HOUSEHOLD ELECTRIC POWER CONSUMPTION: ANALYSIS, CLUSTERING, AND PREDICTION WITH PYTHON


Book Description

In this project, you will perform analysis, clustering, and prediction on household electric power consumption with python. The dataset used in this project contains 2075259 measurements gathered between December 2006 and November 2010 (47 months). Following are the attributes in the dataset: date: Date in format dd/mm/yyyy; time: time in format hh:mm:ss; globalactivepower: household global minute-averaged active power (in kilowatt); globalreactivepower: household global minute-averaged reactive power (in kilowatt); voltage: minute-averaged voltage (in volt); global_intensity: household global minute-averaged current intensity (in ampere); submetering1: energy sub-metering No. 1 (in watt-hour of active energy). It corresponds to the kitchen, containing mainly a dishwasher, an oven and a microwave (hot plates are not electric but gas powered); submetering2: energy sub-metering No. 2 (in watt-hour of active energy). It corresponds to the laundry room, containing a washing-machine, a tumble-drier, a refrigerator and a light; and submetering3: energy sub-metering No. 3 (in watt-hour of active energy). It corresponds to an electric water-heater and an air-conditioner. In this project, you will perform clustering using KMeans to get 5 clusters. The machine learning models used in this project to perform regression on total number of purchase and to predict clusters as target variable are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM, Gradient Boosting, XGB, and MLP. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy.




Mastering Time Series Analysis and Forecasting with Python


Book Description

Decode the language of time with Python. Discover powerful techniques to analyze, forecast, and innovate. Key Features ● Dive into time series analysis fundamentals, progressing to advanced Python techniques. ● Gain practical expertise with real-world datasets and hands-on examples. ● Strengthen skills with code snippets, exercises, and projects for deeper understanding. Book Description "Mastering Time Series Analysis and Forecasting with Python" is an essential handbook tailored for those seeking to harness the power of time series data in their work. The book begins with foundational concepts and seamlessly guides readers through Python libraries such as Pandas, NumPy, and Plotly for effective data manipulation, visualization, and exploration. Offering pragmatic insights, it enables adept visualization, pattern recognition, and anomaly detection. Advanced discussions cover feature engineering and a spectrum of forecasting methodologies, including machine learning and deep learning techniques such as ARIMA, LSTM, and CNN. Additionally, the book covers multivariate and multiple time series forecasting, providing readers with a comprehensive understanding of advanced modeling techniques and their applications across diverse domains. Readers develop expertise in crafting precise predictive models and addressing real-world complexities. Complete with illustrative examples, code snippets, and hands-on exercises, this manual empowers readers to excel, make informed decisions, and derive optimal value from time series data. What you will learn ● Understand the fundamentals of time series data, including temporal patterns, trends, and seasonality. ● Proficiently utilize Python libraries such as pandas, NumPy, and matplotlib for efficient data manipulation and visualization. ● Conduct exploratory analysis of time series data, including identifying patterns, detecting anomalies, and extracting meaningful features. ● Build accurate and reliable predictive models using a variety of machine learning and deep learning techniques, including ARIMA, LSTM, and CNN. ● Perform multivariate and multiple time series forecasting, allowing for more comprehensive analysis and prediction across diverse datasets. ● Evaluate model performance using a range of metrics and validation techniques, ensuring the reliability and robustness of predictive models. Table of Contents 1. Introduction to Time Series 2. Overview of Time Series Libraries in Python 3. Visualization of Time Series Data 4. Exploratory Analysis of Time Series Data 5. Feature Engineering on Time Series 6. Time Series Forecasting – ML Approach Part 1 7. Time Series Forecasting – ML Approach Part 2 8. Time Series Forecasting - DL Approach 9. Multivariate Time Series, Metrics, and Validation Index




Computational Science – ICCS 2019


Book Description

The five-volume set LNCS 11536, 11537, 11538, 11539, and 11540 constitutes the proceedings of the 19th International Conference on Computational Science, ICCS 2019, held in Faro, Portugal, in June 2019. The total of 65 full papers and 168 workshop papers presented in this book set were carefully reviewed and selected from 573 submissions (228 submissions to the main track and 345 submissions to the workshops). The papers were organized in topical sections named: Part I: ICCS Main Track Part II: ICCS Main Track; Track of Advances in High-Performance Computational Earth Sciences: Applications and Frameworks; Track of Agent-Based Simulations, Adaptive Algorithms and Solvers; Track of Applications of Matrix Methods in Artificial Intelligence and Machine Learning; Track of Architecture, Languages, Compilation and Hardware Support for Emerging and Heterogeneous Systems Part III: Track of Biomedical and Bioinformatics Challenges for Computer Science; Track of Classifier Learning from Difficult Data; Track of Computational Finance and Business Intelligence; Track of Computational Optimization, Modelling and Simulation; Track of Computational Science in IoT and Smart Systems Part IV: Track of Data-Driven Computational Sciences; Track of Machine Learning and Data Assimilation for Dynamical Systems; Track of Marine Computing in the Interconnected World for the Benefit of the Society; Track of Multiscale Modelling and Simulation; Track of Simulations of Flow and Transport: Modeling, Algorithms and Computation Part V: Track of Smart Systems: Computer Vision, Sensor Networks and Machine Learning; Track of Solving Problems with Uncertainties; Track of Teaching Computational Science; Poster Track ICCS 2019 Chapter “Comparing Domain-decomposition Methods for the Parallelization of Distributed Land Surface Models” is available open access under a Creative Commons Attribution 4.0 International License via link.springer.com.




Deep Learning for Time Series Forecasting


Book Description

Deep learning methods offer a lot of promise for time series forecasting, such as the automatic learning of temporal dependence and the automatic handling of temporal structures like trends and seasonality. With clear explanations, standard Python libraries, and step-by-step tutorial lessons you’ll discover how to develop deep learning models for your own time series forecasting projects.




Least Squares Support Vector Machines


Book Description

This book focuses on Least Squares Support Vector Machines (LS-SVMs) which are reformulations to standard SVMs. LS-SVMs are closely related to regularization networks and Gaussian processes but additionally emphasize and exploit primal-dual interpretations from optimization theory. The authors explain the natural links between LS-SVM classifiers and kernel Fisher discriminant analysis. Bayesian inference of LS-SVM models is discussed, together with methods for imposing spareness and employing robust statistics. The framework is further extended towards unsupervised learning by considering PCA analysis and its kernel version as a one-class modelling problem. This leads to new primal-dual support vector machine formulations for kernel PCA and kernel CCA analysis. Furthermore, LS-SVM formulations are given for recurrent networks and control. In general, support vector machines may pose heavy computational challenges for large data sets. For this purpose, a method of fixed size LS-SVM is proposed where the estimation is done in the primal space in relation to a Nystrom sampling with active selection of support vectors. The methods are illustrated with several examples.




Forecasting and Assessing Risk of Individual Electricity Peaks


Book Description

The overarching aim of this open access book is to present self-contained theory and algorithms for investigation and prediction of electric demand peaks. A cross-section of popular demand forecasting algorithms from statistics, machine learning and mathematics is presented, followed by extreme value theory techniques with examples. In order to achieve carbon targets, good forecasts of peaks are essential. For instance, shifting demand or charging battery depends on correct demand predictions in time. Majority of forecasting algorithms historically were focused on average load prediction. In order to model the peaks, methods from extreme value theory are applied. This allows us to study extremes without making any assumption on the central parts of demand distribution and to predict beyond the range of available data. While applied on individual loads, the techniques described in this book can be extended naturally to substations, or to commercial settings. Extreme value theory techniques presented can be also used across other disciplines, for example for predicting heavy rainfalls, wind speed, solar radiation and extreme weather events. The book is intended for students, academics, engineers and professionals that are interested in short term load prediction, energy data analytics, battery control, demand side response and data science in general.




Distributed Computer and Communication Networks


Book Description

This book constitutes the refereed proceedings of the 21th International Conference on Distributed and Computer and Communication Networks, DCCN 2018, held in Moscow, Russia, in September 2018. The 50 full papers and the 9 short papers were carefully reviewed and selected from 168 submissions. The papers cover the following topics: computer and communication networks architecture optimization; control in computer and communication networks; performance and QoS/QoE evaluation in wireless networks; analytical modeling and simulation of next-generation communications systems; queueing theory and reliability theory applications in computer networks; wireless 4G/5G networks, cm- and mm-wave radio technologies; RFID technology and its application in intellectual transportation networks; Internet of Things, wearables, and applications of distributed information systems; probabilistic and statistical models in information systems; mathematical modeling of high-tech systems; mathematical modeling and control problems; distributed and cloud computing systems, big data analytics.




Discrete Choice Methods with Simulation


Book Description

This book describes the new generation of discrete choice methods, focusing on the many advances that are made possible by simulation. Researchers use these statistical methods to examine the choices that consumers, households, firms, and other agents make. Each of the major models is covered: logit, generalized extreme value, or GEV (including nested and cross-nested logits), probit, and mixed logit, plus a variety of specifications that build on these basics. Simulation-assisted estimation procedures are investigated and compared, including maximum stimulated likelihood, method of simulated moments, and method of simulated scores. Procedures for drawing from densities are described, including variance reduction techniques such as anithetics and Halton draws. Recent advances in Bayesian procedures are explored, including the use of the Metropolis-Hastings algorithm and its variant Gibbs sampling. The second edition adds chapters on endogeneity and expectation-maximization (EM) algorithms. No other book incorporates all these fields, which have arisen in the past 25 years. The procedures are applicable in many fields, including energy, transportation, environmental studies, health, labor, and marketing.




Data-driven Analytics for Sustainable Buildings and Cities


Book Description

This book explores the interdisciplinary and transdisciplinary fields of energy systems, occupant behavior, thermal comfort, air quality and economic modelling across levels of building, communities and cities, through various data analytical approaches. It highlights the complex interplay of heating/cooling, ventilation and power systems in different processes, such as design, renovation and operation, for buildings, communities and cities. Methods from classical statistics, machine learning and artificial intelligence are applied into analyses for different building/urban components and systems. Knowledge from this book assists to accelerate sustainability of the society, which would contribute to a prospective improvement through data analysis in the liveability of both built and urban environment. This book targets a broad readership with specific experience and knowledge in data analysis, energy system, built environment and urban planning. As such, it appeals to researchers, graduate students, data scientists, engineers, consultants, urban scientists, investors and policymakers, with interests in energy flexibility, building/city resilience and climate neutrality.




Understanding Machine Learning


Book Description

Introduces machine learning and its algorithmic paradigms, explaining the principles behind automated learning approaches and the considerations underlying their usage.