FOUR PROJECTS: PREDICTION AND FORECASTING USING MACHINE LEARNING WITH PYTHON


Book Description

PROJECT 1: GOLD PRICE ANALYSIS AND FORECASTING USING MACHINE LEARNING WITH PYTHON The challenge of this project is to accurately predict the future adjusted closing price of Gold ETF across a given period of time in the future. The problem is a regression problem, because the output value which is the adjusted closing price in this project is continuous value. Data for this study is collected from November 18th 2011 to January 1st 2019 from various sources. The data has 1718 rows in total and 80 columns in total. Data for attributes, such as Oil Price, Standard and Poor’s (S&P) 500 index, Dow Jones Index US Bond rates (10 years), Euro USD exchange rates, prices of precious metals Silver and Platinum and other metals such as Palladium and Rhodium, prices of US Dollar Index, Eldorado Gold Corporation and Gold Miners ETF were gathered. The dataset has 1718 rows in total and 80 columns in total. Data for attributes, such as Oil Price, Standard and Poor’s (S&P) 500 index, Dow Jones Index US Bond rates (10 years), Euro USD exchange rates, prices of precious metals Silver and Platinum and other metals such as Palladium and Rhodium, prices of US Dollar Index, Eldorado Gold Corporation and Gold Miners ETF were gathered. To perform forecasting based on regression adjusted closing price of gold, you will use: Linear Regression, Random Forest regression, Decision Tree regression, Support Vector Machine regression, Naïve Bayes regression, K-Nearest Neighbor regression, Adaboost regression, Gradient Boosting regression, Extreme Gradient Boosting regression, Light Gradient Boosting regression, Catboost regression, and MLP regression. The machine learning models used predict gold daily returns as target variable are K-Nearest Neighbor classifier, Random Forest classifier, Naive Bayes classifier, Logistic Regression classifier, Decision Tree classifier, Support Vector Machine classifier, LGBM classifier, Gradient Boosting classifier, XGB classifier, MLP classifier, and Extra Trees classifier. Finally, you will plot boundary decision, distribution of features, feature importance, predicted values versus true values, confusion matrix, learning curve, performance of the model, and scalability of the model. PROJECT 2: WIND POWER ANALYSIS AND FORECASTING USING MACHINE LEARNING WITH PYTHON Renewable energy remains one of the most important topics for a sustainable future. Wind, being a perennial source of power, could be utilized to satisfy our power requirements. With the rise of wind farms, wind power forecasting would prove to be quite useful. It contains various weather, turbine and rotor features. Data has been recorded from January 2018 till March 2020. Readings have been recorded at a 10-minute interval. A longterm wind forecasting technique is thus required. The attributes in the dataset are as follows: ActivePower, AmbientTemperature, BearingShaftTemperature, Blade1PitchAngle, Blade2PitchAngle, Blade3PitchAngle, ControlBoxTemperature, GearboxBearingTemperature, GearboxOilTemperature, GeneratorRP, GeneratorWinding1Temperature, GeneratorWinding2Temperature, HubTemperature, MainBoxTemperature, NacellePosition, ReactivePower, RotorRPM, TurbineStatus, WTG, WindDirection, and WindSpeed. To perform forecasting based on regression active power, you will use: Linear Regression, Random Forest regression, Decision Tree regression, Support Vector Machine regression, Naïve Bayes regression, K-Nearest Neighbor regression, Adaboost regression, Gradient Boosting regression, Extreme Gradient Boosting regression, Light Gradient Boosting regression, Catboost regression, and MLP regression. To perform clustering, you will use K-Means algorithm. The machine learning models used predict categorized active power as target variable are K-Nearest Neighbor classifier, Random Forest classifier, Naive Bayes classifier, Logistic Regression classifier, Decision Tree classifier, Support Vector Machine classifier, LGBM classifier, Gradient Boosting classifier, XGB classifier, and MLP classifier. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 3: MACHINE LEARNING FOR CONCRETE COMPRESSIVE STRENGTH ANALYSIS AND PREDICTION WITH PYTHON Concrete is the most important material in civil engineering. The concrete compressive strength is a highly nonlinear function of age and ingredients. These ingredients include cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, and fine aggregate. The actual concrete compressive strength (MPa) for a given mixture under a specific age (days) was determined from laboratory. This dataset is in raw form (not scaled). There are 1030 observations, 9 attributes, 8 quantitative input variables, and 1 quantitative output variable in dataset. The attributes in the dataset are as follows: Cement (component 1); Blast Furnace Slag (component 2); Fly Ash (component 3); Water (component 4); Superplasticizer (component 5); Coarse Aggregate; Fine Aggregate (component 7); Age; and Concrete compressive strength. To perform regression on concrete compressive strength, you will use: Linear Regression, Random Forest regression, Decision Tree regression, Support Vector Machine regression, Naïve Bayes regression, K-Nearest Neighbor regression, Adaboost regression, Gradient Boosting regression, Extreme Gradient Boosting regression, Light Gradient Boosting regression, Catboost regression, and MLP regression. To perform clustering, you will use K-Means algorithm. The machine learning models used predict clusters as target variable are K-Nearest Neighbor classifier, Random Forest classifier, Naive Bayes classifier, Logistic Regression classifier, Decision Tree classifier, Support Vector Machine classifier, LGBM classifier, Gradient Boosting classifier, XGB classifier, and MLP classifier. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 4: DATA SCIENCE FOR SALES ANALYSIS, FORECASTING, CLUSTERING, AND PREDICTION WITH PYTHON The dataset used in this project is from Walmart which is a renowned retail corporation that operates a chain of hypermarkets. Walmart has provided a data combining of 45 stores including store information and monthly sales. The data is provided on weekly basis. Walmart tries to find the impact of holidays on the sales of store. For which it has included four holidays’ weeks into the dataset which are Christmas, Thanksgiving, Super bowl, Labor Day. In this project, you are going to analyze, forecast weekly sales, perform clustering, and predict the resulting clusters. The dataset covers sales from 2010-02-05 to 2012-11-01. Following are the attributes in the dataset: Store - the store number; Date - the week of sales; Weekly_Sales - sales for the given store; Holiday_Flag - whether the week is a special holiday week 1 – Holiday week 0 – Non-holiday week; Temperature - Temperature on the day of sale; Fuel_Price - Cost of fuel in the region; CPI – Prevailing consumer price index; and Unemployment - Prevailing unemployment rate. To perform regression on weekly sales, you will use: Linear Regression, Random Forest regression, Decision Tree regression, Support Vector Machine regression, Naïve Bayes regression, K-Nearest Neighbor regression, Adaboost regression, Gradient Boosting regression, Extreme Gradient Boosting regression, Light Gradient Boosting regression, Catboost regression, and MLP regression. To perform clustering, you will use K-Means algorithm. The machine learning models used predict clusters as target variable are K-Nearest Neighbor classifier, Random Forest classifier, Naive Bayes classifier, Logistic Regression classifier, Decision Tree classifier, Support Vector Machine classifier, LGBM classifier, Gradient Boosting classifier, XGB classifier, and MLP classifier. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy.




Time Series Forecasting in Python


Book Description

Build predictive models from time-based patterns in your data. Master statistical models including new deep learning approaches for time series forecasting. In Time Series Forecasting in Python you will learn how to: Recognize a time series forecasting problem and build a performant predictive model Create univariate forecasting models that account for seasonal effects and external variables Build multivariate forecasting models to predict many time series at once Leverage large datasets by using deep learning for forecasting time series Automate the forecasting process Time Series Forecasting in Python teaches you to build powerful predictive models from time-based data. Every model you create is relevant, useful, and easy to implement with Python. You’ll explore interesting real-world datasets like Google’s daily stock price and economic data for the USA, quickly progressing from the basics to developing large-scale models that use deep learning tools like TensorFlow. About the technology You can predict the future—with a little help from Python, deep learning, and time series data! Time series forecasting is a technique for modeling time-centric data to identify upcoming events. New Python libraries and powerful deep learning tools make accurate time series forecasts easier than ever before. About the book Time Series Forecasting in Python teaches you how to get immediate, meaningful predictions from time-based data such as logs, customer analytics, and other event streams. In this accessible book, you’ll learn statistical and deep learning methods for time series forecasting, fully demonstrated with annotated Python code. Develop your skills with projects like predicting the future volume of drug prescriptions, and you’ll soon be ready to build your own accurate, insightful forecasts. What's inside Create models for seasonal effects and external variables Multivariate forecasting models to predict multiple time series Deep learning for large datasets Automate the forecasting process About the reader For data scientists familiar with Python and TensorFlow. About the author Marco Peixeiro is a seasoned data science instructor who has worked as a data scientist for one of Canada’s largest banks. Table of Contents PART 1 TIME WAITS FOR NO ONE 1 Understanding time series forecasting 2 A naive prediction of the future 3 Going on a random walk PART 2 FORECASTING WITH STATISTICAL MODELS 4 Modeling a moving average process 5 Modeling an autoregressive process 6 Modeling complex time series 7 Forecasting non-stationary time series 8 Accounting for seasonality 9 Adding external variables to our model 10 Forecasting multiple time series 11 Capstone: Forecasting the number of antidiabetic drug prescriptions in Australia PART 3 LARGE-SCALE FORECASTING WITH DEEP LEARNING 12 Introducing deep learning for time series forecasting 13 Data windowing and creating baselines for deep learning 14 Baby steps with deep learning 15 Remembering the past with LSTM 16 Filtering a time series with CNN 17 Using predictions to make more predictions 18 Capstone: Forecasting the electric power consumption of a household PART 4 AUTOMATING FORECASTING AT SCALE 19 Automating time series forecasting with Prophet 20 Capstone: Forecasting the monthly average retail price of steak in Canada 21 Going above and beyond




Python Machine Learning Projects


Book Description

As machine learning is increasingly leveraged to find patterns, conduct analysis, and make decisions — sometimes without final input from humans who may be impacted by these findings — it is crucial to invest in bringing more stakeholders into the fold. This book of Python projects in machine learning tries to do just that: to equip the developers of today and tomorrow with tools they can use to better understand, evaluate, and shape machine learning to help ensure that it is serving us all. This book will set you up with a Python programming environment if you don’t have one already, then provide you with a conceptual understanding of machine learning in the chapter “An Introduction to Machine Learning.” What follows next are three Python machine learning projects. They will help you create a machine learning classifier, build a neural network to recognize handwritten digits, and give you a background in deep reinforcement learning through building a bot for Atari.




Machine Learning for Time Series Forecasting with Python


Book Description

Learn how to apply the principles of machine learning to time series modeling with this indispensable resource Machine Learning for Time Series Forecasting with Python is an incisive and straightforward examination of one of the most crucial elements of decision-making in finance, marketing, education, and healthcare: time series modeling. Despite the centrality of time series forecasting, few business analysts are familiar with the power or utility of applying machine learning to time series modeling. Author Francesca Lazzeri, a distinguished machine learning scientist and economist, corrects that deficiency by providing readers with comprehensive and approachable explanation and treatment of the application of machine learning to time series forecasting. Written for readers who have little to no experience in time series forecasting or machine learning, the book comprehensively covers all the topics necessary to: Understand time series forecasting concepts, such as stationarity, horizon, trend, and seasonality Prepare time series data for modeling Evaluate time series forecasting models’ performance and accuracy Understand when to use neural networks instead of traditional time series models in time series forecasting Machine Learning for Time Series Forecasting with Python is full real-world examples, resources and concrete strategies to help readers explore and transform data and develop usable, practical time series forecasts. Perfect for entry-level data scientists, business analysts, developers, and researchers, this book is an invaluable and indispensable guide to the fundamental and advanced concepts of machine learning applied to time series modeling.




Practical Time Series Analysis


Book Description

Time series data analysis is increasingly important due to the massive production of such data through the internet of things, the digitalization of healthcare, and the rise of smart cities. As continuous monitoring and data collection become more common, the need for competent time series analysis with both statistical and machine learning techniques will increase. Covering innovations in time series data analysis and use cases from the real world, this practical guide will help you solve the most common data engineering and analysis challengesin time series, using both traditional statistical and modern machine learning techniques. Author Aileen Nielsen offers an accessible, well-rounded introduction to time series in both R and Python that will have data scientists, software engineers, and researchers up and running quickly. You’ll get the guidance you need to confidently: Find and wrangle time series data Undertake exploratory time series data analysis Store temporal data Simulate time series data Generate and select features for a time series Measure error Forecast and classify time series with machine or deep learning Evaluate accuracy and performance




Deep Learning for Time Series Forecasting


Book Description

Deep learning methods offer a lot of promise for time series forecasting, such as the automatic learning of temporal dependence and the automatic handling of temporal structures like trends and seasonality. With clear explanations, standard Python libraries, and step-by-step tutorial lessons you’ll discover how to develop deep learning models for your own time series forecasting projects.




Machine Learning for Time-Series with Python


Book Description

Get better insights from time-series data and become proficient in model performance analysis Key FeaturesExplore popular and modern machine learning methods including the latest online and deep learning algorithmsLearn to increase the accuracy of your predictions by matching the right model with the right problemMaster time series via real-world case studies on operations management, digital marketing, finance, and healthcareBook Description The Python time-series ecosystem is huge and often quite hard to get a good grasp on, especially for time-series since there are so many new libraries and new models. This book aims to deepen your understanding of time series by providing a comprehensive overview of popular Python time-series packages and help you build better predictive systems. Machine Learning for Time-Series with Python starts by re-introducing the basics of time series and then builds your understanding of traditional autoregressive models as well as modern non-parametric models. By observing practical examples and the theory behind them, you will become confident with loading time-series datasets from any source, deep learning models like recurrent neural networks and causal convolutional network models, and gradient boosting with feature engineering. This book will also guide you in matching the right model to the right problem by explaining the theory behind several useful models. You'll also have a look at real-world case studies covering weather, traffic, biking, and stock market data. By the end of this book, you should feel at home with effectively analyzing and applying machine learning methods to time-series. What you will learnUnderstand the main classes of time series and learn how to detect outliers and patternsChoose the right method to solve time-series problemsCharacterize seasonal and correlation patterns through autocorrelation and statistical techniquesGet to grips with time-series data visualizationUnderstand classical time-series models like ARMA and ARIMAImplement deep learning models, like Gaussian processes, transformers, and state-of-the-art machine learning modelsBecome familiar with many libraries like Prophet, XGboost, and TensorFlowWho this book is for This book is ideal for data analysts, data scientists, and Python developers who want instantly useful and practical recipes to implement today, and a comprehensive reference book for tomorrow. Basic knowledge of the Python Programming language is a must, while familiarity with statistics will help you get the most out of this book.




Introduction to Time Series Forecasting With Python


Book Description

Time series forecasting is different from other machine learning problems. The key difference is the fixed sequence of observations and the constraints and additional structure this provides. In this Ebook, finally cut through the math and specialized methods for time series forecasting. Using clear explanations, standard Python libraries and step-by-step tutorials you will discover how to load and prepare data, evaluate model skill, and implement forecasting models for time series data.




Python Machine Learning


Book Description

Unlock deeper insights into Machine Leaning with this vital guide to cutting-edge predictive analytics About This Book Leverage Python's most powerful open-source libraries for deep learning, data wrangling, and data visualization Learn effective strategies and best practices to improve and optimize machine learning systems and algorithms Ask – and answer – tough questions of your data with robust statistical models, built for a range of datasets Who This Book Is For If you want to find out how to use Python to start answering critical questions of your data, pick up Python Machine Learning – whether you want to get started from scratch or want to extend your data science knowledge, this is an essential and unmissable resource. What You Will Learn Explore how to use different machine learning models to ask different questions of your data Learn how to build neural networks using Keras and Theano Find out how to write clean and elegant Python code that will optimize the strength of your algorithms Discover how to embed your machine learning model in a web application for increased accessibility Predict continuous target outcomes using regression analysis Uncover hidden patterns and structures in data with clustering Organize data using effective pre-processing techniques Get to grips with sentiment analysis to delve deeper into textual and social media data In Detail Machine learning and predictive analytics are transforming the way businesses and other organizations operate. Being able to understand trends and patterns in complex data is critical to success, becoming one of the key strategies for unlocking growth in a challenging contemporary marketplace. Python can help you deliver key insights into your data – its unique capabilities as a language let you build sophisticated algorithms and statistical models that can reveal new perspectives and answer key questions that are vital for success. Python Machine Learning gives you access to the world of predictive analytics and demonstrates why Python is one of the world's leading data science languages. If you want to ask better questions of data, or need to improve and extend the capabilities of your machine learning systems, this practical data science book is invaluable. Covering a wide range of powerful Python libraries, including scikit-learn, Theano, and Keras, and featuring guidance and tips on everything from sentiment analysis to neural networks, you'll soon be able to answer some of the most important questions facing you and your organization. Style and approach Python Machine Learning connects the fundamental theoretical principles behind machine learning to their practical application in a way that focuses you on asking and answering the right questions. It walks you through the key elements of Python and its powerful machine learning libraries, while demonstrating how to get to grips with a range of statistical models.




Forecasting: principles and practice


Book Description

Forecasting is required in many situations. Stocking an inventory may require forecasts of demand months in advance. Telecommunication routing requires traffic forecasts a few minutes ahead. Whatever the circumstances or time horizons involved, forecasting is an important aid in effective and efficient planning. This textbook provides a comprehensive introduction to forecasting methods and presents enough information about each method for readers to use them sensibly.