Python 3 and Data Analytics Pocket Primer


Book Description

As part of the best-selling Pocket Primer series, this book is designed to introduce the reader to the basic concepts of data analytics using Python 3. It is intended to be a fast-paced introduction to some basic features of data analytics and also covers statistics, data visualization, and data cleaning. The book includes numerous code samples using NumPy, Pandas, Matplotlib, Seaborn, and features an appendix on regular expressions. Companion files with source code and color figures are available online by emailing the publisher with proof of purchase at [email protected]. FEATURES: Includes a concise introduction to Python 3 Provides a thorough introduction to data and data cleaning Covers NumPy and Pandas Introduces statistical concepts and data visualization (Matplotlib/Seaborn) Features an appendix on regular expressions Includes companion files with source code and figures




Python Tools for Data Scientists Pocket Primer


Book Description

As part of the best-selling Pocket Primer series, this book is designed to provide a thorough introduction to numerous Python tools for data scientists. The book covers features of NumPy and Pandas, how to write regular expressions, and how to perform data cleaning tasks. It includes separate chapters on data visualization and working with Sklearn and SciPy. Companion files with source code are available. FEATURES: Introduces Python, NumPy, Sklearn, SciPy, and awk Covers data cleaning tasks and data visualization Features numerous code samples throughout Includes companion files with source code




Dealing With Data Pocket Primer


Book Description

As part of the best-selling Pocket Primer series, this book is designed to introduce the reader to the basic concepts of managing data using a variety of computer languages and applications. It is intended to be a fast-paced introduction to some basic features of data management and covers statistical concepts, data-related techniques, features of Pandas, RDBMS, SQL, NLP topics, Matplotlib, and data visualization. Companion files with source code and color figures are available. FEATURES: Covers Pandas, RDBMS, NLP, data cleaning, SQL, and data visualization Introduces probability and statistical concepts Features numerous code samples throughout Includes companion files with source code and figures




Data Science Fundamentals Pocket Primer


Book Description

As part of the best-selling Pocket Primer series, this book is designed to introduce the reader to the basic concepts of data science using Python 3 and other computer applications. It is intended to be a fast-paced introduction to some basic features of data analytics and also covers statistics, data visualization, linear algebra, and regular expressions. The book includes numerous code samples using Python, NumPy, R, SQL, NoSQL, and Pandas. Companion files with source code and color figures are available. FEATURES: Includes a concise introduction to Python 3 and linear algebra Provides a thorough introduction to data visualization and regular expressions Covers NumPy, Pandas, R, and SQL Introduces probability and statistical concepts Features numerous code samples throughout Companion files with source code and figures




Data Pipelines Pocket Reference


Book Description

Data pipelines are the foundation for success in data analytics. Moving data from numerous diverse sources and transforming it to provide context is the difference between having data and actually gaining value from it. This pocket reference defines data pipelines and explains how they work in today's modern data stack. You'll learn common considerations and key decision points when implementing pipelines, such as batch versus streaming data ingestion and build versus buy. This book addresses the most common decisions made by data professionals and discusses foundational concepts that apply to open source frameworks, commercial products, and homegrown solutions. You'll learn: What a data pipeline is and how it works How data is moved and processed on modern data infrastructure, including cloud platforms Common tools and products used by data engineers to build pipelines How pipelines support analytics and reporting needs Considerations for pipeline maintenance, testing, and alerting




Data Science and Machine Learning


Book Description

Focuses on mathematical understanding Presentation is self-contained, accessible, and comprehensive Full color throughout Extensive list of exercises and worked-out examples Many concrete algorithms with actual code




A Primer on Scientific Programming with Python


Book Description

The book serves as a first introduction to computer programming of scientific applications, using the high-level Python language. The exposition is example and problem-oriented, where the applications are taken from mathematics, numerical calculus, statistics, physics, biology and finance. The book teaches "Matlab-style" and procedural programming as well as object-oriented programming. High school mathematics is a required background and it is advantageous to study classical and numerical one-variable calculus in parallel with reading this book. Besides learning how to program computers, the reader will also learn how to solve mathematical problems, arising in various branches of science and engineering, with the aid of numerical methods and programming. By blending programming, mathematics and scientific applications, the book lays a solid foundation for practicing computational science. From the reviews: Langtangen ... does an excellent job of introducing programming as a set of skills in problem solving. He guides the reader into thinking properly about producing program logic and data structures for modeling real-world problems using objects and functions and embracing the object-oriented paradigm. ... Summing Up: Highly recommended. F. H. Wild III, Choice, Vol. 47 (8), April 2010 Those of us who have learned scientific programming in Python ‘on the streets’ could be a little jealous of students who have the opportunity to take a course out of Langtangen’s Primer.” John D. Cook, The Mathematical Association of America, September 2011 This book goes through Python in particular, and programming in general, via tasks that scientists will likely perform. It contains valuable information for students new to scientific computing and would be the perfect bridge between an introduction to programming and an advanced course on numerical methods or computational science. Alex Small, IEEE, CiSE Vol. 14 (2), March /April 2012 “This fourth edition is a wonderful, inclusive textbook that covers pretty much everything one needs to know to go from zero to fairly sophisticated scientific programming in Python...” Joan Horvath, Computing Reviews, March 2015




Learning Spark


Book Description

Data is bigger, arrives faster, and comes in a variety of formats—and it all needs to be processed at scale for analytics or machine learning. But how can you process such varied workloads efficiently? Enter Apache Spark. Updated to include Spark 3.0, this second edition shows data engineers and data scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine learning algorithms. Through step-by-step walk-throughs, code snippets, and notebooks, you’ll be able to: Learn Python, SQL, Scala, or Java high-level Structured APIs Understand Spark operations and SQL Engine Inspect, tune, and debug Spark operations with Spark configurations and Spark UI Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka Perform analytics on batch and streaming data using Structured Streaming Build reliable data pipelines with open source Delta Lake and Spark Develop machine learning pipelines with MLlib and productionize models using MLflow




Python for Finance


Book Description

The financial industry has recently adopted Python at a tremendous rate, with some of the largest investment banks and hedge funds using it to build core trading and risk management systems. Updated for Python 3, the second edition of this hands-on book helps you get started with the language, guiding developers and quantitative analysts through Python libraries and tools for building financial applications and interactive financial analytics. Using practical examples throughout the book, author Yves Hilpisch also shows you how to develop a full-fledged framework for Monte Carlo simulation-based derivatives and risk analytics, based on a large, realistic case study. Much of the book uses interactive IPython Notebooks.




The Data Science Design Manual


Book Description

This engaging and clearly written textbook/reference provides a must-have introduction to the rapidly emerging interdisciplinary field of data science. It focuses on the principles fundamental to becoming a good data scientist and the key skills needed to build systems for collecting, analyzing, and interpreting data. The Data Science Design Manual is a source of practical insights that highlights what really matters in analyzing data, and provides an intuitive understanding of how these core concepts can be used. The book does not emphasize any particular programming language or suite of data-analysis tools, focusing instead on high-level discussion of important design principles. This easy-to-read text ideally serves the needs of undergraduate and early graduate students embarking on an “Introduction to Data Science” course. It reveals how this discipline sits at the intersection of statistics, computer science, and machine learning, with a distinct heft and character of its own. Practitioners in these and related fields will find this book perfect for self-study as well. Additional learning tools: Contains “War Stories,” offering perspectives on how data science applies in the real world Includes “Homework Problems,” providing a wide range of exercises and projects for self-study Provides a complete set of lecture slides and online video lectures at www.data-manual.com Provides “Take-Home Lessons,” emphasizing the big-picture concepts to learn from each chapter Recommends exciting “Kaggle Challenges” from the online platform Kaggle Highlights “False Starts,” revealing the subtle reasons why certain approaches fail Offers examples taken from the data science television show “The Quant Shop” (www.quant-shop.com)