DevOps for Data Science


Book Description

Data Scientists are experts at analyzing, modelling and visualizing data but, at one point or another, have all encountered difficulties in collaborating with or delivering their work to the people and systems that matter. Born out of the agile software movement, DevOps is a set of practices, principles and tools that help software engineers reliably deploy work to production. This book takes the lessons of DevOps and aplies them to creating and delivering production-grade data science projects in Python and R. This book’s first section explores how to build data science projects that deploy to production with no frills or fuss. Its second section covers the rudiments of administering a server, including Linux, application, and network administration before concluding with a demystification of the concerns of enterprise IT/Administration in its final section, making it possible for data scientists to communicate and collaborate with their organization’s security, networking, and administration teams. Key Features: • Start-to-finish labs take readers through creating projects that meet DevOps best practices and creating a server-based environment to work on and deploy them. • Provides an appendix of cheatsheets so that readers will never be without the reference they need to remember a Git, Docker, or Command Line command. • Distills what a data scientist needs to know about Docker, APIs, CI/CD, Linux, DNS, SSL, HTTP, Auth, and more. • Written specifically to address the concern of a data scientist who wants to take their Python or R work to production. There are countless books on creating data science work that is correct. This book, on the otherhand, aims to go beyond this, targeted at data scientists who want their work to be than merely accurate and deliver work that matters.




Practical DataOps


Book Description

Gain a practical introduction to DataOps, a new discipline for delivering data science at scale inspired by practices at companies such as Facebook, Uber, LinkedIn, Twitter, and eBay. Organizations need more than the latest AI algorithms, hottest tools, and best people to turn data into insight-driven action and useful analytical data products. Processes and thinking employed to manage and use data in the 20th century are a bottleneck for working effectively with the variety of data and advanced analytical use cases that organizations have today. This book provides the approach and methods to ensure continuous rapid use of data to create analytical data products and steer decision making. Practical DataOps shows you how to optimize the data supply chain from diverse raw data sources to the final data product, whether the goal is a machine learning model or other data-orientated output. The book provides an approach to eliminate wasted effort and improve collaboration between data producers, data consumers, and the rest of the organization through the adoption of lean thinking and agile software development principles. This book helps you to improve the speed and accuracy of analytical application development through data management and DevOps practices that securely expand data access, and rapidly increase the number of reproducible data products through automation, testing, and integration. The book also shows how to collect feedback and monitor performance to manage and continuously improve your processes and output. What You Will LearnDevelop a data strategy for your organization to help it reach its long-term goals Recognize and eliminate barriers to delivering data to users at scale Work on the right things for the right stakeholders through agile collaboration Create trust in data via rigorous testing and effective data management Build a culture of learning and continuous improvement through monitoring deployments and measuring outcomes Create cross-functional self-organizing teams focused on goals not reporting lines Build robust, trustworthy, data pipelines in support of AI, machine learning, and other analytical data products Who This Book Is For Data science and advanced analytics experts, CIOs, CDOs (chief data officers), chief analytics officers, business analysts, business team leaders, and IT professionals (data engineers, developers, architects, and DBAs) supporting data teams who want to dramatically increase the value their organization derives from data. The book is ideal for data professionals who want to overcome challenges of long delivery time, poor data quality, high maintenance costs, and scaling difficulties in getting data science output and machine learning into customer-facing production.




Enterprise DevOps for Architects


Book Description

An architect's guide to designing, implementing, and integrating DevOps in the enterprise Key FeaturesDesign a DevOps architecture that is aligned with the overall enterprise architectureDesign systems that are ready for AIOps and make the move toward NoOpsArchitect and implement DevSecOps pipelines, securing the DevOps enterpriseBook Description Digital transformation is the new paradigm in enterprises, but the big question remains: is the enterprise ready for transformation using native technology embedded in Agile/DevOps? With this book, you'll see how to design, implement, and integrate DevOps in the enterprise architecture while keeping the Ops team on board and remaining resilient. The focus of the book is not to introduce the hundreds of different tools that are available for implementing DevOps, but instead to show you how to create a successful DevOps architecture. This book provides an architectural overview of DevOps, AIOps, and DevSecOps – the three domains that drive and accelerate digital transformation. Complete with step-by-step explanations of essential concepts, practical examples, and self-assessment questions, this DevOps book will help you to successfully integrate DevOps into enterprise architecture. You'll learn what AIOps is and what value it can bring to an enterprise. Lastly, you will learn how to integrate security principles such as zero-trust and industry security frameworks into DevOps with DevSecOps. By the end of this DevOps book, you'll be able to develop robust DevOps architectures, know which toolsets you can use for your DevOps implementation, and have a deeper understanding of next-level DevOps by implementing Site Reliability Engineering (SRE). What you will learnCreate DevOps architecture and integrate it with the enterprise architectureDiscover how DevOps can add value to the quality of IT deliveryExplore strategies to scale DevOps for an enterpriseArchitect SRE for an enterprise as next-level DevOpsUnderstand AIOps and what value it can bring to an enterpriseCreate your AIOps architecture and integrate it into DevOpsCreate your DevSecOps architecture and integrate it with the existing DevOps setupApply zero-trust principles and industry security frameworks to DevOpsWho this book is for This book is for enterprise architects and consultants who want to design DevOps systems for the enterprise. It provides an architectural overview of DevOps, AIOps, and DevSecOps. If you're looking to learn about the implementation of various tools within the DevOps toolchain in detail, this book is not for you.




Accelerate


Book Description

Winner of the Shingo Publication Award Accelerate your organization to win in the marketplace. How can we apply technology to drive business value? For years, we've been told that the performance of software delivery teams doesn't matter―that it can't provide a competitive advantage to our companies. Through four years of groundbreaking research to include data collected from the State of DevOps reports conducted with Puppet, Dr. Nicole Forsgren, Jez Humble, and Gene Kim set out to find a way to measure software delivery performance―and what drives it―using rigorous statistical methods. This book presents both the findings and the science behind that research, making the information accessible for readers to apply in their own organizations. Readers will discover how to measure the performance of their teams, and what capabilities they should invest in to drive higher performance. This book is ideal for management at every level.




The DevOps Handbook


Book Description

Increase profitability, elevate work culture, and exceed productivity goals through DevOps practices. More than ever, the effective management of technology is critical for business competitiveness. For decades, technology leaders have struggled to balance agility, reliability, and security. The consequences of failure have never been greater―whether it's the healthcare.gov debacle, cardholder data breaches, or missing the boat with Big Data in the cloud. And yet, high performers using DevOps principles, such as Google, Amazon, Facebook, Etsy, and Netflix, are routinely and reliably deploying code into production hundreds, or even thousands, of times per day. Following in the footsteps of The Phoenix Project, The DevOps Handbook shows leaders how to replicate these incredible outcomes, by showing how to integrate Product Management, Development, QA, IT Operations, and Information Security to elevate your company and win in the marketplace.




Python for DevOps


Book Description

Much has changed in technology over the past decade. Data is hot, the cloud is ubiquitous, and many organizations need some form of automation. Throughout these transformations, Python has become one of the most popular languages in the world. This practical resource shows you how to use Python for everyday Linux systems administration tasks with today’s most useful DevOps tools, including Docker, Kubernetes, and Terraform. Learning how to interact and automate with Linux is essential for millions of professionals. Python makes it much easier. With this book, you’ll learn how to develop software and solve problems using containers, as well as how to monitor, instrument, load-test, and operationalize your software. Looking for effective ways to "get stuff done" in Python? This is your guide. Python foundations, including a brief introduction to the language How to automate text, write command-line tools, and automate the filesystem Linux utilities, package management, build systems, monitoring and instrumentation, and automated testing Cloud computing, infrastructure as code, Kubernetes, and serverless Machine learning operations and data engineering from a DevOps perspective Building, deploying, and operationalizing a machine learning project




Tools and Techniques for Software Development in Large Organizations: Emerging Research and Opportunities


Book Description

The development of software has expanded substantially in recent years. As these technologies continue to advance, well-known organizations have begun implementing these programs into the ways they conduct business. These large companies play a vital role in the economic environment, so understanding the software that they utilize is pertinent in many aspects. Researching and analyzing the tools that these corporations use will assist in the practice of software engineering and give other organizations an outline of how to successfully implement their own computational methods. Tools and Techniques for Software Development in Large Organizations: Emerging Research and Opportunities is an essential reference source that discusses advanced software methods that prominent companies have adopted to develop high quality products. This book will examine the various devices that organizations such as Google, Cisco, and Facebook have implemented into their production and development processes. Featuring research on topics such as database management, quality assurance, and machine learning, this book is ideally designed for software engineers, data scientists, developers, programmers, professors, researchers, and students seeking coverage on the advancement of software devices in today’s major corporations.




Data Science on AWS


Book Description

With this practical book, AI and machine learning practitioners will learn how to successfully build and deploy data science projects on Amazon Web Services. The Amazon AI and machine learning stack unifies data science, data engineering, and application development to help level upyour skills. This guide shows you how to build and run pipelines in the cloud, then integrate the results into applications in minutes instead of days. Throughout the book, authors Chris Fregly and Antje Barth demonstrate how to reduce cost and improve performance. Apply the Amazon AI and ML stack to real-world use cases for natural language processing, computer vision, fraud detection, conversational devices, and more Use automated machine learning to implement a specific subset of use cases with SageMaker Autopilot Dive deep into the complete model development lifecycle for a BERT-based NLP use case including data ingestion, analysis, model training, and deployment Tie everything together into a repeatable machine learning operations pipeline Explore real-time ML, anomaly detection, and streaming analytics on data streams with Amazon Kinesis and Managed Streaming for Apache Kafka Learn security best practices for data science projects and workflows including identity and access management, authentication, authorization, and more




Data Engineering on Azure


Book Description

Build a data platform to the industry-leading standards set by Microsoft’s own infrastructure. Summary In Data Engineering on Azure you will learn how to: Pick the right Azure services for different data scenarios Manage data inventory Implement production quality data modeling, analytics, and machine learning workloads Handle data governance Using DevOps to increase reliability Ingesting, storing, and distributing data Apply best practices for compliance and access control Data Engineering on Azure reveals the data management patterns and techniques that support Microsoft’s own massive data infrastructure. Author Vlad Riscutia, a data engineer at Microsoft, teaches you to bring an engineering rigor to your data platform and ensure that your data prototypes function just as well under the pressures of production. You'll implement common data modeling patterns, stand up cloud-native data platforms on Azure, and get to grips with DevOps for both analytics and machine learning. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Build secure, stable data platforms that can scale to loads of any size. When a project moves from the lab into production, you need confidence that it can stand up to real-world challenges. This book teaches you to design and implement cloud-based data infrastructure that you can easily monitor, scale, and modify. About the book In Data Engineering on Azure you’ll learn the skills you need to build and maintain big data platforms in massive enterprises. This invaluable guide includes clear, practical guidance for setting up infrastructure, orchestration, workloads, and governance. As you go, you’ll set up efficient machine learning pipelines, and then master time-saving automation and DevOps solutions. The Azure-based examples are easy to reproduce on other cloud platforms. What's inside Data inventory and data governance Assure data quality, compliance, and distribution Build automated pipelines to increase reliability Ingest, store, and distribute data Production-quality data modeling, analytics, and machine learning About the reader For data engineers familiar with cloud computing and DevOps. About the author Vlad Riscutia is a software architect at Microsoft. Table of Contents 1 Introduction PART 1 INFRASTRUCTURE 2 Storage 3 DevOps 4 Orchestration PART 2 WORKLOADS 5 Processing 6 Analytics 7 Machine learning PART 3 GOVERNANCE 8 Metadata 9 Data quality 10 Compliance 11 Distributing data




Data Science from Scratch


Book Description

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases