The Evolution of Data Products


Book Description

This report examines the important shifts in data products. Drawing from diverse examples, including iTunes, Google's self-driving car, and patient monitoring, author Mike Loukides explores the "disappearance" of data, the power of combining data, and the difference between discovery and recommendation. Looking ahead, the analysis finds the real changes in our lives will come from products and companies that reveal data results, not the data itself.




The Evolution of Data Products


Book Description

This report examines the important shifts in data products. Drawing from diverse examples, including iTunes, Google's self-driving car, and patient monitoring, author Mike Loukides explores the "disappearance" of data, the power of combining data, and the difference between discovery and recommendation. Looking ahead, the analysis finds the real changes in our lives will come from products and companies that reveal data results, not the data itself.




Data Analytics with Hadoop


Book Description

Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you’ll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You’ll also learn about the analytical processes and data systems available to build and empower data products that can handle—and actually require—huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark’s MLlib




Data Mesh


Book Description

Many enterprises are investing in a next-generation data lake, hoping to democratize data at scale to provide business insights and ultimately make automated intelligent decisions. In this practical book, author Zhamak Dehghani reveals that, despite the time, money, and effort poured into them, data warehouses and data lakes fail when applied at the scale and speed of today's organizations. A distributed data mesh is a better choice. Dehghani guides architects, technical leaders, and decision makers on their journey from monolithic big data architecture to a sociotechnical paradigm that draws from modern distributed architecture. A data mesh considers domains as a first-class concern, applies platform thinking to create self-serve data infrastructure, treats data as a product, and introduces a federated and computational model of data governance. This book shows you why and how. Examine the current data landscape from the perspective of business and organizational needs, environmental challenges, and existing architectures Analyze the landscape's underlying characteristics and failure modes Get a complete introduction to data mesh principles and its constituents Learn how to design a data mesh architecture Move beyond a monolithic data lake to a distributed data mesh.




Designing Great Data Products


Book Description

In the past few years, we’ve seen many data products based on predictive modeling. These products range from weather forecasting to recommendation engines like Amazon's. Prediction technology can be interesting and mathematically elegant, but we need to take the next step: going from recommendations to products that can produce optimal strategies for meeting concrete business objectives. We already know how to build these products: they've been in use for the past decade or so, but they're not as common as they should be. This report shows how to take the next step: to go from simple predictions and recommendations to a new generation of data products with the potential to revolutionize entire industries.




Data Products and the Data Mesh


Book Description

"Data Products and the Data Mesh" is a comprehensive guide that explores the emerging paradigm of the data mesh and its implications for organizations navigating the data-driven landscape. This book equips readers with the knowledge and insights needed to design, build, and manage effective data products within the data mesh framework. The book starts by introducing the core concepts and principles of the data mesh, highlighting the shift from centralized data architectures to decentralized, domain-oriented approaches. It delves into the key components of the data mesh, including federated data governance, data marketplaces, data virtualization, and adaptive data products. Each chapter provides in-depth analysis, practical strategies, and real-world examples to illustrate the application of these concepts. Readers will gain a deep understanding of how the data mesh fosters a culture of data ownership, collaboration, and innovation. They will explore the role of modern data architectures, such as data marketplaces, in facilitating decentralized data sharing, access, and monetization. The book also delves into the significance of emerging technologies like blockchain, AI, and machine learning in enhancing data integrity, security, and value creation. Throughout the book, readers will discover practical insights and best practices to overcome challenges related to data governance, scalability, privacy, and compliance. They will learn how to optimize data workflows, leverage domain-driven design principles, and harness the power of data virtualization to drive meaningful insights and create impactful data products. "Data Products and the Data Mesh" is an essential resource for data professionals, architects, and leaders seeking to navigate the complex world of data products within the data mesh paradigm. It provides a comprehensive roadmap for building a scalable, decentralized, and innovative data ecosystem that empowers organizations to unlock the full potential of their data assets and drive data-driven success.




Applied Data Science


Book Description

This book has two main goals: to define data science through the work of data scientists and their results, namely data products, while simultaneously providing the reader with relevant lessons learned from applied data science projects at the intersection of academia and industry. As such, it is not a replacement for a classical textbook (i.e., it does not elaborate on fundamentals of methods and principles described elsewhere), but systematically highlights the connection between theory, on the one hand, and its application in specific use cases, on the other. With these goals in mind, the book is divided into three parts: Part I pays tribute to the interdisciplinary nature of data science and provides a common understanding of data science terminology for readers with different backgrounds. These six chapters are geared towards drawing a consistent picture of data science and were predominantly written by the editors themselves. Part II then broadens the spectrum by presenting views and insights from diverse authors – some from academia and some from industry, ranging from financial to health and from manufacturing to e-commerce. Each of these chapters describes a fundamental principle, method or tool in data science by analyzing specific use cases and drawing concrete conclusions from them. The case studies presented, and the methods and tools applied, represent the nuts and bolts of data science. Finally, Part III was again written from the perspective of the editors and summarizes the lessons learned that have been distilled from the case studies in Part II. The section can be viewed as a meta-study on data science across a broad range of domains, viewpoints and fields. Moreover, it provides answers to the question of what the mission-critical factors for success in different data science undertakings are. The book targets professionals as well as students of data science: first, practicing data scientists in industry and academia who want to broaden their scope and expand their knowledge by drawing on the authors’ combined experience. Second, decision makers in businesses who face the challenge of creating or implementing a data-driven strategy and who want to learn from success stories spanning a range of industries. Third, students of data science who want to understand both the theoretical and practical aspects of data science, vetted by real-world case studies at the intersection of academia and industry.




Implementing Data Mesh


Book Description

As data continues to grow and become more complex, organizations seek innovative solutions to manage their data effectively. Data Mesh is one solution that provides a new approach to managing data in complex organizations. This practical guide offers step-by-step guidance on how to implement data mesh in your organization. In this book, Jean-Georges Perrin and Eric Broda focus on the key components of data mesh and provide practical advice supported by code. You'll explore a simple and intuitive process for identifying key data mesh components and data products, and learn about a consistent set of interfaces and access methods that make data products easy to consume. This approach ensures that your data products are easily accessible and the data mesh ecosystem is easy to navigate. With this book, you'll learn how to: Identify, define, and build data products that interoperate within an enterprise data mesh Build a data mesh fabric that binds data products together Build and deploy data products in a data mesh Establish the organizational structure to operate data products, data platforms, and data fabric Learn an innovative architecture that brings data products and data fabric together into the data mesh About the authors: Jean-Georges "JG" Perrin is a technology leader focusing on building innovative and modern data platforms. Eric Broda is a technology executive, practitioner, and founder of a boutique consulting firm that helps global enterprises realize value from data.







EOS Data Products Handbook


Book Description

Description of the data products that will be produced from the named scientific missions.