An Introduction to Agile Data Engineering Using Data Vault 2. 0


Book Description

The world of data warehousing is changing. Big Data & Agile are hot topics. But companies still need to collect, report, and analyze their data. Usually this requires some form of data warehousing or business intelligence system. So how do we do that in the modern IT landscape in a way that allows us to be agile and either deal directly or indirectly with unstructured and semi structured data?The Data Vault System of Business Intelligence provides a method and approach to modeling your enterprise data warehouse (EDW) that is agile, flexible, and scalable. This book will give you a short introduction to Agile Data Engineering for Data Warehousing and Data Vault 2.0. I will explain why you should be trying to become Agile, some of the history and rationale for Data Vault 2.0, and then show you the basics for how to build a data warehouse model using the Data Vault 2.0 standards.In addition, I will cover some details about the Business Data Vault (what it is) and then how to build a virtual Information Mart off your Data Vault and Business Vault using the Data Vault 2.0 architecture.So if you want to start learning about Agile Data Engineering with Data Vault 2.0, this book is for you.




Building a Scalable Data Warehouse with Data Vault 2.0


Book Description

The Data Vault was invented by Dan Linstedt at the U.S. Department of Defense, and the standard has been successfully applied to data warehousing projects at organizations of different sizes, from small to large-size corporations. Due to its simplified design, which is adapted from nature, the Data Vault 2.0 standard helps prevent typical data warehousing failures. "Building a Scalable Data Warehouse" covers everything one needs to know to create a scalable data warehouse end to end, including a presentation of the Data Vault modeling technique, which provides the foundations to create a technical data warehouse layer. The book discusses how to build the data warehouse incrementally using the agile Data Vault 2.0 methodology. In addition, readers will learn how to create the input layer (the stage layer) and the presentation layer (data mart) of the Data Vault 2.0 architecture including implementation best practices. Drawing upon years of practical experience and using numerous examples and an easy to understand framework, Dan Linstedt and Michael Olschimke discuss: - How to load each layer using SQL Server Integration Services (SSIS), including automation of the Data Vault loading processes. - Important data warehouse technologies and practices. - Data Quality Services (DQS) and Master Data Services (MDS) in the context of the Data Vault architecture. - Provides a complete introduction to data warehousing, applications, and the business context so readers can get-up and running fast - Explains theoretical concepts and provides hands-on instruction on how to build and implement a data warehouse - Demystifies data vault modeling with beginning, intermediate, and advanced techniques - Discusses the advantages of the data vault approach over other techniques, also including the latest updates to Data Vault 2.0 and multiple improvements to Data Vault 1.0




Agile Data Warehouse Design


Book Description

Agile Data Warehouse Design is a step-by-step guide for capturing data warehousing/business intelligence (DW/BI) requirements and turning them into high performance dimensional models in the most direct way: by modelstorming (data modeling + brainstorming) with BI stakeholders. This book describes BEAM✲, an agile approach to dimensional modeling, for improving communication between data warehouse designers, BI stakeholders and the whole DW/BI development team. BEAM✲ provides tools and techniques that will encourage DW/BI designers and developers to move away from their keyboards and entity relationship based tools and model interactively with their colleagues. The result is everyone thinks dimensionally from the outset! Developers understand how to efficiently implement dimensional modeling solutions. Business stakeholders feel ownership of the data warehouse they have created, and can already imagine how they will use it to answer their business questions. Within this book, you will learn: ✲ Agile dimensional modeling using Business Event Analysis & Modeling (BEAM✲) ✲ Modelstorming: data modeling that is quicker, more inclusive, more productive, and frankly more fun! ✲ Telling dimensional data stories using the 7Ws (who, what, when, where, how many, why and how) ✲ Modeling by example not abstraction; using data story themes, not crow's feet, to describe detail ✲ Storyboarding the data warehouse to discover conformed dimensions and plan iterative development ✲ Visual modeling: sketching timelines, charts and grids to model complex process measurement - simply ✲ Agile design documentation: enhancing star schemas with BEAM✲ dimensional shorthand notation ✲ Solving difficult DW/BI performance and usability problems with proven dimensional design patterns Lawrence Corr is a data warehouse designer and educator. As Principal of DecisionOne Consulting, he helps clients to review and simplify their data warehouse designs, and advises vendors on visual data modeling techniques. He regularly teaches agile dimensional modeling courses worldwide and has taught dimensional DW/BI skills to thousands of students. Jim Stagnitto is a data warehouse and master data management architect specializing in the healthcare, financial services, and information service industries. He is the founder of the data warehousing and data mining consulting firm Llumino.




The Data Vault Guru


Book Description

The data vault methodology presents a unique opportunity to model the enterprise data warehouse using the same automation principles applicable in today's software delivery, continuous integration, continuous delivery and continuous deployment while still maintaining the standards expected for governing a corporation's most valuable asset: data. This book provides at first the landscape of a modern architecture and then as a thorough guide on how to deliver a data model that flexes as the enterprise flexes, the data vault. Whether the data is structured, semi-structured or even unstructured one thing is clear, there is always a model either applied early (schema-on-write) or applied late (schema-on-read). Today's focus on data governance requires that we know what we retain about our customers, the data vault provides that focus by delivering a methodology focused on all aspects about the customer and provides some of the best practices for modern day data compliance.The book will delve into every data vault modelling artefact, its automation with sample code, raw vault, business vault, testing framework, a build framework, sample data vault models, how to build automation patterns on top of a data vault and even offer an extension of data vault that provides automated timeline correction, not to mention variation of data vault designed to provide audit trails, metadata control and integration with agile delivery tools.




Agile Data Warehousing


Book Description

Contains a six-stage plan for starting new warehouse projects and guiding programmers step-by-step until they become a world-class, Agile development team. It describes also how to avoid or contain the fierce opposition that radically new methods can encounter from the traditionally-minded IS departments found in many large companies.




Super Charge Your Data Warehouse


Book Description

Do You Know If Your Data Warehouse Flexible, Scalable, Secure and Will It Stand The Test Of Time And Avoid Being Part Of The Dreaded "Life Cycle"? The Data Vault took the Data Warehouse world by storm when it was released in 2001. Some of the world's largest and most complex data warehouse situations understood the value it gave especially with the capabilities of unlimited scaling, flexibility and security. Here is what industry leaders say about the Data Vault "The Data Vault is the optimal choice for modeling the EDW in the DW 2.0 framework" - Bill Inmon, The Father of Data Warehousing "The Data Vault is foundationally strong and an exceptionally scalable architecture" - Stephen Brobst, CTO, Teradata "The Data Vault should be considered as a potential standard for RDBMS-based analytic data management by organizations looking to achieve a high degree of flexibility, performance and openness" - Doug Laney, Deloitte Analytics Institute "I applaud Dan's contribution to the body of Business Intelligence and Data Warehousing knowledge and recommend this book be read by both data professionals and end users" - Howard Dresner, From the Foreword - Speaker, Author, Leading Research Analyst and Advisor You have in your hands the work, experience and testing of 2 decades of building data warehouses. The Data Vault model and methodology has proven itself in hundreds (perhaps thousands) of solutions in Insurance, Crime-Fighting, Defense, Retail, Finance, Banking, Power, Energy, Education, High-Tech and many more. Learn the techniques and implement them and learn how to build your Data Warehouse faster than you have ever done before while designing it to grow and scale no matter what you throw at it. Ready to "Super Charge Your Data Warehouse"?




Corporate Information Factory


Book Description

The "father of data warehousing" incorporates the latesttechnologies into his blueprint for integrated decision supportsystems Today's corporate IT and data warehouse managers are required tomake a small army of technologies work together to ensure fast andaccurate information for business managers. Bill Inmon created theCorporate Information Factory to solve the needs ofthese managers. Since the First Edition, the design of the factoryhas grown and changed dramatically. This Second Edition, revisedand expanded by 40% with five new chapters, incorporates thesechanges. This step-by-step guide will enable readers to connecttheir legacy systems with the data warehouse and deal with a hostof new and changing technologies, including Web access mechanisms,e-commerce systems, ERP (Enterprise Resource Planning) systems. Thebook also looks closely at exploration and data mining servers foranalyzing customer behavior and departmental data marts forfinance, sales, and marketing.




Data Architecture


Book Description




The Elephant in the Fridge


Book Description

You want the rigor of good data architecture at the speed of agile? Then this is the missing link - your step-by-step guide to Data Vault success. Success with a Data Vault starts with the business and ends with the business. Sure, there's some technical stuff in the middle, and it is absolutely essential - but it's not sufficient on its own. This book will help you shape the business perspective, and weave it into the more technical aspects of Data Vault modeling. You can read the foundational books and go on courses, but one massive risk still remains. Dan Linstedt, the founder of the Data Vault, very clearly directs those building a Data Vault to base its design on an "enterprise ontology". And Hans Hultgren similarly stresses the importance of the business concepts model. So it's important. We get that. But: What on earth is an enterprise ontology/business concept model, 'cause I won't know if I've got one if I don't know what I'm looking for? If I can't find one, how do I get my hands on such a thing? Even if I have one of these wonderful things, how do I apply it to get the sort of Data Vault that's recommended? It's actually not as hard as some would fear to answer all of these questions, and it's certainly worth the effort. This book just might save you a world of pain. It's a supplement to other material on Data Vault modeling, but it's the vital missing link to finding simplicity for Data Vault success.




Knowledge Graphs


Book Description

This book provides a comprehensive and accessible introduction to knowledge graphs, which have recently garnered notable attention from both industry and academia. Knowledge graphs are founded on the principle of applying a graph-based abstraction to data, and are now broadly deployed in scenarios that require integrating and extracting value from multiple, diverse sources of data at large scale. The book defines knowledge graphs and provides a high-level overview of how they are used. It presents and contrasts popular graph models that are commonly used to represent data as graphs, and the languages by which they can be queried before describing how the resulting data graph can be enhanced with notions of schema, identity, and context. The book discusses how ontologies and rules can be used to encode knowledge as well as how inductive techniques—based on statistics, graph analytics, machine learning, etc.—can be used to encode and extract knowledge. It covers techniques for the creation, enrichment, assessment, and refinement of knowledge graphs and surveys recent open and enterprise knowledge graphs and the industries or applications within which they have been most widely adopted. The book closes by discussing the current limitations and future directions along which knowledge graphs are likely to evolve. This book is aimed at students, researchers, and practitioners who wish to learn more about knowledge graphs and how they facilitate extracting value from diverse data at large scale. To make the book accessible for newcomers, running examples and graphical notation are used throughout. Formal definitions and extensive references are also provided for those who opt to delve more deeply into specific topics.