Learning OpenTelemetry


Book Description

OpenTelemetry is a revolution in observability data. Instead of running multiple uncoordinated pipelines, OpenTelemetry provides users with a single integrated stream of data, providing multiple sources of high-quality telemetry data: tracing, metrics, logs, RUM, eBPF, and more. This practical guide shows you how to set up, operate, and troubleshoot the OpenTelemetry observability system. Authors Austin Parker, head of developer relations at Lightstep and OpenTelemetry Community Maintainer, and Ted Young, cofounder of the OpenTelemetry project, cover every OpenTelemetry component, as well as observability best practices for many popular cloud, platform, and data services such as Kubernetes and AWS Lambda. You'll learn how OpenTelemetry enables OSS libraries and services to provide their own native instrumentation—a first in the industry. Ideal for application developers, OSS maintainers, operators and infrastructure teams, and managers and team leaders, this book guides you through: The principles of modern observability All OpenTelemetry components—and how they fit together A practical approach to instrumenting platforms and applications Methods for installing, operating, and troubleshooting an OpenTelemetry-based observability solution Ways to roll out and maintain end-to-end observability across a large organization How to write and maintain consistent, high-quality instrumentation without a lot of work




Cloud Observability in Action


Book Description

Don’t fly blind. Observability gives you actionable insights into your cloud native systems—from pinpointing errors, to increasing developer productivity, to tracking compliance. Observability is the difference between an error message and an error explanation with a recipe how to resolve the error! You know exactly which service is affected, who’s responsible for its repair, and even how it can be optimized in the future. Cloud Observability in Action teaches you how to set up an observability system that learns from a cloud application’s signals, logging, and monitoring, all using free and open source tools. In Cloud Observability in Action you will learn how to: Apply observability in cloud native systems Understand observability signals, including their costs and benefits Apply good practices around instrumentation and signal collection Deliver dashboarding, alerting, and SLOs/SLIs at scale Choose the correct signal types for given roles or tasks Pick the right observability tool for any given function Communicate the benefits of observability to management A well-designed observability system provides insight into bugs and performance issues in cloud native applications. They help your development team understand the impact of code changes, measure optimizations, and track user experience. Best of all, observability can even automate your error handling so that machine users apply their own fixes—no more 3AM calls for emergency outages. About the technology Cloud native systems are made up of hundreds of moving parts. When something goes wrong, it’s not enough to know there is a problem—you need to know where it is, what it is, and how to fix it. This book takes you beyond traditional monitoring, explaining observability systems that turn application telemetry into actionable insights. About the book Cloud Observability in Action gives you the background and techniques you need to successfully introduce observability into cloud-based serverless and Kubernetes environments. In it, you’ll learn to use open standards and tools like OpenTelemetry, Prometheus, and Grafana to build your own observability system and end reliance on proprietary software. You’ll discover insights from different telemetry signals, including logs, metrics, traces, and profiles. Plus, the book’s rigorous cost-benefit analysis ensures you’re getting a real return on your observability investment. What's inside Observability in and of cloud native systems Dashboarding, alerting, and SLOs/SLIs at scale Signal types for any role or task State-of-the-art open source observability tools About the reader For application developers, platform owners, DevOps, and SREs. About the author Michael Hausenblas is a Product Owner in the AWS open source observability team. Table of Contents 1 End-to-end observability 2 Signal types 3 Sources 4 Agents and instrumentation 5 Backend destinations 6 Frontend destinations 7 Cloud operations 8 Distributed tracing 9 Developer observability 10 Service level objectives 11 Signal correlation




Mastering Prometheus


Book Description

Learn how to effectively implement, manage, and optimize Prometheus for monitoring your systems Key Features Achieve high availability with Prometheus by using Thanos Integrate Prometheus into your broader observability stack with OpenTelemetry Tweak, tune, and debug Prometheus to reliably scale without limits Purchase of the print or Kindle book includes a free PDF eBook Book DescriptionWith an increased focus on observability and reliability, establishing a scalable and reliable monitoring environment is more important than ever. Over the last decade, Prometheus has emerged as the leading open-source, time-series based monitoring software catering to this demand. This book is your guide to scaling, operating, and extending Prometheus from small on-premises workloads to multi-cloud globally distributed workloads and everything in between. Starting with an introduction to Prometheus and its role in observability, the book provides a walkthrough of its deployment. You’ll explore Prometheus’s query language and TSDB data model, followed by dynamic service discovery for monitoring targets and refining alerting through custom templates and formatting. The book then demonstrates horizontal scaling of Prometheus via sharding and federation, while equipping you with debugging techniques and strategies to fine-tune data ingestion. Advancing through the chapters, you’ll manage Prometheus at scale through CI validations and templating with Jsonnet, and integrate Prometheus with other projects such as OpenTelemetry, Thanos, VictoriaMetrics, and Mimir. By the end of this book, you’ll have practical knowledge of Prometheus and its ecosystem, which will help you discern when, why, and how to scale it to meet your ever-growing needs.What you will learn Deploy Prometheus and Node Exporter to public clouds and Kubernetes Gain in-depth knowledge of how Prometheus's underlying code works Build your own custom service-discovery providers for Prometheus Debug Prometheus performance issues to identify cardinality issues in your environment Use VictoriaMetrics and/or Grafana Mimir for remote storage of Prometheus data Define and implement SLO-based alerting Who this book is for The book is for site reliability engineers (SREs), developers, and platform engineers involved in the monitoring and observability of their team or company’s systems. A background in Prometheus is assumed, so the book dedicates minimal time to the basics of getting Prometheus up and running. Whether you aim to expand monitoring capabilities, streamline configuration management, or enhance integration with existing tools, this book will help you maximize the potential of your Prometheus monitoring stack.




Prometheus: Up & Running


Book Description

Get up to speed with Prometheus, the metrics-based monitoring system used by tens of thousands of organizations in production. This practical guide provides application developers, sysadmins, and DevOps practitioners with a hands-on introduction to the most important aspects of Prometheus, including dashboarding and alerting, direct code instrumentation, and metric collection from third-party systems with exporters. This open source system has gained popularity over the past few years for good reason. With its simple yet powerful data model and query language, Prometheus does one thing, and it does it well. Author and Prometheus developer Brian Brazil guides you through Prometheus setup, the Node exporter, and the Alertmanager, then demonstrates how to use them for application and infrastructure monitoring. Know where and how much to apply instrumentation to your application code Identify metrics with labels using unique key-value pairs Get an introduction to Grafana, a popular tool for building dashboards Learn how to use the Node Exporter to monitor your infrastructure Use service discovery to provide different views of your machines and services Use Prometheus with Kubernetes and examine exporters you can use with containers Convert data from other monitoring systems into the Prometheus format




Go for DevOps


Book Description

Achieve reliable release automation and get zero troublesome notifications on your release day. Purchase of the print or Kindle book includes a free eBook in the PDF format. Key Features Develop the skills to build command-line tools to control thousands of nodes Use Go to create Terraform providers and GitHub actions and extend Kubernetes Gain the knowledge to build DevOps workflows that are understandable, scalable, and safe Book DescriptionGo is the go-to language for DevOps libraries and services, and without it, achieving fast and safe automation is a challenge. With the help of Go for DevOps, you'll learn how to deliver services with ease and safety, becoming a better DevOps engineer in the process. Some of the key things this book will teach you are how to write Go software to automate configuration management, update remote machines, author custom automation in GitHub Actions, and interact with Kubernetes. As you advance through the chapters, you'll explore how to automate the cloud using software development kits (SDKs), extend HashiCorp's Terraform and Packer using Go, develop your own DevOps services with gRPC and REST, design system agents, and build robust workflow systems. By the end of this Go for DevOps book, you'll understand how to apply development principles to automate operations and provide operational insights using Go, which will allow you to react quickly to resolve system failures before your customers realize something has gone wrong.What you will learn Understand the basic structure of the Go language to begin your DevOps journey Interact with filesystems to read or stream data Communicate with remote services via REST and gRPC Explore writing tools that can be used in the DevOps environment Develop command-line operational software in Go Work with popular frameworks to deploy production software Create GitHub actions that streamline your CI/CD process Write a ChatOps application with Slack to simplify production visibility Who this book is for This book is for Ops and DevOps engineers who would like to use Go to develop their own DevOps tooling or integrate custom features with DevOps tools such as Kubernetes, GitHub Actions, HashiCorp Packer, and Terraform. Experience with some type of programming language, but not necessarily Go, is necessary to get started with this book.




Flow Architectures


Book Description

Software development today is embracing events and streaming data, which optimizes not only how technology interacts but also how businesses integrate with one another to meet customer needs. This phenomenon, called flow, consists of patterns and standards that determine which activity and related data is communicated between parties over the internet. This book explores critical implications of that evolution: What happens when events and data streams help you discover new activity sources to enhance existing businesses or drive new markets? What technologies and architectural patterns can position your company for opportunities enabled by flow? James Urquhart, global field CTO at VMware, guides enterprise architects, software developers, and product managers through the process. Learn the benefits of flow dynamics when businesses, governments, and other institutions integrate via events and data streams Understand the value chain for flow integration through Wardley mapping visualization and promise theory modeling Walk through basic concepts behind today's event-driven systems marketplace Learn how today's integration patterns will influence the real-time events flow in the future Explore why companies should architect and build software today to take advantage of flow in coming years




Learning Dapr


Book Description

Get the authoritative guide to Dapr, the distributed application runtime that works with new and existing programming languages alike. Written by the model’s creators, this introduction shows you how Dapr not only unifies stateless, stateful, and actor programming models but also runs everywhere—in the cloud or on the edge. Authors Haishi Bai and Yaron Schneider with Microsoft’s Azure CTO team explain that, with Dapr, you don’t need to include any SDKs or libraries in your user code. Instead, you automatically get flexible binding, state management, the actor pattern, pub-sub, reliable messaging, and many more features. This book shows developers, architects, CIOs, students, and computing enthusiasts how to get started with Dapr. Learn the new programming model for cloud native applications Write high-performance distributed applications without drilling into technical details Use Dapr with any language or framework to write microservices easily Learn how Dapr provides consistency and portability through open APIs and extensible, community-driven components Explore how Dapr handles state, resource bindings, and pub-sub messaging to enable resilient event-driven architectures that scale Integrate cloud applications with various SaaS offerings, such as machine learning




Learning Modern Linux


Book Description

If you use Linux in development or operations and need a structured approach to help you dive deeper, this book is for you. Author Michael Hausenblas also provides tips and tricks for improving your workflow with this open source operating system. Whether you're a developer, software architect, or site reliability engineer, this hands-on guide focuses on ways to use Linux for your everyday needs, from development to office-related tasks. Along the way, you'll gain hands-on experience with modern Linux terminals and shells, and learn how to manage your workloads. You'll understand how to run Linux applications by using containers, systemd, modern filesystems, and immutable distros such as Flatcar and Bottlerocket. Use Linux as a modern work environment, rather than just from an admin perspective Learn critical components such as the Linux kernel, terminal multiplexer, human-friendly shells, and portable shell scripting Become familiar with access control, from file permissions to capabilities, and understand the role of filesystems as a fundamental building block Learn about application dependency management and containers Gain hands-on experience with the Linux networking stack and tooling, including DNS Apply modern operating system observability to manage your workloads Become familiar with interprocess communication, virtual machines, and selected security topics




Reliable Machine Learning


Book Description

Whether you're part of a small startup or a multinational corporation, this practical book shows data scientists, software and site reliability engineers, product managers, and business owners how to run and establish ML reliably, effectively, and accountably within your organization. You'll gain insight into everything from how to do model monitoring in production to how to run a well-tuned model development team in a product organization. By applying an SRE mindset to machine learning, authors and engineering professionals Cathy Chen, Kranti Parisa, Niall Richard Murphy, D. Sculley, Todd Underwood, and featured guest authors show you how to run an efficient and reliable ML system. Whether you want to increase revenue, optimize decision making, solve problems, or understand and influence customer behavior, you'll learn how to perform day-to-day ML tasks while keeping the bigger picture in mind. You'll examine: What ML is: how it functions and what it relies on Conceptual frameworks for understanding how ML "loops" work How effective productionization can make your ML systems easily monitorable, deployable, and operable Why ML systems make production troubleshooting more difficult, and how to compensate accordingly How ML, product, and production teams can communicate effectively




Cloud-Native Observability with OpenTelemetry


Book Description

Leverage OpenTelemetry's API, libraries, tools and the collector to produce and collect telemetry along with using open-source tools to analyze distributed traces, check metrics and logs, and gain insights into application health Key Features • Get to grips with OpenTelemetry, an open-source cloud-native software observability standard • Use vendor-neutral tools to instrument applications to produce better telemetry and improve observability • Understand how telemetry data can be correlated and interpreted to understand distributed systems Book Description Cloud-Native Observability with OpenTelemetry is a guide to helping you look for answers to questions about your applications. This book teaches you how to produce telemetry from your applications using an open standard to retain control of data. OpenTelemetry provides the tools necessary for you to gain visibility into the performance of your services. It allows you to instrument your application code through vendor-neutral APIs, libraries and tools. By reading Cloud-Native Observability with OpenTelemetry, you'll learn about the concepts and signals of OpenTelemetry - traces, metrics, and logs. You'll practice producing telemetry for these signals by configuring and instrumenting a distributed cloud-native application using the OpenTelemetry API. The book also guides you through deploying the collector, as well as telemetry backends necessary to help you understand what to do with the data once it's emitted. You'll look at various examples of how to identify application performance issues through telemetry. By analyzing telemetry, you'll also be able to better understand how an observable application can improve the software development life cycle. By the end of this book, you'll be well-versed with OpenTelemetry, be able to instrument services using the OpenTelemetry API to produce distributed traces, metrics and logs, and more. What you will learn • Understand the core concepts of OpenTelemetry • Explore concepts in distributed tracing, metrics, and logging • Discover the APIs and SDKs necessary to instrument an application using OpenTelemetry • Explore what auto-instrumentation is and how it can help accelerate application instrumentation • Configure and deploy the OpenTelemetry Collector • Get to grips with how different open-source backends can be used to analyze telemetry data • Understand how to correlate telemetry in common scenarios to get to the root cause of a problem Who this book is for This book is for software engineers, library authors, and systems operators looking to better understand their infrastructure, services and applications by leveraging telemetry data like never before. Working knowledge of Python programming is assumed for the example applications that you'll be building and instrumenting using the OpenTelemetry API and SDK. Some familiarity with Go programming, Linux, and Docker is preferable to help you set up additional components in various examples throughout the book.