Elasticsearch in Action


Book Description

Summary Elasticsearch in Action teaches you how to build scalable search applications using Elasticsearch. You'll ramp up fast, with an informative overview and an engaging introductory example. Within the first few chapters, you'll pick up the core concepts you need to implement basic searches and efficient indexing. With the fundamentals well in hand, you'll go on to gain an organized view of how to optimize your design. Perfect for developers and administrators building and managing search-oriented applications. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Modern search seems like magic—you type a few words and the search engine appears to know what you want. With the Elasticsearch real-time search and analytics engine, you can give your users this magical experience without having to do complex low-level programming or understand advanced data science algorithms. You just install it, tweak it, and get on with your work. About the Book Elasticsearch in Action teaches you how to write applications that deliver professional quality search. As you read, you'll learn to add basic search features to any application, enhance search results with predictive analysis and relevancy ranking, and use saved data from prior searches to give users a custom experience. This practical book focuses on Elasticsearch's REST API via HTTP. Code snippets are written mostly in bash using cURL, so they're easily translatable to other languages. What's Inside What is a great search application? Building scalable search solutions Using Elasticsearch with any language Configuration and tuning About the Reader For developers and administrators building and managing search-oriented applications. About the Authors Radu Gheorghe is a search consultant and software engineer. Matthew Lee Hinman develops highly available, cloud-based systems. Roy Russo is a specialist in predictive analytics. Table of Contents PART 1 CORE ELASTICSEARCH FUNCTIONALITY Introducing Elasticsearch Diving into the functionality Indexing, updating, and deleting data Searching your data Analyzing your data Searching with relevancy Exploring your data with aggregations Relations among documents PART 2 ADVANCED ELASTICSEARCH FUNCTIONALITY Scaling out Improving performance Administering your cluster




Elasticsearch: The Definitive Guide


Book Description

Whether you need full-text search or real-time analytics of structured data—or both—the Elasticsearch distributed search engine is an ideal way to put your data to work. This practical guide not only shows you how to search, analyze, and explore data with Elasticsearch, but also helps you deal with the complexities of human language, geolocation, and relationships. If you’re a newcomer to both search and distributed systems, you’ll quickly learn how to integrate Elasticsearch into your application. More experienced users will pick up lots of advanced techniques. Throughout the book, you’ll follow a problem-based approach to learn why, when, and how to use Elasticsearch features. Understand how Elasticsearch interprets data in your documents Index and query your data to take advantage of search concepts such as relevance and word proximity Handle human language through the effective use of analyzers and queries Summarize and group data to show overall trends, with aggregations and analytics Use geo-points and geo-shapes—Elasticsearch’s approaches to geolocation Model your data to take advantage of Elasticsearch’s horizontal scalability Learn how to configure and monitor your cluster in production




Relevant Search


Book Description

Summary Relevant Search demystifies relevance work. Using Elasticsearch, it teaches you how to return engaging search results to your users, helping you understand and leverage the internals of Lucene-based search engines. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Users are accustomed to and expect instant, relevant search results. To achieve this, you must master the search engine. Yet for many developers, relevance ranking is mysterious or confusing. About the Book Relevant Search demystifies the subject and shows you that a search engine is a programmable relevance framework. You'll learn how to apply Elasticsearch or Solr to your business's unique ranking problems. The book demonstrates how to program relevance and how to incorporate secondary data sources, taxonomies, text analytics, and personalization. In practice, a relevance framework requires softer skills as well, such as collaborating with stakeholders to discover the right relevance requirements for your business. By the end, you'll be able to achieve a virtuous cycle of provable, measurable relevance improvements over a search product's lifetime. What's Inside Techniques for debugging relevance? Applying search engine features to real problems? Using the user interface to guide searchers? A systematic approach to relevance? A business culture focused on improving search About the Reader For developers trying to build smarter search with Elasticsearch or Solr. About the Authors Doug Turnbull is lead relevance consultant at OpenSource Connections, where he frequently speaks and blogs. John Berryman is a data engineer at Eventbrite, where he specializes in recommendations and search. Foreword author, Trey Grainger, is a director of engineering at CareerBuilder and author of Solr in Action. Table of Contents The search relevance problem Search under the hood Debugging your first relevance problem Taming tokens Basic multifield search Term-centric search Shaping the relevance function Providing relevance feedback Designing a relevance-focused search application The relevance-centered enterprise Semantic and personalized search




Solr in Action


Book Description

Summary Solr in Action is a comprehensive guide to implementing scalable search using Apache Solr. This clearly written book walks you through well-documented examples ranging from basic keyword searching to scaling a system for billions of documents and queries. It will give you a deep understanding of how to implement core Solr capabilities. About the Book Whether you're handling big (or small) data, managing documents, or building a website, it is important to be able to quickly search through your content and discover meaning in it. Apache Solr is your tool: a ready-to-deploy, Lucene-based, open source, full-text search engine. Solr can scale across many servers to enable real-time queries and data analytics across billions of documents. Solr in Action teaches you to implement scalable search using Apache Solr. This easy-to-read guide balances conceptual discussions with practical examples to show you how to implement all of Solr's core capabilities. You'll master topics like text analysis, faceted search, hit highlighting, result grouping, query suggestions, multilingual search, advanced geospatial and data operations, and relevancy tuning. This book assumes basic knowledge of Java and standard database technology. No prior knowledge of Solr or Lucene is required. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. What's Inside How to scale Solr for big data Rich real-world examples Solr as a NoSQL data store Advanced multilingual, data, and relevancy tricks Coverage of versions through Solr 4.7 About the Authors Trey Grainger is a director of engineering at CareerBuilder. Timothy Potter is a senior member of the engineering team at LucidWorks. The authors work on the scalability and reliability of Solr, as well as on recommendation engine and big data analytics technologies. Table of Contents PART 1 MEET SOLR Introduction to Solr Getting to know Solr Key Solr concepts Configuring Solr Indexing Text analysis PART 2 CORE SOLR CAPABILITIES Performing queries and handling results Faceted search Hit highlighting Query suggestions Result grouping/field collapsing Taking Solr to production PART 3 TAKING SOLR TO THE NEXT LEVEL SolrCloud Multilingual search Complex query operations Mastering relevancy




Spark in Action


Book Description

Summary The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, Second Edition, you’ll learn to take advantage of Spark’s core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in enterprises worldwide, and with Spark’s powerful and flexible Java APIs, you can reap all the benefits without first learning Scala or Hadoop. Foreword by Rob Thomas. About the technology Analyzing enterprise data starts by reading, filtering, and merging files and streams from many sources. The Spark data processing engine handles this varied volume like a champ, delivering speeds 100 times faster than Hadoop systems. Thanks to SQL support, an intuitive interface, and a straightforward multilanguage API, you can use Spark without learning a complex new ecosystem. About the book Spark in Action, Second Edition, teaches you to create end-to-end analytics applications. In this entirely new book, you’ll learn from interesting Java-based examples, including a complete data pipeline for processing NASA satellite data. And you’ll discover Java, Python, and Scala code samples hosted on GitHub that you can explore and adapt, plus appendixes that give you a cheat sheet for installing tools and understanding Spark-specific terms. What's inside Writing Spark applications in Java Spark application architecture Ingestion through files, databases, streaming, and Elasticsearch Querying distributed datasets with Spark SQL About the reader This book does not assume previous experience with Spark, Scala, or Hadoop. About the author Jean-Georges Perrin is an experienced data and software architect. He is France’s first IBM Champion and has been honored for 12 consecutive years. Table of Contents PART 1 - THE THEORY CRIPPLED BY AWESOME EXAMPLES 1 So, what is Spark, anyway? 2 Architecture and flow 3 The majestic role of the dataframe 4 Fundamentally lazy 5 Building a simple app for deployment 6 Deploying your simple app PART 2 - INGESTION 7 Ingestion from files 8 Ingestion from databases 9 Advanced ingestion: finding data sources and building your own 10 Ingestion through structured streaming PART 3 - TRANSFORMING YOUR DATA 11 Working with SQL 12 Transforming your data 13 Transforming entire documents 14 Extending transformations with user-defined functions 15 Aggregating your data PART 4 - GOING FURTHER 16 Cache and checkpoint: Enhancing Spark’s performances 17 Exporting data and building full data pipelines 18 Exploring deployment




Lucene in Action


Book Description

When Lucene first hit the scene five years ago, it was nothing short ofamazing. By using this open-source, highly scalable, super-fast search engine,developers could integrate search into applications quickly and efficiently.A lot has changed since then-search has grown from a "nice-to-have" featureinto an indispensable part of most enterprise applications. Lucene now powerssearch in diverse companies including Akamai, Netflix, LinkedIn,Technorati, HotJobs, Epiphany, FedEx, Mayo Clinic, MIT, New ScientistMagazine, and many others. Some things remain the same, though. Lucene still delivers high-performancesearch features in a disarmingly easy-to-use API. Due to its vibrant and diverseopen-source community of developers and users, Lucene is relentlessly improving,with evolutions to APIs, significant new features such as payloads, and ahuge increase (as much as 8x) in indexing speed with Lucene 2.3. And with clear writing, reusable examples, and unmatched advice on bestpractices, Lucene in Action, Second Edition is still the definitive guide todeveloping with Lucene. Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.




Elasticsearch Essentials


Book Description

Harness the power of ElasticSearch to build and manage scalable search and analytics solutions with this fast-paced guide About This Book New to ElasticSearch? Here's what you need—a highly practical guide that gives you a quick start with ElasticSearch using easy-to-follow examples; get up and running with ElasticSearch APIs in no time Get the latest guide on ElasticSearch 2.0.0, which contains concise and adequate information on handling all the issues a developer needs to know while handling data in bulk with search relevancy Learn to create large-scale ElasticSearch clusters using best practices Learn from our experts—written by Bharvi Dixit who has extensive experience in working with search servers (especially ElasticSearch) Who This Book Is For Anyone who wants to build efficient search and analytics applications can choose this book. This book is also beneficial for skilled developers, especially ones experienced with Lucene or Solr, who now want to learn Elasticsearch quickly. What You Will Learn Get to know about advanced Elasticsearch concepts and its REST APIs Write CRUD operations and other search functionalities using the ElasticSearch Python and Java clients Dig into wide range of queries and find out how to use them correctly Design schema and mappings with built-in and custom analyzers Excel in data modeling concepts and query optimization Master document relationships and geospatial data Build analytics using aggregations Setup and scale Elasticsearch clusters using best practices Learn to take data backups and secure Elasticsearch clusters In Detail With constantly evolving and growing datasets, organizations have the need to find actionable insights for their business. ElasticSearch, which is the world's most advanced search and analytics engine, brings the ability to make massive amounts of data usable in a matter of milliseconds. It not only gives you the power to build blazing fast search solutions over a massive amount of data, but can also serve as a NoSQL data store. This guide will take you on a tour to become a competent developer quickly with a solid knowledge level and understanding of the ElasticSearch core concepts. Starting from the beginning, this book will cover these core concepts, setting up ElasticSearch and various plugins, working with analyzers, and creating mappings. This book provides complete coverage of working with ElasticSearch using Python and performing CRUD operations and aggregation-based analytics, handling document relationships in the NoSQL world, working with geospatial data, and taking data backups. Finally, we'll show you how to set up and scale ElasticSearch clusters in production environments as well as providing some best practices. Style and approach This is an easy-to-follow guide with practical examples and clear explanations of the concepts. This fast-paced book believes in providing very rich content focusing majorly on practical implementation. This book will provide you with step-by-step practical examples, letting you know about the common errors and solutions along with ample screenshots and code to ensure your success.




Docker in Action, Second Edition


Book Description

Summary Docker in Action, Second Edition teaches you the skills and knowledge you need to create, deploy, and manage applications hosted in Docker containers. This bestseller has been fully updated with new examples, best practices, and a number of entirely new chapters. About the technology The idea behind Docker is simple—package just your application and its dependencies into a lightweight, isolated virtual environment called a container. Applications running inside containers are easy to install, manage, and remove. This simple idea is used in everything from creating safe, portable development environments to streamlining deployment and scaling for microservices. In short, Docker is everywhere. About the book Docker in Action, Second Edition teaches you to create, deploy, and manage applications hosted in Docker containers running on Linux. Fully updated, with four new chapters and revised best practices and examples, this second edition begins with a clear explanation of the Docker model. Then, you go hands-on with packaging applications, testing, installing, running programs securely, and deploying them across a cluster of hosts. With examples showing how Docker benefits the whole dev lifecycle, you’ll discover techniques for everything from dev-and-test machines to full-scale cloud deployments. What's inside Running software in containers Packaging software for deployment Securing and distributing containerized applications About the reader Written for developers with experience working with Linux. About the author Jeff Nickoloff and Stephen Kuenzli have designed, built, deployed, and operated highly available, scalable software systems for nearly 20 years.




Learning Elastic Stack 7.0


Book Description

A beginner's guide to storing, managing, and analyzing data with the updated features of Elastic 7.0 Key FeaturesGain access to new features and updates introduced in Elastic Stack 7.0Grasp the fundamentals of Elastic Stack including Elasticsearch, Logstash, and KibanaExplore useful tips for using Elastic Cloud and deploying Elastic Stack in production environmentsBook Description The Elastic Stack is a powerful combination of tools for techniques such as distributed search, analytics, logging, and visualization of data. Elastic Stack 7.0 encompasses new features and capabilities that will enable you to find unique insights into analytics using these techniques. This book will give you a fundamental understanding of what the stack is all about, and help you use it efficiently to build powerful real-time data processing applications. The first few sections of the book will help you understand how to set up the stack by installing tools, and exploring their basic configurations. You’ll then get up to speed with using Elasticsearch for distributed searching and analytics, Logstash for logging, and Kibana for data visualization. As you work through the book, you will discover the technique of creating custom plugins using Kibana and Beats. This is followed by coverage of the Elastic X-Pack, a useful extension for effective security and monitoring. You’ll also find helpful tips on how to use Elastic Cloud and deploy Elastic Stack in production environments. By the end of this book, you’ll be well versed with the fundamental Elastic Stack functionalities and the role of each component in the stack to solve different data processing problems. What you will learnInstall and configure an Elasticsearch architectureSolve the full-text search problem with ElasticsearchDiscover powerful analytics capabilities through aggregations using ElasticsearchBuild a data pipeline to transfer data from a variety of sources into Elasticsearch for analysisCreate interactive dashboards for effective storytelling with your data using KibanaLearn how to secure, monitor and use Elastic Stack’s alerting and reporting capabilitiesTake applications to an on-premise or cloud-based production environment with Elastic StackWho this book is for This book is for entry-level data professionals, software engineers, e-commerce developers, and full-stack developers who want to learn about Elastic Stack and how the real-time processing and search engine works for business analytics and enterprise search applications. Previous experience with Elastic Stack is not required, however knowledge of data warehousing and database concepts will be helpful.




Mastering Elastic Stack


Book Description

Get the most out of the Elastic Stack for various complex analytics using this comprehensive and practical guide About This Book Your one-stop solution to perform advanced analytics with Elasticsearch, Logstash, and Kibana Learn how to make better sense of your data by searching, analyzing, and logging data in a systematic way This highly practical guide takes you through an advanced implementation on the ELK stack in your enterprise environment Who This Book Is For This book cater to developers using the Elastic stack in their day-to-day work who are familiar with the basics of Elasticsearch, Logstash, and Kibana, and now want to become an expert at using the Elastic stack for data analytics. What You Will Learn Build a pipeline with help of Logstash and Beats to visualize Elasticsearch data in Kibana Use Beats to ship any type of data to the Elastic stack Understand Elasticsearch APIs, modules, and other advanced concepts Explore Logstash and it's plugins Discover how to utilize the new Kibana UI for advanced analytics See how to work with the Elastic Stack using other advanced configurations Customize the Elastic Stack and plugin development for each of the component Work with the Elastic Stack in a production environment Explore the various components of X-Pack in detail. In Detail Even structured data is useless if it can't help you to take strategic decisions and improve existing system. If you love to play with data, or your job requires you to process custom log formats, design a scalable analysis system, and manage logs to do real-time data analysis, this book is your one-stop solution. By combining the massively popular Elasticsearch, Logstash, Beats, and Kibana, elastic.co has advanced the end-to-end stack that delivers actionable insights in real time from almost any type of structured or unstructured data source. If your job requires you to process custom log formats, design a scalable analysis system, explore a variety of data, and manage logs, this book is your one-stop solution. You will learn how to create real-time dashboards and how to manage the life cycle of logs in detail through real-life scenarios. This book brushes up your basic knowledge on implementing the Elastic Stack and then dives deeper into complex and advanced implementations of the Elastic Stack. We'll help you to solve data analytics challenges using the Elastic Stack and provide practical steps on centralized logging and real-time analytics with the Elastic Stack in production. You will get to grip with advanced techniques for log analysis and visualization. Newly announced features such as Beats and X-Pack are also covered in detail with examples. Toward the end, you will see how to use the Elastic stack for real-world case studies and we'll show you some best practices and troubleshooting techniques for the Elastic Stack. Style and approach This practical guide shows you how to perform advanced analytics with the Elastic stack through real-world use cases. It includes common and some not so common scenarios to use the Elastic stack for data analysis.