IBM InfoSphere Streams: Accelerating Deployments with Analytic Accelerators


Book Description

This IBM® Redbooks® publication describes visual development, visualization, adapters, analytics, and accelerators for IBM InfoSphere® Streams (V3), a key component of the IBM Big Data platform. Streams was designed to analyze data in motion, and can perform analysis on incredibly high volumes with high velocity, using a wide variety of analytic functions and data types. The Visual Development environment extends Streams Studio with drag-and-drop development, provides round tripping with existing text editors, and is ideal for rapid prototyping. Adapters facilitate getting data in and out of Streams, and V3 supports WebSphere MQ, Apache Hadoop Distributed File System, and IBM InfoSphere DataStage. Significant analytics include the native Streams Processing Language, SPSS Modeler analytics, Complex Event Processing, TimeSeries Toolkit for machine learning and predictive analytics, Geospatial Toolkit for location-based applications, and Annotation Query Language for natural language processing applications. Accelerators for Social Media Analysis and Telecommunications Event Data Analysis sample programs can be modified to build production level applications. Want to learn how to analyze high volumes of streaming data or implement systems requiring high performance across nodes in a cluster? Then this book is for you.




Beyond Big Data


Book Description

Drive Powerful Business Value by Extending MDM to Social, Mobile, Local, and Transactional Data Enterprises have long relied on Master Data Management (MDM) to improve customer-related processes. But MDM was designed primarily for structured data. Today, crucial information is increasingly captured in unstructured, transactional, and social formats: from tweets and Facebook posts to call center transcripts. Even with tools like Hadoop, extracting usable insight is difficult—often, because it’s so difficult to integrate new and legacy data sources. In Beyond Big Data, five of IBM’s leading data management experts introduce powerful new ways to integrate social, mobile, location, and traditional data. Drawing on pioneering experience with IBM’s enterprise customers, they show how Social MDM can help you deepen relationships, improve prospect targeting, and fully engage customers through mobile channels. Business leaders and practitioners will discover powerful new ways to combine social and master data to improve performance and uncover new opportunities. Architects and other technical leaders will find a complete reference architecture, in-depth coverage of relevant technologies and use cases, and domain-specific best practices for their own projects. Coverage Includes How Social MDM extends fundamental MDM concepts and techniques Architecting Social MDM: components, functions, layers, and interactions Identifying high value relationships: person to product and person to organization Mapping Social MDM architecture to specific products and technologies Using Social MDM to create more compelling customer experiences Accelerating your transition to highly-targeted, contextual marketing Incorporating mobile data to improve employee productivity Avoiding privacy and ethical pitfalls throughout your ecosystem Previewing Semantic MDM and other emerging trends




Spatio-Temporal Data Streams


Book Description

This SpringerBrief presents the fundamental concepts of a specialized class of data stream, spatio-temporal data streams, and demonstrates their distributed processing using Big Data frameworks and platforms. It explores a consistent framework which facilitates a thorough understanding of all different facets of the technology, from basic definitions to state-of-the-art techniques. Key topics include spatio-temporal continuous queries, distributed stream processing, SQL-like language embedding, and trajectory stream clustering. Over the course of the book, the reader will become familiar with spatio-temporal data streams management and data flow processing, which enables the analysis of huge volumes of location-aware continuous data streams. Applications range from mobile object tracking and real-time intelligent transportation systems to traffic monitoring and complex event processing. Spatio-Temporal Data Streams is a valuable resource for researchers studying spatio-temporal data streams and Big Data analytics, as well as data engineers and data scientists solving data management and analytics problems associated with this class of data.




Security and Privacy Trends in Cloud Computing and Big Data


Book Description

It is essential for an organization to know before involving themselves in cloud computing and big data, what are the key security requirements for applications and data processing. Big data and cloud computing are integrated together in practice. Cloud computing offers massive storage, high computation power, and distributed capability to support processing of big data. In such an integrated environment the security and privacy concerns involved in both technologies become combined. This book discusses these security and privacy issues in detail and provides necessary insights into cloud computing and big data integration. It will be useful in enhancing the body of knowledge concerning innovative technologies offered by the research community in the area of cloud computing and big data. Readers can get a better understanding of the basics of cloud computing, big data, and security mitigation techniques to deal with current challenges as well as future research opportunities.




Supporting Mobile and Sensing Solutions with IBM Intelligent Pervasive Platform


Book Description

This IBM® RedpaperTM publication describes the IBM Intelligent Pervasive Platform and IBM Intelligent Pervasive Platform Sensor Extender. Intelligent Pervasive Platform (IPP) enables you to create mobile applications that seamlessly deliver customized information and services to mobile users, based on their profile and location data. IPP Sensor Extender (based on IPP) supports the collection and standardization of sensing device data from sensor platforms that consist of sensing devices, packaging devices, and gateways. After standardization, the sensing device data can be forwarded to other applications that report, analyze, and display this data. IPP Sensor Extender supports the Internet of Things and machine-to-machine communication. This paper provides the following information about IPP and IPP Sensor Extender: A description of each solution The high-level architecture for each solution, including key components Details of the data models for each solution, including examples of common data representations Integration with other IBM products Use in industry solutions The paper is written for developers and others who are interested in using IPP and IPP Sensor Extender to create mobile and sensing applications.




IBM Cloud Private System Administrator's Guide


Book Description

IBM® Cloud Private is an application platform for developing and managing containerized applications across hybrid cloud environments, on-premises and public clouds. It is an integrated environment for managing containers that includes the container orchestrator Kubernetes, a private image registry, a management console, and monitoring frameworks. This IBM Redbooks covers tasks performed by IBM Cloud Private system administrators such as installation for high availability, configuration, backup and restore, using persistent volumes, networking, security, logging and monitoring. Istio integration, troubleshooting and so on. As part of this project we also developed several code examples and you can download those from the IBM Redbooks GitHub location: https://github.com/IBMRedbooks. The authors team has many years of experience in implementing IBM Cloud Private and other cloud solutions in production environments, so throughout this document we took the approach of providing you the recommended practices in those areas. If you are an IBM Cloud Private system administrator, this book is for you. If you are developing applications on IBM Cloud Private, you can see the IBM Redbooks publication IBM Cloud Private Application Developer's Guide, SG24-8441.




Optimizing DB2 Queries with IBM DB2 Analytics Accelerator for z/OS


Book Description

The IBM® DB2® Analytics Accelerator Version 2.1 for IBM z/OS® (also called DB2 Analytics Accelerator or Query Accelerator in this book and in DB2 for z/OS documentation) is a marriage of the IBM System z® Quality of Service and Netezza® technology to accelerate complex queries in a DB2 for z/OS highly secure and available environment. Superior performance and scalability with rapid appliance deployment provide an ideal solution for complex analysis. This IBM Redbooks® publication provides technical decision-makers with a broad understanding of the IBM DB2 Analytics Accelerator architecture and its exploitation by documenting the steps for the installation of this solution in an existing DB2 10 for z/OS environment. In this book we define a business analytics scenario, evaluate the potential benefits of the DB2 Analytics Accelerator appliance, describe the installation and integration steps with the DB2 environment, evaluate performance, and show the advantages to existing business intelligence processes.




Big Data and Hadoop


Book Description

This book introduces you to the Big Data processing techniques addressing but not limited to various BI (business intelligence) requirements, such as reporting, batch analytics, online analytical processing (OLAP), data mining and Warehousing, and predictive analytics. The book has been written on IBMs Platform of Hadoop framework. IBM Infosphere BigInsight has the highest amount of tutorial matter available free of cost on Internet which makes it easy to acquire proficiency in this technique. This therefore becomes highly vunerable coaching materials in easy to learn steps. The book optimally provides the courseware as per MCA and M. Tech Level Syllabi of most of the Universities. All components of big Data Platform like Jaql, Hive Pig, Sqoop, Flume , Hadoop Streaming, Oozie: HBase, HDFS, FlumeNG, Whirr, Cloudera, Fuse , Zookeeper and Mahout: Machine learning for Hadoop has been discussed in sufficient Detail with hands on Exercises on each.




IBM Integrated Synchronization: Incremental Updates Unleashed


Book Description

The IBM® Db2® Analytics Accelerator (Accelerator) is a logical extension of Db2 for IBM z/OS® that provides a high-speed query engine that efficiently and cost-effectively runs analytics workloads. The Accelerator is an integrated back-end component of Db2 for z/OS. Together, they provide a hybrid workload-optimized database management system that seamlessly manages queries that are found in transactional workloads to Db2 for z/OS and queries that are found in analytics applications to Accelerator. Each query runs in its optimal environment for maximum speed and cost efficiency. The incremental update function of Db2 Analytics Accelerator for z/OS updates Accelerator-shadow tables continually. Changes to the data in original Db2 for z/OS tables are propagated to the corresponding target tables with a high frequency and a brief delay. Query results from Accelerator are always extracted from recent, close-to-real-time data. An incremental update capability that is called IBM InfoSphere® Change Data Capture (InfoSphere CDC) is provided by IBM InfoSphere Data Replication for z/OS up to Db2 Analytics Accelerator V7.5. Since then, an extra new replication protocol between Db2 for z/OS and Accelerator that is called IBM Integrated Synchronization was introduced. With Db2 Analytics Accelerator V7.5, customers can choose which one to use. IBM Integrated Synchronization is a built-in product feature that you use to set up incremental updates. It does not require InfoSphere CDC, which is bundled with IBM Db2 Analytics Accelerator. In addition, IBM Integrated Synchronization has more advantages: Simplified administration, packaging, upgrades, and support. These items are managed as part of the Db2 for z/OS maintenance stream. Updates are processed quickly. Reduced CPU consumption on the mainframe due to a streamlined, optimized design where most of the processing is done on the Accelerator. This situation provides reduced latency. Uses IBM Z® Integrated Information Processor (zIIP) on Db2 for z/OS, which leads to reduced CPU costs on IBM Z and better overall performance data, such as throughput and synchronized rows per second. On z/OS, the workload to capture the table changes was reduced, and the remainder can be handled by zIIPs. With the introduction of an enterprise-grade Hybrid Transactional Analytics Processing (HTAP) enabler that is also known as the Wait for Data protocol, the integrated low latency protocol is now enabled to support more analytical queries running against the latest committed data. IBM Db2 for z/OS Data Gate simplifies delivering data from IBM Db2 for z/OS to IBM Cloud® Pak® for Data for direct access by new applications. It uses the special-purpose integrated synchronization protocol to maintain data currency with low latency between Db2 for z/OS and dedicated target databases on IBM Cloud Pak for Data.




Harness the Power of Big Data The IBM Big Data Platform


Book Description

Boost your Big Data IQ! Gain insight into how to govern and consume IBM’s unique in-motion and at-rest Big Data analytic capabilities Big Data represents a new era of computing—an inflection point of opportunity where data in any format may be explored and utilized for breakthrough insights—whether that data is in-place, in-motion, or at-rest. IBM is uniquely positioned to help clients navigate this transformation. This book reveals how IBM is infusing open source Big Data technologies with IBM innovation that manifest in a platform capable of "changing the game." The four defining characteristics of Big Data—volume, variety, velocity, and veracity—are discussed. You’ll understand how IBM is fully committed to Hadoop and integrating it into the enterprise. Hear about how organizations are taking inventories of their existing Big Data assets, with search capabilities that help organizations discover what they could already know, and extend their reach into new data territories for unprecedented model accuracy and discovery. In this book you will also learn not just about the technologies that make up the IBM Big Data platform, but when to leverage its purpose-built engines for analytics on data in-motion and data at-rest. And you’ll gain an understanding of how and when to govern Big Data, and how IBM’s industry-leading InfoSphere integration and governance portfolio helps you understand, govern, and effectively utilize Big Data. Industry use cases are also included in this practical guide.




Recent Books