Book Description
"Like all organizations, libraries are generating more data than ever before and are keen to use it. Data manipulation and analysis is far easier than most people imagine. This book demystifies the process of working with data, familiarizing readers with a small number of simple tools, and easily digestible but powerful concepts. Using tools that come with desktop computers, readers will learn to extract, manipulate, and analyze data (and metadata) of any size and complexity. Kyle Banerjee, experienced author of in data and digital library topics, is determined to take the fear out of the command line. This book will be useful to librarians developing their skills, introducing concepts and tools gradually. Starter topics, most of which can be accomplished with a single-word command, will include: -how to use the output of one program as input for another -redirecting the results of that to any file or program -sorting files of any size by any criteria -identifying duplicates - listing the number of occurrences for each entry As readers develop a firm grasp of the fundamentals, they will learn progressively more sophisticated tasks such as comparing files, converting data from one format to another, reformatting values (e.g. converting inconsistent dates to a consistent format), combining data from multiple files, and communicating with APIs (Application Programming Interfaces) built into their systems. Each chapter with more examples that power users might appreciate, but others can skip over without impeding their ability to understand anything else in the book. Table of Contents 1. Introduction 2. Getting started 3. Directing output - making programs and files work with each other 4. Regular expressions -- the Swiss Army knife of data 5. Understanding data formats Model, namespaces, and validation 6. Application Programming Interfaces (APIs) - talk to programs across the Web 7. Putting it all together 8. More advanced topics 9. One line solutions for common library tasks 10. Command reference 11. Glossary"--