Industrial Parsing of Software Manuals


Book Description

The task of language engineering is to develop the technology for building computer systems which can perform useful linguistic tasks such as machine assisted translation, text retrieval, message classification and document summarisation. Such systems often require the use of a parser which can extract specific types of grammatical data from pre-defined classes of input text. There are many parsers already available for use in language engineering systems. However, many different linguistic formalisms and parsing algorithms are employed. Grammatical coverage varies, as does the nature of the syntactic information extracted. Direct comparison between systems is difficult because each is likely to have been evaluated using different test criteria. In this volume, eight different parsers are applied to the same task, that of analysing a set of sentences derived from software instruction manuals. Each parser is presented in a separate chapter. Evaluation of performance is carried out using a standard set of criteria with the results being presented in a set of tables which have the same format for each system. Three additional chapters provide further analysis of the results as well as discussing possible approaches to the standardisation of parse tree data. Five parse trees are provided for each system in an appendix, allowing further direct comparison between systems by the reader. The book will be of interest to students, researchers and practitioners in the areas of computational linguistics, computer science, information retrieval, language engineering, linguistics and machine assisted translation.




The Definitive ANTLR 4 Reference


Book Description

Programmers run into parsing problems all the time. Whether it's a data format like JSON, a network protocol like SMTP, a server configuration file for Apache, a PostScript/PDF file, or a simple spreadsheet macro language--ANTLR v4 and this book will demystify the process. ANTLR v4 has been rewritten from scratch to make it easier than ever to build parsers and the language applications built on top. This completely rewritten new edition of the bestselling Definitive ANTLR Reference shows you how to take advantage of these new features. Build your own languages with ANTLR v4, using ANTLR's new advanced parsing technology. In this book, you'll learn how ANTLR automatically builds a data structure representing the input (parse tree) and generates code that can walk the tree (visitor). You can use that combination to implement data readers, language interpreters, and translators. You'll start by learning how to identify grammar patterns in language reference manuals and then slowly start building increasingly complex grammars. Next, you'll build applications based upon those grammars by walking the automatically generated parse trees. Then you'll tackle some nasty language problems by parsing files containing more than one language (such as XML, Java, and Javadoc). You'll also see how to take absolute control over parsing by embedding Java actions into the grammar. You'll learn directly from well-known parsing expert Terence Parr, the ANTLR creator and project lead. You'll master ANTLR grammar construction and learn how to build language tools using the built-in parse tree visitor mechanism. The book teaches using real-world examples and shows you how to use ANTLR to build such things as a data file reader, a JSON to XML translator, an R parser, and a Java class->interface extractor. This book is your ticket to becoming a parsing guru! What You Need: ANTLR 4.0 and above. Java development tools. Ant build system optional(needed for building ANTLR from source)




Parsing Techniques


Book Description

This second edition of Grune and Jacobs’ brilliant work presents new developments and discoveries that have been made in the field. Parsing, also referred to as syntax analysis, has been and continues to be an essential part of computer science and linguistics. Parsing techniques have grown considerably in importance, both in computer science, ie. advanced compilers often use general CF parsers, and computational linguistics where such parsers are the only option. They are used in a variety of software products including Web browsers, interpreters in computer devices, and data compression programs; and they are used extensively in linguistics.




CASL User Manual


Book Description

CASL, the Common Algebraic Specification Language, was designed by the members of CoFI, the Common Framework Initiative for algebraic specification and development, and is a general-purpose language for practical use in software development for specifying both requirements and design. CASL is already regarded as a de facto standard, and various sublanguages and extensions are available for specific tasks. This book illustrates and discusses how to write CASL specifications. The authors first describe the origins, aims and scope of CoFI, and review the main concepts of algebraic specification languages. The main part of the book explains CASL specifications, with chapters on loose, generated and free specifications, partial functions, sub- and supersorts, structuring specifications, genericity and reusability, architectural specifications, and version control. The final chapters deal with tool support and libraries, and present a realistic case study involving the standard benchmark for comparing specification frameworks. The book is aimed at software researchers and professionals, and follows a tutorial style with highlighted points, illustrative examples, and a full specification and library index. A separate, complementary LNCS volume contains the CASL Reference Manual.




Mastering PHP 4.1


Book Description

Build Dynamic, Database-Driven Web Sites PHP is a fully developed, server-side embedded scripting language, and its importance for web application development has grown with the rise of the Apache web server. Are you a novice programmer? This book starts with the basics and takes you wherever you want to go. A seasoned pro? You'll be amazed at how much you can capitalize on PHP's power and object-oriented support, and how it leverages your knowledge of other languages. Finally, if you're a PHP user in search of an authoritative reference, you need look no further. Mastering PHP 4.1 guides you through all levels of real-world web programming problems and provides expert advice on which solutions work best. Coverage Includes: Reading and writing files Validating data with regular expressions Accessing MySQL and PostgreSQL databases Accessing LDAP servers Generating images and PDF documents on the fly Building authentication and access-control systems Sending e-mail and building web-to-e-mail interfaces Creating your own classes Closing common security holes in PHP scripts Parsing and generating XML documents Using sessions to store persistent data Debugging misbehaving scripts Encrypting and decrypting sensitive data Note: CD-ROM/DVD and other supplementary materials are not included as part of eBook file.




Programming Python


Book Description

If you've mastered Python's fundamentals, you're ready to start using it to get real work done. Programming Python will show you how, with in-depth tutorials on the language's primary application domains: system administration, GUIs, and the Web. You'll also explore how Python is used in databases, networking, front-end scripting layers, text processing, and more. This book focuses on commonly used tools and libraries to give you a comprehensive understanding of Python’s many roles in practical, real-world programming. You'll learn language syntax and programming techniques in a clear and concise manner, with lots of examples that illustrate both correct usage and common idioms. Completely updated for version 3.x, Programming Python also delves into the language as a software development tool, with many code examples scaled specifically for that purpose. Topics include: Quick Python tour: Build a simple demo that includes data representation, object-oriented programming, object persistence, GUIs, and website basics System programming: Explore system interface tools and techniques for command-line scripting, processing files and folders, running programs in parallel, and more GUI programming: Learn to use Python’s tkinter widget library Internet programming: Access client-side network protocols and email tools, use CGI scripts, and learn website implementation techniques More ways to apply Python: Implement data structures, parse text-based information, interface with databases, and extend and embed Python




Recent Trends in Algebraic Development Techniques


Book Description

This book constitutes the thoroughly refereed post-workshop proceedings of the 14th International Workshop on Algebraic Development Techniques, WADT'99, held in Toulouse, France in September 1999. The 23 revised full papers presented together with three invited papers were carefully reviewed and selected from 69 workshop presentations. The papers address the following topics: algebraic specification and other specification formalisms, test and validation, concurrent processes applications, logic and validation, combining formalisms, subsorts and partiality, structuring, rewriting, co-algebras and sketches, refinement, institutions and categories, and ASM specifications.










Computational Linguistics and Intelligent Text Processing


Book Description

CICLing 2005 (www.CICLing.org) was the 6th Annual Conference on Intelligent Text Processing and Computational Linguistics. It was intended to provide a balanced view of the cutting-edge developments in both the theoretical foundations of computational linguistics and the practice of natural-language text processing with its numerous applications. A feature of CICLing conferences is their wide scope that covers nearly all areas of computational linguistics and all aspects of natural language processing applications. This year we were honored by the presence of our keynote speakers Christian Boitet (CLIPS-IMAG, Grenoble), Kevin Knight (ISI), Daniel Marcu (ISI), and Ellen Riloff (University of Utah), who delivered excellent extended lectures and organized vivid discussions and encouraging tutorials; their invited papers are published in this volume. Of 151 submissions received, 88 were selected for presentation; 53 as full papers and 35 as short papers, by exactly 200 authors from 26 countries: USA (15 papers); Mexico (12); China (9.5); Spain (7.5); South Korea (5.5); Singapore (5); Germany (4.8); Japan (4); UK (3.5); France (3.3); India (3); Italy (3); Czech Republic (2.5); Romania (2.3); Brazil, Canada, Greece, Ireland, Israel, the Netherlands, Norway, Portugal, Sweden, Switzerland (1 each); Hong Kong (0.5); and Russia (0.5) including the invited papers. Internationally co-authored papers are counted in equal fractions.