Workflows for e-Science


Book Description

This is a timely book presenting an overview of the current state-of-the-art within established projects, presenting many different aspects of workflow from users to tool builders. It provides an overview of active research, from a number of different perspectives. It includes theoretical aspects of workflow and deals with workflow for e-Science as opposed to e-Commerce. The topics covered will be of interest to a wide range of practitioners.




Scientific Workflows


Book Description

Creating scientific workflow applications is a very challenging task due to the complexity of the distributed computing environments involved, the complex control and data flow requirements of scientific applications, and the lack of high-level languages and tools support. Particularly, sophisticated expertise in distributed computing is commonly required to determine the software entities to perform computations of workflow tasks, the computers on which workflow tasks are to be executed, the actual execution order of workflow tasks, and the data transfer between them. Qin and Fahringer present a novel workflow language called Abstract Workflow Description Language (AWDL) and the corresponding standards-based, knowledge-enabled tool support, which simplifies the development of scientific workflow applications. AWDL is an XML-based language for describing scientific workflow applications at a high level of abstraction. It is designed in a way that allows users to concentrate on specifying such workflow applications without dealing with either the complexity of distributed computing environments or any specific implementation technology. This research monograph is organized into five parts: overview, programming, optimization, synthesis, and conclusion, and is complemented by an appendix and an extensive reference list. The topics covered in this book will be of interest to both computer science researchers (e.g. in distributed programming, grid computing, or large-scale scientific applications) and domain scientists who need to apply workflow technologies in their work, as well as engineers who want to develop distributed and high-throughput workflow applications, languages and tools.




Business and Scientific Workflows


Book Description

Focuses on how to use web service computing and service-based workflow technologies to develop timely, effective workflows for both business and scientific fields Utilizing web computing and Service-Oriented Architecture (SOA), Business and Scientific Workflows: A Web Service Oriented Approach focuses on how to design, analyze, and deploy web service based workflows for both business and scientific applications in many areas of healthcare and biomedicine. It also discusses and presents the recent research and development results. This informative reference features application scenarios that include healthcare and biomedical applications, such as personalized healthcare processing, DNA sequence data processing, and electrocardiogram wave analysis, and presents: Updated research and development results on the composition technologies of web services for ever-sophisticated service requirements from various users and communities Fundamental methods such as Petri nets and social network analysis to advance the theory and applications of workflow design and web service composition Practical and real applications of the developed theory and methods for such platforms as personalized healthcare and Biomedical Informatics Grids The authors' efforts on advancing service composition methods for both business and scientific software systems, with theoretical and empirical contributions With workflow-driven service composition and reuse being a hot topic in both academia and industry, this book is ideal for researchers, engineers, scientists, professionals, and students who work on service computing, software engineering, business and scientific workflow management, the internet, and management information systems (MIS).




Automated Optimization Methods for Scientific Workflows in e-Science Infrastructures


Book Description

Scientific workflows have emerged as a key technology that assists scientists with the design, management, execution, sharing and reuse of in silico experiments. Workflow management systems simplify the management of scientific workflows by providing graphical interfaces for their development, monitoring and analysis. Nowadays, e-Science combines such workflow management systems with large-scale data and computing resources into complex research infrastructures. For instance, e-Science allows the conveyance of best practice research in collaborations by providing workflow repositories, which facilitate the sharing and reuse of scientific workflows. However, scientists are still faced with different limitations while reusing workflows. One of the most common challenges they meet is the need to select appropriate applications and their individual execution parameters. If scientists do not want to rely on default or experience-based parameters, the best-effort option is to test different workflow set-ups using either trial and error approaches or parameter sweeps. Both methods may be inefficient or time consuming respectively, especially when tuning a large number of parameters. Therefore, scientists require an effective and efficient mechanism that automatically tests different workflow set-ups in an intelligent way and will help them to improve their scientific results. This thesis addresses the limitation described above by defining and implementing an approach for the optimization of scientific workflows. In the course of this work, scientists’ needs are investigated and requirements are formulated resulting in an appropriate optimization concept. In a following step, this concept is prototypically implemented by extending a workflow management system with an optimization framework, including general mechanisms required to conduct workflow optimization. As optimization is an ongoing research topic, different algorithms are provided by pluggable extensions (plugins) that can be loosely coupled with the framework, resulting in a generic and quickly extendable system. In this thesis, an exemplary plugin is introduced which applies a Genetic Algorithm for parameter optimization. In order to accelerate and therefore make workflow optimization feasible at all, e-Science infrastructures are utilized for the parallel execution of scientific workflows. This is empowered by additional extensions enabling the execution of applications and workflows on distributed computing resources. The actual implementation and therewith the general approach of workflow optimization is experimentally verified by four use cases in the life science domain. All workflows were significantly improved, which demonstrates the advantage of the proposed workflow optimization. Finally, a new collaboration-based approach is introduced that harnesses optimization provenance to make optimization faster and more robust in the future.




Provenance and Annotation of Data and Processes


Book Description

This book constitutes the thoroughly refereed post-conference proceedings of the Second International Provenance and Annotation Workshop, IPAW 2008, held in Salt Lake City, UT, USA, in June 2007. The 14 revised full papers and 15 revised short and demo papers presented together with 2 keynote lectures were carefully reviewed and selected from 40 submissions. The paper are organized in topical sections on provenance: models and querying; provenance: visualization, failures, identity; provenance and workflows; provenance for streams and collaboration; and applications.




A Framework for Model-Driven Scientific Workflow Engineering


Book Description

Scientific workflows are one important means in the context of data-intensive science for reliable and efficient scientific data processing in distributed computing infrastructures such as Grids. A common trend is to adapt existing and established business workflow technologies instead of developing own technologies from scratch. This thesis provides a model-driven approach for scientific workflow engineering, in which domain-specific languages (DSLs) tailored for a certain scientific domain are used for scientific workflow modeling, and automated mapping techniques for technical execution are developed and evaluated. The Business Process Model and Notation (BPMN) is thereby used at the domain-specific layer and the Web Services Business Process Execution Language (BPEL) at the technical layer. The implementation uses the Eclipse Modeling Framework (EMf) and is evaluated in three application scenarios.




Scientific and Statistical Database Management


Book Description

This book constitutes the refereed proceedings of the 21st International Conference on Scientific and Statistical Database Management, SSDBM 2009, held in New Orleans, LA, USA in June 2009. The 29 revised full papers and 12 revised short papers including poster and demo papers presented together with three invited presentations were carefully reviewed and selected from 76 submissions. The papers are organized in topical sections on improving the end-user experience, indexing, physical design, and energy, application experience, workflow, query processing, similarity search, mining, as well as spatial data.




Scientific Data Management


Book Description

Dealing with the volume, complexity, and diversity of data currently being generated by scientific experiments and simulations often causes scientists to waste productive time. Scientific Data Management: Challenges, Technology, and Deployment describes cutting-edge technologies and solutions for managing and analyzing vast amounts of data, helping




Guide to e-Science


Book Description

This guidebook on e-science presents real-world examples of practices and applications, demonstrating how a range of computational technologies and tools can be employed to build essential infrastructures supporting next-generation scientific research. Each chapter provides introductory material on core concepts and principles, as well as descriptions and discussions of relevant e-science methodologies, architectures, tools, systems, services and frameworks. Features: includes contributions from an international selection of preeminent e-science experts and practitioners; discusses use of mainstream grid computing and peer-to-peer grid technology for “open” research and resource sharing in scientific research; presents varied methods for data management in data-intensive research; investigates issues of e-infrastructure interoperability, security, trust and privacy for collaborative research; examines workflow technology for the automation of scientific processes; describes applications of e-science.




Case-Based Reasoning Research and Development


Book Description

This book constitutes the refereed proceedings of the 19th International Conference on Case-Based Reasoning, held in London, UK, in September 2011. The 32 contributions presented together with 3 invited talks were carefully reviewd and selected from 67 submissions. The presentations and posters covered a wide range of CBR topics of interest both to practitioners and researchers, including CBR methodology covering case representation, similarity, retrieval, and adaptation; provenance and maintenance; recommender systems; multi-agent collaborative systems; data mining; time series analysis; Web applications; knowledge management; legal reasoning; healthcare systems and planning systems.