SciDAC's Earth System Grid Center for Enabling Technologies Semiannual Progress Report October 1, 2010 Through March 31, 2011


Book Description

This report summarizes work carried out by the Earth System Grid Center for Enabling Technologies (ESG-CET) from October 1, 2010 through March 31, 2011. It discusses ESG-CET highlights for the reporting period, overall progress, period goals, and collaborations, and lists papers and presentations. To learn more about our project and to find previous reports, please visit the ESG-CET Web sites: http://esg-pcmdi.llnl.gov/ and/or https://wiki.ucar.edu/display/esgcet/Home. This report will be forwarded to managers in the Department of Energy (DOE) Scientific Discovery through Advanced Computing (SciDAC) program and the Office of Biological and Environmental Research (OBER), as well as national and international collaborators and stakeholders (e.g., those involved in the Coupled Model Intercomparison Project, phase 5 (CMIP5) for the Intergovernmental Panel on Climate Change (IPCC) 5th Assessment Report (AR5); the Community Earth System Model (CESM); the Climate Science Computational End Station (CCES); SciDAC II: A Scalable and Extensible Earth System Model for Climate Change Science; the North American Regional Climate Change Assessment Program (NARCCAP); the Atmospheric Radiation Measurement (ARM) program; the National Aeronautics and Space Administration (NASA), the National Oceanic and Atmospheric Administration (NOAA)), and also to researchers working on a variety of other climate model and observation evaluation activities. The ESG-CET executive committee consists of Dean N. Williams, Lawrence Livermore National Laboratory (LLNL); Ian Foster, Argonne National Laboratory (ANL); and Don Middleton, National Center for Atmospheric Research (NCAR). The ESG-CET team is a group of researchers and scientists with diverse domain knowledge, whose home institutions include eight laboratories and two universities: ANL, Los Alamos National Laboratory (LANL), Lawrence Berkeley National Laboratory (LBNL), LLNL, NASA/Jet Propulsion Laboratory (JPL), NCAR, Oak Ridge National Laboratory (ORNL), Pacific Marine Environmental Laboratory (PMEL)/NOAA, Rensselaer Polytechnic Institute (RPI), and University of Southern California, Information Sciences Institute (USC/ISI). All ESG-CET work is accomplished under DOE open-source guidelines and in close collaboration with the project's stakeholders, domain researchers, and scientists. Through the ESG project, the ESG-CET team has developed and delivered a production environment for climate data from multiple climate model sources (e.g., CMIP (IPCC), CESM, ocean model data (e.g., Parallel Ocean Program), observation data (e.g., Atmospheric Infrared Sounder, Microwave Limb Sounder), and analysis and visualization tools) that serves a worldwide climate research community. Data holdings are distributed across multiple sites including LANL, LBNL, LLNL, NCAR, and ORNL as well as unfunded partners sites such as the Australian National University (ANU) National Computational Infrastructure (NCI), the British Atmospheric Data Center (BADC), the Geophysical Fluid Dynamics Laboratory/NOAA, the Max Planck Institute for Meteorology (MPI-M), the German Climate Computing Centre (DKRZ), and NASA/JPL. As we transition from development activities to production and operations, the ESG-CET team is tasked with making data available to all users who want to understand it, process it, extract value from it, visualize it, and/or communicate it to others. This ongoing effort is extremely large and complex, but it will be incredibly valuable for building 'science gateways' to critical climate resources (such as CESM, CMIP5, ARM, NARCCAP, Atmospheric Infrared Sounder (AIRS), etc.) for processing the next IPCC assessment report. Continued ESG progress will result in a production-scale system that will empower scientists to attempt new and exciting data exchanges, which could ultimately lead to breakthrough climate science discoveries.




SciDAC's Earth System Grid Center for Enabling Technologies Semi-Annual Progress Report for the Period October 1, 2009 Through March 31, 2010


Book Description

This report summarizes work carried out by the ESG-CET during the period October 1, 2009 through March 31, 2009. It includes discussion of highlights, overall progress, period goals, collaborations, papers, and presentations. To learn more about our project, and to find previous reports, please visit the Earth System Grid Center for Enabling Technologies (ESG-CET) website. This report will be forwarded to the DOE SciDAC program management, the Office of Biological and Environmental Research (OBER) program management, national and international collaborators and stakeholders (e.g., the Community Climate System Model (CCSM), the Intergovernmental Panel on Climate Change (IPCC) 5th Assessment Report (AR5), the Climate Science Computational End Station (CCES), the SciDAC II: A Scalable and Extensible Earth System Model for Climate Change Science, the North American Regional Climate Change Assessment Program (NARCCAP), and other wide-ranging climate model evaluation activities).




SciDAC's Earth System Grid Center for Enabling Technologies Semi-Annual Progress Report for the Period April 1, 2009 Through September 30, 2009


Book Description

This report summarizes work carried out by the ESG-CET during the period April 1, 2009 through September 30, 2009. It includes discussion of highlights, overall progress, period goals, collaborations, papers, and presentations. To learn more about our project, and to find previous reports, please visit the Earth System Grid Center for Enabling Technologies (ESG-CET) website. This report will be forwarded to the DOE SciDAC program management, the Office of Biological and Environmental Research (OBER) program management, national and international collaborators and stakeholders (e.g., the Community Climate System Model (CCSM), the Intergovernmental Panel on Climate Change (IPCC) 5th Assessment Report (AR5), the Climate Science Computational End Station (CCES), the SciDAC II: A Scalable and Extensible Earth System Model for Climate Change Science, the North American Regional Climate Change Assessment Program (NARCCAP), and other wide-ranging climate model evaluation activities). During this semi-annual reporting period, the ESG-CET team continued its efforts to complete software components needed for the ESG Gateway and Data Node. These components include: Data Versioning, Data Replication, DataMover-Lite (DML) and Bulk Data Mover (BDM), Metrics, Product Services, and Security, all joining together to form ESG-CET's first beta release. The launch of the beta release is scheduled for late October with the installation of ESG Gateways at NCAR and LLNL/PCMDI. Using the developed ESG Data Publisher, the ESG II CMIP3 (IPCC AR4) data holdings - approximately 35 TB - will be among the first datasets to be published into the new ESG enterprise system. In addition, the NCAR's ESG II data holdings will also be published into the new system - approximately 200 TB. This period also saw the testing of the ESG Data Node at various collaboration sites, including: the British Atmospheric Data Center (BADC), the Max-Planck-Institute for Meteorology, the University of Tokyo Center for Climate System Research, and the Australian National University. This period, a total of 14 national and international sites installed an ESG Data Node for testing. During this period, we also continued to provide production-level services to the community, providing researchers worldwide with access to CMIP3 (IPCC AR4), CCES, and CCSM, Parallel Climate Model (PCM), Parallel Ocean Program (POP), and Cloud Feedback Model Intercomparison Project (CFMIP), and NARCCAP data.




DOE SciDAC's Earth System Grid Center for Enabling Technologies Final Report


Book Description

The mission of the Earth System Grid Federation (ESGF) is to provide the worldwide climate-research community with access to the data, information, model codes, analysis tools, and intercomparison capabilities required to make sense of enormous climate data sets. Its specific goals are to (1) provide an easy-to-use and secure web-based data access environment for data sets; (2) add value to individual data sets by presenting them in the context of other data sets and tools for comparative analysis; (3) address the specific requirements of participating organizations with respect to bandwidth, access restrictions, and replication; (4) ensure that the data are readily accessible through the analysis and visualization tools used by the climate research community; and (5) transfer infrastructure advances to other domain areas. For the ESGF, the U.S. Department of Energy's (DOE's) Earth System Grid Center for Enabling Technologies (ESG-CET) team has led international development and delivered a production environment for managing and accessing ultra-scale climate data. This production environment includes multiple national and international climate projects (such as the Community Earth System Model and the Coupled Model Intercomparison Project), ocean model data (such as the Parallel Ocean Program), observation data (Atmospheric Radiation Measurement Best Estimate, Carbon Dioxide Information and Analysis Center, Atmospheric Infrared Sounder, etc.), and analysis and visualization tools, all serving a diverse user community. These data holdings and services are distributed across multiple ESG-CET sites (such as ANL, LANL, LBNL/NERSC, LLNL/PCMDI, NCAR, and ORNL) and at unfunded partner sites, such as the Australian National University National Computational Infrastructure, the British Atmospheric Data Centre, the National Oceanic and Atmospheric Administration Geophysical Fluid Dynamics Laboratory, the Max Planck Institute for Meteorology, the German Climate Computing Centre, the National Aeronautics and Space Administration Jet Propulsion Laboratory, and the National Oceanic and Atmospheric Administration. The ESGF software is distinguished from other collaborative knowledge systems in the climate community by its widespread adoption, federation capabilities, and broad developer base. It is the leading source for present climate data holdings, including the most important and largest data sets in the global-climate community, and - assuming its development continues - we expect it to be the leading source for future climate data holdings as well. Recently, ESG-CET extended its services beyond data-file access and delivery to include more detailed information products (scientific graphics, animations, etc.), secure binary data-access services (based upon the OPeNDAP protocol), and server-side analysis. The latter capabilities allow users to request data subsets transformed through commonly used analysis and intercomparison procedures. As we transition from development activities to production and operations, the ESG-CET team is tasked with making data available to all users seeking to understand, process, extract value from, visualize, and/or communicate it to others. This ongoing effort, though daunting in scope and complexity, will greatly magnify the value of numerical climate model outputs and climate observations for future national and international climate-assessment reports. The ESG-CET team also faces substantial technical challenges due to the rapidly increasing scale of climate simulation and observational data, which will grow, for example, from less than 50 terabytes for the last Intergovernmental Panel on Climate Change (IPCC) assessment to multiple Petabytes for the next IPCC assessment. In a world of exponential technological change and rapidly growing sophistication in climate data analysis, an infrastructure such as ESGF must constantly evolve if it is to remain relevant and useful. Regretfully, we submit our final report at the end of project funding. To continue to serve the climate-science community, we are currently seeking additional funding. Such funding would allow us to maintain and enhance ESGF production and operation of this vital endeavor of cataloging, serving, and analyzing ultra-scale climate science data. At this time, the entire ESG-CET team would like to take this opportunity to sincerely thank our funding agencies in the DOE Scientific Discovery through Advanced Computing (SciDAC) program and the Office of Biological and Environmental Research (OBER) - as well as our national and international collaborators, stakeholders, and partners - for allowing us to work with you and serve the community these past several years.




DOE SciDAC's Earth System Grid Center for Enabling Technologies Final Report for University of Southern California Information Sciences Institute


Book Description

The mission of the Earth System Grid Federation (ESGF) is to provide the worldwide climate-research community with access to the data, information, model codes, analysis tools, and intercomparison capabilities required to make sense of enormous climate data sets. Its specific goals are to (1) provide an easy-to-use and secure web-based data access environment for data sets; (2) add value to individual data sets by presenting them in the context of other data sets and tools for comparative analysis; (3) address the specific requirements of participating organizations with respect to bandwidth, access restrictions, and replication; (4) ensure that the data are readily accessible through the analysis and visualization tools used by the climate research community; and (5) transfer infrastructure advances to other domain areas. For the ESGF, the U.S. Department of Energy's (DOE's) Earth System Grid Center for Enabling Technologies (ESG-CET) team has led international development and delivered a production environment for managing and accessing ultra-scale climate data. This production environment includes multiple national and international climate projects (such as the Community Earth System Model and the Coupled Model Intercomparison Project), ocean model data (such as the Parallel Ocean Program), observation data (Atmospheric Radiation Measurement Best Estimate, Carbon Dioxide Information and Analysis Center, Atmospheric Infrared Sounder, etc.), and analysis and visualization tools, all serving a diverse user community. These data holdings and services are distributed across multiple ESG-CET sites (such as ANL, LANL, LBNL/NERSC, LLNL/PCMDI, NCAR, and ORNL) and at unfunded partner sites, such as the Australian National University National Computational Infrastructure, the British Atmospheric Data Centre, the National Oceanic and Atmospheric Administration Geophysical Fluid Dynamics Laboratory, the Max Planck Institute for Meteorology, the German Climate Computing Centre, the National Aeronautics and Space Administration Jet Propulsion Laboratory, and the National Oceanic and Atmospheric Administration. The ESGF software is distinguished from other collaborative knowledge systems in the climate community by its widespread adoption, federation capabilities, and broad developer base. It is the leading source for present climate data holdings, including the most important and largest data sets in the globalclimate community, and--assuming its development continues--we expect it to be the leading source for future climate data holdings as well. Recently, ESG-CET extended its services beyond data-file access and delivery to include more detailed information products (scientific graphics, animations, etc.), secure binary data-access services (based upon the OPeNDAP Data Access Protocol), and server-side analysis. The latter capabilities allow users to request data subsets transformed through commonly used analysis and intercomparison procedures. As we transition from development activities to production and operations, the ESG-CET team is tasked with making data available to all users seeking to understand, process, extract value from, visualize, and/or communicate it to others--this is of course if funding continues at some level. This ongoing effort, though daunting in scope and complexity, would greatly magnify the value of numerical climate model outputs and climate observations for future national and international climate-assessment reports. The ESG-CET team also faces substantial technical challenges due to the rapidly increasing scale of climate simulation and observational data, which will grow, for example, from less than 50 terabytes for the last Intergovernmental Panel on Climate Change (IPCC) assessment to multiple Petabytes for the next IPCC assessment. In a world of exponential technological change and rapidly growing sophistication in climate data analysis, an infrastructure such as ESGF must constantly evolve if it is t ...




DOE SciDAC{u2019}s Earth System Grid Center for Enabling Technologies Final Report for University of Southern California Information Sciences Institute


Book Description

The mission of the Earth System Grid Federation (ESGF) is to provide the worldwide climate-research community with access to the data, information, model codes, analysis tools, and intercomparison capabilities required to make sense of enormous climate data sets. Its specific goals are to (1) provide an easy-to-use and secure web-based data access environment for data sets; (2) add value to individual data sets by presenting them in the context of other data sets and tools for comparative analysis; (3) address the specific requirements of participating organizations with respect to bandwidth, access restrictions, and replication; (4) ensure that the data are readily accessible through the analysis and visualization tools used by the climate research community; and (5) transfer infrastructure advances to other domain areas. For the ESGF, the U.S. Department of Energy’s (DOE’s) Earth System Grid Center for Enabling Technologies (ESG-CET) team has led international development and delivered a production environment for managing and accessing ultra-scale climate data. This production environment includes multiple national and international climate projects (such as the Community Earth System Model and the Coupled Model Intercomparison Project), ocean model data (such as the Parallel Ocean Program), observation data (Atmospheric Radiation Measurement Best Estimate, Carbon Dioxide Information and Analysis Center, Atmospheric Infrared Sounder, et cetera), and analysis and visualization tools, all serving a diverse user community. These data holdings and services are distributed across multiple ESG-CET sites (such as ANL, LANL, LBNL/NERSC, LLNL/PCMDI, NCAR, and ORNL) and at unfunded partner sites, such as the Australian National University National Computational Infrastructure, the British Atmospheric Data Centre, the National Oceanic and Atmospheric Administration Geophysical Fluid Dynamics Laboratory, the Max Planck Institute for Meteorology, the German Climate Computing Centre, the National Aeronautics and Space Administration Jet Propulsion Laboratory, and the National Oceanic and Atmospheric Administration. The ESGF software is distinguished from other collaborative knowledge systems in the climate community by its widespread adoption, federation capabilities, and broad developer base. It is the leading source for present climate data holdings, including the most important and largest data sets in the globalclimate community, and—assuming its development continues—we expect it to be the leading source for future climate data holdings as well. Recently, ESG-CET extended its services beyond data-file access and delivery to include more detailed information products (scientific graphics, animations, et cetera), secure binary data-access services (based upon the OPeNDAP Data Access Protocol), and server-side analysis. The latter capabilities allow users to request data subsets transformed through commonly used analysis and intercomparison procedures. As we transition from development activities to production and operations, the ESG-CET team is tasked with making data available to all users seeking to understand, process, extract value from, visualize, and/or communicate it to others—this is of course if funding continues at some level. This ongoing effort, though daunting in scope and complexity, would greatly magnify the value of numerical climate model outputs and climate observations for future national and international climate-assessment reports. The ESG-CET team also faces substantial technical challenges due to the rapidly increasing scale of climate simulation and observational data, which will grow, for example, from less than 50 terabytes for the last Intergovernmental Panel on Climate Change (IPCC) assessment to multiple Petabytes for the next IPCC assessment. In a world of exponential technological change and rapidly growing sophistication in climate data analysis, an infrastructure such as ESGF must constantly evolve if it is t...




PMEL Contributions to the Collaboration


Book Description

Drawing to a close after five years of funding from DOE's ASCR and BER program offices, the SciDAC-2 project called the Earth System Grid (ESG) Center for Enabling Technologies has successfully established a new capability for serving data from distributed centers. The system enables users to access, analyze, and visualize data using a globally federated collection of networks, computers and software. The ESG software now known as the Earth System Grid Federation (ESGF) has attracted a broad developer base and has been widely adopted so that it is now being utilized in serving the most comprehensive multi-model climate data sets in the world. The system is used to support international climate model intercomparison activities as well as high profile U.S. DOE, NOAA, NASA, and NSF projects. It currently provides more than 25,000 users access to more than half a petabyte of climate data (from models and from observations) and has enabled over a 1,000 scientific publications.




The Earth System Grid Center for Enabling Technologies


Book Description

This report discusses a project that used prototyping technology to access and analyze climate data. This project was initially funded under the DOE's Next Generation Internet (NGI) program, with follow-on support from BER and the Mathematical, Information, and Computational Sciences (MICS) office. In this prototype, we developed Data Grid technologies for managing the movement and replication of large datasets, and applied these technologies in a practical setting (i.e., an ESG-enabled data browser based on current climate data analysis tools), achieving cross-country transfer rates of more than 500 Mb/s. Having demonstrated the potential for remotely accessing and analyzing climate data located at sites across the U.S., we won the "Hottest Infrastructure" award in the Network Challenge event. While the ESG I prototype project substantiated a proof of concept ("Turning Climate Datasets into Community Resources"), the SciDAC Earth System Grid (ESG) II project made this a reality. Our efforts targeted the development of metadata technologies (standard schema, XML metadata extraction based on netCDF, and a Metadata Catalog Service), security technologies (Web-based user registration and authentication, and community authorization), data transport technologies (GridFTPenabled OPeNDAP-G for high-performance access, robust multiple file transport and integration with mass storage systems, and support for dataset aggregation and subsetting), as well as web portal technologies to provide interactive access to climate data holdings. At this point, the technology was in place and assembled, and ESG II was poised to make a substantial impact on the climate modelling community.




Data Management and Analysis for the Earth System Grid


Book Description

The international climate community is expected to generate hundreds of petabytes of simulation data within the next five to seven years. This data must be accessed and analyzed by thousands of analysts worldwide in order to provide accurate and timely estimates of the likely impact of climate change on physical, biological, and human systems. Climate change is thus not only a scientific challenge of the first order but also a major technological challenge. To address this technological challenge, the Earth System Grid Center for Enabling Technologies (ESG-CET) has been established within the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC)-2 program, with support from the offices of Advanced Scientific Computing Research and Biological and Environmental Research. ESG-CET's mission is to provide climate researchers worldwide with access to the data, information, models, analysis tools, and computational capabilities required to make sense of enormous climate simulation datasets. Its specific goals are to (1) make data more useful to climate researchers by developing Grid technology that enhances data usability; (2) meet specific distributed database, data access, and data movement needs of national and international climate projects; (3) provide a universal and secure web-based data access portal for broad multi-model data collections; and (4) provide a wide-range of Grid-enabled climate data analysis tools and diagnostic methods to international climate centers and U.S. government agencies. Building on the successes of the previous Earth System Grid (ESG) project, which has enabled thousands of researchers to access tens of terabytes of data from a small number of ESG sites, ESG-CET is working to integrate a far larger number of distributed data providers, high-bandwidth wide-area networks, and remote computers in a highly collaborative problem-solving environment.