Data Access and the ECCO Ocean and Ice State Estimate

Principal Investigator (PI): Patrick Heimbach, University of Texas, Austin

Co-Investigators (Co-PIs): Ian Fenty and Thomas Huang, NASA’s Jet Propulsion Laboratory (JPL)

Project Summary

The Estimating the Ocean Circulation and Climate (ECCO) global ocean state estimation system is the premier tool for synthesizing NASA’s diverse Earth system observations into a complete physical description of the planet’s ocean and sea ice system. ECCO state estimates are of particular significance to NASA because satellite observations, although global in coverage, remain sparse in space and time relative to the inherent scales of ocean variability and are blind to all but shallow-water, optically clear environments.

The volume of ECCO products is increasing rapidly as more satellite data come online. ECCO now faces two serious Big Data challenges: (a) managing the increasing volume of new satellite data that goes into making new ECCO products, and (b) ensuring that the scientific community continues to be able to use ECCO products efficiently as the volume of data continues to grow.

The ECCO-Cloud project addresses these Big Data challenges by:

  • Expanding and accelerating the integration of NASA Earth system data into ECCO through automated preprocessing and transformation.
  • Radically streamlining the integration of updated ECCO products into NASA’s Earth Observing System Data and Information System (EOSDIS), specifically NASA’s Physical Oceanography Distributed Active Archive Center (PO.DAAC).
  • Facilitating and expanding the scientific utilization of NASA remote sensing data integrated in ECCO by the growing community of interdisciplinary researchers in the oceanographic, sea-ice, sea level rise, and climate science fields.

The ECCO-Cloud system is comprised of four subsystems:

  1. Data Harvesting: Queries the Earthdata Common Metadata Repository (CMR) for new satellite data, retrieves these data, and stages the data to Object Store to trigger the Product Transformation Workflow.
  2. Product Transformation Workflow: Preprocess and transforms the harvested satellite data so that the data can be used in the ECCO state estimation system in the high-performance computing (HPC) environment at NASA’s Ames Research Center.
  3. Data Assembly Workflow: Takes the raw ECCO output from HPC and generates Climate and Forecast (CF) metadata-compliant NetCDF output for long-term archive at PO.DAAC.
  4. Analysis: Ingests the CF-compliant NetCDF ECCO products for cloud-optimized data analysis services.
ECCO data architecture schematic.

ECCO-Cloud's end-to-end System Architecture: An integration between Amazon Web Services (center images with AWS cloud icon) and the NASA Pleiades supercomputer at NASA's Ames Research Center (upper right box) enables full automation for Data Harvesting, Product Generation, Distribution, and Data Visualization and Analysis.

The ECCO-Cloud project improves the quality of current and future ECCO products by bridging the elastic cloud and NASA's HPC environments. The project formalizes ECCO packaging according to CF standards for the global climate community, and streamlines the delivery to PO.DAAC to enable rapid access to the latest ECCO products by the research and informatic communities. ECCO-Cloud also provides a production-quality, cloud-based visualization and analysis platform and tool for data exploration and collaboration.

This work leveraged 1) the latest ECCO ocean global state estimate, 2) new software tools developed to display, analyze, extract, subset, reproject, and download ocean physical parameters from the ECCO state estimate (e.g., temperature, salinity, currents, atmosphere-ocean heat fluxes, and sea level), 3) experience in hosting and rapidly accessing the tens of gigabytes of binary output files that comprise the complete ECCO state estimate, and 4) recent developments allowing new simulations based on ECCO’s Oceanic General Circulation Model (OGCM) to be run on the Amazon Elastic Compute Cloud (Amazon EC2).

Project Status

The project has completed its two years of development, and the ECCO Data Analysis Tool (DAT) is now part of the official ECCO Group website (https://ecco-group.org). All software developed through this project is available through open source. The project also generated many supporting documents that are hosted on the Earthdata Code Collaborative (ECC).

Publications and Presentations (listed alphabetically)

Fenty, I. (2020) “ECCO Town Hall.” American Geophysical Union (AGU) Ocean Sciences Meeting, San Diego, CA. 20 February 2020.

—. (2020). “ECCO Town Hall.” AGU Fall Meeting (virtual). 3 December 2020.

—. (2019). “ECCO Town Hall.” AGU Fall Meeting, San Francisco, CA. 9 December 2019.

Fenty, I. & Heimbach, P. (2018). “ECCO Town Hall.” AGU Fall Meeting, Washington, D.C. 10 December 2018.

Ford, E. & Huang, T. (2019). “Analytics Center Framework for Estimating the Circulation and Climate of the Ocean,” NASA ESDIS System Engineering Technical Interchange Meeting. NASA’s Goddard Space Flight Center, Greenbelt, MD. July 2019.

Greguska, F., Wilson, B. & Huang, T. (2019). “Apache Science Data Analytics Platform (SDAP).” ApacheCon North America, Las Vegas. 9 September 2019.

Huang, T. (2020). “From NASA Innovation to Professional Open Source – Apache Science Data Analytics Platform (SDAP).” Earth Science Information Partners (ESIP) Winter Meeting, Bethesda, MD. 7 to 9 January 2020.

—. (2020). “Autonomously Sustainable Solution for Big Ocean Science.” Ocean Sciences, San Diego, CA. 20 February 2020.

—. (2019). “Advancing Technology for Big Ocean Science through Partnership and Open Source.” 99th American Meteorological Society (AMS) Annual Meeting, Phoenix, AZ. 8 January 2019.

—. (2019). “Overview of JPL Data Science for Earth Science.” Keynote Talk of ESA Big Data from Space (BiDS’19), Munich, Germany. February 2019.

—. (2019). “Apache Science Data Analytics Platform (SDAP).” European General Assembly, Vienna, Austria. 7 to 12 April 2019. Geophysical Research Abstracts, Vol. 21, EGU2019-18602.

—. (2019). “From Data to Insights: Shift Toward Data Analytics.” 2019 Collaborative Conference on Computation & Data Intensive Science, Canberra, Australia. 6 to 10 May 2019.

—. (2019). “Apache SDAP – A Disruptive Technology Solution for Earth Science.” 2019 Earth Science Technology Forum. NASA’s Ames Research Center, Silicon Valley, CA. 12 June 2019. Invited speaker.

—. (2019). “Open Source Data-Intensive Platform for the Cloud.” ESIP Summer Meeting, Tacoma, WA. July 2019.

—. (2019). “From Data to Insights: Shift Toward Data Analytics.” 2019 Data Analytics for Canadian Climate Services (DACCS), Montreal, Quebec, Canada. September 2019.

—. (2019). “Advancing Technology Through Open Source.” OceanObs 2019, Honolulu, HI. 15 to 20 September 2019. Invited talk and panelist.

—. (2019). “Analysis Ready Storage using Apache SDAP.” CEOS 48th Meeting of the Working Group on Information Systems & Services (WGISS-48). Hanoi, Vietnam. October 2019. Invited remote talk.

—. (2019). “Advancing Data Science Technology Through Open Source.” National Academy of Sciences 2nd Meeting of the Committee on Advancing Commercialization from Federal Labs. Washington, D.C. December 2019.

—. (2019). “Aiming for Autonomously and Sustainable Solution for Spatiotemporal Analysis.” AGU Fall Meeting, San Francisco, CA. 9 December 2019.

—. (2018). “Lessons Learned in Creating Science Data Analysis Solutions for the Cloud.” AGU Fall Meeting, Washington, D.C. 10 to 14 December 2018.

Huang, T., Fenty, I. & Heimbach, P. (2019). “Analytics Center Framework for Estimating the Circulation and Climate of the Ocean.” 2019 International Geoscience and Remote Sensing Society (IGARSS), Yokohama, Japan. 1 August 2019. Paper 3899.

Yam, E., et al. (2019). “A Cloud Environment for Automated Processing and Data-Analysis of ECCO Ocean and Ice State Estimates.” NASA Earth Science Data System Working Groups (ESDSWG), Annapolis, MD. March 2019.

Updated January 12, 2021

Page Last Updated: Mar 1, 2021 at 4:17 PM EST