Enhanced Quality Screening for Earth Science Data
Principal Investigator (PI): Ed Armstrong, NASA's Jet Propulsion Laboratory (JPL)
Data subsetting and aggregation services such as Open-source Project for a Network Data Access Protocol (OPeNDAP), Live Access Server (LAS) and Thematic Real-time Environmental Distributed Data Services (THREDDS) are useful tools for earth science users to more easily access large volumes of data, especially satellite data. Most NASA and other government Distributed Active Archive Centers (DAACs) now offer these as part of their standard services suite. However, the data of interest frequently contains commensurate quality information in the form of a flag or other metric that a user must apply to the actual geophysical data to make it meaningful and understandable. In the case of OPeNDAP this is a three-step process. A user must first request the geophysical data, then request the quality data, and finally apply the quality information (e.g., a flag) to the geophysical data in order to make the result usable for further analysis or even visualization.
To address the tools and technologies that improve users’ ability to efficiently discover, find, access, and readily use multi-mission, multi-instrument Earth science data interest area of the solicitation we propose to design and deploy a quality screening service to make the discovery and application of quality information as transparent to the user as possible. In function, this will be a web-based service that will allow any client to make a Representational state transfer (RESTful) URL based request for data access and subsetting for a granule, while in the same request, accessing and applying the quality information derived from user specification. The system will utilize the NASA developed Webification (AKA w10n) service, that provides efficient RESTful access to all facets of a data store including attributes, metadata, and array data. A query service based in part one of the JPL-developed OpenSearch protocol will be implemented to bridge different data and quality attribute dialects and generate the proper RESTful requests to one or more w10n instances from a single high-level query. Where possible, our approach will also leverage the quality flagging standardization that is found in rich metadata content satellite granules that adhere to the Climate Forecast (CF) metadata conventions. For example, the CF attributes flag_meanings, flag_values and flag_masks can be identified and used to autonomously recognize arrays that contain quality information, and provide meaning and context to quality flags. This service will first incorporate data streams from the Group for High Resolution Sea Surface Temperature (GHRSST) as well as preliminary products from the Soil Moisture Active Passive (SMAP) decadal mission. Each of these missions produce data granules that are complex but are well documented with rich metadata. Although we focus on the datasets from these two missions, our goal is to provide a generalized and streamlined service that can be applied to any earth science data that meet a minimum of metadata requirements. Our primary focus will be to improve scientific computing of satellite based earth science data but our service will also support on-demand visualization, real-time data exploration and experimentation, and any other service or methodology that requires the application of quality flagging for data utilization.
Last Updated: Feb 18, 2020 at 1:28 PM EST