Data Quality Screening Service
Principal Investigator (PI): Chris Lynnes, NASA's Goddard Space Flight Center
NASA Earth observation data typically include a rich set of quality information. These can range from simple quality flags to complicated bitmasks to external criteria such as cloud cover. Additional quality information is often included in external documentation or peer-reviewed literature and may include latitude dependencies, physiographic criteria (e.g., not valid over deserts), temporal functions, etc. The explanations of these quality factors also appear in a variety of forms, such as data product documentation and journal articles. Thus, it is a laborious process for data users to locate the quality explanations, read the quality indicators and code an interpretation of the indicators. The effort expended is compounded many times, because each user must repeat the process him or herself. If it is merely difficult for science users, it is virtually impossible for machine applications to assimilate and use the quality information; they must either reject the data product or ignore the quality information.We propose to solve this problem through the construction of an ontology-driven quality screening service. The service will be deployable as a simple Representational State Transfer (REST) or Simple Open Access Protocol (SOAP) Web Service at data centers, taking as input the original data product and producing as output the same product, but with quality screening applied as requested by the user or client. In addition, predefined quality screens will be provided according to the cognizant science team's recommendations as stated in their documentation.
The quality screening indicators and recommendations will be encoded in a community-based data quality ontology, leveraged from an ongoing NASA project, which will allow it to be reused, expanded and maintained as new missions (such as the Decadal Survey missions) come online. In addition, the ontology-based approach will make the quality screening and associated information usable by machine-based applications such as models and decision support systems.
Last Updated: Feb 18, 2020 at 1:22 PM EST