ESDS Program

Data Management Guidance for ESD-Funded Researchers

Image
three circles showing a file folder with magnifying glass, a bar chart, and a lightbulb

Background and Motivation

NASA's Earth Science Division (ESD) has a long-standing commitment to the full and open sharing of all data with research and applications communities, private industry, academia, and the general public, as defined in the ESD Data and Information Policy. The principles of this policy are consistent with NASA’s recently updated Scientific Information Policy for the Science Mission Directorate (SPD-41a), which established more specific data management responsibilities for SMD-funded researchers, repositories, and missions.

NASA's Science Mission Directorate (SMD) has released guidance for its research community (SMD Open-Source Science Guidance) and Frequently Asked Questions (FAQ) on the Scientific Information Policy to provide guidelines, best practices, and examples of open-source science to support the SMD scientific community in implementing the requirements of SPD-41a and achieving the broader goal of moving science towards openness.

This page builds upon the SMD Open-Source Science Guidance and provides additional guidance for ESD-funded researchers on how to implement the data management requirements in SPD-41a. The recommendations provided here reflect general best practices and resources for the ESD research community and should be considered alongside any additional guidance provided by specific programs or funding solicitations. This guidance will be updated over time to best support the needs of the community, including feedback obtained during A Year of Open Science.

This page does not establish policy requirements related to SMD-funded activities. While the guidance described here is compliant with SPD-41a and the ESD Data and Information Policy, it may not be comprehensive or complete, and there may be other ways in which to comply with existing or future agreements or requirements not described here.

Research Data in Scope of SPD-41a

Data are defined as scientific or technically relevant information that can be stored digitally and accessed electronically. This includes any scientifically useful data associated with an award. In particular, the information needed to validate the scientific conclusions of peer-reviewed publications must be shared at the time of publication. This does not include laboratory notebooks, preliminary analyses, intermediate data products, drafts of scientific papers, plans for future research, peer review reports, communications with colleagues, or physical objects such as laboratory specimens.

Data subject to specific laws, regulations, or policies (e.g., Export Administration Regulations [EAR] or International Traffic in Arms Regulations [ITAR]) that would prevent the release of this information are exempt from requirements for making data publicly available. Section II-C of SPD-41a lists additional laws, regulations, and policies that generate exceptions to data sharing requirements.

Open Science and Data Management Plans

SPD-41a requires that all SMD-funded scientific activities that are expected to produce scientific data shall include an open science and data management plan (OSDMP) describing the management, preservation, and release of data to facilitate implementation of relevant scientific information policies.

The scope of an OSDMP differs between research proposers, data producers, and NASA Distributed Active Archive Centers (DAACs). Guidance is provided below for each audience.

OSDMP for Proposals to Earth Science Programs

Starting with ROSES-2023, the proposal DMP will be one component of the broader OSDMP that describes how additional categories of scientific information (e.g., software and publications) will be managed and shared openly. Please follow any directions on preparing an OSDMP that are included in a specific funding solicitation. For general information on the preparation of an OSDMP, see the Open Science and Data Management Plan section of the SMD Open-Source Science Guidance and the associated Earth Science-specific OSDMP template (DOC). See How to Create and Maintain an Open Science and Data Management Plan for Earth Science Research Proposals.

OSDMP for ESD-Funded Data Producers

Each organization funded by NASA to produce data is required to prepare an OSDMP. The OSDMP addresses the management of data from Earth science missions from the time of their data collection/observation to their entry into permanent archives. The OSDMP Template for Data Producers ensures that OSDMPs are in a consistent and useful format.

Open Science and Data Management Plan (OSDMP) Template for Data Producers (PDF)

OSDMP for Data Repositories

Organizations funded by NASA to archive and/or distribute data are also required to prepare an Open Science and Data Management Plan (OSDMP) to describe their data operations. The purpose of the OSDMP Template for DAACs is to provide the DAACs with guidance on the contents of data management plans.

Open Science and Data Management Plan (OSDMP) Template for DAACs (DOC)

Timeline for Sharing Research Data

Data produced from the proposal shall be made publicly available, and the timeline for that process must be contained within the proposals’ OSDMP.

All scientifically useful data associated with an SMD research award shall be made publicly available by the end of the period of performance of the research award, whether or not the data would be needed to validate the scientific conclusions of a peer-reviewed publication. This includes data required to derive the findings communicated in figures, maps, and tables as well as scientifically useful data from models and simulations.

Scientifically useful data needed to validate the scientific conclusions of peer-reviewed manuscripts resulting from SMD-funded scientific activities, including data from models and simulations developed using SMD funding, shall become publicly available no later than the publication date of the corresponding peer-reviewed article.

There is no period of exclusive access. PIs may request a reasonable period for calibration and validation of data, but should discuss if this period will exceed 6 months and the timeframe of dependencies. Extended delays without reasonable justification will not be viewed favorably.

Where to Share Data

Data shall be published (i.e., shared and archived) in locations that ensure its accessibility and preservation. In general, ESD-funded researchers should follow the guidance for how to share data as described in their funding solicitation. For most ESD-funded researchers, the designated repository will be a NASA DAAC. See DAAC Assignment section below for more information on this designation.

If the solicitation does not designate a specific NASA repository, additional options for where to share research data include:

  • In a non-SMD, federally supported data repository (e.g., data.nasa.gov); if using a different federally supported repository, this should be indicated in the data management plan
  • In public repositories already in use by the scientific community that have characteristics consistent with Desirable Characteristics of Data Repositories for Federally Funded Research; further examples will be provided but see examples provided by the NIH and USGS
  • If appropriate for the field or journal, as machine-readable tables in the supplemental material of a peer-reviewed publication; this may be the best solution for small datasets or individual tables that accompany a peer-reviewed publication, but the use of a community-recognized repository is encouraged

The method for sharing the data must be described as part of the OSDMP. Especially for very large datasets that might not fit into existing guidance, the PI must describe in their OSDMP how they will share the data, what is appropriate to share, and include the costs for preparing data for archiving in their budget.

DAAC Assignment

ESD promotes data publication that is performed collaboratively by ESD-funded researchers and a NASA DAAC. For ESD-funded researchers, the designated NASA repository is often a NASA DAAC. DAACs are assigned after a ROSES proposal is awarded and approved by the ESD. PIs should not contact a DAAC before an award is made.

The process for this is as follows:

  1. PI obtains funding approval
  2. PI discusses DAAC assignment with their NASA Program Scientist. Together the PI and Program Scientist decide which DAAC is the best fit for the data (see Table 1)
  3. PI fills in a request form for the desired DAAC
  4. PI submits the request for review
  5. DAAC is assigned and assignment is approved by NASA headquarters
  6. DAAC contacts PI

The following information is needed to complete DAAC assignment:

  • Data processing level
  • Data format. This should be an open, non-proprietary, earth science standard format such as NET-CDF, GeoTIFF, Cloud-optimized GeoTIFF, etc.
  • Spatial and temporal extent of data
  • Estimated total data volume
  • Frequency and volume of data delivery
  • Sample data
  • Existing data documentation and Digital Object Identifiers (DOIs). In the case of data that reside in a NASA DAAC, a DOI will be provided. For non-DAAC archived data, software, and algorithms you may need to provide a DOI. SMD-funded investigators shall have a persistent identifier that meets the standards of a digital persistent identifier service as defined in the NSPM-33 Implementation Guidance. Instructions for this process can be found at DOI Frequently Asked Questions. ESD provides assistance in the publication of algorithms via the Algorithm Publication Tool (APT)
  • Funding information
  • Description of why the DAAC is appropriate for that data

The DAAC will provide guidance on data formats and metadata standards and will work with the PI to ensure the data meet the requirements of SPD-41a.

Once your data product has been published, the DAAC will continue to provide support and maintenance of your data product while it remains available to the public. DAACs are domain-focused data repositories supporting the specific needs of science disciplines, while also enabling cross-disciplinary data usage. The DAACs, as custodians of NASA Earth science data, provide data publication, data access, and data user support. This role is essential in preserving your data and the information needed so that a new user in the future can understand how the data were used for deriving information.

Table 1: The 12 NASA DAACs and their primary scientific disciplines.
NASA DAAC Acronym Scientific Disciplines
Alaska Satellite Facility DAAC ASF DAAC SAR Products, Change Detection, Sea Ice, Polar Processes
Atmospheric Science Data Center ASDC Radiation Budget, Clouds, Aerosols, Tropospheric Composition
Crustal Dynamics Data Information System CDDIS Space Geodesy, Solid Earth
Global Hydrometeorology Resource Center DAAC GHRC DAAC Lightning, Severe Weather Interactions, Atmospheric Convection, Hurricanes, Storm-induced Hazards
Goddard Earth Sciences Data and Information Services Center GES DISC Global Precipitation, Solar Irradiance, Atmospheric Composition and Dynamics, Water and Energy
Land Processes DAAC LP DAAC Land data products
Level 1 and Atmosphere Archive and Distribution System DAAC LAADS DAAC Moderate Resolution Imaging Spectrometer (MODIS) Level 1 data (geolocation, L1A, and radiance L1B) and Atmosphere (Level 2 and Level 3)
National Snow and Ice Data Center DAAC NSIDC DAAC Cryospheric Processes, Sea Ice, Snow, Ice Sheets, Frozen Ground, Glaciers, Soil Moisture
Oak Ridge National Laboratory DAAC ORNL DAAC Biogeochemical Dynamics, Ecological Data, Environmental Processes
Ocean Biology DAAC OB.DAAC Ocean Biology
Physical Oceanography DAAC PO.DAAC Gravity, Ocean Circulation, Ocean Heat Budget, Ocean Surface Topography, Ocean Temperature, Ocean Waves, Ocean Winds, Ocean Salinity, Surface Water
Socioeconomic Data and Applications Center SEDAC Synthesized Earth Science, Socioeconomic Data

All data derived from the proposal will, by virtue of being published by a NASA DAAC, reside in the Earthdata Cloud and be indexed, using appropriate metadata, in NASA’s Common Metadata Repository (CMR). All data are archived at least at a basic level of service. The basic level requirements are designed to provide a minimal level of data stewardship that makes the data findable and accessible.

FAIR Data

When sharing data, ESD-funded researchers should follow the FAIR Guiding Principles for scientific data management and stewardship. The FAIR Principles include ensuring that data are:

  • Findable - consistent and persistent descriptions make scientific data easy to find by both humans and computers
  • Accessible - use of standard, open protocols ensure data and metadata can be accessed by all
  • Interoperable - formal, accessible, and widely adopted semantics and vocabularies are used to expand data usability across systems and communities
  • Reusable - data are richly described according to standards to ensure they can be combined or replicated, and usage rights are clarified

How to Share Data

SPD-41a established the following requirements (denoted using “shall”) or recommendations (denoted using “should”) for the sharing of research data developed using SMD funding. These items help ensure that data are preserved and accessible to support reproducibility and reuse and that they are consistent with the FAIR Guiding Principles. Data archived in a DAAC meet these criteria:

  • Open accessibility: Publicly available, SMD-funded data shall be made available without fee or restriction of use. The data shall be shared in a repository that provides broad, equitable, and maximally open access to datasets and their metadata free of charge in a timely manner after submission, consistent with legal and policy requirements related to maintaining privacy and confidentiality, Tribal and national data sovereignty, and protection of sensitive data. The data shall be accessible to the public (lay and scientific) without pre-approval
  • Format: SMD-funded data and metadata shall be made available for access, download, or export in non-proprietary, modifiable, open, and machine-readable formats consistent with standards used in the disciplines the repository serves
  • Inclusion of metadata: SMD-funded data shall include robust, standards-compliant metadata that clearly and explicitly describe the data. This metadata can be indexed in an online catalog, such as NASA’s Common Metadata Repository, to enable discovery, reuse, and citation of SMD-funded data
  • Clear guidance on use: Publicly available SMD-funded data shall be reusable with a clear, open, and accessible data license. This provides a clear license for the user that the scientific data are in the worldwide public domain and that the public may use it freely. In some cases, there might be existing restrictions on releasing the data due to intellectual property rights, contract restrictions, underlying licenses, or other issues. If unsure, contact your counsel that can help with intellectual property rights or ask for clarifications at HQ-SMD-SPD41@mail.nasa.gov
  • Persistent identifiers: Publicly available SMD-funded data collections shall be citable using unique persistent identifiers (e.g., DOI) assigned by the repository to support data discovery, reporting, and research assessment. As part of the archival process, a DAAC will assign a DOI to the data
  • Findability: SMD-funded data shall be findable, such that the data can be retrieved, downloaded, indexed, and searched. The data must be shared in a repository that will ensure that data are searchable and be provided with descriptive metadata along with the data collections

Glossary of Open-Source Science Terms

See ‘Glossary of Open-Source Science Terms’ in the SMD Open-Source Science Guidance for definitions of terms used throughout this page.

How to Provide Feedback

Contact us to provide comments and feedback on the guidance.

Last Updated