Data Chat: Mark McInerney
The technical requirements for moving NASA EOSDIS data into the commercial cloud are enormous. Mark McInerney oversees this effort and helps ensure these data are interoperable with similar data from other agencies.
Interview conducted and edited by Josh Blumenfeld, NASA EOSDIS Science Writer
Mark McInerney does not hesitate when asked what his favorite planet is: Earth. This is not surprising for someone who has spent the last quarter-century working with Earth science data and making these data work more efficiently. After more than a decade as an operational meteorologist with the National Weather Service (NWS), Mark now oversees the technical requirements of NASA’s Earth Science Data and Information System (ESDIS) Project as the Deputy Project Manager-Technical.
In his current position, Mark coordinates the implementation of technical strategies for moving data in NASA’s Earth Observing System Data and Information System (EOSDIS) collection into the commercial cloud and helps ensure that agency data policies are in line with the needs of a cloud-based data system. This work also requires not only close collaboration with NASA’s Earth Science Data Systems (ESDS) Program, but also cross-agency collaboration with other federal repositories of Earth science data, such as NOAA and USGS. When discussing his work, he echoes a common theme: We are on the cusp of an exciting new era of scientific discovery.
How did your work with NOAA and the Weather Service prepare you for your current position managing ESDIS Project technical operations?
My background is both in meteorology and in software engineering and distributed computing. I married these two using it operationally as a forecaster to do analysis of the data, but then later went to the Meteorological Development Laboratory and NOAA’s Systems Engineering Center to build the tools that the forecasters use. This background was really helpful since I was using a vast amount of Earth observation data from satellites, aircraft, balloons, and field observations. This all is very similar to data in the EOSDIS collection.
What are your responsibilities as Deputy ESDIS Project Manager-Technical?
ESDIS implements organizational, program-level initiatives from NASA’s ESDS Program. My colleagues and I at the Deputy Project Manager level ensure that there is a highly-skilled and diverse team capable of implementing these strategies.
All Deputy Project Managers work together and share responsibilities to support the entire project, including making sure we have the proper organizational structure and that it’s governed properly. We also have to make sure the EOSDIS DAACs [Distributed Active Archive Centers] are aligned properly.
A large portion of my work is related to the evolution of EOSDIS data and services into the commercial cloud. I also reach out and represent NASA and ESDIS in cross-agency collaboration efforts. Most recently I’ve been working with NOAA and USGS on interoperability of the commercial cloud. We’re building our commercial cloud environment, but NASA also is working with other federal satellite-based agencies to build in a similar fashion so you have this interoperability inside the commercial cloud across these agencies. This is important since NASA has vast amounts of viable datasets that the science community needs. But this science community also needs to be able to incorporate data from NOAA and USGS. We all work together to make sure agencies are building capabilities that are supported broadly.
It sounds like there is a lot of interagency work being done to put all of these similar data together and make them work together.
Exactly. There are different agencies, different funding streams, different requirements. I mean, NOAA is an operational organization, but they do have a research component as well.
This is pretty challenging: How do you work with partner agencies to build out capabilities that are interoperable? There’s a service that NASA supported early-on through its competitive ACCESS [Advancing Collaborative Connections for Earth System Science] Program called Pangeo. This looks promising and it might be how we connect these different streams. Both NOAA and USGS seem supportive of this. It allows individual agencies to go to the commercial cloud, meet the needs and requirements of our respective agencies, and yet align our data to be accessed in a way that further fosters scientific discovery. [For more information about NASA’s Pangeo ACCESS project, please see the Earthdata article EOSDIS Data in the Cloud: User Requirements.]
Looking at the big picture, why is it important to evolve NASA Earth observing data and services into the commercial cloud?
When the EOSDIS was established in the 1990s, it was aligned by science discipline; each of the DAACs is discipline-specific and the DAACs are geographically dispersed. This has worked fine. But now we have the ability to keep the DAACs intact and move the data for which they are responsible into a centralized commercial cloud environment and work with these data in a centralized location. This will expedite scientific discovery and do this at an overall lower cost. By cost, I mean not only the cost in equipment, but also cost in terms of time.
Here’s an example. Earth is a coupled model – ocean interacts with atmosphere, atmosphere interacts with ocean, etc. A lot of times scientists need access to atmospheric data, which EOSDIS archives primarily at ASDC [Atmospheric Science Data Center] or GES DISC [Goddard Earth Sciences Data and Information Services Center]. Ocean data would be at PO.DAAC [Physical Oceanography DAAC]. Not to mention, they might need land data from LP DAAC [Land Processes DAAC]. For a scientist to access these data today without a commercial cloud environment, they have to invest in equipment with enough storage capacity to download the required data from, in this example, three or more DAACs and then invest the time to conduct extensive data manipulation and analysis. Just getting all the necessary data into one place to start an analysis could take days, weeks, or longer.
Now with the commercial cloud, even if we’re not quite at the place where we can sequence data to facilitate machine learning or predictive analytics-type operations, the data will be centrally located. This limits the time needed for the scientist in this example to download data from different DAACs or the investment in the IT infrastructure to store the data and run their analysis – they just bring their algorithms to the cloud, access the data in-place, and accomplish their objectives in much less time and for much less expense.
This example shows how moving to the commercial cloud can help expedite scientific discovery. But this evolution also has system-level benefits to ESDIS, which is the flexibility to easily build out capabilities as needed. ESDIS will become much more nimble in terms of its ability to ingest, archive, and distribute data. So, moving to the commercial cloud has system-level value and, of course, the value of science discovery.
What are some of the technical challenges facing the ESDIS Project over the next five to 10 years?
We operationalized the Earthdata cloud platform in July 2019, and this was a huge milestone. We now have an infrastructure that allows us to archive and distribute data using the commercial cloud, but we’re not quite complete. We’re still going to be onboarding observational datasets, and the technical challenges that are involved with this still exist.
We’re building up a data lake systematically over these next five to 10 years. It will continue as new missions are launched and new data come in, and we have to look at how do we do this in a commercial cloud environment without any degradation of service in what our users have come to expect. They need to be able to download the data, process it, and work with it easily and efficiently.
The challenges of the future are how do you sequence data differently in the cloud to support machine learning, improved analytics, or predictive analytics capabilities? On the one hand, you have all these data that are now in the cloud from sources like NASA, NOAA, and USGS. On the other hand, you have to know how all these agencies store and sequence data so that everything can be interoperable in the cloud. This is incredibly challenging, and I don’t think anyone has really solved this yet. We have to change the way we think now that the data are going to be all in one place.
How does your work complement the work being done by ESDIS system architects and system engineers?
Going to the commercial cloud, we’re breaking new ground everywhere. The IT security capabilities that NASA has sanctioned also are evolving for the commercial cloud. One of my tasks is to help ESDIS Project architects and engineers identify areas where agency capabilities and policies don’t quite fit the commercial cloud environment and then work with NASA program executives to help get these policy changes implemented at the agency level. It’s a constant feedback loop to support our system architects.
Tell me about your work leading the EOSDIS cloud evolution prototyping project to evaluate commercial cloud technologies for core EOSDIS capabilities.
The EOSDIS Cloud Evolution Project, or ExCEL, kicked off in 2016. The idea behind this was to take key components of the EOSDIS architecture, like ingest, archive, and distribution, and test these capabilities inside Amazon Web Services. Our goal was to see if we could operate in the cloud and in a commercial cloud environment both cost-effectively and technologically.
There was no question that we could technically build and architect a system to evolve EOSDIS data into the commercial cloud. Because the commercial cloud is a pay as you go model, the biggest challenges were government and federal policies, security, and business requirements and capabilities. The federal government has the Antideficiency Act [ADA], which basically says that you can’t spend money you don’t have. This made the business components of setting up a commercial cloud environment more challenging – we had to ensure that we set up and designed a system that would never violate the ADA. We were able to do this.
Overall, this prototyping project was highly successful. In fact, the CMR [Common Metadata Repository] and the Earthdata Search applications, which are key EOSDIS capabilities for supporting the use of our data, were so successful in the ExCEL project that we moved to operationalize them right away [in the commercial cloud]. We easily proved that these two applications were much more efficient in the commercial cloud environment.
NASA’s Computing Services Program Office (CSPO) recently approved the use of Microsoft Azure as an approved cloud service provider to join Amazon Web Services (AWS). Prior to this, all ESDIS commercial cloud work has been based on using AWS. Does the availability of Microsoft Azure impact any of this ongoing work? Are you looking at using both services?
While it does not impact any ongoing work, we do plan to look at integrating Microsoft Azure into our Earthdata cloud environment later this year to build a greater extension for cloud service provider access for EOSDIS. This won’t impact AWS, but this will bring additional cloud service provider opportunities for Earthdata cloud work.
It’s important to note that ESDIS has a really strong partnership with NASA’s CSPO. They’ve spent an enormous amount of time and energy bringing commercial vendors and service providers to projects like ESDIS. This saves us a lot of time by our not having to do these contractual agreements on our own – we just plug into what they approve on our behalf. In addition, they are working directly with us to implement components of the Earthdata cloud and are part of our development process. The CSPO has been with us every step of the way in our cloud efforts.
What are you most excited about over the next five years?
Let me first say that words can’t even begin to describe how much work it took to build out a commercial cloud platform for EOSDIS and get us to this point. The dedicated team of engineers and architects who accomplished this are some of the most highly skilled and mission-focused people I’ve ever worked with. This has been a huge milestone for them, and I’m really glad to have been part of this effort.
What I’m really excited about is how this new commercial cloud environment will further support scientific discovery. I really believe that science that requires observational data will be expedited because of the work that we’re doing. Scientists will not have to focus on their IT, but can devote their focus to their algorithms. Everything will be built in for them to just tap into the data. I can’t wait to see what will come out of this when scientists only have to focus on science, and not IT requirements. We’re going to see some great science coming out of this.
Published July 29, 2020
Page Last Updated: Oct 21, 2021 at 7:43 PM EDT