A Self-Supervised Learning (SSL) Framework for Discovery
Enabling a machine to find phenomena of interest in vast quantities of satellite imagery begins with training the machine on what it’s seeking.
Derek Koehl, NASA IMPACT Science Writer
(Editor’s note: NASA’s Interagency Implementation and Advanced Concepts Team [IMPACT] is a component of NASA’s Earth Science Data Systems [ESDS] Program, and works to further the ESDS goal of overseeing the lifecycle of Earth science data to maximize the scientific return of NASA’s missions and experiments for scientists, decision makers, and society.)
Machine learning can use human-labeled datasets as training datasets to achieve impressive results. However, hard problems exist in domains with sparse amounts of labeled data, such as in Earth science. Self-supervised learning (SSL) is a method designed to address this challenge. Using clever tricks that range from representation clustering to random transform comparisons, self-supervised learning for computer vision is a growing area of machine learning whose goal is simple: learn meaningful vector representations of images without having human labels associated with each image such that similar images have similar vector representations.
In particular, remote sensing is characterized by a huge amount of images and, depending on the data survey, a reasonable amount of metadata contextualizing the image such as location, time of day, temperature, and wind. However, when a phenomenon of interest cannot be found from a metadata search alone, research teams will often spend hundreds of hours conducting visual inspections, combing through imagery using applications such as NASA Worldview, which enables the interactive exploration of all 197 million square miles of Earth’s surface with more than 20 years of daily global satellite imagery.
This is the fundamental challenge addressed by a collaboration between IMPACT and the SpaceML initiative. This collaboration produced the Worldview image search pipeline. A key component of the pipeline is the self-supervised learner, which employs SSL to build the model store. The SSL model sits on top of an unlabeled pool of data and circumvents the random search process. Leveraging the vector representations generated by the SSL, researchers can provide a single reference image and search for similar images, thus enabling rapid curation of datasets of interest from massive unlabeled datasets.
The impetus behind this collaboration is to streamline and increase the efficiency of Earth science research. SSL developer, and winner of the Exceptional Contribution Award from the IMPACT team, Rudy Venguswamy explains:
Machine learning has the potential to radically transform how we find out about things happening in our universe to, proverbially, more quickly find needles in our various haystacks. When I started building the SSL as a package, I wanted to build something for scientists in diverse fields, not just machine learning experts.
The SSL tool was released as an open-source package built on PyTorch Lightning. The Worldview image search pipeline uses compatible Graphics Processing Unit (GPU)-based transforms from the NVIDIA Dali package for augmentation and can be trained across multiple GPUs leading to a 5-to-10 times increase in the speed of self-supervised training. New transforms in each epoch are critical to model learning, so improvements to the speed of transforms have a direct impact on training speed.
The package is built with customizability in mind. Currently, the SimCLR and SimSiam SSL architectures are supported, and the package allows users to specify custom encoder architectures to the model as well as change model parameters with optional arguments specific to each model. For instance, researchers can specify their own pre-trained encoder or use one of the defaults provided that are pre-trained on ImageNet data.
As an example of the capabilities of the SSL, the image at right represents data from the Worldview website. The team trained SimCLR using the SSL on a data sample of approximately 50,000 images and plotted a t-Distributed Stochastic Neighbor Embedding (t-SNE) visualization, reducing the dimensionality of the embeddings to plot on a 2D plane. With no human labels for the imagery, the machine manages to cluster images intuitively.
As part of the larger Worldview imagery pipeline, the SSL helps streamline the research process as scientists study phenomena such as wildfires, oil spills, desertification, and the polar vortex.
Article originally published April 15, 2021, on the IMPACT Blog and reprinted with permission.
Published April 26, 2021
Page Last Updated: Apr 26, 2021 at 10:22 AM EDT