GeoEngine: A Platform for Production-Ready Geospatial Research

Nov 09, 2023


Geospatial machine learning has seen tremendous academic advancement, but its practical application has been constrained by difficulties with operationalizing performant and reliable solutions. Sourcing satellite imagery in real-world settings, handling terabytes of training data, and managing machine learning artifacts are a few of the challenges that have severely limited downstream innovation. In this paper we introduce the GeoEngine platform for reproducible and production-ready geospatial machine learning research. GeoEngine removes key technical hurdles to adopting computer vision and deep learning-based geospatial solutions at scale. It is the first end-to-end geospatial machine learning platform, simplifying access to insights locked behind petabytes of imagery. Backed by a rigorous research methodology, this geospatial framework empowers researchers with powerful abstractions for image sourcing, dataset development, model development, large scale training, and model deployment. In this paper we provide the GeoEngine architecture explaining our design rationale in detail. We provide several real-world use cases of image sourcing, dataset development, and model building that have helped different organisations build and deploy geospatial solutions.

CVPR Demo 2022

Contributed by

Sagar Verma , Siddharth Gupta , Hal Shin , Akash Panigrahi , Shubham Goswami , Shweta Pardeshi , Natanael Exe , Ujwal Dutta , Tanka Raj Joshi , Nitin Bhojwani

Related Research

Synthetix: Pipeline for Synthetic Geospatial Data Generation

Remote sensing is crucial in various domains, such as agriculture, urban planning, environmental monitoring, and disaster management. However, acquiring real-world remote sensing data can be challenging due to cost, logistical constraints, and privacy concerns. To overcome these limitations, synthetic data has emerged as a promising approach. We present an overview of the use of synthetic data for remote sensing applications.In this regard, we address three conditions that can drastically affect the optimization of computer vision algorithms: lighting conditions, fidelity of the 3D model, and resolution of the synthetic imagery data. We propose a highly configurable pipeline called Synthetix as part of our GeoEngine platform for synthetic data generation. Synthetix allows us to quickly create large amounts of aerial and satellite imagery under varying conditions, given a few samples of 3D objects on real-world scenes. We demonstrate our pipeline’s effectiveness by generating 3D scenes from 35 real-world locations and utilizing these scenes to generate different versions of datasets and answer the three questions. We conduct an in-depth ablation study and show that considering different environments and weather conditions increases the reliability and robustness of the deep learning networks.

02 January 2024

Post Wildfire Burnt-up Detection using Siamese UNet

In this article, we present an approach for detecting burnt area due to wild fire in Sentinel-2 images by leveraging the power of Siamese neural networks. By employing a Siamese network, we are able to efficiently encode the feature extraction process for pairs of images. This is achieved by utilizing two branches within the Siamese network, which capture and combine information at different resolutions to make predictions. The weights are shared between these two branches in siamese networks. This design allows to effectively analyze the changes between two remote sensing images, enabling precise identification of areas impacted by forest wildfires in the state of California as part of ChaBuD challenge thereby assisting local authorities in effectively monitoring the impacted regions and facilitating the restoration process. We experimented with various model architectures to train ChaBuD dataset and carefully evaluated the performance. Through rigorous testing and analysis, we have achieved promising results, ultimately obtaining a final private score (IoU) of 0.7495 on the hidden test dataset. The code is available at We also deploy the final model as a point solution for anyone to use at

09 November 2023