🔵 COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails

Miguel Espinosa, Valerio Marsocci, Yuru Jia, Elliot J. Crowley, Mikolaj Czerkawski

GitHub Repo stars

⚠️ NOTE: This is a prototype Beta model of COP-GEN. It is based on image thumbnails of Major TOM and does not yet support raw source data. The hillshade visualisation is used for elevation. The full model COP-GEN is coming soon.

  1. Generate: Click the 🏭 Generate button to synthesize the output without any conditions. The outputs will be shown below - and that's it, you've generated your first sample! 🧑‍🎨️
    2. Optionally, define input: If you want to condition your generation, you can upload your thumbnails manually or you can 🔄 Load a random sample from Major TOM by clicking the button.
    3. Select conditions: Each input image can be used as a conditioning when it's loaded into the inputs panel. The modalities you wish to generate should have no content in the input panel (you can empty each element by clicking x in the top right corner of the image).
    4. Additional Options: You can control the number of generation steps (higher number might produce better quality, but will take more time), or set a fixed seed (for reproducible results).
    5. You can also reuse any of the generated samples as input to the model by clicking ♻️ Reuse

Outputs


(Optional) Input Conditions

Ready? Go back up and press 🏭 Generate again!

10 1000
Arxiv Link

In remote sensing, multi-modal data from various sensors capturing the same scene offers rich opportunities, but learning a unified representation across these modalities remains a significant challenge. Traditional methods have often been limited to single or dual-modality approaches. In this paper, we introduce COP-GEN-Beta, a generative diffusion model trained on optical, radar, and elevation data from the Major TOM dataset. What sets COP-GEN-Beta apart is its ability to map any subset of modalities to any other, enabling zero-shot modality translation after training. This is achieved through a sequence-based diffusion transformer, where each modality is controlled by its own timestep embedding. We extensively evaluate COP-GEN-Beta on thumbnail images from the Major TOM dataset, demonstrating its effectiveness in generating high-quality samples. Qualitative and quantitative evaluations validate the model's performance, highlighting its potential as a powerful pre-trained model for future remote sensing tasks.