BIFOLD at IGARSS 2024

Enjoying IGARSS 2024: Baris Büyüktas, Tom Burgert, Begüm Demir, Theresa Follath, Panagiotis Agrafiotis (from left to right) — ©️BIFOLD

Prof. Begüm Demir's team presented four research papers

The BIFOLD research group Big Data Analytics for Earth Observation (BigEarth), led by Prof. Begüm Demir, presented four papers at the IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2024) in Athens, Greece . The group focuses on fundamental research at the intersection of data management and machine learning for Earth observation. The IGARSS 2024 Scientific Program focused on sustainable development in line with the United Nations 2030 Agenda. This leading event attracted over 3,000 esteemed scientists and professionals from the global Remote Sensing community.

Begüm Demir organized a community contributed session on Data-Centric AI for Geosciences together with Manil Maskey from NASA, Charlotte Pelletier from Université Bretagne Sud, Sylvain Lobry from Université Paris Cité, Ribana Roscher from Forschungszentrum Jülich and University of Bonn, and Marc Rußwurm from Wageningen University. During the subsequent days the team presented four BIFOLD research papers:

Transformer-based Federated Learning Across Decentralized and Unshared Archives for Remote Sensing Image Classification

Abstract: Federated learning (FL) aims to collaboratively learn deep learning model parameters from decentralized data archives (i.e., clients) without accessing training data on clients. However, the training data across clients might be not independent and identically distributed (non-IID), which may result in difficulty in achieving optimal model convergence. In this work, we investigate the capability of state-of-the-art transformer architectures to address the challenges related to non-IID training data across various clients in the context of FL for multi-label classification (MLC) problems in remote sensing (RS). The considered transformer architectures are compared in terms of their: 1) robustness to training data heterogeneity; 2) local training complexity; and 3) aggregation complexity under different non-IID levels. On the basis of the performed analysis, some guidelines are derived for a proper selection of transformer architecture in the context of FL for RS MLC.

Authors: Barış Büyüktaş, Kenneth Weitzel, Sebastian Völkers, Felix Zailskas, Begüm Demir
Paper: https://arxiv.org/abs/2405.15405
Code: https://git.tu-berlin.de/rsim/FL-Transformer

MagicBathyNet: A Multimodal Remote Sensing Dataset for Benchmarking Learning-based Bathymetry and Pixel-based Classification in Shallow Waters

Abstract: Accurate, detailed, and regularly updated bathymetry, coupled with complex semantic content, is crucial for the undermapped shallow water areas facing intense climatological and anthropogenic pressures. Current methods exploiting remote sensing imagery to derive bathymetry or pixel-based seabed classes mainly exploit non-open data. This lack of openly accessible benchmark archives prevents the wider use of deep learning methods in such applications. To address this issue, in this paper we present the MagicBathyNet, which is a benchmark dataset made up of image patches of Sentinel-2, SPOT-6 and aerial imagery, bathymetry in raster format and annotations of seabed classes. MagicBathyNet is then exploited to benchmark state-of-the-art methods in learning-based bathymetry and pixel-based classification.

Authors: Panagiotis Agrafiotis , Łukasz Janowski , Dimitrios Skarlatos, Begüm Demir
Paper: https://arxiv.org/abs/2405.15477
Code: https://github.com/pagraf/MagicBathyNet
Dataset: www.magicbathy.eu/magicbathynet.html

Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images

Abstract: This paper explores the use of data augmentation, specifically channel transformations like solarize, grayscale, and brightness adjustments, in deep learning for remote sensing (RS) image classification. In the RS community, there is an ongoing debate about the proper application of these techniques, as they might create physically inconsistent spectral data. To address this issue, we propose an approach to estimate whether these augmentations impact the physical information in RS images. Our approach calculates a score to measure how well a pixel's signature aligns within a time series, accounting for natural variations due to acquisition conditions or vegetation changes. By comparing the scores of original (i.e., unaugmented) and augmented pixel signatures, we assess the physical consistency of the augmentations. The results show that channel augmentations with scores exceeding the expected deviation of original pixel signatures do not improve model performance.
Authors: Tom Burgert, Begüm Demir
Paper: https://arxiv.org/abs/2403.14547
Code: https://git.tu-berlin.de/rsim/physical-consistency

Multi-Modal Vision Transformers for Crop Mapping from Satellite Image Time Series

Abstract: Using images acquired by different satellites has shown to improve classification performance in the framework of crop mapping from satellite image time series (SITS). Existing state-of-the-art architectures use self-attention mechanisms to process the temporal dimension and convolutions for the spatial dimension of SITS. Motivated by the success of purely attention-based architectures in crop mapping from single-modal SITS, in this paper we introduce several multi-modal multitemporal transformer-based architectures. Experimental results demonstrate significant improvements over state-of-the-art architectures with both convolutional and self-attention components.

Authors: Theresa Follath, David Mickisch, Jan Hemmerling, Stefan Erasmi, Marcel Schwieder, Begüm Demir
Paper: https://arxiv.org/pdf/2406.16513v1
Code: https://git.tu-berlin.de/rsim/mmtsvit

More details about the conference