Automated Machine Learning for Satellite Data: Integrating Remote Sensing Pre-trained Models into AutoML Systems
Creating machine learning models from satellite data is useful for various applications ranging from environmental mapping and monitoring to urban planning and emergency response. However, in order to leverage the latest advancements in machine learning, much expert knowledge, and hands-on expertise is required. Automated Machine Learning (AutoML) addressed this issue by allowing the creation of high-performing models through automatically making machine learning design choices in a data-driven manner. Current AutoML systems have been benchmarked with natural image datasets. However, there are various differences between satellite images and natural images, for instance in the bit-wise resolution, the number and type of spectral bands, which poses questions about the applicability of current AutoML systems for satellite data tasks. In this thesis, we demonstrate how AutoML can be leveraged for classification tasks on satellite data. Specifically, we examined the image classification task of an AutoML system (Auto-Keras) and created two new variants of it for satellite image classification that incorporate transfer learning using models pre-trained with (i) natural images (using ImageNet) and (ii) remote sensing datasets. For evaluation, we compared the performance of these variants against manually designed architectures on a benchmark set of 7 satellite datasets. Our results show that, except for 2 datasets, the AutoML systems outperform the best model we could find in the remote sensing literature. This project highlights the usefulness of a customized satellite data search space in AutoML systems. Our new AutoML variant, using remote sensing pre-trained models, performed better than the ImageNet variant for small datasets with limited amount of training data and further found the best-automated model for the datasets composed of near-infrared, green, and red bands.
The original AutoML system used is Auto-Keras
The experimental evaluation made use of the following datasets:
- So2Sat Sentinel-2
- UC Merced
- BrazilDam Sentinel-2 (2019)
- Brazilian Cerrado-Savanna Scenes
- Brazilian Coffee Scenes
- Nelly R. Palacios Salinas (supervisors: Dr. Mitra Baratchi & Dr. Jan van Rijn & Dr. Andreas Vollrath)
Automated Machine Learning for Satellite Data: Integrating remote sensing pre-trained models into AutoML systems.Master's Thesis in Computer Science at Leiden Institute of Advanced Computer Science, Leiden University, 2021.
- Nelly Rosaura Palacios Salinas, Mitra Baratchi, Jan N. van Rijn & Andreas Vollrath
Automated Machine Learning for Satellite Data: Integrating remote sensing pre-trained models into AutoML systems.In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. 2021, to appear.