The high quality and the relevance of Matthieu’s thesis contributed to get significant results as a part of OCE project.
About this thesis
The various Earth observation missions have gathered an important amount of information over time. This is caused in particular by the frequent revisiting time for the same region, the improvement of spatial resolution and the increase of the swath (spatial coverage of an acquisition). Remote sensing, which was once confined to the study of a single image, has gradually turned into the analysis of long time series of multispectral images acquired at different dates. The annual flow of satellite images is expected to reach several Petabytes in the near future.
The availability of such a large amount of data is an asset to develop advanced processing chains. The machine learning techniques used in remote sensing have greatly improved. The robustness of traditional machine learning approaches was often limited by the amount of available data. New techniques have been developed to effectively use this new and important data flow. However, the amount of data and the complexity of the algorithms embedded in the new processing pipelines require a high computing power.
In parallel, the computing power available for image processing has also increased. Graphic Processing Units (GPUs) are increasingly being used and the use of public or private clouds is becoming more widespread. Now, all the power required for image processing is available at a reasonable cost. The design of the new processing lines must take this new factor into account.
Unfortunately, in remote sensing, the volume of data currently available for exploitation has become a problem due to the constraint of the computing power required for the analysis. Traditional remote sensing algorithms have often been designed for data that can be stored in internal memory throughout processing. This condition is violated with the quantity of images and their resolution taken into account. Traditional remote sensing algorithms need to be reviewed and adapted for large-scale data processing. This need is not specific to remote sensing and is found in other sectors such as the web, medicine, speech recognition … which have already solved some of these problems. Some of the techniques and technologies developed by the other domains still need to be adapted to be applied to satellite images.
This thesis focuses on remote sensing algorithms for processing massive data volumes. In particular, a first algorithm of machine learning is studied and adapted for a distributed implementation. The aim of the implementation is the scalability, i.e. the algorithm can process a large quantity of data with a suitable computing power. Finally, the second proposed methodology is based on recent algorithms of learning convolutional neural networks and proposes a methodology to apply them to our cases of use on satellite images.
- P. Bolon – Savoie Mont Blanc University – Chairman
- J-Y. Tourneret – IRIT Laboratory – PhD Advisor
- H. Wendt – IRIT Laboratory – PhD Advisor
- M. Ortner – Airbus Defense & Space – IRT PhD Advisor
- M. Spigai – Thales Alenia Space – IRT PhD Advisor
- F. Turpin – LCTI,Telecom Paris tech – Rapporteur
- G. Mercier – IMT Atlantique, eXo maKina – Rapporteur
- J. Inglada – CESBIO – Examiner
M. Le Goff, J-Y. Tourneret, H. Wendt, M. Ortner, M. Spigai: “Distributed Boosting for Cloud Detection”.
International Geoscience and Remote Sensing Symposium (IGARSS 2016), Jul 2016, Beijing, China.
Abstract : The SPOT 6-7 satellite ground segment includes a systematic and automatic cloud detection step in order to feed a catalogue with a binary cloud mask and an appropriate confidence measure. In order to significantly improve the SPOT cloud detection and get rid of frequent manual re-labelings, we study a new automatic cloud detection technique that is adapted to large datasets. The proposed method is based on a modified distributed boosting algorithm. Experiments conducted using the framework Apache Spark on a SPOT 6 image database with various landscapes and cloud coverage show promising results.
M. Le Goff, M. Ortner, M. Spigai and G. Flandin: “Massive learning-Based Image Processing in the cloud”.
Innovation IT Day, Toulouse (France). 2016
M. Le Goff: “Problématiques big data et cloud computing pour les futurs services d’observation de la Terre”.
Innovation IT Day, Toulouse (France). 2015