
The Computer Science, Systems, Information and Knowledge Processing Laboratory is developing a strong research activity around image analysis and artificial intelligence, for academic and industrial projects. In order to cope with the significant needs for image analysis and the massive volumes of data to be processed, an expertise with the MUST mesocentre has been built, particularly in GPU computing.
Among the many studies carried out, let us quote:
-
The description of spatiotemporal data for video indexing
-
Analysis of hyperspectral images: deep-learning and optimization of the MUST computing platform
-
Affective computing by vision
-
Structural monitoring and Earth observation by remote sensing
-
Technology transfer to businesses
This work has led the platform to associate with conventional computing means (CPU), specific computing servers equipped with graphics processors (GPU) dedicated to current artificial intelligence issues.
Description of spatiotemporal data for video indexing
Work on image and video sequence indexing has been developed around the international competition TRECVid. Associated with the IRIM consortium bringing together up to 14 French laboratories, LISTIC has developed content description methods so that search engines associate a video query for similar content in large databases. The IRIM consortium ranked fourth in the world four times over the period 2012-2017.
Thanks to the video processing carried out in parallel by the hundreds of processors (CPU) of MUST, calculation times are widely reduced.
The figure below illustrates spatiotemporal data extracted from videos by LISTIC. A bio model inspired by retina available in the OpenCV library allows you to select areas of interest in a video and breaks it down into temporal (movement) and structural (texture, detail) information. This data is then merged to create a relevant descriptor allowing the content to be indexed.
Analysis of hyperspectral images: deep-learning and optimization of the MUST computing platform
LISTIC is developing many academic works around the observation of the earth. In this context, LISTIC has carried out work on the analysis of hyperspectral images for the characterization of soil types using innovative approaches based on deep neural networks or deep-learning.
A hyperspectral image (source: https://en.wikipedia.org/wiki/Hyperspectral_imaging)
As suggested by the image above, hyperspectral images allow each pixel to be described in several hundreds of spectral bands. The information is very dense but the associated knowledge very limited. Optimizing the millions of parameters of neural networks in this context is then difficult.
To solve this problem, LISTIC proposed the use of neural networks with a structure adapted to the data. Very light in terms of number of parameters, these models are associated with targeted optimization strategies that significantly reduce the number of learning steps. These proposals have been validated thanks to optimizations carried out on the MUST platform on specific computing servers equipped with graphics processors (GPU).
The figure below gives an example of the result. Starting from a hyperspectral image with 100 bands per pixel shown on the left in pseudo colors. The proposed model predicts a soil category for each pixel as illustrated in color in the figure on the right. Performance is compared to expert knowledge of the terrain shown in the central figure. The model obtains a 97% good prediction on pixels not seen during training, having only trained on 4% of the data.
Contact: Alexandre Benoit
Affective computing by vision
- Can artificial intelligence come close to the performance of humans in recognition of the violent nature of a scene?*
Through specific application locking or website filtering and navigation data, so-called “parental control” software are tools intended to prevent access to unwanted images or videos. Any image has an emotional impact on the individual who looks at it: depending on its content, it affects them, sometimes in the long term. This software can be put at fault: by consulting inappropriate results of a search by keywords, accessing social networks or viewing multimedia content with fuzzy description. To get around this pitfall, recourse to “affective computing”, that is the ability of a computer system to recognize the sensitive or emotional nature of a video by automatic analysis of its content, independently of the description announced by the source, is possible. This solution would make it possible to be informed in advance of their level of convenience and thus to calmly assess the advisability of viewing this video or these images. As part of the SAINS project funded by the USMB, using MUST computing resources, including MATLAB, LISTIC carried out a study in affective computing aimed at automatically classifying video content (over 11,000 clips examined), as if the robot were viewing them for the recipient. The categorization of the videos was made by the only statistical and semantic analysis of the pixels thereof.
The results of the study showed that artificial intelligence is capable, through deep learning, of achieving high rates of recognition of the violent nature of a scene.
Structural monitoring and Earth observation by remote sensing
PHOENIX is an ANR project which was initiated by LISTIC. Conducted from October 2015, it was the result of four years of collaboration with French and Brazilian university experts, as well as with the SONDRA research laboratory.
Its objective: to study the resilience to change of large-scale structures such as alpine glaciers and the Amazon rainforest. To do so, the past and current states of these structures have been analyzed in order to characterize the consequences of their transformation, through the use of mathematical models applied to large quantities of remote sensing data.
As part of thesis work, the PHOENIX project focused on the calculation of displacement fields of the Argentière glacier. The figure below illustrates the context of calculating pixel velocity fields obtained on this glacier from two pairs of images associated respectively with the right and left cameras positioned facing the glacier.
Areas in dark blue on the 4 velocity maps are assumed to be static. The areas in red are the most dynamic areas.
In the continuity of this study and in order to analyze very long time series, LISTIC is studying the use of a new automatic method of calculating displacement fields: its output data will be used to propose generative and predictive models of said displacement fields. While these are complex deep learning models, requiring many hours of high-performance computing, the use of MUST computing resources has become the norm.
Contact: Abdourrahmane M. ATTO
Technology transfer and business support
LISTIC helps develop and transfer technological bricks to the industrial world. The laboratory has thus supported the companies AboutGoods and SmarterPlan on problems of object detection in images via approaches based on deep neural networks.
- for AboutGoods, this involves automatically detecting sales receipts using a simple photograph taken by smartphone. This solution opens up the automatic analysis of the shopping list allowing the creation of consumption monitoring services.
- for the SmarterPlan startup, this is about detecting multiple objects in buildings from 360° shots. The goal is to be able to create new services revolving around the processes of building management and their data (BIM). As part of these studies, specific computing servers equipped with graphics processors (GPUs) dedicated to the problem of artificial intelligence were purchased and installed on MUST platform. These allow a significant acceleration of calculation times, thus optimizing deep-learning approaches.