Recherche et Informatique Fondamentale

Diplômés 2018

Le 6 septembre 2018, sept étudiants du parcours recherche ont soutenu leur mémoire de fin d'études. Félicitations à eux !

End-to-end available bandwidth estimation in virtualized environments, Alessandro Baldi Vitali

Tutor: Guillaume Urvoy-Keller, Université Côte d’Azur, CNRS, I3S, France.

Network performance evaluation has always been an important research topic in the computer science domain. Nowadays it is even more important because of the more and more increasing use of distributed applications that have to spread data across multiple servers that could be geographically distant. The scope of this internship is to find a methodology to measure end-to-end available bandwidth in a non intrusive manner. However, the measurement of this metric, in particular in a virtualized context, presents a series of challenges. For example, the nature of the metric is inherently highly dependent on many factors such as duration of measurement and latency. Furthermore, the problem of multipath needs to be taken into account in the measurement process and it complicates the interpretation of the results. We first evaluated Pathload, that was designed in the early 2000s and it is still consider as the best tool to measure the spare capacity of the path in a non virtualized context. As the existence of multiple paths constitute a challenge, we put the focus on this problem. We did a study about multiple paths that exist between AWS data centers (DCs). The initial idea was to create an a priori map of the number of different paths using the legacy Traceroute utility and its improved version Paris-traceroute. This phase transformed into a deep study of the complex infrastructure used by AWS to interconnect its DCs. Finally, we discovered that our methodology applied in order to count all the multiple paths between AWS DCs couldn’t effectively succeed because of the complexity of AWS infrastructure. We are now trying to use the informations acquired with this work to design a methodology that could successfully lead to a correct estimation of end-to-end available bandwidth.

Asynchronous Approximate Distributed Computation for Machine Learning, Gianmarco Calbi

Tutor: Giovanni Neglia, Université Côte d’Azur, Inria, CNRS, I3S, France.

Distributed systems enable exploiting multiple machines to perform massive calculations. Within large-scale machine learning as well, it is becoming more and more popular to perform the training on multi-machines clusters. Distributed gradient methods for functions minimization are often used for this purpose since many machine learning problems are formulated as minimization of some loss function on a training set of examples. In this document, we study how the variability of task execution times at cluster executors affects the system throughput. In particular, a simple but accurate model allows us to quantify how the time to solve the minimization problem depends on the network of information exchanges among the executors. Interestingly, we show that, even when communication overhead may be neglected, the clique (i.e., a model where executors share information with all the others) is not necessarily the most effective topology, as commonly assumed in previous works, instead there are situations where it would rather be worthwhile to deploy the distributed computation over a less-connected network.

Optimization algorithms for Network Slicing, Adrien Gausseran

Tutors: Joanna Moulierac, Frédéric Giroire, Université Côte d’Azur, CNRS, I3S, France.

Network slicing is a form of network allowing multiple virtual networks which employ different services to be maps over a common network infrastructure. A slice is a virtual network using VNFs (for virtual network function) in a particular order and connecting one or multiple sources to a destination. A VNF is a network function that runs on a virtual machine on top of the hardware networking infrastructure. The combination of multiple virtual network functions forms the service utilisation of a slice. For this internship, we will focus on the smallest defnition of a slice: a service function chain as described in Optimization of Network Service Chain Provisioning : the virtual network will be a single path linking the source to the VNFs in a determined order to the destination. This internship takes place because of the appearance of network slicing enabled by the emergence and upcoming deployment of 5G mobile technology. Network slicing is likely to be a major 5G network technology because of the new services supported by 5G. These new services and new ways of using the network will impose different constraints and requirements on the network in terms of functionality. By maximising the flexibility of 5G networks and optimising both the utilisation of the infrastructure and the allocation of resources, network slicing will enable greater energy and cost efficiencies compared to earlier mobile networks. This will allow new products and services to be brought to market quickly and easily adapted to changing demands, resulting in increased revenues for operators and more services for users. Using the emerging technology paradigms used by netwwork slicing, this study proposes an elegant and practical solution to increase network usage by slices by up to 40% and reduce operational costs generated by network usage by reducing license usage by 20%.

Adrien is now a PhD student in the COATI (Combinatorics, Optimization, and Algorithms for Telecommunications) team, a joint project-team between Inria Sophia Antipolis - Méditerranée and the I3S laboratory.

Integration of structural constraints into TSP models, Nicolas Isoart

Encadrants : Jean-Charles Régin, David Coudert, Université Côte d’Azur, CNRS/INRIA, I3S, France

Cette étude montre comment l’ajout de contraintes structurelles peut améliorer la résolution du TSP en programmation par contraintes. L’idée est d’ajouter un point de vue plus structurel au point de vue basé principalement sur les coûts de l’état de l’art. En d’autres termes, nous cherchons des motifs empêchant l’existence d’un TSP dans un graphe ou, à l’inverse, des motifs imposant que ses arêtes soient dans le TSP. Nous donnons une règle de déduction structurelle et nous définissons une structure de données permettant de la traiter. Puis, nous donnons un algorithme dans sa version statique et dynamique, tous deux en O(|M \ CST|) + O(|M|) + O(n + m) où M est le nombre d’arêtes nécessairement dans le TSP, CST est la structure de données défini, n est le nombre de nœuds et m est le nombre d’arêtes. Enfin, nous comparons nos résultats avec l’état de l’art.

Nicolas is now a PhD student at the I3S laboratory.

Statistical Shape Modelling, Florent Jousse

Tutors: M. Gonzales, A. Bilger

In this report, I will present my internship works about statistical shape modelling which is a powerful tool as part of medical imaging and automatic image analysis, also called computer vision. Throughout these six month, I have studied the state of the art of shape modelling which brought me to an implemention of a recent version of shape models called Gaussian Process Morphable Models created by Lüthi et al. in 2016. This implementation as been integrated to the Quantificare suite as a C++ library by the development team to make 3D image registration, anatomic landmark detection and segmentation. The anatomic landmarks are used to compute measure of angles and distance on the face. The segmentation is used to match intra-patient models to show results of treatments. We will explain how this technique works and what are the concepts behind. Then, how we improved the performances of the registration algorithm between the shape model and the image.

M. Lüthi, T. Gerig, C. Jud and T. Vetter, “Gaussian Process Morphable Models,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 8, pp. 1860-1873, 1 Aug. 2018. doi: 10.1109/TPAMI.2017.2739743

On the decidability of synchronizability for mailbox communicating automata, Laetitia Laversa

Tutors: Etienne Lozes, Cinzia Di Giusto, Université Côte d’Azur, CNRS, I3S, France.

Distributed systems are increasingly common and it is therefore important to ensure that they are error-free. We are interested here in their communication and more particularly in the notion of synchronizability. This property makes it possible to compare an asynchronous system with a synchronous system. We know that synchronizability is undecidable for peer-to-peer systems, so this report focuses on the decidability of synchronizability for mailbox systems.

Laetitia is now a PhD student at the I3S laboratory.

Analysis and classification of heart signals, Alessandro Vasquez

Tutor: Frédéric Précioso, Université Côte d’Azur, CNRS, I3S, France.

During the last years, the Deep Learning domain have grown exponentially both in number of papers published each year and in the general public attention. What’s interesting in this domain is the performance of neural networks in classifying different types of data such as images, audio and video. One of the best advantages of Deep Learning over other techniques is that it has proved to perform very well on raw data, i.e. not applying heavy preprocessing but rather feed the raw data directly to a learner (i.e. a Machine Learning model) hoping that it will learn somehow to characterize the target’s properties for resolving a given problem (of classification or prediction, for example). This approach is often called “End to End Machine Learning”, since it is based on the idea of letting the model learn automatically all the necessary features during all the problem-solving’s steps, or at least most of them. It’s also referred to as “agnostic approach”, because developing this type of model do not require specific, domain-related knowledge: let’s take for example a set of images of brains MRI (magnetic resonance images). Supposing that some of the brains have a cancer and others don’t, and that each image is labelled as “positive” or “negative” to cancer, I can design and implement a neural network, feed it with the raw images so that it learns to classify an image as positive or negative to cancer. I can the evaluate the neural network’s performances without having a specific knowledge on cancer or image analysis. On the other hand, I need to have an expertise on neural network design, implementation and data management.

The objective of this internship is to investigate the performances of end-to-end models applied on a dataset of heartbeat-related electric signals (recorded as electrocardiograms, or ECG). Being partnered with the project IADB (Intégration et Apprentissage sur les Données Biomédicales or Integration and Learning on Biomedical Data) of the Université Côte d’Azur, we wanted to apply Deep Learning on waveform medical data, so we turned our attention to ECGs. The questions we wanted to answer are thus:

  • Can we develop an end-to-end model for analyzing ECGs? • How well does it performs?
  • Can we improve the model?
  • Can we reuse this model for studying other problems related to medical waveform data?