SM207-1 Deep Learning for Cell Tracking on 3D + time live embryos.  
Admin  
Encadrant : Emmanuel FAURE
Labo/Organisme : LIRMM
URL : https://www.irit.fr/~Emmanuel.Faure/
Ville : MONTPELLIER

Description  

Machine Learning and more specifically Deep Learning give very promising results in various domains of biological research. The aim of the internship is to apply Deep learning to the field of biological imaging, and more specifically to 4D (3D plus time) cell segmentation, the recognition and tracking of each cell and its progeny in live developing embryos. The internship will start from a preexisting database of more than 10 fully-segmented embryos (>10000 tracked cells). We would like to create an end-to-end method for complex 3D cell tracking using deep learning. In this internship, depending on the profile and interests of the intern, we can:
- Implement and adapt to our dataset existing end-to-end deep learning methods for instance segmentation (the identification of individual objects).
- Explore theoretical aspects of the adaptation of the memory concept in Recurrent Neural Networks (LSTM) to cell tracking in 3D+time images.
- Use deep learning to predict and define the rules of cell division during embryogenesis.
This 5-6 months internship can lead to a PhD project, potentially combining computer approaches with the experimental validation of predictions.

URL sujet détaillé : :

Remarques : PROFILE AND RESEARCH SKILLS:

The internship is mainly intended for students with an initial training in maths and computer science, though biologists with very good skills in computer programming and statistical analysis could be considered. An interest in biology is welcome. Team spirit, autonomy, dynamism, creativity would be appreciated. Proficiency in technical English or French must be sufficient to write technical documentation, and to interact verbally daily.
Required programming language: python
The selected student should basic knowledge of at least one of the following topics: machine learning theory, image processing, high performance computing or the use deep learning libraries (e.g. Keras).
CONTEXT:
The intern will be based in Montpellier, and the supervision will be provided by a computer scientist Emmanuel Faure (CNRS) and a biologist Patrick Lemaire (CNRS)
Mails : emmanuel.faure.fr and patrick.lemaire.cnrs.fr

Emmanuel Faure, (LIRMM Montpellier from Fall 2018 https://www.lirmm.fr/icar/) is a Data Scientist using recent machine learning techniques to understand quantitative biology and medicine.

Patrick Lemaire's team (CRBM, Montpellier, France; http://www.crbm.cnrs.fr/en/recherche/equipes/)focuses on animal embryonic development, using ascidians as a model system.




SM207-2 Verification of machine learning models  
Admin  
Encadrant : Nathanaël FIJALKOW
Labo/Organisme : LaBRI
URL : http://nathanael-fijalkow.github.io/
Ville : Bordeaux

Description  

Neural networks and other machine learning models have proved to be very successful in a wide range of applications. However to be safely used in critical scenarios we need guarantees: how accurate and robust is your model? A very classical example shows a neural net classifying images being fooled by adding random noise, asserting that a panda becomes a gibbon with high confidence.

Verification of machine learning models is a growing field, fostering ideas from automata theory, program verification, invariant synthesis, and dynamical systems.

The goal of the internship is to study and construct algorithms for verifying neural networks
with simple architectures. This in particular requires defining and comparing correctness and
robustness notions. The outcome of the internship can be either theoretical or practical through
implementation.
URL sujet détaillé : https://nathanael-fijalkow.github.io/pdf/internship_learning.pdf

Remarques : A PhD scholarship is available for starting in September 2019.



SM207-3 Génération de chemin fluide en dimension 2 ou 3 / Smooth Path Generation in Dimension 2 or 3  
Admin  
Encadrant : Alexis SCHEUER
Labo/Organisme : Loria / Inria Grand-Est
URL : http://members.loria.fr/AScheuer
Ville : Vandoeuvre-lès-Nancy

Description  

A path is the geometric part of a motion, without time aspect. Smooth paths (C 2 or more), if they respect the motion constraints of a robot (given as a differential equation), can be followed by this robot more precisely than simpler paths (C 1 or less).
Writing motion constraints as a differential equation is usually quite simple. Integrating this differential equation allows to find the motion associated to a given command starting from an initial state, as well as the end state of this motion. On the contrary, it is much harder to find out the right command corresponding to a motion connecting an initial state to an end one, as existence and uniqueness of the motion can be questioned.

As far as paths are concerned, ISEEML library remains one of the more efficient to solve this problem. While it was initially designed for car-like robots' paths, it can be extended to those of any kind of mobile robots, in dimension 2 (wheeled or legged robots) or 3 (planes or drones): clothoid pieces just have to be replaced by pieces of any kind of other spirals, computations remaining unchanged. It is also possible to handle different motion constraints for initial or end part of the paths. At last, connecting configurations (made of a position, an orientation and a curvature)
whose curvature is not zero (or extreme) should be possible.
For this last point, only resolution clues are currently available. During this internship, the student will start and get accustomed to ISEEML library by programming the two first points, which are already totally formalised.
He will then try and solve last point, and implement the solution found. Finally, experiments will be carried out,
either in an ad-hoc application (using Qt) or in a control benchmark (based on ROS) switching easily from simulation to real robot handling.
URL sujet détaillé : http://members.loria.fr/AScheuer/research/2018-master-2-subject

Remarques : Internship allowance and accommodation (Crous) may be obtained / possibilité d'indemnité de stage et de logement par le Crous.



SM207-4 Mean Field Optimal Control for Multi-Armed Bandit Problems  
Admin  
Encadrant : Nicolas GAST
Labo/Organisme : LIG
URL : http://mescal.imag.fr/membres/nicolas.gast/
Ville : Grenoble

Description  

Mutli-armed bandits are classical models of sequential decision making problems in which a controller (or learner) needs to decide at each time step how to allocate its resources to a finite set of alternatives (called arms). They are widely used in online learning today but the complexity of computing optimal policies for Markovian bandit grows quickly with the number of arms. The goal of the internship is to study the use of recent progress from mean field theory to compute close to optimal allocation policies for such problems.

(see the detailed pdf for more details).
URL sujet détaillé : https://team.inria.fr/polaris/files/2018/10/stage_bandits.pdf

Remarques : Co-advisor : Patrick Loiseau (Inria)




SM207-5 Sequential Learning in Combinatorial Bandits under Delayed and Anonymous Feedback  
Admin  
Encadrant : Patrick LOISEAU
Labo/Organisme : Co-encadrement avec Dong Quan Vu (Nokia Bell Labs)
URL : http://lig-membres.imag.fr/loiseapa/
Ville : Grenoble

Description  

Background

The multi-armed bandit (MAB) problem is an elegant model to solve sequential prediction problems. A learner needs to choose at each time step an action (also called arm) in a given set, then he suffers a loss and observes a feedback corresponding to that chosen action. The objective of the learner is to guarantee that the accumulated loss is not much larger than that of the best action-in-hindsight (that is, to minimize the regret).

In this work, we focus on bandit problems with combinatorial structure and linear loss function. The classical algorithms for K-arms bandits problems such as EXP3 [1] offer a regret upper-bound that is inefficient in this case due to the fact that the considered action set typically contains an exponential number of actions. However, exploiting the combinatorial structure of the action set, we can use the exponential average weights algorithms (e.g., EXP2 [4]) to bound the regret in terms of the dimension of the representative space (typically a smaller number). This algorithms work, however, only under the classical feedback models: the learner can receive either bandit feedback (he only observes the incurred loss by the chosen action) or either full-information feedback (he observes losses of all actions).

Work description

In this internship, the goal is to extend combinatorial bandits algorithms and results to more complex feedback structure that would enable realistically modeling applications such as advertising on social networks, scheduling data transmission on a network, or decision making in recommendation systems.

To that end, we will leverage several new settings of feedback that were recently proposed by [2, 3]: (i) Delayed feedback: The loss of time t will only be observed at time t+d for d> 0 fixed. (ii) Anonymous composite feedback: At time t, the learner only observes the total sum of the loss from time steps t − d + 1, t − d, . . . , t for d> 0 fixed. For these settings, techniques based on classical EXP3 algorithm for MAB are presented in [2, 3]. The student will investigate theoretically whether these techniques can be extended to combinatorial bandits in order to obtain algorithms with good regret bounds for these cases. Then, he/she will propose an efficient implementation of the new extended algorithms and perform simulations to validate the results and to evaluate the actual performance of the algorithms.

Expected abilities of the student

Strong background in mathematics (in particular probability and statistics); if possible initiation to sequential learning; and basic programming experience (e.g., Python, R or C++).

References

[1] Peter Auer, Nicolo Cesa-Bianchi, Yoav Freund, and Robert E Schapire, The nonstochastic multiarmed bandit problem, SIAM journal on computing 32 (2002), no. 1, 48=9677.
[2] Nicolo Cesa-Bianchi, Claudio Gentile, and Yishay Mansour, Nonstochastic bandits with composite anonymous feedback.
[3] Nicolo Cesa-Bianchi, Claudio Gentile, Yishay Mansour, and Alberto Minora, Delay and cooperation in nonstochastic bandits, Journal of machine learning research 49 (2016), 605=96622.
[4] Nicolo Cesa-Bianchi and G=E1bor Lugosi, Combinatorial bandits, Journal of Computer and System Sciences 78 (2012), no. 5, 1404=961422.
URL sujet détaillé : http://lig-membres.imag.fr/loiseapa/pdfs/InternshipBanditsDelayedFeedback.pdf

Remarques : Co-encadrement avec Dong Quan Vu (Nokia Bell Labs)



SM207-6 Set-based co-simulation, uncertainties and guarantee  
Admin  
Encadrant : Julien ALEXANDRE DIT SANDRETTO
Labo/Organisme : U2IS - ENSTA ParisTech
URL : http://perso.ensta-paristech.fr/~alexandre/
Ville : Palaiseau

Description  

Simulation tools are mainly used in the design steps of systems, such as robots, ship, airplane, missiles. It is also the best approach to develop the software of this kind of complex and critical systems.
In the case of a physical system (hybrid system of cyber-physical system), the vehicle, for example, moves into a dynamical environment (wind, current, etc.). It is then interesting to be able to simulate both the environment and the system. The major problem is that both are strongly connected, by the physics but also by the software (control loop).
In our lab, we develop a tool for set-based simulation, allowing us to consider many sources of uncertainty while let us, nevertheless, produce guarantee on the result.
The subject of the internship is to develop the mathematical theory for a set-based co-simulation (a new theorem is needed), implement it (in our tool in c++) and test the solution on a ship-ocean example (in simulation and maybe on an actual system).
URL sujet détaillé : :

Remarques : team : Safety of Hybrid Systems
keywords : Mathematics, computer science and basics in physics (ODEs, PDEs)
Salary following the law (around 550 =80 per month)
A PhD could be proposed to good students



SM207-7 Distance geometry problems  
Admin  
Encadrant : Julien ALEXANDRE DIT SANDRETTO
Labo/Organisme : U2IS - ENSTA ParisTech
URL : http://perso.ensta-paristech.fr/~alexandre/
Ville : Palaiseau

Description  

Some systems are sometimes defined only by distances, such as in chemistry, localization, sensor network, robotics.
Distance geometry problems are difficult to solve. From a complexity viewpoint, obtaining an
approximate solution to a distance geometry is NP-hard. From an optimization viewpoint, distance
geometry problems have many local minimizers which introduce doubts.
In particular, distance geometry problems are interesting mathematical problems with important applications in robotics for solving localization problems, UAV swarm organization, etc. If the distances given by sensors are always uncertain, in these special cases, the measures can be obtained with delay (conditionned by dynamics of UAVs and communication) or even completly wrong (due to outliers introduced by sensor fault).
More than a distance solver able to manage with uncertainties, some more complicated issues can thus be addressed.
We propose an internship in order to develop a more robust distance solver algorithm using the
Interval Analysis in order to bound and certify the results. The Cayley-Menger distance matrix could
be a good approach to fix the problem of lack of constraints. The relaxed intersection (or q-intersection) can also be used to counteract the effect of
outliers. Finally, an introduction of dynamics of UAVs can be considered to minimize the effect of delay in distance getting.






URL sujet détaillé : :

Remarques : team : Safety of Hybrid Systems
keywords : Constraint programming, Mathematics, computer science
Salary following the law (around 550 =80 per month)
A PhD could be proposed to good students



SM207-8 Risque d'avalanche & visualisation 3D  
Admin  
Encadrant : Frédéric POURRAZ
Labo/Organisme : LISTIC
URL : https://www.interreg-francesuisse.eu/beneficiaire/cime/
Ville : Annecy

Description  

Version française :

Le travail demandé lors de ce stage se fera en trois étapes :
1. La réalisation d'un état de l'art et d'une comparaison de deux approches de fusion d'informations incertaines et multi-critères :
◦ Les sous-ensembles flou ;
◦ L'approche possibiliste.
L'objectif est d'identifier les cas d'usage de chacune de ces approches et d'étudier l'opportunité de les combiner.
2. La mise en application d'une ou plusieurs de ces approches dans le cadre de l'estimation du niveau de vigilance à adopter sur le terrain, au regard du risque d'avalanche.
3. La visualisation des résultats obtenus dans un environnement 3D (voire de réalité augmentée) déjà mis en óuvre au sein du laboratoire.
_________________________________________________
English version :

The work required during this internship will be done in three stages:
1. The realization of a state of the art and a comparison of two approaches to merging uncertain and multi-criteria information:
◦ The fuzzy subsets;
◦ The possibilist approach.
The objective is to identify the use cases of each of these approaches and to study the opportunity to combine them.
2. The implementation of one or more of these approaches as part of the assessment of the level of vigilance to be adopted in the field with regard to avalanche risk.
3. Visualization of the results obtained in a 3D (or even augmented reality) environment already implemented in the laboratory.
URL sujet détaillé : https://projects.listic.univ-smb.fr/emploi/stage/2018_Sujet_Master_CIME.pdf

Remarques : Ce stage sera gratifié selon la réglementation ministérielle en vigueur.
En fonction des résultats obtenus et de la motivation du candidat ou de la candidate, un financement de thèse assuré sur trois ans à hauteur de 1615 =80 par mois pourra être proposé.



SM207-9 Speculative execution based on GLR/GLL theory  
Admin  
Encadrant : Thierry GOUBIER
Labo/Organisme : LCE / CEA List
URL : http://www-list.cea.fr/
Ville : Palaiseau

Description  

In the context of the current difficulty to feed compute units in today's CPUs with enough data without stalling on the memory accesses, this internship intends to study the use of generalized context free parsing theory to represent speculative execution of a program, relying on the ability of GLL/GLR parsers to fork and merge speculative paths. The work will study the ability of abstracting a computation over a sequence of instructions in that theory, and extract memory addresses that could be resolved that way.

A good understanding of compilation and processor architecture is required.
URL sujet détaillé : :

Remarques : Stage rémunéré suivant la grille CEA.



SM207-10 Rust for Performance  
Admin  
Encadrant : Bruno RAFFIN
Labo/Organisme : INRIA Grenoble / Equipe DataMove
URL : https://team.inria.fr/datamove/
Ville : SAINT MARTIN D'HERES

Description  

The analysis of large multidimensional spatiotemporal datasets poses challenging questions regarding storage requirements and query performance. Several data structures have recently been proposed to address these problems that rely on indexes that pre-compute different aggregations from a known-a-priori dataset. Consider now the problem of handling streaming datasets, in which data arrive as one or more continuous data streams. Such datasets introduce challenges to the data structure, which now has to support dynamic updates (insertions/deletions) and rebalancing operations to perform self- reorganizations. We developed a Packed- Memory Quadtree (PMQ), that efficiently supports dynamics data insertions and outperform other classical data structures like R-trees or B-trees. The PMQ ensures good performance by keeping control on the density of data that are internally stored in a large array. The PMQ is based on the Packed Memory Array [3,4,5] and share with it its amortized cost of O(log^2(N)) per insertion. But today all processors are multicores. Data structures need to be parallelized to leverage the compute power available. The goal of this internship is to design, develop and test a parallel version of the PMQ using the Rust programming language (or C++).
URL sujet détaillé : https://team.inria.fr/datamove/job-offers/#rust

Remarques : Gratification de Stage



SM207-11 Data Assimilation at Large Scale  
Admin  
Encadrant : Bruno RAFFIN
Labo/Organisme : INRIA Grenoble / Equipe DataMove
URL : https://team.inria.fr/datamove/
Ville : SAINT MARTIN D'HERES

Description  

Ensemble runs consist in running many times the same simulation with different parameter sets to get a sturdy analysis of the simulation behavior in the parameter space. The simulation can be a high resolution large scale parallel code or a smaller scale lightweight version called meta-model or surrogate-model. This is a process commonly used for Uncertainty Quantifications, Sensibility Analysis or Data Assimilation. This may require to run from thousands to millions of the same simulation, making it an extremely compute-intensive process that will fully benefit of Exascale machines.

Existing approaches show a limited scalability. They either rely on intermediate files where each simulation run writes outputs to file that are next processed for result analysis. This makes for flexible process, but writing these data to the file system and reading them back for the analysis step is a strong performance bottleneck at scale. The alternative approach consists in aggregating all simulation runs into the same large monolithic MPI job. Results are processed as soon as available avoiding intermediate files. However, by not taking benefit of the loose synchronization needs between the different runs, this over-constrains the execution regarding compute resource allocation, application flexibility or fault tolerance. Recent approaches like the Melissa framework adopt an elastic architecture. Simulations or groups of simulations (depend on the level of synchronization needed between the runs) are submitted to the machine batch scheduler independently. Once these jobs start they dynamically connect to a parallel data processing parallel server. This server gets the data from the running simulation that are processed in parallel on-line as soon as available, thus avoiding intermediate files. The computed partial results can be retro-feeded to the simulation (needed for data assimilation for instance). They can also be used to support an adaptive sampling process where the set of parameters for the next simulation runs are defined according to these partial results. Such framework enables to fully take benefit of the loose synchronization capabilities between simulation runs: simulations are submitted and allocated independently enabling to better use the machine resources and to support efficient fault tolerance mechanisms.
URL sujet détaillé : https://team.inria.fr/datamove/job-offers/#GRPC

Remarques : Gratification de Stage



SM207-12 Certified Compilation of Skeletal Semantics  
Admin  
Encadrant : Alan SCHMITT
Labo/Organisme : Inria
URL : http://people.rennes.inria.fr/Alan.Schmitt/
Ville : Rennes

Description  

Skeletal semantics (https://hal.inria.fr/INRIA/hal-01881863) are a new approach to describe the semantics of programming languages in a highly generic and modular way. These semantics describe the control flow for the evaluation of each construct of the language without specifying the implementation of primitive construct. For instance, the semantics for the addition of two expressions could be "evaluate the first expression, then evaluate the second expression, then add the results", without specifying how to add numbers. One may then provide several interpretations of such semantics, such as a big-step semantics, a static semantics (i.e., typing), or an abstract semantics (for abstract interpretation). These interpretations are defined independently of the programming language, and thus may be reused for other languages.

The objective of the internship is to study the certified compilation of skeletal semantics. The student will choose a source language, for instance a simple WHILE language, and a target language, for instance a stack-based language. The student will define these languages as skeletal semantics, and will define the compilation from source to target as an interpretation of the source semantics. The student will then establish a set of necessary conditions between the behavior of primitive constructs to certify the compilation is correct: the behavior of a compiled program corresponds to the behavior of the source program.
URL sujet détaillé : http://people.rennes.inria.fr/Alan.Schmitt/internships/skeletons.pdf

Remarques : Rémunération possible. Bourse disponible dans le cadre d'un projet ANR pour la poursuite en thèse.



SM207-13 Gradual Typing for dynamic languages  
Admin  
Encadrant : Giuseppe CASTAGNA
Labo/Organisme : IRIF, Institut de Recherche en Informatique Fondamentale
URL : https://www.irif.fr/~gc
Ville : Paris

Description  

We propose three main internships on gradual typing for dynamic languages (though other are possible).

CONTEXT

The idea behind gradual typing [1] is to integrate, in a unique framework, both static typing (where the types in a program are checked against each other before its execution) and dynamic typing (where the types are only controlled during the execution).

Gradual typing reconciles static and dynamic typing by allowing the programmer to obtain the strong guarantees offered by static typing for certain parts of a program while keeping the flexibility of dynamic typing for other parts. Formally, this is done by adding a new type annotation, often called dynamic, which acts as a fully dynamic type. That is, values of this type will pass any static type check, and their type will only be checked at runtime. What is important is that preexisting dynamically-typed code is still valid, but can easily be upgraded with more type annotations. This is why gradual typing has been attracting more and more interest and has been adopted by some of the major partakers in the development of information technology and computer science, as shown by the development of the languages Dart by Google, TypeScript by Microsoft and Flow by Facebook.

Regarding static typing, one of the tools that proved to be among the most suited to Web development is the use of union and intersection types, also known as set-theoretic types [3,4]. The aim of set-theoretic types is to integrate the mathematical definitions, principles, and operations of set-theory (such as the union, the intersection, or the complement of sets) in a type system. This allows for a much more precise control over types, thus providing stronger static guarantees than with simple types.


GOALS

We propose three different internships on gradual-typing for polymorphic languages, that will build upon the work we already developed on gradual typing with set-theoretic types [6,7]:

SUBJECT 1: Occurrence typing.
Occurrence typing is a technique developed for Typed-Racket to use flow information for typing highly dynamic code [2]. The idea is that when some dynamic type test is performed on a variable, then we can refine our type inference according to whether the test has succeeded or not. For instance, if in an if-expression we test whether a variable x has type integer, then we know that in the then branch x will have type integer (intersection with its static type) and that in the else branch it will have its static type minus the type integer. The goal of this internship is to formalize occurrence typing in the context of semantic subtyping so as to extract the maximum amount of information from the flow of a program.


SUBJECT 2: Code analysis for inferring intersection types
Intersection types are a key feature for a feasible typing of dynamic languages, where the same function can be used in contexts so different that they cannot be captured by common polymorphic type systems. The idea is that in each context the function is used with some given type, that, thus, the function has all these types, and, therefore, it also has the intersection of all these types. While it is relatively easy to check whether a function has an intersection type specified by the programmer via an annotation, as it can be done in Flow, it is very hard (actually, undecidable) to infer an intersection type for code partially annotated or not annotated at all. The goal of this stage is to use the results of a static analysis (and, time permitting, profiling information) to drive the inference of intersection types for functions defined in dynamic languages.

SUBJECT 3: Abstract machines for cast calculi.
In order to provide strong safety guarantees, gradually-typed languages can be compiled into intermediate versions which contain explicit type casts that dynamically check the correctness of gradually-typed portions of a program. The implementation of such intermediate representations is the subject of active research and of a heated debate [5], but there does not exist any work on the compilation of cast when these are on set-theoretic types. The goal of the internship is to study the current literature on the implementation of cast calculi, use it to define an abstract machine for cast calculi with set-theoretic types, and --time permitting-- implement it.


References

[1] Jeremy Siek and Walid Taha. Gradual typing for functional languages. In Scheme and Functional Programming Workshop (September 2006), pp. 81=9692

[2] Occurrence typing , The Typed Racket Guide

[3] Giuseppe Castagna, Kim Nguyen, Zhiwu Xu, Pietro Abate: Polymorphic Functions with Set-Theoretic Types: Part 2: Local Type Inference and Type Reconstruction. POPL 2015.

[4] Giuseppe Castagna, Kim Nguyen, Zhiwu Xu, Hyeonseung Im, Sergueï Lenglet, Luca Padovani: Polymorphic functions with set-theoretic types: part 1: syntax, semantics, and evaluation. POPL 2014.

[5] Asumu Takikawa, Daniel Feltey, Ben Greenman, Max New, Jan Vitek, Matthias Felleisen. Is Sound Gradual Typing Dead? POPL 16.

[6] Giuseppe Castagna and Victor Lanvin. Gradual typing with union and intersection types. In International Conference on Functional Programming, 2017.

[7] Giuseppe Castagna, Victor Lanvin, Tommaso Petrucciani, and Jeremy Siek. Gradual typing: A new perspective (subtyping and implicit parametric polymorphism). POPL 2019, to appear, 2018.

URL sujet détaillé : https://www.irif.fr/~gc/stageGradualTyping.en.html

Remarques : Part of this project is proposed in collaboration with Facebook Language Research. It will be pursued in collaboration with Jeremy Siek (Indiana University) in sabbatical leave at IRIF from January 2019 (possibility of a joint supervision).



SM207-14 Hierarchical Parallelization  
Admin  
Encadrant : Christophe ALIAS
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/christophe.alias
Ville : LYON

Description  

Technological trends in circuit design force the duplication of computation units to obtain more performances. Splitting a program into parallel units and setup the synchronisations is bug-prone and expensive. Hence, the trend to rely on a {em parallelizing compiler} to automate this task.
Numerous problems must be addresses by such a compiler. Where is the parallelism in the program? Which parts must be parallelized and how? How to allocate the hardware resources?

The {em polyhedral model} addresses these issues in the same unifying framework. Under certain conditions, the iterations of {tt for} loops may be summarized by polyhedra. Then, a geometric reasonning allows to build compiler analysis (e.g. dependence, scheduling) thanks to linear programming and geometric operations (union, intersection, projection). But the precision comes with a cost: the complexity of polyhedral analysis limits the size of compilable programs to a few tens of lines!

In this internship, we propose to study how to scale the polyhedral model. We will leverage a {em divide-and-conquer} approach: the program is divided into subprograms, each subprogram is analyzed separately, then we ``gather the pieces''. We will focus on {em loop tiling}, a polyhedral loop transformation which divides the iteration domain of nested {tt for} loops into atomic blocks.

Program:
1) Study the possible loop tilings on a composition of polyhedral programs (e.g. several matrix multiplication). Figure out how {em local} loop tilings may be composed to form a {em global} loop tiling. Specify the properties of a ``good'' composition.
2) Propose an algorithm for hierarchical loop tiling, which fulfills the properties formalized in step 1.
3) Implement and test your algorithm. The approach will be validated experimentally on the benchmarks of the polyhedral community.

Requirements. Notions in compilation and parallelism.

URL sujet détaillé : http://perso.ens-lyon.fr/christophe.alias/stages/modular-tiling-en.pdf

Remarques :



SM207-15 Automatic circuit transformations for fault-tolerance and their certification in Coq  
Admin  
Encadrant : Pascal FRADET
Labo/Organisme : Inria Grenoble Rhone Alpes
URL : https://team.inria.fr/spades/
Ville : Grenoble (Montbonnot)

Description  

Circuit fault-tolerance is nowadays a major research issue for any safety critical applications due to the increased risk of soft errors. Such risk results from the continuous shrinking of transistor size that makes components more sensitive to radiation.

The standard fault-tolerant technique is Triple Modular Redundancy (TMR) that consists in triplicating the circuit and inserting majority voters to filter out corrupted bits [1]. Recently, we have proposed several novel fault-tolerance techniques based on time redundancy [2] trading throughput for minimal hardware overhead. Circuits are represented using a functional hardware description language and fault-tolerance techniques as program (circuit) transformations. Since fault-tolerance is typically used in critical domains (aerospace, nuclear power, etc.), the correctness of such transformations is essential. If there is little doubt about the correctness of simple transformations such as TMR, this is not the case for more intricate ones. We have used the Coq proof assistant to formalize the semantics of an hardware description language and several fault-models to formally prove the correctness of our time redundant transformations [3].

With recent nanoscale technologies, a high energy particle strike often produces several errors called SEMT (for "single event multiple transients") in adjacent flip-flops or wires. However, the techniques mentioned before can only handle a single error at a time.

The objective of the work is to study extensions of these techniques to tolerate SEMT. After a bibliographic study of fault-tolerance transformations for digital circuits, the following topics could be considered.

- The description of SEMT fault-models using information about the location of cells, gates and wires. (e.g. what simultaneous corruptions are possible, how they propagate, etc.)

- The analysis of circuits to infer where the different cells, gates and wires should be placed relatively to each other to prevent SEMTs to lead to actual failures.

- The design of a new fault-tolerance technique (possibly by combining several already known ones) able to mask SEMT.

- The formalization of the SEMT fault-models, masking techniques and their certification in the Coq proof assistant.

Depending on his/her background and interests, the candidate may focus on only one or two of the above tasks.

References :

[1] R. E. Lyons and W. Vanderkulk.
The Use of Triple-Modular Redundancy to Improve Computer Reliability.
IBM Journal. 1962.

[2] Dmitry Burlyaev, Pascal Fradet, and Alain Girault.
Automatic time-redundancy transformation for fault-tolerant circuits.
In Proceedings of the ACM/SIGDA International Symposium on Field-
Programmable Gate Arrays, 2015.

[3] Dmitry Burlyaev and Pascal Fradet.
Formal verication of automatic circuit transformations for fault-tolerance.
In Formal Methods in Computer-Aided Design, FMCAD, 2015.


Context :

The main goal of the Spades team is the safe
design of real-time embedded systems.
URL sujet détaillé : :

Remarques :



SM207-16 Hardware Compilation  
Admin  
Encadrant : Christophe ALIAS
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/christophe.alias
Ville : LYON

Description  

A process network is a collection of pure functions communicating through channels. Process networks are a natural intermediate representation for hardware compilation: the front-end extracts the parallelism and derive the process network. Then, the back-end maps the process network to hardware. In this internship, we propose to focus on the compilation of {em channels} to hardware.

Compiler optimizations for parallelism and data locality restructure deeply the execution order of the program, hence the read/write patterns in communication channels. This breaks most FIFO channels, which have to be implemented with addressable buffers. Expensive hardware is required to enforce synchronizations, which often results
in dramatic performance loss.
We have proposed a algorithm to reorganize the channels so the FIFO broken by loop tiling can be recovered. We have prove this algorithm to be complete on DPN, a constrained family of process networks.

The goal of this internship is to study the completeness of FIFO recovery on {em general} process networks. Does there exists an algorithm to recover the FIFO broken by {em any} tiling transformation? In the positive case (our belief), find an algorithm
and prove the completeness. Otherwise, point out the fundamental reasons why such an algorithm can never exists.
- Study several polyhedral program where the technique developed in cite{hip3es} did not work.
- Propose a complete algorithm + proof of completeness. Or, prove that such an algorithm can never exist.
- In the positive case, implement and test your algorithm. The approach will be validated experimentally on the benchmarks of the polyhedral community.

Requirements. Notions in compilers and parallelism

URL sujet détaillé : http://perso.ens-lyon.fr/christophe.alias/stages/fifo-en.pdf

Remarques :



SM207-17 Types for reconfiguration  
Admin  
Encadrant : Jean-bernard STEFANI
Labo/Organisme : INRIA Grenoble / LIG
URL : https://team.inria.fr/spades/
Ville : Montbonnot

Description  

Scientific context

Location graphs [1] are a new formal model for software components currently developed in the INRIA SPADES team,
that combines dynamic reconfiguration, graph control structures and isolation properties,
and which subsumes located variants of the pi calculs [2], and more abstract computational models,
such as Milner's bigraphs [3].


Subject of the internship

The goal of this internship is to develop a simple I/O type system for location graphs,
inpired by notions of I/O types for the pi-calculus [4], of mobility types in Boxed Ambients [5],
and of types for a dynamic component-based language [6]. Apart from communication,
location graphs have capabilities for passivating locations and modifying a location graph structure.
An I/O type system for location graph needs to take into account these capabilities to prevent simple
but less usual forms of errors such as unlawful update, binding and movement in a location graph.
If successful, the internship can also begin to investigate more complex forms of types dealing
with structural invariants, including conditions such as encapsulation or isolation between
components, a topic related to the aliasing problem in object-oriented programming languages
and in particular the notions of ownership types that have been developed to cope with it [7].


References

[1] J.B. Stefani : "Component as location graphs", LNCS 8997, Springer, 2015.

[2] R. Milner : "Communicating and Mobile Systems: The Pi Calculus", Cambridge University Press, 1999.

[3] R. Milner : "The Space and Motion of Communicating Agents", Cambridge University Press, 2009.

[4] D. Sangiorgi, D. Walker : "The pi-calculus: A Theory of Mobile Processes", Cambridge University Press, 2001.

[5] M. Bugliesi, S. Crafa, M. Merro, V. Sassone: "Communication and mobility control in boxed ambients",
Information and Computation 202(1), 2005.

[6] J.C. Seco, L. Caires : "Types for dynamic reconfiguration", LNCS 3924, Springer, 2006.

[7] D. Clarke, J. Ostlund, I. Sergey, T. Wrigstad: "Ownership Types: A Survey", LNCS 7850, Springer, 2013.

URL sujet détaillé : :

Remarques :



SM207-18 Delivering VoD services to vehicles along a highway  
Admin  
Encadrant : Thomas BEGIN
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/thomas.begin/
Ville : Lyon

Description  

Vehicular networks bringing connectivity to the vehicles for the purpose of safety and entertainment are expected to be deployed in the upcoming years. While different technologies including LTE and WiMAX have been considered, IEEE 802.11p (aka WiFi) has dragged much attention.

In this work, we consider the following scenario. Passengers in a vehicle are willing to use VoD (Video on Demand) service in order to watch a movie during their journey to their destination. Several RSUs (Road Side Unit) delivering a wireless connection to passing-by vehicles are scattered along the road. We assume that the network operator may adapt the quality of the video (changing its codec) or block the download from cars having a bad connection so that the passengers watch their movie without any disruption. In order to achieve this goal, one needs to download enough data and filling a buffer with frames that are kept for being played later when the vehicle will have no connection. It is worth keeping in mind that with IEEE 802.11 cars may have download the video at different speed (transmission rate) depending on their distance to their nearest RSU.

We already developed a theoretical model to forecast what would be the performance and Quality of Experience (expressed as a disruption time) of each vehicle. This model was validated against a discrete-event simulator [1].

The objective of this work is to design (and to validate against simulations) an algorithm that selects the video codec or block the download of remote vehicles so that passengers experiences a seamless play of their movie. The algorithm will be based on the predictive model presented in [1] (or an improved version of it) that takes into account the number of RSUs, the expected number of contending vehicles on each RSU, the bandwidth of RSUs, the distance to travel, the length of the movie, etc.

Keywords: selective algorithm; modeling; Video on Demand; 802.11 (WiFi); discrete-event simulator (ns-3); performance evaluation.

[1] Video on Demand in IEEE 802.11p-based Vehicular Networks: Analysis and Dimensioning. T.Begin, A. Busson, I. Guérin-Lassous, A. Boukerche - ACM MSWIM 2018 - Montreal (Canada).


URL sujet détaillé : http://perso.ens-lyon.fr/thomas.begin/tmp/stage-VoD-vehicular.pdf

Remarques : Supervisors: Thomas Begin, Isabelle Guérin-Lassous and Anthony Busson
E-Mails: thomas.begin-lyon.fr; isabelle.guerin-lassous-lyon.fr; anthony.busson-lyon.fr




SM207-19 Logics for reconfiguration  
Admin  
Encadrant : Jean-bernard STEFANI
Labo/Organisme : INRIA Grenoble / LIG
URL : https://team.inria.fr/spades/
Ville : Montbonnot

Description  

Scientific context

Location graphs [1] are a new formal model for software components currently developed in the INRIA SPADES team,
that combines dynamic reconfiguration, graph control structures and isolation properties,
and which subsumes located variants of the pi calculs [2], and more abstract computational models,
such as Milner's bigraphs [3].

Subject of the internship

The goal of this internship is to develop an adequate logic for location graphs.
An adequate logic for a computational model provides a logical characterization of
behavioral equivalence in the model. In the case of location graphs,
such a logic should cover modalities à la Hennessy-Milner [4],
as well as spatial modalities to deal with the underlying hypergraph structure of a location graph.
Several works have looked into adequate logics for computational models with spatial
modalities, such as Ambient calculi [5,6]. The difficulty in location graphs lies with
taking account encapsulation and isolation constraints. Those would probably require
considering constructs analogous to those found in separation logic [7,8,9].


References

[1] J.B. Stefani : "Component as location graphs", LNCS 8997, Springer, 2015.

[2] R. Milner : "Communicating and Mobile Systems: The Pi Calculus", Cambridge University Press, 1999.

[3] R. Milner : "The Space and Motion of Communicating Agents", Cambridge University Press, 2009.

[4] L. Aceto, A. Ingolfsdottir, K.G. Larsen, and J. Srba : "Reactive Systems", Cambridge University Press,
2007.

[5] L. Caires and L. Cardelli. A spatial logic for concurrency (part i). Inf. Comput., 186(2), 2003.

[6] L. Caires and L. Cardelli. A spatial logic for concurrency - ii. Theor. Comput. Sci., 322(3), 2004.

[7] P. O'Hearn : "Resources, concurrency and local reasoning"", Theoretical Computer Science, 375(1-3), 2007.

[8] P. W. O'Hearn, H. Yang, and J. C. Reynolds : "Separation and information hiding",
ACM Trans. Program. Lang. Syst., 31(3), 2009.

[9] A. Francalanza, J. Rathke, and V. Sassone : "Permission-based separation logic for message-passing concurrency",
Logical Methods in Computer Science, 7(3), 2011.
URL sujet détaillé : :

Remarques :



SM207-20 Online assignment games =96 trading off mis-match and waiting costs  
Admin  
Encadrant : Bary PRADELSKI
Labo/Organisme : LIG
URL : https://barypradelski.com/
Ville : Grenoble

Description  

Assignment markets are ubiquitous: From assigning sellers to buyers, workers to jobs, or workload to machines. Shapley and Shubik (1972) introduce a model where each seller is looking for exactly one buyer and vice versa. They show that, generically,
there exists a unique and optimal matching.

This projects will study online assignment markets, where buyers and sellers arrive to the market over time. Such markets may be classic labor markets or market institutions that emerged in recent years such as ride sharing applications. For any market, modern technology allows the market designer to vary (at low cost) at what points in time to clear the market (Emek et al., 2016; Akbarpour et al., 2017). On the one end of
the scale, she can immediately match buyers and sellers (and remove them from the market). On the other end of the scale, she can wait until all buyers and sellers have arrived and then apply the optimal matching. We shall call this design decision of the market designer the clearing schedule. This project seeks to understand:
1. How does the expected payoff change dependent on the clearing schedule?
2. What are (optimal) clearing schedules with respect to a specific cost function of
the waiting time?

The outcome of this research will ideally be presented at an international conference
with both academics and market design practitioners.

Required skills: Mandatory knowledge of basic probability theory and combinatorics.
In addition, simulation skills can be of use.
URL sujet détaillé : :

Remarques : The internship may be continued as a PhD. The internship is paid.



SM207-21 Pervasive platform for learning-based applications  
Admin  
Encadrant : Philippe LALANDA
Labo/Organisme : LIG
URL : http://lig-membres.imag.fr/lalanda/
Ville : Grenoble

Description  

The purpose of the internship is to provide new pervasive platforms and architectures based on the fog paradigm in order to run smart, AI-based applications close to devices and users. This approach contrasts with current cloud-based solutions presenting significant drawbacks in terms of security, performance, and affordability.

The project will extend an existing service-oriented platform (see http://self-star.imag.fr) with new abstractions and mechanisms facilitating the development and evolution of smart applications. These extensions are meant to integrate individualized learning models and interact with remote platforms to distribute computing. The core of the work will be to better specify pervasive applications and their needs.

The work will be done in cooperation with Hong-Kong Polytechnic university (top 100 university). It will be applied to energy savings and crowd management in Hong-Kong high-rise buildings for which more than four years of data will be made available. Also several learning-based applications have been already developed by the Hong-Kong team. A trip to Hong-Kong might be necessary.

URL sujet détaillé : :

Remarques : Highly motivated student needed to work in an international context.



SM207-22 Deep learning for anomaly detection  
Admin  
Encadrant : Liyun HE GUELTON
Labo/Organisme : Worldline
URL : https://fr.worldline.com/
Ville : Villeurbanne

Description  

We are seeking a self-motivated and enthusiastic machine learning student to join the R&D worldline team.

Machine learning interns will have the opportunity to explore the application of deep learning algorithms towards anomaly detection problems. In this internship, the sequential information will be extracted automatically to help anomaly detection. Recurrent Neural Network (RNN) is often used for this propose. This work consists on how to use different RNN extensions in parallel being able to deal with multi-sources problem. In this context, you will have the opportunity to work on the latest general purpose GPU technologies available on our home servers and a variety of deep learning frameworks implemented.

Skills acquired at the end of the internship
 Program with efficient technologies for the encountered case
 Practical experience with machine learning
 Progress in AI algorithms and know how to opt for the right strategy and suitable algorithm
 Get demonstrative and documented results
 Develop technologies that are not restrained to one use-case
 Collaborate with researchers and engineers
 Participate in the proposition of new exploratory fields

URL sujet détaillé : https://jobs.atos.net/job/lyon/deep-learning-for-anomaly-detection-h-f-en-stage/5343/9159388

Remarques :



SM207-23 Evaluating the impact of cache control technologies on HPC workloads  
Admin  
Encadrant : Guillaume HUARD
Labo/Organisme :
URL : https://team.inria.fr/polaris
Ville : GRENOBLE

Description  

Many scientific processes can be expressed as a workflow graph where nodes are
computational tasks (scientific simulations, analysis, visualization) and edges
are data exchanges between tasks. Traditional workflow infrastructures exchange
data through files. However, the increasing gap between I/O bandwidth and
computational capabilities in current and future supercomputers
requires a change in the way scientists are analyzing data produced by
simulations. File-based workflows must now be replaced by in situ workflows
performing data extraction, data reduction, or online visualization before
storing relevant data to the file system.

On current supercomputers, a in situ workflow execution can span across
thousands of nodes. Such execution may exhibit single tasks distributed across
multiple nodes, or multiple tasks sharing a same single node. In this last
case, tasks must be isolated from each other, to avoid potential performance
degradation.

For shared resources like caches in many-cores architectures, recent
technologies have been made available to partition them among distinct groups
of processes, for example with the Cache Allocation Technology on modern Intel
processors. Using an interface similar to low-level container mechanims, one
can use this technology to split shared caches between competing workloads.

Unfortunately, the impact of such technology on the performance of HPC
workloads is still not well understood, making it difficult to choose the right
allocation policy, or to tune it on the fly during execution.

To address this issue, the purpose of this internship is to perform a wide
experimental study of the impact of cache controls on HPC workloads with a
focus on its impact of runtime in the context of multiple workloads competing
for resources. This study will result in the design of predictive models for
how a cache control policy would impact a known workload at runtime,
focusing on either cache capacity or cache bandwidth control. The
evaluation will use currently available processors and existing software
allowing the control of Intel's CAT.

This internship is part of a collaboration between Grenoble University and
Argonne National Laboratory. A successful candidate will have the opportunity
of visiting Argonne during the summer to further this work.
URL sujet détaillé : :

Remarques :



SM207-24 Online HPC workload data analysis for light and reactive control  
Admin  
Encadrant : Guillaume HUARD
Labo/Organisme :
URL : https://team.inria.fr/polaris
Ville : GRENOBLE

Description  

Many scientific processes can be expressed as a workflow graph where nodes are
computational tasks (scientific simulations, analysis, visualization) and edges
are data exchanges between tasks. Traditional workflow infrastructures exchange
data through files. However, the increasing gap between I/O bandwidth and
computational capabilities in current and future supercomputers
requires a change in the way scientists are analyzing data produced by
simulations. File-based workflows must now be replaced by in situ workflows
performing data extraction, data reduction, or online visualization before
storing relevant data to the file system.

On current supercomputers, a in situ workflow execution can span across
thousands of nodes. Such execution may exhibit single tasks distributed across
multiple nodes, or multiple tasks sharing a same single node. In this last
case, tasks must be isolated from each other, to avoid potential performance
degradation.

However, the behavior of each task and the performance of the HPC application
itself depends on several external factors : the input dataset, the number of
computing resources available, possible network contention, ... In such a
situation, the HPC runtime should adapt online its placement and isolation
policies in reaction to the actual execution it controls. Often the cost of
running this control process can be high and the runtime choses to perform it
sporadically to lessen its global impact. The main issue here is to decide when
it is the most pertinent to run the control process in order to adapt the
resources policies.

The purpose of this internship is to design a method to analyze an application
online, that is during its execution. The idea is to continuously collect a
trace, that contains software and hardware metrics, and to analyze it in order
to detect changes in the execution. We are mainly interested in detecting phases
changes (the application switch for computation to communication or conversely),
load changes (the input dataset is irregular and the computation is not
balanced) and environnemental changes (there is a network contention, some nodes
are down). The objective being to sort out what is relevant (has an impact on
the performance) from what is not and to trigger the runtime control algorithms
only when a relevant change occurs.

This internship is part of a collaboration between Grenoble University and
Argonne National Laboratory. A successful candidate will have the opportunity
of visiting Argonne during the summer to further this work.
URL sujet détaillé : :

Remarques :



SM207-25 programmation vérifiée de fonctions de flot  
Admin  
Encadrant : Jean-pierre TALPIN
Labo/Organisme : Inria project-team TEA
URL : https://www.irisa.fr/prive/talpin
Ville : Rennes

Description  

Dans ce stage, nous nous intéressons au langage de programmation vérifiée F* (Inria-Microsoft) pour y définir un modèle de concurrence supportant la génération de code réactif (exécution préhemptive, ordonnancement dynamique) ou temp-réel (non-préhemptive, ordonancement statique) en utilisant le générateur de code Kremlin de F* et les outils de méta-programmation Meta-F*.
URL sujet détaillé : https://www.irisa.fr/prive/talpin/frp.pdf

Remarques : Stage co-encadré par Yuan Rui Zhang, post-doctorant, et dans le cadre de l'équipe-associé Composite avec UC San Diego (MESL)



SM207-26 A logical framework to verify requirements of hybrid system models  
Admin  
Encadrant : Jean-pierre TALPIN
Labo/Organisme : Inria project-team TEA
URL : https://www.irisa.fr/prive/talpin
Ville : Rennes

Description  

The objective of the internship is to explore build upon previous works on compositional verification of hybrid systems that yielded a component based approach for system design with an executable semantics of contracts [3] as well as formalized cyber-physical systems into components with a strong logical foundation grounded on differential logic dL and associated with an automatic composition theorem to prove properties from individual component contracts.
URL sujet détaillé : https://www.irisa.fr/prive/talpin/cps.pdf

Remarques : Stage co-encadré par Benoit Boyer, Mitsubishi Electric R&D Europe (MERCE)



SM207-27 Etude et preuve de convergence d'un algorithme de Machine Learning déterministe  
Admin  
Encadrant : Roland GROZ
Labo/Organisme : LIG
URL : https://www.liglab.fr/fr/la-recherche/axes-et-equipes-de-recherche/vasco
Ville : Grenoble

Description  


Le développement rigoureux de logiciel, basé sur des méthodes formelles, exige que l'on dispose de modèles du programme. Comme de tels modèles ne sont pas toujours disponibles, différentes méthodes d'inférence ont été proposées pour reconstruire ces modèles. On suppose que le système peut être représenté, à un certain niveau d'abstraction, par une machine à états finie ayant certaines entrées et sorties.
L'équipe VASCO du LIG s'intéresse depuis plusieurs années aux techniques d'inférence "active", qui consistent construire cette machine à états en observant ses sorties en fonction de différentes entrées. Ces techniques d'apprentissage reposent sur des tests en boîte noire, ne supposent pas qu'on dispose du code source, ni même du code binaire du logiciel.

Ces techniques s'inscrivent dans le cadre théorique de l'inférence grammaticale: retrouver un modèle de grammaire (ou d'automates ici) à partir de l'observation des séquences d'événements (séries temporelles, vues comme mots d'un langage sur l'alphabet des événements).

Ces techniques sont bien utilisées depuis une douzaine d'années pour retrouver des modèles de systèmes testés en boîte noire (rétro-ingénierie de modèle), que ce soit pour faire de la vérification, de la documentation, de la recherche de failles de sécurité etc.

POINT DE DEPART

L'équipe Vasco du LIG à Grenoble, en collaboration avec l'Université de S=E3o Paulo, a développé récemment un nouvel algorithme appelé hW, très efficace pour le cas où on n'a pas la possibilité de réinitialiser le système, donc on doit l'apprendre au fil de son exécution. L'algorithme hW a de très bonnes performances en moyenne, mais sa convergence est difficile à prouver.

TRAVAIL A MENER

Il faudra étudier les hypothèses permettant ou facilitant la preuve de convergence de l'algorithme d'inférence.

On pourra également étudier une généralisation de l'algorithme pour inférer des automates non fortement connexes en minimisant le recours à des réinitialisations.

Le travail théorique pourra être complété par
- des analyses expérimentales sur des benchmarks de systèmes logiciels
- l'utilisation de l'algorithme sur un ou plusieurs cas d'étude (par exemple, rétro-ingénierie d'applications embarquées sur cartes à puce).


RÉFÉRENCES BIBLIOGRAPHIQUES

[Dagstuhl2016] Amel Bennaceur, Reiner Hähnle, Karl Meinke.
Machine Learning for Dynamic Software Analysis: Potentials and Limits
International Dagstuhl Seminar 16172, Dagstuhl Castle, April 24-27, Springer 2018.

[GSBO18] Roland Groz, Adenilso Simao, Nicolas Bremond, Catherine Oriat.
Revisiting AI and Testing methods to Infer FSM Models of Black-Box Systems
AST 2018 (Automated Software Testing), G=F6teborg, May 2018.

URL sujet détaillé : :

Remarques :



SM207-28 Towards Efficient Big Data Management with Transient Storage Systems  
Admin  
Encadrant : Gabriel ANTONIU
Labo/Organisme :
URL : https://team.inria.fr/kerdata/
Ville : Rennes

Description  

In the High Performance Computing (HPC) domain, huge amounts of data are collected from complex simulations of physical phenomena (such as climate modeling, galaxy evolution, etc.), and data analysis techniques from the field of Big Data are used to extract pertinent informations. Needless to say, ​data storage is a cornerstone of Big Data processing​. Whereas traditional HPC systems usually separate computational resources from storage, using parallel file systems, new supercomputers now often have local storage devices (such as SSDs or NVRAM) located on their compute nodes. This local storage allows users to deploy new types of distributed storage systems along with the applications. Such a storage system, deployed only for the duration of an application's execution, is called a ​transient storage system​.

This internship focuses on answering the following research question: "​How can we efficiently deploy and terminate a transient storage system?"​ . Indeed, a transient storage system only exists during the run time of the application that uses it, initially hosting no data. When the application terminates, data would be lost if not backed up. Thus, two steps are needed: loading data from the persistent storage at initialization time, and dumping the data onto the persistent storage before termination. Both operations involve large data transfers that may slow down the initialization and termination of such storage systems. Optimizing these transfers would lead to faster application deployments and termination.

Multiple relevant contributions can be achieved during this internship:

1. Theoretical contributions:​ establish a lower bound (mathematical model) of the duration of the operations of loading and dumping data in order to identify their bottlenecks, and have a baseline for the evaluation of implementations of such
systems.

2. Algorithmic contributions:​ optimize the algorithms used to transfer the data between nodes. For instance, one can consider the network topology, or use AI to monitor the application's behavior and load/dump data in the background.

3. Experimental contributions:​ implement the algorithms in Pufferbench (a benchmark designed to evaluate commission and decommission algorithms to experiment with the algorithms previously designed.

4. Validation contributions:​ implement and evaluate a microservice able to efficiently run the studied operations using Mochi (a set of libraries designed to quickly build distributed storage systems - ​https://www.mcs.anl.gov/research/projects/mochi/​).

This subject will be done in close collaboration with Matthieu Dorier from Argonne National Laboratory (ANL, USA). In addition to the possibility to interact with top-level researchers and scientists from ANL, the work can involve the use of large-scale HPC experimental facilities available to Inria and to ANL.

Depending on the quality of the achievements, this internship could be followed by a PhD thesis in the same area, in the same collaboration context.
URL sujet détaillé : http://master.irisa.fr/internship/uploaded/5.pdf

Remarques : Advisors:

Gabriel Antoniu - ​gabriel.antoniu.fr​ (main advisor)
Nathanaël Cheriere - ​nathanael.cheriere.fr



SM207-29 HPC-Big Data convergence at processing level by bridging in-situ/in-transit processing with Big Data analytics  
Admin  
Encadrant : Alexandru COSTAN
Labo/Organisme ://www.grid5000.fr/
URL : https://team.inria.fr/kerdata/
Ville : Rennes

Description  

In the High Performance Computing (HPC) area, the need to get fast and relevant insights from massive amounts of data generated by extreme-scale computations led to the emergence of in-situ and in-transit processing approaches. They allow data to be visualized and processed in real-time, in an interactive way, as they are produced, as opposed to the traditional approach consisting of transferring data off-site after the end of the computation, for offline analysis. In the Big Data Analytics (BDA) area, the search for real-time, fast analysis was materialized through a different approach: stream-based processing. As the tools and cultures of HPC and BDA have evolved in divergent directions motivated by different optimization criteria, it becomes clear today that leveraging together the progresses achieved in the two areas can be an efficient means addresses the Big Data challenges, which are now relevant for both HPC and BDA.

This Master internship will be hosted by the KerData team at IRISA/Inria Rennes Bretagne Atantique (https://team.inria.fr/kerdata/). It aims to propose an approach enabling HPC-Big Data convergence at the data processing level, by exploring alternative solutions to architecture a unified framework for extreme-scale data processing. The architecture of such a framework will leverage the extreme scalability demonstrated by in situ/in transit data processing approaches originated in the HPC area, in conjunction with Big Data processing approaches emerged in the BDA area (batch-based, streaming-based and hybrid). The high-level goal of this framework is to enable the usage of a large spectrum of Big Data analytics techniques at extreme scales, to support precise predictions in real-time and fast decision making.

In the process of designing the unified data processing framework, we will leverage the Damaris framework for scalable, asynchronous I/O and in-situ and in- transit visualization and processing, developed by KerData at Inria (https://project.inria.fr/damaris). Damaris already demonstrated its scalability up to 16,000 cores on some of the top supercomputers of Top500, including Titan, Jaguar and Kraken. For the purpose of this internship, Damaris will have to be extended to support Big Data analytics for data processing (e.g., based on the Apache Flink and Apache Spark engines and on their higher-level machine-learning libraries). The work will involve software architecture design, implementation and experimental validation.
In particular, adding programmatic support for Big Data analytics on dedicated cores in Damaris will focus on distributed, high-performance, always-available, elastic and accurate data processing. Furthermore, we plan to design, implement and experimentally validate a mechanism for incrementally migrating running stream tasks from the in-situ processing backend to the in-transit one without stopping the query execution.

This subject will be done in close collaboration with Bogdan Nicolae and Pierre Matri from Argonne National Laboratory (ANL, USA). In addition to the possibility to interact with top- level researchers and scientists from ANL, the work can involve the use of large-scale HPC experimental facilities available at ANL as well as Inria's Grid5000 distributed testbed - https://www.grid5000.fr/
Depending on the quality of the achievements, this internship could be followed by a Ph.D. thesis in the same area, in the same collaboration context.
URL sujet détaillé : http://master.irisa.fr/internship/uploaded/23.pdf

Remarques : Advisors:

Alexandru Costan - alexandru.costan.fr
Gabriel Antoniu - gabriel.antoniu.fr




SM207-30 Safe floating-point optimisation  
Admin  
Encadrant : David MONNIAUX
Labo/Organisme : VERIMAG
URL : http://www-verimag.imag.fr/~monniaux/
Ville : Grenoble

Description  

Satisfiability modulo theory is an approach to solving first-order formulas in certain logical theories or combinations thereof. It is used in program verification and security analyses (concolic testing), as well as for certain kinds of optimization problems.

Originally, SMT-solvers only allowed ideal arithmetic types (Z, Q), but some now also allow floating-point types. The idea is that if we wish to prove the absence of corner cases in numerical algorithms implemented with floating-point arithmetic, we must be able to specify problems with floating-point variables and floating-point operations.

One way to address this issue is to ``bit-blast floating-point operations into basic operations over bits, as in a hardware implementation of floating-point. This has been done in certain solvers (Z3dots) but results in formulas that are very hard and slow to solve.

We instead wish to be fast in the ``easy case where a formula holds over floating-point values because it holds over the real numbers, using ideal operations, and still holds if rounding error is accounted for. For this, we will soundly take rounding error into account, with increasing precision.

The goal of this project is to explore this approach.
URL sujet détaillé : http://www-verimag.imag.fr/~monniaux/stages/2019/float_optimisation.pdf

Remarques :



SM207-31 Memory-aware scheduling fot the StarPU runtime  
Admin  
Encadrant : Loris MARCHAL
Labo/Organisme : LIP
URL : perso.ens-lyon.fr/loris.marchal/ :
Ville : Lyon

Description  

Parallel computing platforms are used, among others, to perform large-scale numerical simulations and are becoming increasingly complex, as they now include computing accelerators, deep memory hierarchies, and the combination of shared and distributed memories. On way to take advantage of their raw power despite this complexity is to rely on lightweight scheduling runtimes, such as the StarPU runtime. A scientific computing application is then described as a directed graph of tasks, where arcs represent dependencies between tasks. The scheduler allocates available tasks on the various computing resources of the platform. One of the problem to efficiently solve such a scheduling problem is linked to memory: we have to avoid the scenario where the platform runs out of memory, otherwise the application would crash (or experienced very bad performance because of swapping).

Solutions to this problem have been proposed. First, schedulers taking into account the amount of available memory have been proposed for the case of a single computing resource or for specific task graphs on parallel resources. More recently, we have proposed a method to add new dependencies to the task graph such that any valid schedule of the resulting graph is guaranteed not to exceed a given amount of memory. The benefit of this method is to deal with memory bottlenecks before the execution, and to allow for runtime schedulers, which are able to dynamically adapt to changing conditions in the computing platforms. Unfortunately, the complexity of the algorithm and the large number of added dependencies make this method hard to use in practice.



The goal of this internship is to improve this method, by limiting the needed guarantees: instead of proving that all execution will not exceed the memory (the actual guarantee), it is indeed enough to make sure that no memory bottlenecks will happen, as runtime schedulers such as StarPU are able to block tasks requesting memories before some other tasks release enough memory. The objective is to dramatically limit the number of new dependencies to add to the graph.

The internship will consist of the following steps:
- Model the problem using graph concepts, select the most appropriate memory model and express the objective as a graph property.
- Propose algorithmic solutions to solve the problem, with a special care on the complexity of the solutions.
- Test the proposed solutions first by simulation, on actual application graphs.
- Implement some of these solutions on the StarPU runtime.

The distribution of the work between theory (designing algorithms and proving them) and practice (simulation and real implementation) is flexible and will depend on the skills and will of the candidate.

URL sujet détaillé : http://perso.ens-lyon.fr/loris.marchal/sujet-stage-M2-en.pdf

Remarques : Co-advised by Samuel Thibault (Univ. Bordeaux), leader of the StarPU development, who will be on leave at the LIP during the internship.



SM207-32 Smoothed analysis of walking strategies in Delaunay triangulation  
Admin  
Encadrant : Olivier DEVILLERS
Labo/Organisme : Loria
URL : http://gamble.loria.fr/
Ville : Nancy

Description  

The smoothed analysis technique allow a trade-off between a worse case complexity and a random complexity by varying a noise level.
(no noise is worst case, lot of noise means fully randomn position).

The aim of the internship is to apply this kind of analysis to algorithms traversing a triangulation using neighborhood relations between triangles.
Such algorithms have application for routing or for point location.
URL sujet détaillé : https://members.loria.fr/Olivier.Devillers/stage-smoothed.pdf

Remarques : Gratification légale



SM207-33 DTIS-2019-07 (à rappeler dans toute correspondance)  
Admin  
Encadrant : Cédric PRALET
Labo/Organisme : ONERA Midi-Pyrennées (MIP), Département Traitement de l'Information et Systèmes (DTIS), unité SYstèmes intelligents et Décision (SYD).
URL : https://www.onera.fr/en/staff/cedric-pralet
Ville : Toulouse

Description  

Combinatorial optimization problems such as task allocation problems,
scheduling problems, vehicle routing problems occur in many
applications in the aerospace domain: when planning
Earth or space observation tasks by satellite constellations, when
deploying complex tasks on embedded, real-time many-core computing
platforms, for the planning of a robot fleet deployment in a given
zone, for planning tasks for aircraft assembly lines, etc.
To solve such problems two main classes of algorithms can be distinguished:
complete search methods on the one hand, which are guaranteed to find
a solution if one exists and can prove the optimality of a solution,
and incomplete search methods on the other hand, which explore candidate solutions in a more
heuristic manner, without the guarantees of complete methods, and are
typically used for solving problems under limited computation time budgets, or for solving very high dimensional problems, or both.

Major advances for complete methods have been brought by the use of
/conflict-driven/ techniques, whereby explanations for failures to
satisfy the problem constraints are generated and
recorded by the search process, so as to never reproduce the bad
decisions again and to steer away the search process from bad regions of the
search space. Such techniques are used for instance for the Boolean
satisfiability problem (SAT) in /Conflict Driven Clause Learning
(CDCL)/ [1] algorithms, and for satisfiability modulo theory solvers
(SMT) with the /CDCL(T)/ algorithm or /Model-Constructing
Satisfiability Calculus/ [2], and in Constraint Programming with the
/Lazy Clause Generation/ approach [3].

Nevertheless, complete methods still have issues scaling to high
dimensional problems, which often leads to resort to incomplete
methods such as /greedy search/, /local search/ or /meta-heuristic
search/ [4], either as main search engines, or to guide decisions
performed by complete search methods by probing the search space.

The goal of this internship is to study to which extent conflict
learning methods inspired by complete search methods can increase the
performance of incomplete search methods. The first step of the work
is to perform a state-of-the art report on existing hybrid
incomplete/conflict learning methods [5]. The second step is to
propose novel ways to combine incomplete methods and conflict learning
methods, possibly revisiting the nature of conflict information handled
by the algorithms and the way it is processed and used during search.
Last, experiments will be conducted to assess the performance increase of
the proposed hybrid method.

Bibliography:
[1] Conflict Driven Clause Learning, Chapter 4, Handbook of
Satisfiability. Armin Biere, Marijn Heule, Hans van Maaren and Toby
Walsh (Eds.) IOS Press, 2009.

[2] A Model-Constructing Satisfiability Calculus. Leonardo Mendonça de
Moura and Dejan Jovanovic. Verification, Model Checking, and Abstract
Interpretation, 14th International Conference (VMCAI), 2013.

[3] Solving RCPSP/max by lazy clause generation. Andreas Schutt,
Thibaut Feydy, Peter J. Stuckey and Mark G. Wallace. Journal of
Scheduling 16(3) : 273-289 (2013).

[4] Handbook of Metaheuristics. Michel Gendreau and Jean-Yves
Potvin. Springer Publishing Company,
2010.

[5] Integrating Conflict Driven Clause Learning to Local
Search. Gilles Audemard and Jean-Marie Lagniez and Bertrand Mazure and
Lakhdar Sais. Proceedings 6th International Workshop on Local Search
Techniques in Constraint Satisfaction,
2009.
URL sujet détaillé : https://w3.onera.fr/stages/sites/w3.onera.fr.stages/files/dtis-2019-07.pdf

Remarques : stage réalisé en co-encadrement avec Rémi DELMAS (ONERA)



SM207-34 Mesure de l'alignement des systèmes d'information  
Admin  
Encadrant : Jannik LAVAL
Labo/Organisme : DISP Lab - University of Lyon
URL : https://www.disp-lab.fr/
Ville : BRON

Description  

In the context of the flexible and agile factory of the future, it is crucial that information systems (IS), cornerstone of modern organizations, evolve in harmony with the organization it supports. The IS's ability to meet the business and organization needs is called alignment. It is regularly considered as one of the main concerns of IS managers. In general, alignment is worked out during the IS design so that the designed IS supports both the business strategies and the business processes involved. However, the environment in which companies evolve is also evolving. This implies an adjustment of the alignment over time and therefore changes to the IS.

Context and objectives:
The ability of an organization to evolve is a key factor of its success, the control of the IS dynamics is therefore a main stake. The objective of the project is to set up models to measure the evolution of IS alignment over time.

To address the above barriers, we will work on three complementary scientific areas:
- Assessment of alignment at the different levels (strategic, tactical and project): the challenge is to model the different levels of alignment and then to define the corresponding parameters enabling to evaluate a given level of alignment and to articulate them. Metrics will be set up for each level. In addition, the evaluation should be as automated as possible.
- Alignment dynamics modelling: The main problem here is to develop a model that reflects the evolution of the alignment over time for the different levels of alignment.
- Dynamic Alignment Adjustment Methodology: The goal here is to recommend what to do in order to control and adjust the IS alignment over time. In other words, the goal is to put together the alignment assessment models and the model of the alignment dynamics and then to exploit them to support the management of the IS alignment.
In addition, we hope to develop a software prototype implementing the proposed methodology.
URL sujet détaillé : https://disp-lab.fr/sites/default/files/disp-2019a_m2_internship_-_va.pdf

Remarques : Internship based at the DISP laboratory of the University of Lyon, in partnership with the ICUBE CSIP team, INSA Strasbourg, which can lead to a PhD thesis.



SM207-35 Transformation Proof From Madeus to Petri Nets  
Admin  
Encadrant : Helene COULLON
Labo/Organisme : IMT Atlantique, Inria, LS2N
URL : http://helene-coullon.fr/
Ville : Nantes

Description  

Due to the complexity of distributed infrastructures, installation (deployment) and regular maintenance (e.g. updates, fault management, and other elements of normal software life-cycles) of distributed software becomes a complex administrative task, composed of many steps and well-defined procedures. In the STACK research team at Inria (France), Helene Coullon works on Madeus, a novel distributed software deployment formal model, as well as its Python implementation in a product called MAD (Madeus Application Deployer). The overall approach that Madeus uses to support more efficient deployment is to expose and leverage opportunities for parallelism in the deployment process. Unfortunately, introducing parallelism also introduces more complexity for the end-user/administrator wanting to deploy a distributed software system.
For this reason, Helene Coullon and Ludovic Henrio work in collaboration with other computer scientists to apply formal verification on deployments written with Madeus. Verification is of major importance for large-scale distributed systems that introduce concurrency and uncertainty.

Among other ongoing works on formal verifications, a Madeus deployement process can be translated to a petri net. As a result, once the transformation applied, model checkers can be used to perform verifications. The formal semantics that defines a translation from Madeus to Petri nets has already been defined. Petri nets are particularly adapted to the modeling of Madeus deployment processes, because the syntax and semantics of a Madeus deployment process is somehow similar to a petri net.

The intern will be responsible for proving that the transformation from Madeus deployment process to Petri nets is correct, and thus the obtained petri net has the same behavior than the deployment formally defined with Madeus.
The missions of the intern candidate are to:
- understand both the operational semantics of Madeus and its semantics as a translation to petri nets,
- study which way to express the semantics of Petri nets is the most adapted to the work,
- study the different possibilities to achieve the proof of correctness of the transformation and choose one solution, the main question to be answered here will be: what kind of equivalence do we prove between the two semantics and what are its properties.
- Perform the proof,
- and eventually, the proof could be performed with the Coq proof assistant, but a thorough paper proof would also be satisfactory.
URL sujet détaillé : http://helene-coullon.fr/download/internship-position-ens.pdf

Remarques : Co-advised by Ludovic Henrio (ludovic.henrio.fr) CNRS researcher in Lyon at the LIP, ENS Lyon.



SM207-36 Rounding 3D meshes  
Admin  
Encadrant : Sylvain LAZARD
Labo/Organisme : Inria Nancy / LORIA
URL : https://members.loria.fr/Sylvain.Lazard/
Ville : Nancy

Description  

In most software, 3D polygonal meshes have vertices whose
coordinates are represented with fixed-precision floating-point numbers, say
{doubles}. Unfortunately, actions, such as rotations and intersections, create vertices whose
coordinates cannot be emph{exactly} represented with doubles. A natural problem is thus to round
the coordinates to doubles, without creating self-intersections between the faces and without
changing the geometry too much; the faces can however be subdivided.

We recently presented an algorithm that solves the problem in 3D
on the integer grid. This solution is not yet known to be
practical because its worst-case complexity (n^{19}) very bad but we conjecture a good
complexity of (nsqrt{n}) practice on non-pathological data.

The goal of this Master thesis, which could lead to a Ph.D. thesis, is to investigate and
eventually develop a practical and efficient solution for this problem. The approach will be to
implement a simplified version of our algorithm in and, in parallel, to work at
improving its efficiency, both in theory and in practice.
URL sujet détaillé : https://members.loria.fr/Sylvain.Lazard/stage-snap.pdf

Remarques :



SM207-37 Build a platform for increasing the transparency of social media advertising  
Admin  
Encadrant : Oana GOGA
Labo/Organisme : Laboratoire d'Informatique de Grenoble (LIG), Grenoble, France (head
URL : https://lig-membres.imag.fr/gogao/
Ville : Grenoble

Description  

Keywords: audit, privacy, online targeted advertising, Twitter, Facebook, browser extension, data collection and analysis,
machine learning, inference, statistics
Lab: Laboratoire d'Informatique de Grenoble (LIG), Grenoble, France (head: Eric Gaussier)
Team in the lab: SLIDE (head: Sihem Amer-Yahia)
Advisor: Oana Goga (CNRS & Univ. Grenoble Alpes, LIG) oana.goga-sws.org https://people.mpi-sws.org/~ogoga/
Project Description:
In recent years, targeted advertising has been source of a growing number of privacy complaints from Internet users [1].
At the heart of the problem lies the opacity of the targeted advertising mechanisms: users do not understand what data
advertisers have about them and how this data is being used to select the ads they are being shown. This lack of
transparency has begun to catch the attention of policy makers and government regulators, which are increasingly
introducing laws requiring transparency [2].
The project consists in building a tool that provides explanations for why a user has been targeted with a particular ad that
does not need the collaboration of the advertising platform. The key idea is to crowdsource the transparency task to
users. The project consists in building a collaborative tool where users donate in a privacy-preserving manner data about
the ads they receive. The platform needs to aggregate data from multiple users (using machine learning) to infer
(statistically) why a user received a particular ad. Intuitively, the tool will group together all users that received the same
ad, and look at the most common demographics and interests of users in the group. The key challenge is to identify the
limits of what we can statistically infer from such a platform.
Throughout the project the student will be able to familiarize himself with the online targeted advertising ecosystems, learn
to collect data from online services and apply machine learning and statistics concepts on real-world data.
Requirements:
Strong coding skills and strong background in statistics and data mining.
URL sujet détaillé : https://people.mpi-sws.org/~ogoga/offers/Internship_2017_crowdsouring.pdf

Remarques : possibilité de rénumération



SM207-38 Audit the transparency mechanisms provided by social media advertising platforms  
Admin  
Encadrant : Oana GOGA
Labo/Organisme : Laboratoire d'Informatique de Grenoble (LIG), Grenoble, France (head
URL : https://lig-membres.imag.fr/gogao/
Ville : Grenoble

Description  

Keywords: audit, privacy, online targeted advertising, Twitter, browser extension, data collection and analysis, machine
learning, statistics

Lab: Laboratoire d'Informatique de Grenoble (LIG), Grenoble, France (head: Eric Gaussier)

Team in the lab: SLIDE (head: Sihem Amer-Yahia)

Advisor: Oana Goga (CNRS & Univ. Grenoble Alpes, LIG) oana.goga-sws.org https://people.mpi-sws.org/~ogoga/

Project Description:
In recent years, targeted advertising has been source of a growing number of privacy complaints from Internet users [1].
At the heart of the problem lies the opacity of the targeted advertising mechanisms: users do not understand what data
advertisers have about them and how this data is being used to select the ads they are being shown. This lack of
transparency has begun to catch the attention of policy makers and government regulators, which are increasingly
introducing laws requiring transparency [2].
To enhance transparency, Twitter recently introduced a feature (called =93why am I seeing this=94) that provides users with an
explanation for why they have been targeted a particular ad. While this is a positive step, it is important to check whether
these transparency mechanisms are not actually deceiving users leading to more harm than good. The goal of the project
is to verify whether the explanations provided by Twitter satisfy basic properties such as completeness, correctness and
consistency. The student will need to develop a browser extension or a mobile application that is able to collect the ad
explanations received by real-world users on Twitter and to conduct controlled ad campaigns such that we can collect the
corresponding ad explanations.
Throughout the project the student will be able to familiarize himself with the online targeted advertising ecosystems, learn
to conduct online experiments and measure their impact and conceptually reflect at what is a good explanation.

Requirements:
Strong coding skills. Experience in working with data is a plus.
URL sujet détaillé : https://people.mpi-sws.org/~ogoga/offers/Internship_2017_Twitter_explanations.pdf

Remarques : possibilité de rénumération; continuation avec these



SM207-39 Gestion d'énergie dans le noyau Linux  
Admin  
Encadrant : Georges DA COSTA
Labo/Organisme : University of Toulouse
URL : http://www.irit.fr/~Georges.Da-Costa/
Ville : Toulouse

Description  

Dans le domaine du mobile, l'utilisation d'un système de régulation thermique actif, comme un ventilateur, n'est pas possible. Afin de
permettre au matériel de rester dans l'enveloppe thermique, des solutions software sont mises en place dans le noyau pour réguler la
température, c'est le cas du mécanisme de CPU cooling device, qui modifie la fréquence du processeur en fonction de la température.

Deux nouvelles stratégies doivent être étudiées: l'injection de cycles idle [1] et l'utilisation simultanée (combo) d'injection de cycles idle
et de changements de fréquence [2]. Vu que le nombre de fréquences disponibles est limité, il s'agira de créer des fréquences virtuelles
en utilisant à la fois des fréquences réelles et des cycles idle.

Toutefois, cette approche peut se réveler trop pénalisante pour l'utilisateur si elle introduit des temps de latences trop importants [3].

L'objectif du stage est de construire un driver et d'expérimenter à partir du framework existant l'impact de ces solutions, proposer des
améliorations et formaliser le raisonnement sous forme mathématique.

L'étudiant sera encadré par un chercheur à l'IRIT et un expert developpement noyau.

La finalité de ce driver est qu'il soit livré dans le noyau Android (4.9 et 4.14) et Linux.

Pré-requis:

- Etre interessé et motivé par ce genre de problèmatique
- Avoir envie de travailler dans le domaine de l'Opensource
- Etre à l'aise avec l'environnement de dev OSS/GNU/Linux

L'étudiant se verra attribuer un portable pré-installé, ainsi qu'une board de développement [4][5] pour les tests, pendant la durée du stage.

L'environnement de travail étant distribué, il n'y a pas de locaux d'entreprise. L'IRIT accueillira l'étudiant dans ses locaux.

Sans être obligatoire, une bonne maîtrise de l'Anglais technique serait un plus en terme de confort pour la communication avec les différents
intervenants.


[1]
https://git.linaro.org/people/daniel.lezcano/linux.git/commit/?h=thermal/idle-inject&id=f31b7021f2910ff6969e998861d9fb269c782579

[2]
https://git.linaro.org/people/daniel.lezcano/linux.git/commit/?h=thermal/idle-inject&id=9a136388ef1ceddc35f569abc2b4bb5bc33ef21d

[3] https://www.youtube.com/watch?v=4V1ynaj6dq8

[4]
https://github.com/96boards/documentation/blob/master/ConsumerEdition/HiKey960/HardwareDocs/HardwareUserManual.md

[5]
http://www.cnetfrance.fr/news/hikey-960-huawei-et-google-lancent-un-nano-ordinateur-facon-raspberry-pi-39851830.htm
URL sujet détaillé : :

Remarques : This work will be in collaboration (and payed by) Linaro which is the company responsible for the Thermal/Frequency system in Linux on ARM



SM207-40 Complexity of factorization in products of linear forms  
Admin  
Encadrant : Pascal KOIRAN
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/pascal.koiran
Ville : Lyon

Description  

We have recently proposed new algorithms for the factorization of multivariate polynomials into products of linear forms. The goal of this internship will be to improve the proposed algorithms in directions such as:
(i) better running time.
(ii) derandomization.
(iii) ability to handle polynomials of very high degree.

This internship is suitable for a student interested in algorithms and complexity (and especially arithmetic circuit complexity).
URL sujet détaillé : http://perso.ens-lyon.fr/pascal.koiran/stageM2_1819.pdf

Remarques :



SM207-41 Octogones entiers et réels en programmation par contraintes  
Admin  
Encadrant : Charlotte TRUCHET
Labo/Organisme : LS2N UMR 6004
URL : http://www.normalesup.org/~truchet/
Ville : Nantes

Description  

Le but de ce stage est d'étudier plus précisément le domaine abstrait des octogones, tel que défini en interprétation abstraites, dans le cadre de la programmation par contraintes, et ses applications potentielles pour des problèmes discrets. Les octogones ont déjà montré de bonnes performances sur des problèmes à variables réelles [PTB2014], par ailleurs, l'algorithme de clôture sur les octogones [M2006] est très proche d'algorithme de consistance dédiés, comme les temporal constraint networks [DP1987]. Le domaine abstrait des octogones, déjà bien étudié dans le cas des variables réelles, semble donc un bon candidat pour la résolution de problèmes à variables entières, ou bien entières et réelles, avec des contraintes temporelles (problèmes d'emplois du temps, d'ordonnancement, etc).
URL sujet détaillé : https://www.normalesup.org/~truchet/Docs/sujet2018.pdf

Remarques :



SM207-42 Avoiding cluster overload with a control-based approach  
Admin  
Encadrant : Eric RUTTEN
Labo/Organisme : Inria / LIG
URL : https://team.inria.fr/ctrl-a/
Ville : Grenoble

Description  


Work in cooperation with Olivier Richard (Datamove) and Bogdan Robu (Gipsa-lab)

- Context

HPC systems are facing more and more variability in their behavior (related to e.g., performance and power consumption) and because they are less predictable their administration requires a more complex management. This can be done by monitoring the information in the systems, analyzing this data in order to activate appropriate system-level or application-level feedback mechanisms (e.g., informing schedulers, down-clocking CPUs).

Such feedback mechanisms rely on extensive monitoring and analysis, and involve decisions and their execution. Such feedback loops, in the domain of Computer Science, are the object of Autonomic Computing [1], which emerged mainly from distributed and Cloud systems. One approach in designing feedback loops is naturally Control Theory, which is extremely widespread in all domains of engineering, but only quite recently and scarcely applied to regulation in computing systems [2]. It can bring to the systems designers methodologies to design and implement feedback loops with well-mastered and guaranteed behavior in order to obtain automated management with goals of optimization of resources or avoidance of crashes and overloads.

- Expected work

The research work will consist in proposing feedcback loops for the runtime management of resources, targeting the particular computation platform CiGri [3], a lightweight, scalable and fault tolerant grid system which exploits the unused resources of a set of computing clusters. It interacts with the computing clusters through the Ressource Job Management System OAR [4], a batch scheduler software. Help from Control engineering scientsists form INRIA and Gipsa-lab Grenoble will be provided for the creation of the control laws.

The design of the feedback loops will build upon existing work of a 3rd year engineer student from INP, and follow our methodology :
- analysis of the system and identification of its dynamics and the problems to be tackled;
- design of a feedback controller ; we will start from an existing P (Proportional) regulation, through a maximum number of jobs to be sent to the cluster, in response to the current number of jobs processed. The current control objective is to maximize cluster utilization while avoiding overload.
- implementation and experimental results in simulation; comparing the new control scheme with the existing solution.

According to this sequence of steps, new characteristics of the platform will be considered, e.g., other problems of overload, for example concerning storage architecture.
Controllers will be designed going beyond the simple P regulation, alternative objectives and control schemes will be explored, e.g. adaptive and discrete control.
Implementations and experimental validation will be consolidated.



[1] J. O. Kephart and D. M. Chess. The vision of autonomic computing. IEEE Computer, 36(1):41=9650, Jan. 2003.

[2] M. Litoiu, M. Shaw, G. Tamura, N. M. Villegas, H. A. M=FCller, H. Giese, R. Rouvoy and E. Rutten. What can control theory teach us about assurances in self-adaptive software systems? In R. de Lemos, D. Garlan, C. Ghezzi, and H. Giese, editors, Software Engineering for Self-Adaptive Systems, volume 9640 of LNCS, Springer.
https://hal.inria.fr/hal-01281063, 2016.

[3] http://ciment.ujf-grenoble.fr/CiGri/dokuwiki/doku.php

[4] N. Capit, G. Da Costa, Y. Georgiou, G. Huard, C. Martin, G. Mounié, P. Neyron, and O. Richard. A batch scheduler with high level components. In IEEE Int. Symp. on Cluster Computing and the Grid, CCGrid, pages 776=96783, 2005.


URL sujet détaillé : :

Remarques : Co-advising with Olivier Richard (Datamove Inria/Lig) and Bogdan Robu (Gipsa-lab)

We plan to continue on this research topic with a PhD.



SM207-43 Convergence des réseaux automobiles et avioniques / Convergence between automotive and avionics networks  
Admin  
Encadrant : Marc BOYER
Labo/Organisme : ONERA
URL : https://www.onera.fr/staff/marc-boyer
Ville : Toulouse

Description  

Les systèmes cyber-physiques (automobiles, avions, drones...) sont équipés de centaines de capteurs, de dizaines de calculateurs, communiquants à travers un réseau partagé (CAN, Ethernet, AFDX, TSN...). Le fonctionnement correct de l'ensemble nécessite que chaque message échangé respecte un temps maximum de traversée du réseau. Il faut donc, lors de la conception du système, pouvoir prouver le respect de cette échéance (borne sur la latence).
La théorie du calcul réseau [NC01,DNC18] a été utilisée pour calculer de telles bornes sur les cóurs de réseau AFDX de nombreux avions civils (A380, A350...). Après ces succès, de nombreuses pistes de travail sont apparues: améliorer la précision des résultats, réduire les temps de calcul, modéliser de nouveaux types de réseaux (en particulier la technologie émergente TSN, Time Sensitive Network).

Comme nos activités vont de la recherche sur les fondements du modèle jusqu'à l'implantation, les travaux du stage pourront, en fonction du profil et de l'intére du stagiaire, se porter
- soit sur les bases théorique du modèle lui-même, et en particulier la modélisation de la décomposition en paquets dans le dioide (min,plus), [ETFA16]
- soit sur la modélisation de la technologie TSN, solution Ethernet temps-réel développée par l'IEEE [HAL18],
- soit sur les techniques d'implantation efficace (algorithmes sur les fonctions linéaires par morceaux, parcours de graphe, heuristiques).

Cyber-physical systems (automotive, avionics, UAV...) are embedding thousands of sensors, dozens of computers, all communication through a shared network (CAN, Ethernet, AFDX, TSN...). The correct behaviour of the system requires that the communication delay observes a specified deadline. The, one have to prove at design time that such constraints are respected by the system. The Network Calculus theory has been used to prove such bounds on several civil aircrafts (A380, A350...). And the current developments address several areas: increasing the accuracy of results, reducing the calculation time, considering new network technologies (especially the Time Sensitive Network, TSN).

Since our activities address from models theoretical grounds up to implementation, the student could address
- either some fundamental mathematical work on the model, especially in the modelling of packets in the (min,plus) dioid [ETFA16],
- or on the modelling of the Time Sensitive Network technology [HAL18],
- or looking at the computation time of the implementation.



[NC01] "Network Calculus", J.Y. Le Boudec et P.Thiran, Springer, 2001, http://ica1www.epfl.ch/PS_files/NetCal.htm
[DNC18] "Deterministic Network Calculus: From Theory to Practical Implementation" Anne Bouillard, Marc Boyer, Euriell Le Corronc, Oct 2018, Wiley-ISTE
[ETFA16] "Embedding network calculus and event stream theory in a common model" Marc Boyer, Pierre Roux Proc. of the 21th IEEE Int. Conf. on Emerging Technologies and Factory Automation (ETFA 2016) https://hal.archives-ouvertes.fr/hal-01311502v1
[HAL18] "Modelling in network calculus a TSN architecture mixing Time-Triggered, Credit Based Shaper and Best-Effort queues" Hugo Daigmorte, Marc Boyer, Luxi Zhao Technical Report

URL sujet détaillé : http://w3.onera.fr/formationparlarecherche/sites/w3.onera.fr.formationparlarecherche/files/tis-dtis-2018-10_m._boyer.pdf

Remarques : Poursuite en thèse envisageable / Doing a PhD on the subject will be possible.



SM207-44 Automated reasoning  
Admin  
Encadrant : Pascal FONTAINE
Labo/Organisme : LORIA
URL : https://team.inria.fr/veridis/, http
Ville : Nancy

Description  

Many applications, notably in the context of verification (for critical systems in transportation, energy, etc.), rely on checking the satisfiability of logic formulas. Satisfiability-modulo-theories (SMT) solvers handle large formulas in expressive languages with built-in and custom operators (e.g. arithmetic and data structure operators). These tools are built using a cooperation of a SAT (propositional satisfiability) solver to handle the Boolean structure of the formula and theory reasoners to tackle the atomic formulas (e.g. x> y + z for the theory of arithmetic). The veriT SMT solver, developed in Nancy, is a state-of-the-art reasoner of this kind.

When it comes to handle pure quantified first-order logic with equality, the superposition calculus gives best results. This calculus is a complete set of rules extending resolution to first-order logic with equality. It is extremely good at finding intricate proofs, even on large sets of axioms. Superposition-based saturation automated theorem proving has been steadily progressing for decades. One of the best provers of this kind is the E-prover, developed in Stuttgart.

The main objective of this subject is to use the best of both SMT and superposition for efficient reasoning on large logic formulas with quantifiers, in presence of interpreted symbols. There are both theoretical and practical aspects related to this question.

The work will be conducted under the supervision of Stephan Schulz (DHBW, Stuttgart), Jasmin Blanchette (VU Amsterdam) and Pascal Fontaine (Loria).

We already have first experiments both integrating the superposition calculus in SMT and making use of SAT and SMT techniques in the context of superposition. We suggest that, as an introductory step, the student get knowledge of these prototypes, and evaluate their strength and weaknesses on known sets of benchmarks. Another direction is to get acquainted with the Avatar technique successfully used in the Vampire prover.

As a second step, we envision incrementally generalizing the approach in the prototypes for a tighter integration of both reasoning techniques. In parallel, the student will design an abstract calculus to study completeness issues, and termination for decidable fragments (notably for the Bernays-Sch=F6nfinkel-Ramsey fragment). Depending on the interest of the student for formally verified algorithms, this calculus might be the object of formal verification using a proof assistant.

Our ultimate goal is to propose powerful reasoners including SMT and superposition techniques to discharge proof obligations from proof assistants (interactive theorem provers). It is thus crucial to produce proofs that can be understood and replayed by external tools. The student will describe, provide the theoretical foundations, and implement proof reconstruction of formulas delegated to the hybrid Superposition-SMT solver.

The subject can and will be adjusted according to the interests of the student.
URL sujet détaillé : https://team.inria.fr/veridis/files/2018/10/internship-AR.pdf

Remarques : Joint internship with Jasmin Blanchette (VU Amsterdam) and Stephan Schulz (DHBW Stuttgart). Geographic mobility among the sites is to discuss. The subject naturally extends into a PhD subject, that will be best defined in details with the student.
Do not hesitate to contact us for informal inquiries.




SM207-45 Monitoring des échanges applicatifs avec un système IoT  
Admin  
Encadrant : Jannik LAVAL
Labo/Organisme : DISP Lab - University of Lyon
URL : https://www.disp-lab.fr/
Ville : BRON

Description  

Afin d'assurer la communication entre ses applications et/ou avec le monde extérieur, Berger-Levrault met en óuvre des échanges de données au sein d'architectures flexibles, évolutives et au couplage lâche, telles que les architectures orientées services et les architectures événementielles (SOA, EDA). Ces architectures reposent sur plusieurs moyens de communication pour acheminer les données. Nous nous concentrons pour ce travail sur des protocoles applicatifs utilisés dans le monde de l'IoT (Internet-of-Things).
L'IoT est considéré comme l'un des principaux piliers des futurs développements de technologies intelligentes. L'utilisation des objets connectés se répand très vite et ce pour capter des informations dans divers domaines (environnement, industrie, gestion quotidienne des foyers et des villes...) faisant de l'IoT un moyen de contrôle et de prévention indispensable.
Dans le cadre d'un ensemble de dispositifs applicatifs mis en place pour contrôler des indicateurs liés à la gestion des bâtiments et à la maintenance prédictive. Berger-Levrault exploite les mesures collectées depuis des objets connectés en utilisant des interfaces et des connecteurs qui mettent en óuvre des mécanismes de messaging. Ces derniers permettent d'envoyer des messages d'un éditeur source à un ou plusieurs destinataires en utilisant des protocoles spécifiques tels que MQTT et CoAP.
La multiplicité de ces échanges de données génère une complexité et fait ressortir des besoins de contrôle pouvant être traités par la mise en place d'un système de monitoring et d'analyse. Nous proposons d'analyser les infrastructures mis en óuvre pour implémenter ces échanges, à savoir le broker MQTT ou le server CoAP, collecter des méta-informations pour pouvoir :
- Tracer les messages échangés,
- Simplifier de la visualisation des échanges,
- Améliorer la sécurité : s'assurer de l'absence d'infractions, d'interceptions ou de fuites de données
- Renforcer la maintenabilité : détection d'exceptions (ex : problème de transfert d'un message), précision du contexte et de l'origine du problème, alertes et notifications.

Le stage est basé au laboratoire DISP de l'Université Lyon 2, en partenariat avec la société Berger-Levrault (société spécialisée dans l'édition de logiciels dans les domaines de l'éducation, de la santé, du sanitaire, du social et de la gestion des territoires). Vous serez entourés d'une équipe de chercheurs et travaillerez en lien fort avec l'entreprise sur un cas pratique. Vous serez suivi pour planifier l'ensemble des résultats attendus sur la durée du stage.

URL sujet détaillé : https://disp-lab.fr/sites/default/files/2019_-_disp_m2_-_monitoring_des_echanges_dun_systeme_iot.pdf

Remarques : Internship based at the DISP laboratory of the University of Lyon, in partnership with the ICUBE CSIP team, INSA Strasbourg, which can lead to a PhD thesis.



SM207-46 Suivi et évaluation des échanges dans un système d'information distribué  
Admin  
Encadrant : Jannik LAVAL
Labo/Organisme : DISP Lab - University of Lyon
URL : https://www.disp-lab.fr/
Ville : BRON

Description  

Afin d'assurer l'interopérabilité entre ses applications communicantes et/ou avec le monde extérieur, Berger-Levrault met en óuvre des échanges de données au sein d'architectures flexibles, évolutives et au couplage lâche, telles que les architectures orientées services et les architectures événementielles (SOA, EDA). Ces architectures reposent sur plusieurs moyens de communication pour acheminer les données, tel que l'échange de messages. Ces échanges sont opérationnalisés via l'utilisation d'un broker de messages qui permet d'envoyer des messages d'une source à un ou plusieurs destinataires en employant des routages spécifiques.
La multiplicité de ces échanges de données génère une complexité et fait ressortir des besoins de contrôle pouvant être traités par la mise en place de moyens de monitoring et d'analyse. Nous proposons d'analyser ces échanges en supervisant l'infrastructure utilisée pour les implémenter, à savoir le broker RabbitMQ. Sur la base d'une collection de données structurée, il est demandé de :
- Etudier et faire évoluer le modèle de données reflétant l'architecture de messaging établie et les échanges mises en óuvre sur le broker de messages
- Proposer des indicateurs et des seuils acceptables sur la base d'exigences prédéfinies relatives au fonctionnement des échanges (activité des émetteurs/récepteurs, absence d'infractions, taux de messages par type...).
- Préciser les causes de dysfonctionnements des échanges et définir leurs impacts
- Définir une séquence d'analyse permettant de vérifier des causes potentielles de dysfonctionnement
- Etablir une évaluation des situations et indicateurs constatés et proposer, en conséquence, les actions correctives à mener. Ceci permettra d'améliorer le processus de prise de décision.

Le stage est basé au laboratoire DISP de l'Université Lyon 2, en partenariat avec la société Berger-Levrault (société spécialisée dans l'édition de logiciels dans les domaines de l'éducation, de la santé, du sanitaire, du social et de la gestion des territoires). Vous serez entouré d'une équipe de chercheurs et travaillerez en lien fort avec l'entreprise sur un cas pratique. Vous serez suivi pour planifier l'ensemble des résultats attendus sur la durée du stage.

URL sujet détaillé : https://disp-lab.fr/sites/default/files/2019_-_disp_m2_-_suivi_et_evaluation_des_echanges_dans_un_systeme_dinformation_distribue.pdf

Remarques : Internship based at the DISP laboratory of the University of Lyon



SM207-47 Microscopy by sequencing  
Admin  
Encadrant : Nicolas SCHABANEL
Labo/Organisme : LIP (ENS de Lyon) et Guilliver (EPSCI, Paris)
URL : http://perso.ens-lyon.fr/nicolas.schabanel/doku.php
Ville : Lyon ou Paris

Description  

Recent advances in DNA next generation sequencing technologies (NGS) have been extremely fast. Second or third generation machines can routinely read million of individual strands and produce Gb of sequencing data, enough for example to encode full YouTube videos. A number of recent research have looked at possible applications for such massive extraction of molecular information. In this project, we want to explore a new possibility, outside of biology: to extract the shape of an unknown microscopic object through the combination of DNA-gels and NGS. The idea is to create a gel around an object of interest using randomly barcoded DNA monomers, and then extract the connectivity of the individual molecular subunit forming the gel by ligation followed by sequencing. This data will then inform a shape-reconstructing algorithm.

URL sujet détaillé : http://perso.ens-lyon.fr/nicolas.schabanel/Sujet%20de%20stage%20microscopy%20by%20sequencing.pdf

Remarques : Co-encadrant: Yannick RONDELEZ (Gulliver, ESPCI, Paris) http://www.yannick-rondelez.com

Stage bilocalisé Lyon/Paris

Localisation principale du stage sur Paris à l'ESPCI ou sur Lyon à l'ENS de Lyon, suivant les préférences du stagiaire




SM207-48 Towards Paremeterized Distributed Component Model  
Admin  
Encadrant : Ludovic HENRIO
Labo/Organisme : LIP, ENS Lyon
URL : http://www.ens-lyon.fr/LIP/
Ville : Lyon

Description  

Context:

The internship is placed in the context of the design of abstractions
for defining parameterized topologies of software
components for distributed computing in general,
and Cloud/Fog computing in particular.

More precisely, we want to extend the Madeus component model with collective operations and
configuration of system of variable size (at least decided at deployment time).

Madeus accurately describes the life cycle of each component by a Petri net structure, and is able to finely express the dependencies between components. The overall dependency graph it produces is then used to reduce deployment time by parallelizing deployment actions.

For the moment, it is not possible to describe naturally in Madeus a system with a size that will be chosen at deployment time. Communication patterns in such variable size systems are not
possible or very difficult to express.

Objectives:

The objective of this internship is to study the adaptation and possible extensions of the Madeus model to better take into account large component architectures, particularly focusing on architectures of parametric size. More particularly the objective is to design
constructs for parametric communication and topologies, either expressible in Madeus or expressed as minimal extension of the
model. The first extension to be considered is the ability to trigger not only two sided communications but also communications toward a set of components. The duality between the transmission of control tokens characteristic of the Madeus system and the data transmission aspect
will have to be taken into account. In particular, the interplay between data and communication is crucial in the context of collections of components because the size of the system should be a parameter of the system, or could even be transmitted as data during configuration. This influences the communication patterns, at least
concerning the number of targets of collective communications.

The objectives of the internship are at the same time to define sound extensions of the model and to implement the corresponding extension
in the Madeus framework (https://mad.readthedocs.io/). The
internship can thus focus either on theoretical aspects, or on the design and implementation aspects.
URL sujet détaillé : https://lhenrio.github.io/Docs/m2.pdf

Remarques : Co-advisors: Christian Perez (Avallon project, LIP, ENS Lyon) and Ludovic Henrio (Cash project, LIP, ENS Lyon)



SM207-49 Analyse d'un algorithme de reconstruction surfacique  
Admin  
Encadrant : Dobrina BOLTCHEVA
Labo/Organisme : LORIA
URL : https://members.loria.fr/DBoltcheva/
Ville : Nancy

Description  

The topic of the internship is surface reconstruction, where one has to turn a point cloud into a "meaningful" meshed surface. One of the advisors co-authored an algorithm that performs well in practice (and is used in industry). The goal of the internship is to study the theoretical guarantees offered by that algorithm, both in terms of quality of reconstruction (geometry and topology) and in terms of computational complexity.
URL sujet détaillé : https://members.loria.fr/XGoaoc/sujet-stage-reconstruction.pdf

Remarques : Le stage est co-encadré par Xavier Goaoc :

xavier.goaoc.fr
https://members.loria.fr/XGoaoc




SM207-50 Delaunay triangulation on surfaces  
Admin  
Encadrant : Monique TEILLAUD
Labo/Organisme :
URL : http://gamble.loria.fr/
Ville : Nancy

Description  

Context
Delaunay triangulations in the Euclidean space Rd have been extensively studied [1]. Their mathematical properties are well understood, many algorithms to construct them have been proposed and analyzed.
The following simple incremental algorithm [2] allows for very efficient implementations:
For each new point p
- Find the simplices in conflict with p, i.e. the simplices whose circumscribing ball containts p
- Remove these simplices from the triangulation. This creates a hole.
- Fill the conflict hole by ``starring'' it from p.
Periodic structures are arising in many fields, ranging from (nano-)materials to astrophysics. Modeling such periodic structures can be achieved through the use of Delaunay triangulations of the flat torus, which is homeomorphic to the quotient of Rd under the action of a group of translations.
An algorithm was proposed for general closed flat manifolds, i.e. compact quotient spaces of the Euclidean space under the action of a discrete group of isometries [3]. The algorithm is based on the abovementioned incremental algorithm, which subsumes that that the conflict hole is a topological disk for any new point p. This is a priori not always the case on a surface.
This has led to the necessity of ensuring that the Delaunay triangulation is maintained as a simplicial complex throughout the incremental construction, i.e., its edges do not form cycles of length 1 (loops, i.e., edges whose two vertices are equal) or 2 (two edges sharing the same two vertices) [3].
Packages [4,5] of CGAL, the Computational Geometry Algorithms Library, implement this incremental algorithm [3] in the special case of the 2D (resp. 3D) flat torus with square (resp. cubic) domain, homeomorphic to R2/Z2 (resp. R3/Z3). The 3D package has already found users in various fields including particle physics and material engineering. This work also led to a fruitful collaboration with astrophysicists [6].

Topic
A variant [7,1] of the incremental algorithm works as follows in R2:
For each new point p
- Split the triangle containing p into three triangles
- Flip non-Delaunay edges
- Propagate flips until there are no more non-Delaunay edges.
The algorithm can be extended to higher dimensions [8].

The question to study is whether this variant extends to the flat torus in 2D and 3D and still works when there are loops or double edges in the triangulation.
The question will be studied theoretically and may be completed by a prototype implementation.

The work may extend to a PhD on a related topic.

References
[1] Mark de Berg, Otfried Cheong, Marc van Kreveld, and Marc Overmars. Computational Geometry - Algorithms and Applications, Springer-Verlag.
[2] Adrian Bowyer. Computing Dirichlet tessellations. The Computer Journal, 24:162-166, 1981.
[3] Manuel Caroli and Monique Teillaud. Delaunay triangulations of closed Euclidean d-orbifolds, Discrete and Computational Geometry, 55(4):827-853, 2016.
[4] Nico Kruithof. 2D periodic triangulations
[5] Manuel Caroli, Aymeric Pellé, Mael Rouxel-Labbé, and Monique Teillaud. 3D periodic triangulations
[6] Johan Hidding, Rien van de Weygaert, Gert Vegter, Bernard J.T. Jones, and Monique Teillaud. Video: The Sticky Geometry of the Cosmic Web, Proceedings 28th Annual Symposium on Computational Geometry, pages 421-422, 2012.
[7] Charles L.Lawson. Software for C1 Surface Interpolation. C. L. Lawson. Technical report. Jet Propulsion Laboratory. 1977.
[8] Herbert Edelsbrunner and N.H. Shah. Incremental topological flipping works for regular triangulations. Algorithmica, 15:223-241, 1996.
URL sujet détaillé : https://members.loria.fr/Monique.Teillaud/positions/master-DT-surfaces.html

Remarques : co-advising with Vincent Despré
"gratification légale"



SM207-51 Incremental topological sort for a reactive programming language  
Admin  
Encadrant : Mathieu MAGNAUDET
Labo/Organisme : ENAC/Equipe Informatique Interactive
URL : http://smala.io
Ville : Toulouse

Description  

The Interactive Informatics Team at ENAC is developing a reactive programming framework, named djnn[1], dedicated to the design of interactive systems. This framework combines an execution engine with a new programming language named Smala[2]. It proposes a new way to unify data-flow primitives with state machines based on the concept of process activation. In a way similar to reactive synchronous languages, such as Lustre [3], the execution of a djnn program rests on the ordering of the nodes of an interactive components graph. However, in the present implementation, the graph is completely re-ordered at each addition/removal of a component, which can strongly affect the execution time and the reactivity of the interactive system. Thus, the aim of this internship is to define and implement an incremental topological sort for the execution graph. The first step will be a literature survey, then the definition and implementation of the algorithm. Finally the student will have to verify the benefit of this algorithm over the previous one.

This internship may be pursued by a PhD about the optimisation and/or certification of a reactive language dedicated to Human-Machine Interaction.

[1] M. Magnaudet, S. Chatty, S. Conversy, S. Leriche, C. Picard, and D. Prun. Djnn/Smala: A Conceptual Framework and a Language for Interaction-Oriented Programming. Proc. ACM Hum.-Comput. Interact. 2, EICS, Article 12 (June 2018)
[2] http://smala.io
[3] N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous dataflow programming language Lustre. In Proceedings of the IEEE, pages 1305-1320, 1991
URL sujet détaillé : :

Remarques :



SM207-52 Fast Solvers for High Frequency Aeroacoustics  
Admin  
Encadrant : Guillaume SYLVAND
Labo/Organisme : Inria Bordeaux Sud-Ouest
URL : https://team.inria.fr/hiepacs/
Ville : BORDEAUX

Description  

The industrial context (given by Airbus Central R&T) is the simulation of aeroacoustic wave propagation in heterogeneous media. It leads to solve mixed dense/sparse linear systems of very large size. We intend to develop high performance parallel solver for this kind of problems. In particular, we have developed a Hierarchical matrices solver, using task based programming in c++, for dense linear systems, and we want to extend it to sparse and mixed systems, and interface it with other existing solvers.
The subject is very wide (cf. google-doc description) and its exact content can be discussed.
A continuation in PhD is possible.
URL sujet détaillé : https://docs.google.com/document/d/1TXWBJxGOX3EiDihiS-QKF76hGL17eVg47NfHpRLnEAQ/edit?usp=sharing

Remarques : Co-encadré par Emmanuel AGULLO (emmanuel.agullo.fr).
Stage rémunéré.
Prolongation en thèse souhaitée.



SM207-53 Dynamique symbolique et croissance dans les groupes  
Admin  
Encadrant : Pascal VANIER
Labo/Organisme : LACL
URL : http://www.lacl.fr/pvanier
Ville : Créteil

Description  

The aim of this internship is to investigate the possible growths of recursively presented groups. Techniques will be inspired by symbolic dynamics and computability theory.
URL sujet détaillé : https://www.lacl.fr/pvanier/rech/M2_croissance.pdf

Remarques : Possibilité de rémunération, possibilité de poursuite en thèse.



SM207-54 Test of Embedded Multi-Agent Systems (EMAS)  
Admin  
Encadrant : Oum-el-kheir AKTOUF
Labo/Organisme : LCIS
URL : http://lcis.grenoble-inp.fr/
Ville : Valence

Description  

1. Description:

Embedded systems are more and more used in many areas of everyday life: transports, smart houses,
medical applications ... For deploying such systems, Multi-Agent Systems (MAS) provide a large scale
solution, involving agents that work locally to achieve a common global task. Recent uses of such
applications integrate varying mobile devices such as mobile phones, tablets, etc. Consequently, multiagent based applications are more and more complex and difficult to validate.
Indeed, many constraints should be considered:
- Huge quantities of generated data and their heterogeneity;
- A multitude of possible scenarios, making the reproduction of an error scenario difficult or even unfeasible;
- Cooperation and interactions between agents may introduce errors, even though the agents are
themselves valid.
Existing works mainly rely on general approaches for verifying and validating MAS. This makes it
difficult to take into account the specificities of the various hardware and/or software platforms on which the agents can execute. Furthermore, the geographical dimension introduced by mobile devices adds more complexity to MAS runtime environments.
A preliminary work, supported by the TRUST Chair (industrial chair of Grenoble INP on dependability
and security of systems) obtained the following results:
- A state-of-the-art on existing works on MAS design and testing;
- General guidelines related to the design cycle of EMAS in order to take into account agent
testing, especially by the addition of testing agents.
This work has been published (" Toward an Embedded Multi-Agent System Methodology and
Positioning on Testing ", Camille Barnier, Oum-El-Kheir Aktouf, Annabelle Mercier and Jean-Paul
Jamont) in the IWSF workshop that was held in Toulouse on October 2017, co-located with ISSRE 2017.
The proposed strategy for testing multi-agent systems consists of three phases: the agent testing, the acceptance testing and the society testing.

2. Objectives:

The initial model has been implemented in the MASH platform, allowing a testing multi-agent system to
observe the MAS and the agents under test. More precisely, two types of testing agents have been
introduced: a local test agent that measures the behavior of an agent and a society test agent that aims at observing the collective functionalities of the MAS.
The study focuses on the WMAC model, a multi-agent model developed at the LCIS lab. The proposed
project can focus on the consolidation of the model for the detection of disjoint groups in the selforganization phase of the system.
The model and its current implementation can be revised to make them more generic and integrate them to the Diamond MAS design methodology and the MASH platform.
An update of the bibliographic study on the EMAS and MAS testing is also expected
URL sujet détaillé : http://lcis.grenoble-inp.fr/themes/securite-des-systemes-embarques-et-distribues-critiques/offres-de-these-post-doc-stage/

Remarques : - Co-advising: A. Mercier et J.-P. Jamont
- Internship benefits (indemnités de stage)
- This project is part of a collaboration with the IESE Frauhofer research institute in Kaiserslautern (Germany)



SM207-55 Test Coverage analysis for Mobile and Context-Aware Applications  
Admin  
Encadrant : Oum-el-kheir AKTOUF
Labo/Organisme : LCIS
URL : http://lcis.grenoble-inp.fr/
Ville : Valence

Description  

Project description:

In recent years, more and more mobile applications (mobile apps) have been developed to support
different applications in social, news, tourism, health, business, and other domains. Hundreds of new apps are released and downloaded daily. By the end of year 2020, it is expected that the global revenue from the mobile app market will be 79 Billion USD whereas almost 378 Billion downloads of mobile apps will be performed [1].
Testing mobile apps is a hot and challenging research topic as testing a mobile app induces to
validate it on many different contexts including different hardware platforms, operating systems, Web browsers, many geographical locations for location-dependent apps, etc.
Crowdsourcing-based testing is a recent approach where testing is operated by volunteer users
through the cloud. This approach is particularly suited for mobile applications since various users
operating in various contexts can be involved.
The TMACS project (Testing of Mobile Apps using Crowdsourcing) at the LCIS lab aims at
studying crowdsourcing-based testing for mobile apps. A web-based testing platform is under
development. This platform, called TMACS, provides important features for crowdsourcing testing of
mobile apps by means of the following functionalities:
‐ It allows mobile app providers to register and upload mobile apps for testing;
‐ It allows volunteering Internet users to register and test uploaded mobile apps.
Expected behavior is that uploaded mobile apps are tested by many different Internet users in order
to cover different runtime platforms and meaningful geographical locations.
An issue related to outsourcing testing such as crowdsourcing testing concerns the evaluation of
test coverage to assess the application validation. This implies first to determine test coverage criteria suitable for such applications.

Project objectives:

There are many open questions in mobile applications testing, as it has been shown in [2]. More generally, there are no standard testing approaches for such context-aware applications [3]. An interesting idea that has been explored for similar situations [2] [4] is to identify the program variables that implement the context related data of the mobile application and to adopt data-flow-based code coverage criteria [5] to assess the thoroughness of a test suite with respect to the mobility context.
In this project, a literature survey focusing on test coverage evaluation for mobile and contextaware applications is first required. It is expected that at least one coverage assessment approach will be proposed and then precisely defined. Finally, the proposed approach will be implemented and integrated into the TMACS platform. This integration would make possible an experimental evaluation of the approach.

Prerequisites:

Java and UML technologies are important basics for the project.

References

1. https://dazeinfo.com/2016/04/20/global-mobile-app-download-revenue-market-2016-2020-
report/
2. T. Zhang, J. Gao, O. Aktouf, T. Uehara. Test Model and Coverage Analysis for Locationbased
Mobile Services. The 27th International Conference on Software Engineering and
Knowledge Engineering, SEKE 2015, Wyndham Pittsburgh University Center, Pittsburgh,
USA, July 6 - July 8, 2015.
3. S. Matalonga, F. Rodrigues, G.H. Travassos. Characterizing testing methods for context-aware
software systems: Results from a quasi -systematic literature review. The Journal of Systems
and Software, 131 (2017) 1-21.
4. H. Lu, W. K. Chan, T. H. Tse. Testing context-aware middleware-centric programs: A data
flow approach and an RFID-based experimentation. Proceedings of the 14th ACM SIGSOFT
International Symposium on Foundations of Software Engineering (SIGSOFT '06/FSE-14).
5. P. G. Frankl , E. J. Weyuker. An Applicable Family of Data Flow Testing Criteria. IEEE
Transactions on Software Engineering, v.14 n.10, p.1483-1498, October 1988.
URL sujet détaillé : http://lcis.grenoble-inp.fr/themes/securite-des-systemes-embarques-et-distribues-critiques/offres-de-these-post-doc-stage/

Remarques : - Co-advising: I. Parissis
- Internship benefits (indemnités de stage)
- This project is part of a collaboration with Danang University (Viet-Nam)



SM207-56 An Algorithmic Approach to Animal Behavior  
Admin  
Encadrant : Amos KORMAN
Labo/Organisme : IRIF
URL : https://amoskorman.com/
Ville : Paris

Description  

An internship is available under the guidance of Amos Korman, supported by an ERC Consolidator Grant. The internship will be held at IRIF, CNRS and Univ Paris Diderot, a leading center for theoretical computer science in France. The internship will include a long visit (or a couple) to Israel, as part of a collaboration with a bat laboratory at Tel-Aviv University (see http://www.yossiyovel.com/).

The topic concerns the study of algorithmic game theoretical principles with a special focus on animal contexts, and particularly, behavior of bats. Of main interest are competition issues that relate to tasks such as search (or foraging), food consumption, and territoriality. The research will include a large purely theoretical component but will also strive to connect to experiments in biology.

This unconventional approach is highly demanding, and we are seeking only highly gifted candidates. The candidates should have a solid background in the following subjects, ordered by importance:

- Probability theory
- Programming skills
- Interest in game theory
- Love of animals
- Ability to connect theory with experiments

Applications should include a CV hopefully together with 1-2 letters of recommendations and should be sent to amos.korman.com

Some references:

- A very nice survey on evolutionary game theory and potential directions for further research: http://rsif.royalsocietypublishing.org/content/10/88/20130544

- A recent article of mine, studying the impact of competition on the dispersal of foraging players: https://arxiv.org/pdf/1805.01319
URL sujet détaillé : :

Remarques : The internship is intended to hopefully continue to a Ph.D.

It is possible to do the internship at Tel-Aviv University, hosted by Yossi Yovel. In either case, the main guiding advisor will be Amos Korman.



SM207-57 Interference analysis for the new Kalray MPPA3 many-core  
Admin  
Encadrant : Matthieu MOY
Labo/Organisme : LIP or Verimag
URL : http://www.ens-lyon.fr/LIP/CASH/
Ville : Lyon or Grenoble

Description  

In automotive and avionics, some programs are subject to critical timing constraints.
In these real-time critical systems, the timing properties must be guaranteed.
Among them, the worst-case execution time that guarantees an upper-bound on the execution time of programs.
On a multi-core platform, this worst-case execution is highly influenced by all concurrent programs.
In our work, we studied the Kalray MPPA2 platform (a set of multi-core communicating through a network on chip comprising 256 compute cores, designed by a company located in Montbonnot, near Grenoble), and designed an algorithm able to tightly bound the interference, taking into account both the specificities of the software application and the low-level details of the hardware architecture.
Our interference analysis models the interference on the memory bus and the communications through the network-on-chip.
In the new generation of chip, the MPPA3 (announced in 2017), considerable changes have been made to the architecture (less clusters but faster communications and memory access). The generic part of our algorithm are still valid, but the mathematical model of the hardware architecture must be redesigned.
The goal of the internship is to analyze how far our previous work may be adapted to this new platform and propose modified/new interference analysis.


URL sujet détaillé : http://www.ens-lyon.fr/LIP/CASH/wp-content/uploads/2018/10/m2-wcet-mppa3.pdf

Remarques : Co-supervised between Matthieu Moy (LIP) and Claire Maiza (Verimag). The internship can be physically hosted either in Grenoble or in Lyon.



SM207-58 Definition and basis of the verification of a compiler for a new reactive langage  
Admin  
Encadrant : Celia PICARD
Labo/Organisme : ENAC research
URL : http://enac.fr/fr/equipe-ii-informatique-interactive
Ville : Toulouse

Description  

The Interactive Informatics (II) Team of ENAC's research laboratory develops an environment called djnn [1]. It allows the development and execution of software that supports human-machine interaction. This environment contains an original language dedicated to the description of these software (component structuring, control mechanisms, data flow, input and output management...): smala[2]. Compiled and linked to specific libraries, an application described in smala can then be executed.

The smala language is reactive: it provides a way to describe applications responding to external events (user inputs from the interface, timers, etc...) instantaneously. Unlike other languages, smala unifies the concepts of events processing, state machines and data flow management. We are particularly interested in the description of critical systems for air transportation (cockpit, control position, etc.) that require certification.

The basis of the work on the formalization of the semantics of the smala language and the definition of an associated compiler was set during an internship in the summer of 2018. In particular, we defined the minimum elements of the language. This internship is the continuation of this work along two axes:

- definition of the compiler using a functional language
- verification of this compiler using the Coq tool[3] based on the work done on COMPCERT[4] : a certified C compiler and the verification of a compiler for Lustre[5]

This work could be followed by a PhD thesis in the continuity or around the verification of interactive properties on components.

---------

L'équipe d'Informatique Interactive (LII) du laboratoire de recherche de l'ENAC (Ecole Nationale de l'Aviation Civile) développe un environnement nommé djnn [1]. Il permet l'élaboration et l'exécution de logiciels supportant les interactions homme-machine. Cet environnement contient un langage original dédié à la description de ces logiciels (structuration en composants, mécanismes de contrôle, flot de données, gestion des entrées et des sorties...) : smala [2]. Compilée puis liée à des librairies spécifiques, une application décrite en smala peut alors être exécutée.

Le langage smala est réactif : il permet de décrire des applications répondant à des événements externes (inputs utilisateur depuis l'interface, timers, etc...) de manière instantanée. Contrairement aux langages usuels, smala repose sur l'unification des concepts et des mécanismes de traitement d'événements, de machines à état et de gestion de flots de données. Nous nous intéressons plus particulièrement à la description de systèmes critiques pour le transport aérien (cockpit, position de contrôle, etc.) qui nécessitent une certification.

Un travail sur la formalisation de la sémantique du langage smala et la définition d'un compilateur associé a été amorcé lors d'un stage pendant l'été 2018. Il a en particulier permis de définir les éléments minimaux du langage. Ce stage est la poursuite de ce travail selon deux axes :

- définition du compilateur dans un langage fonctionnel
- vérification de ce compilateur à l'aide de l'outil Coq [3] en s'inspirant des travaux effectués sur COMPCERT [4] : un compilateur C certifié et la vérification d'un compilateur pour Lustre [5]

Ce travail pourra ensuite être poursuivi par une thèse de doctorat dans la continuité ou autour de la vérification de propriétés interactives sur des composants.

[1] M. Magnaudet, S. Chatty, S. Conversy, S. Leriche, C. Picard, and D. Prun. Djnn/Smala: A Conceptual Framework and a Language for Interaction-Oriented Programming. Proc. ACM Hum.-Comput. Interact. 2, EICS, Article 12 (June 2018)
[2] http://smala.io
[3] https://coq.inria.fr/
[4] X. Leroy. Formal verification of a realistic compiler. Communications of the ACM, 52(7):107-115, 2009
[5] Bourke, T., Brun, L., Dagand, P. É., Leroy, X., Pouzet, M., & Rieg, L. (2017, June). A formally verified compiler for Lustre. In ACM SIGPLAN Notices (Vol. 52, No. 6, pp. 586-601). ACM.
URL sujet détaillé : :

Remarques :



SM207-59 Blockchain Incentives  
Admin  
Encadrant : Vincent GRAMOLI
Labo/Organisme : University of Sydney / Data61 - CSIRO
URL : http://gramoli.redbellyblockchain.io/web/php/index.php
Ville : Sydney

Description  

Blockchains promise to disrupt various industries with the ability to simplify transactions between individuals by limiting the intermediaries, hence increasing speed and lowering costs. This momentum produced an increase in the number of blockchain proposals coming from the industry sometime claiming both performance and security.

Unfortunately, there is in comparison very little work in the research literature that can support these claims. One example is the well-known impossibility of agreeing on a unique block at a given index when there are a third of misbehaving participants. While few companies claimed to have successfully integrated Byzantine fault tolerant solutions without apparent bugs, the performance of their solution does not scale with the number of participants, making it of limited interest to blockchain.

The University of Sydney and CSIRO have developed the Red Belly Blockchain, a secure blockchain that scales to large number of participants without compromising security. It does not solve the classic consensus problem but a variant of it [1] to allow performance to scale to a large number of participants. It implements a community blockchain [2] that neither relies on a predetermined set of participants like consortium blockchains nor incentivises all participants to create the same blocks as in public blockchain to avoid resource wastes. The topic of this research project is to improve the accountability of the Red Belly Blockchain by designing algorithms incentivising participants to contribute in the interest of the system.

[1] DBFT: Efficient Leaderless Byzantine Consensus and its Applications to Blockchains.
Tyler Crain, Vincent Gramoli, Mikel Larrea, Michel Raynal. Proceedings of the 17th IEEE International Symposium on Network Computing and Applications. IEEE NCA 2018.

[2] ComChain: Bridging the Gap Between Public and Consortium Blockchains. Guillaume Vizier, Vincent Gramoli. IEEE Blockchain 2018.
URL sujet détaillé : :

Remarques : per month leaving allowance



SM207-60 Compression de graphes  
Admin  
Encadrant : Hamida SEBA
Labo/Organisme : LIRIS
URL : https://liris.cnrs.fr
Ville : Villeurbanne

Description  

Graph compression is gaining popularity as a precious tool for identifying both structure and meaning in massive data. The goal of this master project is to study selected graph properties in order to identify those able to construct simple reduced representations of graphs. The obtained summaries are intended to be used without decompression for complex graph operations such as learning, classification and querying.


URL sujet détaillé : https://liris.cnrs.fr/~hseba/Master/Stage_Master2_Recherche_Graphes_et_Algorithmes.pdf

Remarques : This internship can be followed by a PhD thesis in LIRIS
co-encadrement avec Mohammed Haddad,LIRIS, Lyon 1



SM207-61 Solving equations with Chebyshev approximations  
Admin  
Encadrant : Guillaume MOROZ
Labo/Organisme : Inria
URL : http://gamble.loria.fr/positions.html
Ville : Nancy

Description  

The problem of solving polynomial equations of the form p(x)=0 appears in robotics, biology and many other engineering applications. State-of-the-art software can already solve polynomial equations of degree 10 000 in several minutes.

On the other hand, some results in approximation theory show that if n is the degree of p, then there exists a polynomial q of degree O(sqrt{n k}), such that |q(x)-p(x)| < 10^{-k} for all x in the interval [-1; 1].

The goal of this internship is to use approximation methods to speed up the existing root isolation methods for generic polynomials.

URL sujet détaillé : https://members.loria.fr/GMoroz/internships/isolation_chebyshev.pdf

Remarques :



SM207-62 Scalability of the interference analysis for a multi-core platform  
Admin  
Encadrant : Matthieu MOY
Labo/Organisme : LIP or Verimag
URL : http://www.ens-lyon.fr/LIP/CASH/
Ville : Lyon or Grenoble

Description  

In automotive and avionics, some programs are subject to critical timing constraints.
In these real-time critical systems, the timing properties must be guaranteed.
Among them, the worst-case execution time that guarantees an upper-bound on the execution time of programs.
On a multi-core platform, this worst-case execution is highly influenced by all concurrent programs.
In our work, we studied the Kalray MPPA2 platform (a set of multi-core communicating through a network on chip comprising 256 compute cores, designed by a company located in Montbonnot, near Grenoble), and designed an algorithm able to tightly bound the interference.
Our interference analysis models the interference on the memory bus and the communications through the network-on-chip.
This analysis showed very good results in terms of precision, but is limited in terms of scalability (the overall algorithm is O(n^4) in number of tasks). We already have ideas on how to reduce the complexity and improve the scalability (by changing the way we iterate to find a solution), and on how to improve the precision further (through a better study of the communication, memory accesses and scheduling). The aim of this internship is to explore one of the improvement axis and implement it in our analysis tool.



URL sujet détaillé : http://www.ens-lyon.fr/LIP/CASH/wp-content/uploads/2018/10/m2-wcet-scalability.pdf

Remarques : Co-supervised between Matthieu Moy (LIP) and Claire Maiza (Verimag). The internship can be physically hosted either in Grenoble or in Lyon.



SM207-63 Programming models for optimised compilation  
Admin  
Encadrant : Ludovic HENRIO
Labo/Organisme : LIP, ENS Lyon
URL : http://www.ens-lyon.fr/LIP/
Ville : Lyon

Description  

The goal of this internship is to explore, through examples, optimization and parallelization opportunities.

The expected outcomes of the internship are :
=97 Identify primitives needed for efficient parallelism (e.g. efficient FIFO communication), and implement them
as a library.
=97 Identify patterns in the application that map to parallel constructs, and prototype the transformation algo-
rithms to these constructs by applying them manually.
=97 Generalization of the concepts above as a programming language.

The internship could follow the following steps (the internship will most likely not treat them all, but rather focus on a few of them) :
1. Understand a small HPC application (e.g. a few tens of lines kernel).
2. Apply optimizations by hand. Optimizations of interest here are mainly the ones using data-flow between parts of the program. For instance, a program containing two consecutive loops may in some case be transformed into one process per loop, with a limited-size buffer between processes.
3. Design high-level programming constructs, for example inspired by actor-based models that would allow splitting the program into multiple sub-programs and help optimization. This would make explicit the compo- sition and interactions of sub-programs.
4. Rework the optimization phase (2) using these constructs.
5. Specify formally the semantics of the newly proposed constructs.
6. Specify formally the optimizations and prove their correction with respect to the semantics.
7. Generalization : use the new model on other programs.

This proposal is very flexible, and may evolve as :
=97 a more theoretical study, that would focus on points 5 and 6 and on the design of the programming model
(but would still start with an example program and its optimization).
=97 or a more applied study, applied to several programs and focused on the realizability of the optimizations
(implementation of a simple optimizing compiler).
URL sujet détaillé : http://www.ens-lyon.fr/LIP/CASH/wp-content/uploads/2018/10/m2-intermediate-model.pdf

Remarques : Co-advisors: Matthieu Moy and Christophe Alias (Cash team, LIP)



SM207-64 Extracting Information from Moments  
Admin  
Encadrant : Jean-bernard LASSERRE
Labo/Organisme : CNRS LAAS
URL : http://homepages.laas.fr/lasserre/
Ville : Toulouse

Description  

Certification and validation of computational results is a major issue for modern sciences raising challenging problems at the interplay of mathematics and computational aspects of computer science. One can emphasize in this context several applications arising in the design of modern cyber-physical systems with a crucial need of certification and information extraction. In particular, one tries to avoid incidents such as the Patriot missile crash in 1991, the FDIV Pentium bug in 1994 or more recently the collision of Google's self-driving car in 2016.

These issues give rise to many mathematical problems. Polynomial optimization (which consists in computing the infimum of a polynomial function under algebraic constraints) is one of the most important, difficult and challenging one.
The emergence of this exciting new field goes back to the last decade and has led to striking developments from a cross fertilization between (real) algebraic geometry, applied mathematics, theoretical computer science and engineering. The backbone of this powerful methodology is the "moment-SOS" approach, also known as "Lasserre hierarchy".

This research project aims at designing polynomial optimization algorithms to extract useful information from moment data.

In the context of global optimization, we are concerned with the design of solution extraction procedures, providing for instance, the minimizer(s) of a given polynomial.
For a finite set of minimizers, this corresponds to the support of a discrete measure. To improve upon existing extraction procedures, we intend to rely on the Christoffel polynomial associated to the matrix whose entries are the moments of such a measure.
Previous research focused on approximating the set of minimizers for linear functionals coming from positive measures (e.g. convex combination of Dirac measures).
To find the set of minimizers of general nonnegative polynomials, we intend to study the Christoffel function associated to linear functionals coming from signed measures.

Several applications lead to approximate functions corresponding to measures supported on indicator functions or trajectories.
However it is typically hard to approximate them with polynomials of increasing degrees in order to obtain a point-wise convergence behavior. Such non-uniform convergence behaviors are related to an effect known as the Gibbs phenomenon.
Another goal of this project is to explain and attenuate this phenomenon. The starting point will be to investigate classical optimization problems, with available solutions.
URL sujet détaillé : http://www-verimag.imag.fr/~magron/sujets/extract.pdf

Remarques : Co-advised by Jean-Bernard Lasserre and Victor Magron.

The Master student will be hosted by the Mac team in the LAAS laboratory, located at Toulouse.

Motivated candidates should hold a Bachelor degree and have a solid background in either optimization, signal processing, control, real algebraic geometry or computer algebra. Good programming skills are also required. Knowledge of French does not constitute a per-requisite.

Regular internship salary.



SM207-65 Exact Certificates for Noncommutative Polynomial Optimization  
Admin  
Encadrant : Victor MAGRON
Labo/Organisme : CNRS LAAS
URL : http://www-verimag.imag.fr/~magron
Ville : Toulouse

Description  

Certification and validation of computational results is a major issue for modern sciences raising challenging problems at the interplay of mathematics and computational aspects of computer science. One can emphasize in this context several applications arising in the design of modern cyber-physical systems with a crucial need of exact certification. In particular, one tries to avoid incidents such as the Patriot missile crash in 1991, the FDIV Pentium bug in 1994 or more recently the collision of Google's self-driving car in 2016. Since these systems often involve nonlinear functions, such as polynomials, it is highly desirable to design exact polynomial optimization methods.
In 2001, Lasserre introduced a hierarchy of convex relaxations for approximating polynomial infima. Each relaxation is solved with a semidefinite programming (SDP) solver, implemented in finite-precision, thus providing only approximate certificates. The emergence of this exciting new field goes back to the last decade and has led to striking developments from a cross fertilization between (real) algebraic geometry, applied mathematics, theoretical computer science and engineering.

This project intends to design hybrid symbolic/numeric algorithms to output exact certificates for optimization problems with polynomial data, involving non-commutative (NC) variables, e.g. matrices. NC polynomials allow to handle certain control systems (e.g. orbit problems or H-infinity control) and quantum computation problems, (e.g. violation level of bell inequalities or bosonic energy computations). That is, one relies on expressions obtained by adding and multiplying matrices. A wide class of engineering problems has motivated major developments in this recently emerging area.
Connections between linear control systems may produce highly complicated systems involving polynomial matrix coefficients. These hard systems can be analyzed thanks to an NC variant of the Lasserre's hierarchy. This hierarchy shares the same interesting properties as in the commutative case, but comes also with certification and scalability issues.


The challenging goal of this internship is to design algorithms to compute exact certificates for non-commutative polynomial optimization problems, while controlling the bit complexity of the algorithmic procedures.
Preliminary work will consist of studying the existing algorithms to obtain exact decompositions of nonnegative polynomials. In the commutative case, univariate polynomials have been recently studied by means of classical techniques from symbolic computation (root isolation, square-free decomposition). An extension to multivariate polynomials has been derived thanks to a perturbation/compensation algorithm.

A promising strategy is to adapt suitably this algorithm to the NC case.
After designing the certification framework, further efforts should lead to provide the related bit complexity estimates, both on runtime and output size. Practical experiments shall be performed through implementing a tool within the Maple library RealCertify. One expected goal is to compare the performance of the tool with existing frameworks.

URL sujet détaillé : http://www-verimag.imag.fr/~magron/sujets/multivsohs.pdf

Remarques : Co-advised by Jean-Bernard Lasserre and Victor Magron.

The Master student will be hosted by the Mac team in the LAAS laboratory, located at Toulouse.

Motivated candidates should hold a Bachelor degree and have a solid background in either optimization, signal processing, control, real algebraic geometry or computer algebra. Good programming skills are also required. Knowledge of French does not constitute a per-requisite.

Regular internship salary.



SM207-66 Exact Semidefinite Approximations of Reachable Sets  
Admin  
Encadrant : Victor MAGRON
Labo/Organisme : CNRS LAAS
URL : http://www-verimag.imag.fr/~magron
Ville : Toulouse

Description  

Certification and validation of computational results is a major issue for modern sciences raising challenging problems at the interplay of mathematics and computational aspects of computer science. One can emphasize in this context several applications arising in the design of modern cyber-physical systems with a crucial need of exact certification. In particular, one tries to avoid incidents such as the Patriot missile crash in 1991, the FDIV Pentium bug in 1994 or more recently the collision of Google's self-driving car in 2016.

These issues give rise to many mathematical problems. Polynomial optimization (which consists in computing the infimum of a polynomial function under algebraic constraints) is one of the most important, difficult and challenging one.
The emergence of this exciting new field goes back to the last decade and has led to striking developments from a cross fertilization between (real) algebraic geometry, applied mathematics, theoretical computer science and engineering.

Consider for instance the problem of minimizing
4 x^4 + 4 x^3 y - 7 x^2 y^2 - 2 x y^3 + 10 y^4 over R^2 One way to certify that its minimum is 0 is to decompose this polynomial as a sum of squares (SOS), which is the core subject of study in real algebra. Here the decomposition is (2 x y + y^2)^2 + (2 x^2 + x y - 3 y^2)^2>= 0. In general, one can compute such SOS decompositions by solving a semidefinite program (SDP), which is a standard tool in applied mathematics and convex optimization. In SDP, one optimizes a linear function under the constraint that a given matrix is semidefinite positive, i.e. has only nonnegative eigenvalues. One particular issue arising while relying on SDP solvers is that they are often implemented using floating-point arithmetic, thus output only approximate certificates.

Given a dynamical polynomial system described by a continuous-time (differential) equation,
the reachable set (RS) is the set of all states that can be reached from a set of
initial conditions under general state constraints. This set appears in different fields such as optimal control, hybrid systems or program analysis. In general, computing or even
approximating the RS is a challenge.

The ambitious goal of this project is to
compute exact enclosures of RS for continuous-time systems.

Preliminary work will consist of studying the existing algorithms to obtain exact SOS decompositions of nonnegative polynomials. In particular,the case of multivariate polynomial optimization has been recently handled, thanks to a perturbation/compensation algorithm.

A promising research track is to design a similar algorithm to compute certified approximations for the reachable set of continuous-time polynomial systems. The first step shall be to derive a characterization of the RS, following the scheme developed for discrete-time systems.

Practical experiments shall be performed through
implementing a tool within the Maple library RealCertify. A further expected goal is to compare the performance with existing frameworks.
URL sujet détaillé : http://www-verimag.imag.fr/~magron/sujets/reachode.pdf

Remarques : The internship will be co-advised by Victor Magron (CNRS) and Khalil Ghorbal( Hycomes group, INRIA Rennes, France). The Master student will be hosted either at Inria (Rennes, France) in the Hycomes group, or at LAAS CNRS (Toulouse, France) MAC team.

Motivated candidates should hold a Bachelor degree and have a solid background in either optimization, computer arithmetics, real algebraic geometry or computer algebra. Good programming skills are also required. Knowledge of French does not constitute a per-requisite.

Regular internship salary.



SM207-67 Coinductive equivalences for extended lambda-calculi  
Admin  
Encadrant : Daniel HIRSCHKOFF
Labo/Organisme : LIP, ENS de Lyon
URL : http://perso.ens-lyon.fr/daniel.hirschkoff/
Ville : Lyon

Description  

Coinductive proof techniques have been developed to reason about various kinds of foundational calculi, notably concurrent process calculi sur as CCS and the pi-calculus, the lambda-calculus and some of its extensions.

In such a setting, the (coinductive) bisimulation proof method can be enhanced with various techniques, which make the development of proofs of equivalences between programs easier.

The goal of this internship is to study coinductive proof techniques for non-deterministic extensions of the lambda-calculus.
See the pdf file for a short bibliography, which may give an impression of the kind of work which is expected during this internship.
URL sujet détaillé : http://perso.ens-lyon.fr/daniel.hirschkoff/Stages/dh-lambda.pdf

Remarques : The internship will be co-supervised with Davide Sangiorgi (Univ. Bologna and INRIA).
The internship can be funded.
A PhD after the internship is among the options.



SM207-68 Big Data architecture for agriculture application  
Admin  
Encadrant : Raja CHIKY
Labo/Organisme : LISITE-ISEP
URL : www.isep.fr :
Ville : Issey Les moulineaux

Description  

The LISITE laboratory of ISEP is working with an industrial partner (Cap2020) on an innovative project for smart farming =93IDEAC=94. The IDEAC project aims to help farmers to make precise and predictive decisions for maximum productivity, greater profits and more sustainability.
An open source architecture have been proposed to address the different challenges (heterogeneous data collection (sensors, server apis, social networks....), big data storage and data analytics...). In this context, we are looking for a highly motivated intern willing to join the project.
The goal of this internship is to make a use-case scenario to validate the architecture. The intern will have the following responsibilities:
- Installing the architecture components on a virtual machine (documentation is provided)
- Understanding Apache NiFi's architecture and dataflow concepts.
- Ingesting data using provided APIs and identifying how to parse heterogeneous data into normalized format.
- Creating a script to querying Hbase
Key words:
Apache NIFI, Hadoop, Hbase...
Requirements:
- Master, Bac +5 in computer science / engineering or equivalent
- Familiar with Big Data technologies (Big Data processing and management) and distributed systems: Hadoop, Spark, HBase
- Good scripting and coding skills (python, java or C++)
- Autonomous and team working

URL sujet détaillé : :

Remarques : Duration: 4 to 6 months
Start Date: as soon as possible
Grant: yes
Hosting team: LISITE, Institut Supérieur d'Électronique de Paris (ISEP) (https://www.isep.fr/)
Application instructions:To apply, please send a mail and CV to raja.chiky.fr





SM207-69 Automatic Verification of an Abstract Machine for Synthesized Operating System Schedulers  
Admin  
Encadrant : Xavier RIVAL
Labo/Organisme : DIENS/INRIA (ENS Paris)
URL : https://www.di.ens.fr/~rival/
Ville : Paris

Description  

The goal of this internship is to build a verifier for implementations of schedulers derived by compiling high-level descriptions in a domain specific language. Schedulers are complex and critical component of all operating systems, and they are very challenging to verify in general. However, this internship will rely on the knowledge of the domain specific language and its runtime in order to identify the key properties to verify and the abstractions that allow to achieve the verification. An abstract interpreter for programs manipulating complex data structures will be used as a basis to set up abstract domains specific to the scheduling problem.
URL sujet détaillé : https://www.di.ens.fr/~rival/interns/stage-sched.pdf

Remarques : The internship is taking place as part of a funded ANR project (so that funding is available both for the internship and for a PhD thesis).



SM207-70 Separation Logic with Summary Predicates Described by Set Universal Quantification  
Admin  
Encadrant : Xavier RIVAL
Labo/Organisme : DIENS/INRIA (ENS Paris)
URL : https://www.di.ens.fr/~rival/
Ville : Paris

Description  

The goal of this internship is to extend separation logic with a novel summarization technique that involves no recursion, and is thus adapted to data-structures without an inductive definition. This includes graphs or union find data-structures. Summaries should allow to describe memory areas of unbounded size. Furthermore, it should be used so as to create an abstract domain able to support the automatic verification of programs that manipulate unbounded non recursive data-structures. The implementation will be based on MemCAD, an existing abstract interpretation-based static analyzer.
URL sujet détaillé : https://www.di.ens.fr/~rival/interns/stage-set-ext.pdf

Remarques : The internship is taking place as part of a funded ANR project (so that funding is available both for the internship and for a PhD thesis).



SM207-71 Reversible Causal Graph Dynamics  
Admin  
Encadrant : Pablo ARRIGHI
Labo/Organisme : LIS (ou IXXI)
URL : http://www.lis-lab.fr/
Ville : Marseille (ou Lyon)

Description  

Cellular Automata consist in a grid of identical cells, each of which may take a state among a
finite set. The state of each cell at time t + 1 is computed by applying a fixed local rule to the
state of cell at time t, and that of its neighbours, synchronously and homogeneously across space.

Causal Graph Dynamics [1] extend Cellular Automata from fixed grids, to time-varying graphs. That is, the whole graph evolves in discrete time steps, and this global evolution is required to have a number of physics-like symmetries: shift-invariance (it acts everywhere the same), causality (information has a bounded speed of propagation). One could also demand reversibility (the whole evolution is a bijection over the set of labelled graphs). That way we obtain a Computer Science-inspired toy model, that has features mimicking both Quantum theory (reversibility) and General Relativity (evolving topology).
URL sujet détaillé : https://drive.google.com/file/d/1j07HykXuEo4qrx4qTtC3eG01U_YtaraN/view?usp=sharing

Remarques : Rénumération usuelle. Co-encadrement: voir le sujet.



SM207-72 Intégration de composantes V2V structurées grâce au clustering CBL dans une infrastructure de réseau véhiculaire =96 Evaluation sur un cas d'étude relatif aux ADAS  
Admin  
Encadrant : Martine WAHL
Labo/Organisme : LEOST team of the IFSTTAR/COSYS
URL : http://www.ifsttar.fr/?id=134&fe_users[uid]=497&no_cache=1
Ville : Villeneuve-d'Ascq

Description  

In the context of a research project on Advanced Driver-Assistance Systems, we are developing a simulation testbed for evaluating various scenarios of vehicule-to-vehicule (V2V) and vehicule-to-infrastructure (V2I) communications. The goal of this training is to design and realize the scenarios that will allow evaluating a Clustering algorithm, initially proposed for structuring a Vehicular Ad Hoc Network based only on V2V communications, in a context where an infrastructure also allows V2I communications.

After an initiation to network and trafic modeling with tools such as OPNET, MATLAB and SUMO, the research work will start by a study of the clustering scheme Chain-Branch-Leaf (CBL), designed at IFSTTAR in a collaboration with ULCO [1], and also of the existing scenarios and applications already modeled in OPNET Riverbed Modeler.

Using Real-World road trafic data, the trainee will design and model a deployment of 802.11p Road Side Units (RSU) representing the infrastructure in the area where 802.11p equipped vehicles will be moving. He will then propose various strategies for an accurate repartition of the trafic generated by the vehicles between V2V and V2I communications.

Finally, He will apply these strategies to the transmission of some ADAS applications trafic, and evaluate their efficiency using the simulation testbed that he will have realized.

Please note that a PhD subject is proposed in relation with this training subject in the same team.
URL sujet détaillé : https://www.abg.asso.fr/en/candidatOffres/show/id_offre/80946

Remarques : This work will be co-supervised by
Patrick Sondi, MCF at ULCO (LISIC)
http://www-lisic.univ-littoral.fr/spip.php?article50&membre...

Payment :
The Training is rewarded 26,25=80/day



SM207-73 Arithmetic and casting in Lean  
Admin  
Encadrant : Robert LEWIS
Labo/Organisme : Vrije Universiteit Amsterdam
URL : https://lean-forward.github.io/
Ville : Amsterdam, NL

Description  

Proof assistants (also called interactive theorem provers) make it possible to develop computer-checked, formal proofs of theorems. Lean is a new proof assistant, based on dependent type theory similar to Coq. Constructing formal proofs by hand can be tedious, especially when dealing with simple numeric or arithmetic facts. We often want the system to manage such proofs for us. The goal of this project is to adapt tools for integer arithmetic and casting to Lean. Using Lean's metaprogramming framework, these tools can be implemented in the language of Lean itself.
URL sujet détaillé : https://lean-forward.github.io/internships/arith_cast.pdf

Remarques : The project will be co-advised by Jasmin Blanchette. In addition to any compensation offered by the intern's home institution, we offer =80250 per month and will reimburse travel expenses to Amsterdam.



SM207-74 Designing High Performance Parallel Algorithms for Tensor Computations  
Admin  
Encadrant : Oguz KAYA
Labo/Organisme : Laboratoire de Recherche en Informatique (LRI) - Université Paris-Sud
URL : www.oguzkaya.com, https://www.lri.fr/equipe.php?eq=27
Ville : Orsay

Description  

A tensor is a multi-dimensional array that generalizes matrices and vectors (i.e., a matrix is a 2-D tensor and a vector
is a 1-D tensor) to express data/objects of high dimensions (e.g., for shopping habits on Amazon can be modelled
using a 3-D tensor with dimensions user =D7 product =D7 time). Tensors have gained an immense popularity in the recent
past with fundamental applications in numerous domains including signal processing, quantum chemistry/physics,
scientific computing, big data analysis, and machine learning. In parallel with this, there have been a significant
effort to provide fast tensor libraries with high performance parallel implementations of tensor kernels in various
settings (shared memory, distributed memory, and GPU paralellism) and even hardware implementations (i.e. Tensor
Cores in Nvidia Turing/Volta and Tensor Processing Unit of Google). High performance tensor computations is an
ongoing research having enormous impact with many challenges to overcome.

Sparse tensors have been adopted by many applications in data analytics, including recommender systems, anomaly
detection, graph analytics and healthcare data analysis, in which data is sparse (meaning most entries in the tensor
representing the data is zero). In these applications, datasets typically contain millions to billions of entries. This
renders analyzing such datasets computationally very expensive, hence the need for very efficient parallel algorithms and
implementations. Sparse tensor algorithms operate only on the nonzero elements in the tensor, and the irregularity
in the sparsity pattern introduces many challenges such as irregular memory accesses, load imbalance, unbalanced
communication cost, which effectively renders obtaining scalable parallel algorithms difficult.

Dense tensor decompositions have been increasingly used in the solution of linear systems (Ax = b) and eigenvalue
problems (Ax = λx) that appear in scientific simulations, providing substantial gains with respect to matrix-based
techniques. They enable expressing the problem (the matrix A and the vectors x, b, etc.) in a compressed tensor
form, then provide a framework that enables carrying out all matrix and vector operations in this compressed form
(matrix-vector and -matrix multiplication, vector inner product, matrix or vector addition and scaling, etc.). Carrying
out these operations as fast as possible with modern HPC methods is indispensable for executing large-scale scientific
simulations in a timely manner.

In both scenarios, bringing HPC techniques into tensor computations is an utmost need in big data analysis and
scientific computing, which will be the principal focus of this internship. The internship can lead to a PhD. subject for good candidates.
URL sujet détaillé : https://drive.google.com/file/d/0BxUrvV1GfK0wM3RGM2pzMHg2MTdfaVRxejNla2pMVk54emVV/view?usp=sharing

Remarques : The internship will be co-advised by Prof. Marc Baboulin (https://www.lri.fr/~baboulin/). There is also a possibility of funding for the internship (need to contact Prof. Baboulin and Kaya).



SM207-75 Variations on the cake-cutting theorem  
Admin  
Encadrant : Frédéric MEUNIER
Labo/Organisme : Cermics - École des Ponts
URL : https://cermics.enpc.fr/~meuniefr/
Ville : Marne-la-Vallée

Description  

Fair division is an active research area in combinatorics, social choice theory, computer science, etc. One of the main topics in this area is that of envy-free cake cutting where the players have different preferences and the goal is to find a division of a cake (identified with the interval [0,1]) into connected pieces so that it is possible to assign the pieces to the players without making any of them jealous.

The existence of such an envy-free is ensured under mild conditions. Recently, the question of whether these conditions could be even further simplified has arisen. For instance, what can be said if some players dislike the cake? In the traditional setting, they are all "hungry", and always prefer to get something instead of nothing.

Another question is related to a discrete version of the cake cutting problem. What if the cake can be cut only at some specific positions (e.g., the cake is a necklace and the beads cannot be cut)? Is it possible to find an "approximate" envy-free division?

All these questions come also with algorithmic challenge.

The objective of this research project is to study possible extensions and variations of the envy-free cake cutting theorem and to start with very concrete questions such as the ones mentioned above.
URL sujet détaillé : :

Remarques : Stipend possible.



SM207-76 Formalization of polyhedra in the proof assistant Coq  
Admin  
Encadrant : Xavier ALLAMIGEON
Labo/Organisme : INRIA Saclay and CMAP, Ecole Polytechnique
URL : http://www.cmap.polytechnique.fr/~allamigeon
Ville : Palaiseau

Description  

Convex polyhedra are the subsets of R^n defined by systems of linear (affine) inequalities. They play a major role in pure mathematics (algebraic geometry, discrete maths, combinatorics), in applied mathematics (optimization, operations research, control theory) as well as in computer science (computational geometry, software verification, compilation, program optimization, constraint solving). The growing role of mathematical software in the study of open problems in mathematics, as well as the critical nature of the aforementioned applications, provide a strong motivation to formalize the theory of polyhedra in a proof assistant. A first milestone in this direction has been reached in the recent work [1]. This has led to the development of the library CoqPolyhedra (https://github.com/nhojem/Coq-Polyhedra).

The internship is motivated by the project of formalizing the theory of polyhedra in Coq and its applications to mathematical problems. We propose two examples of subjects, which can be adapted according to the taste of the candidate. Both naturally lead to a PhD on the topic.
The first subject aims at developing the computational abilities of the library CoqPolyhedra, i.e. allowing to formally prove mathematical properties on instances of polyhedra of large size. The objective is to limit the slowdown incurred by doing computations within Coq (compared with an informal polyhedral computation library), in order to make an integration of CoqPolyhedra into mathematical software practically possible.
The second subject is about the abstract representation of polyhedra in the proof assistant. Indeed, polyhedra admit many possible representations, and the choice of a good representation depends on the context. The goal is to find a formalization allowing to manipulate the different representations of polyhedra as easily as in a pen-and-paper proof.

[1] Xavier Allamigeon and Ricardo D. Katz. A formalization of convex polyhedra based on the simplex method. Journal of Automated Reasoning, Aug 2018.
URL sujet détaillé : http://www.cmap.polytechnique.fr/~allamigeon/M2_internship_ENS_Lyon.pdf

Remarques :



SM207-77 Calculs locaux pour données distribuées  
Admin  
Encadrant : Xavier URBAIN
Labo/Organisme : LIRIS
URL : https://perso.liris.cnrs.fr/xavier.urbain/
Ville : Villeurbanne

Description  

Preuve automatique de propriétés sur les systèmes de données distribuées.
URL sujet détaillé : https://perso.liris.cnrs.fr/xavier.urbain/ens/M2IF/1819_bd.pdf

Remarques : Coencadrement E. Coquery, LIRIS
S. Tixeuil, LIP6



SM207-78 Modèles formels pour algorithmes distribués probabilistes  
Admin  
Encadrant : Xavier URBAIN
Labo/Organisme : LIP6/LIRIS
URL : https://perso.liris.cnrs.fr/xavier.urbain/
Ville : Paris/Villeurbanne

Description  

Introduction dans un cadre formel de composantes probabilistes pour le comportement de robots autonomes en essaims. Étude de cas : extension des solutions actuelles pour le scattering.
URL sujet détaillé : https://perso.liris.cnrs.fr/xavier.urbain/ens/M2IF/1819_proba.pdf

Remarques : Coencadrement S. Tixeuil, LIP6
P. Courtieu, Cedric



SM207-79 Backward Abstract Interpretation using Over and Under-Approximations  
Admin  
Encadrant : Antoine MINÉ
Labo/Organisme : LIP6
URL : https://www-apr.lip6.fr/~mine/mopsa
Ville : Paris

Description  

The goal of the internship is to develop backward value analyses in a static analyzer by abstract interpretation in a realistic setting for C-like programs featuring notably functions and pointers. The intern will consider both over-approximations and under-approximations.

The theory of abstract interpretation allows the design of effective and efficient static analyzers able to compute approximations of program semantics. The overwhelming majority of analyzers compute over-approximations:
- A classic forward analysis computes an over-approximation of the states reachable from the program entry.
- A classic backward analysis computes the co-reachable states from some desired states. This set is also over-approximated, so as to infer necessary conditions for the program to reach the desired states.
- Over-approximating forward and backward analyses can be iterated, either to provide automatically some context for the false positives of an imprecise analysis, or to provide interactive abstract debugging.

In theory, an under-approximating backward analysis could instead provide sufficient conditions for a target program state to be reached. Applications include:
- Proving that an alarm is a true error and not a false alarm by inferring a counter-example execution
- Inferring procedure contracts: sufficient assertions to insert at the beginning of a procedure to ensure that its execution will never fail.
- Evaluating the uncertainty of an analysis by combining both over-approximations and under-approximations, and quantifying the distance between them. Such analyses may be iterated to improve their precision.

The lack of effective under-approximating infinite-state abstract domains has limited the development of Abstract Interpretation techniques to solve these problems. Effective under-approximations have been proposed for classic numeric domains (intervals, polyhedra) in to infer sufficient conditions for the absence of array bound errors in simple programs, and later found applications in proving liveness properties. These can serve as the basis for the internship, but will require extensions to ensure precision, scalability, and support for non-numeric variables such as pointers.

The intended work will include a theoretical side: developing abstract semantics and proving formally their soundness. It will also include a practical side: implementing the semantics and validating their benefit experimentally.

The host team is developing a static analysis platform, Mopsa, that includes an analysis of C programs using several, ready-to-use abstractions, and a framework to easily extend it to new abstractions. A first step will be to add support for backward iterations in Mopsa, and evaluate classic over-approximating backward analyses on programs featuring numeric and pointer variables. Then, the intern will implement and evaluate the novel over-approximating and under-approximating backward operators developed during the internship. Feedback from experimentations throughout the internship will guide the design of new abstractions tailored to concrete problems in realistic settings.

URL sujet détaillé : https://www-apr.lip6.fr/~mine/enseignement/mpri/2018-2019/stages/ai_back.pdf

Remarques : Stage proposé dans le cadre du projet ERC Mopsa.



SM207-80 Static Analysis of Ethereum Smart Contracts by Abstract Interpretation  
Admin  
Encadrant : Antoine MINÉ
Labo/Organisme : LIP6
URL : https://www-apr.lip6.fr/~mine/mopsa
Ville : Paris

Description  

The goal of the internship is to develop, prove correct, and implement novel static analyses by Abstract Interpretation to verify the correctness of smart contracts.

The internship focuses on Ethereum, a cryptocurrency framework that emploies blockchain techniques and supports the execution of smart contracts. Smart contracts can be written in a variety of languages, which are compiled into bytecode that is run on the Ethereum Virtual Machine (EVM). EVM provides a Turing-complete, stack-based machine with a byte-array based memory and an associative storage (with 256-bit keys and values). The EVM is completely specified formally, giving it an unambiguous semantics. Hence, the intership will focus on analyzing EVM bytecode.

Application of formal methods to smart contracts is a recent area of research, with few results yet. Examples include preliminary analyses using type and effect systems with F*, and using model-checking with Spin. We are not aware of works based on Abstract Interpretation.

Several causes of vulnerabilities in Ethereum smart contracts have been uncovered. Given the novelty of the subject, a preliminary study must be performed to determine precisely which vulnerabilities can be efficiently detected by static analysis, and which properties must be inferred to detect them. Nevertheless, one promising direction is to focus on vulnerabilities related to exceptional behaviors (i.e., run-time errors). The intern may focus primarily on the following analyses:

- Checking exceptions. Erroneous operations in the execution of a contract cause exceptions to be raised and the contract to be terminated. Some exceptions, though, are propagated back to the caller contract, while others result in return codes, which may not be checked appropriately, causing vulnerabilities. A static analysis could detect that possible exceptions are correctly handled, even through calls to contracts.

- Call stack depth. The EVM stack depth is limited to 1024, and exceeding this limitation causes an exception that may not be properly handled by a contract and be the base of attacks. A static stack depth analysis can alleviate this kind of errors.

- Atomicity vulnerabilities. Contracts do not operate atomically: a contract can call another contract, that can access and exploit the intermediate state the calling contract is in for ill effects. For instance, a reentrancy bug, where a contract is called again from within its own execution, is the basis of the The DAO attack, which caused a 60 million US dollar loss in June 2016.

- Transaction fees. Every operation in the EVM comes with a cost, measured in an abstract unit called gas. The cost of a contract is specified up-front, and the execution stops (with an exception) whenever this cost is exceeded. A cost analysis can check statically whether a contract will meet its cost target. Such an analysis can leverage the large body of literature on worst case execution time of binary programs.

The intern will leverage existing abstractions, such as numeric abstractions, and develop new abstractions if needed. Most static analyses target source-level programs. Analyzing bytecode-level programs poses additional challenges due to the very low-level of instructions, and the tendency of fine-grained abstract transfer functions to accumulate imprecision. Methods including developing specific relational domains or high-level expression reconstruction may be required to reach the intended level of precision.

The intended work will include a theoretical side: developing abstract semantics and proving formally their soundness. It will also include a practical side: implementing the semantics in a static analyzer and validating their benefit experimentally on sample smart contracts. The host team is developing a static analysis platform, Mopsa, that includes several ready-to-use abstractions, and a framework that allows easily extending the analyses both to new abstractions, and to new languages. For experimental purposes, the intern will thus write an EVM bytecode frontend for Mopsa, reuse existing Mopsa abstractions, and add the necessary abstractions to carry the static analysis.

URL sujet détaillé : https://www-apr.lip6.fr/~mine/enseignement/mpri/2018-2019/stages/ai_contract.pdf

Remarques : Stage proposé dans le cadre du projet ERC Mopsa.



SM207-81 Dataflow analysis of malicious binary codes. Toward a study of the cartography of functionalities and their correlations.  
Admin  
Encadrant : Jean-yves MARION
Labo/Organisme : LORIA
URL : http://www.loria.fr/fr/
Ville : Nancy

Description  

The team Carbone at Loria has developed thanks to the high security lab (LHS), an innovative method called morphological analysis. This method can detect code similarities. It can also detect functionalities embedded in a binary code and detect malware. The objective is to rebuild the dataflow graph in order to cartography the set of functionalities used inside a malicious code.

URL sujet détaillé : https://members.loria.fr/JYMarion/internship-subject-sujet-stage-m2/

Remarques : Le stage sera rémunéré et l'hébergement peut être pris en charge.




SM207-82 Multi-Flows and Multi-Cuts in Planar Graphs  
Admin  
Encadrant : Chien-chung HUANG
Labo/Organisme : TALGO, DIENS
URL : https://www.di.ens.fr/~cchuang/
Ville : Paris

Description  

Given a graph G=(V,E) with edge capacities, and pairs of terminals (s_i, t_i), the min-cut problem asks to remove min-capacity set of edges so that each pair of terminals are separated. The maximal integral flow problem asks to send max-amount of integral flow between pairs of terminals, while respecting the capacity. The latter problem includes the edge-disjoint path problem as a special case.

The two problems are classical problems in theoretical computer science. This project will focus on the special case of planar graphs and hope to design exact, approximation, and fixed-parameterised algorithms for the two problems.

A knowledge of graph theory, especially concerning the planar graphs, will be much appreciated for the people interested in this project.
URL sujet détaillé : :

Remarques : Salary is possible



SM207-83 Natural Language Processing  
Admin  
Encadrant : Irina ILLINA
Labo/Organisme : LORIA/INRIA Nancy, Multispeech Team
URL : http://team.inria.fr/multispeech/fr/
Ville : Nancy

Description  

According to the 2017 International Migration Report, the number of international migrants worldwide has grown rapidly in recent years, reaching 258 million in 2017, among whom 78 million in Europe. A key reason for the difficulty of EU leaders to take a decisive and coherent approach to the refugee crisis has been the high level of public anxiety about immigration and asylum across Europe. There are at least three social factors underlying this attitude (Berri et al, 2015): the increase in the number and visibility of migrants; the economic crisis that has fed feelings of insecurity; the role of mass media. The last factor has a major influence on the political attitudes of the general public and the elite. Refugees and migrants tend to be framed negatively as a problem. This translates into a significant increase of hate speech towards migrants and minorities. The Internet seems to be a fertile ground for hate speech (Knobel, 2012).

The goal of this Master internspeech Develop a new methodology to automatically detect hate speech, based on machine learning and Neural Networks. Human detection of this material is infeasible since the contents to be analyzed are huge. In recent years, research has been conducted to develop automatic methods for hate speech detection in the social media domain. These typically employ semantic content analysis techniques built on Natural Language Processing (NLP) and Machine Learning (ML) methods (Schmidt et al. 2017). Although current methods have reported promising results, their evaluations are largely biased towards detecting content that is non-hate, as opposed to detecting and classifying real hateful content (Zhang et al., 2018). Current machine learning methods use only certain task-specific features to model hate speech. We propose to develop an innovative approach to combine these pieces of information into a multi-feature approach so that the weaknesses of the individual features are compensated by the strengths of other features (explicit hate speech, implicit hate speech, contextual conditions affecting the prevalence of hate speech, etc.).

References

Baroni, M., Dinu, G., and Kruszewski, G. (2014). =93Don't count, predict! a systematic comparison of context-counting vs. contextpredicting semantic vectors=94. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Volume 1, pages 238-247.

Berri M, Garcia-Blanco I, Moore K (2015), Press coverage of the Refugee and Migrant Crisis in the EU: A Content Analysis of five European Countries, Report prepared for the United Nations High Commission for Refugees, Cardiff School of Journalism, Media and Cultural Studies.

Dai, A. M. and Le, Q. V. (2015). =93Semi-supervised sequence Learning=94. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R., editors, Advances in Neural Information Processing Systems 28, pages 3061-3069. Curran Associates, Inc

Dong, L., Wei, F., Tan, C., Tang, D., Zhou, M., and Xu, K. (2014). =93Adaptive recursive neural network for target-dependent twitter sentiment classification=94. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL, Baltimore, MD, USA, Volume 2: pages 49-54.

Iyyer, M., Manjunatha, V., Boyd-Graber, J., and Daumé, H. (2015). =93Deep unordered composition rivals syntactic methods for text classification=94. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, volume 1, pages 1681-1691.

Johnson, R. and Zhang, T. (2015). =93Effective use of word order for text categorization with convolutional neural networks=94. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 103-112.

Kim, Y. (2014). =93Convolutional neural networks for sentence classification=94. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746-1751.

Knobel M. (2012). L'Internet de la haine. Racistes, antisémites, néonazis, intégristes, islamistes, terroristes et homophobes à l'assaut du web. Paris: Berg International

Mikolov, T., Yih, W.-t., and Zweig, G. (2013a). =93Linguistic regularities in continuous space word representations=94. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 746-751.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013b). =93Distributed representations of words and phrases and their Compositionality=94. In Advances in Neural Information Processing Systems, 26, pages 3111-3119. Curran Associates, Inc.

Nam, J., Kim, J., Loza Menc__a, E., Gurevych, I., and F=7Furnkranz, J. (2014). =93Large-scale multi-label text classification =96 revisiting neural networks=94. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD-14), Part 2, volume 8725, pages 437-452.

Schmidt A., Wiegand M.(2017). A Survey on Hate Speech Detection using Natural Language Processing, Workshop on Natural Language Processing for Social Media

Zhang, Z., Luo, L (2018). Hate speech detection: a solved problem? The Challenging Case of Long Tail on Twitter. arxiv.org/pdf/1803.03662
URL sujet détaillé : https://team.inria.fr/multispeech/master-r2-internship-in-natural-language-processing-online-hate-speech-against-migrants/

Remarques : Supervisors: Irina Illina, MdC, Dominique Fohr, CR CNRS

Required skills: background in statistics, natural language processing and computer program skills (Perl, Python). Candidates should email a detailed CV with diploma



SM207-84 Natural Language Processing  
Admin  
Encadrant : Irina ILLINA
Labo/Organisme : LORIA/INRIA Nancy, Multispeech Team
URL : http://team.inria.fr/multispeech/fr/
Ville : Nancy

Description  

Motivations and context

Semantic and thematic spaces are vector spaces used for the representation of words, sentences or textual documents. The corresponding models and methods have a long history in the field of computational linguistics and natural language processing. Almost all models rely on the hypothesis of statistical semantics which states that: statistical schemes of appearance of words (context of a word) can be used to describe the underlying semantics. The most used method to learn these representations is to predict a word using the context in which this word appears [Mikolov et al., 2013b, Pennington et al., 2014], and this can be realized with neural networks. These representations have proved their effectiveness for a range of natural language processing tasks [Baroni et al., 2014]. In particular, Mikolov's Skip-gram and CBOW models et al. [Mikolov et al., 2013b, Mikolov et al., 2013a] have become very popular because of their ability to process large amounts of unstructured text data with reduced computing costs. The efficiency and the semantic properties of these representations motivate us to explore these semantic representations for our speech recognition system.

Robust automatic speech recognition (ASR) is always a very ambitious goal. Despite constant efforts and some dramatic advances, the ability of a machine to recognize the speech is still far from equaling that of the human being. Current ASR systems see their performance significantly decrease when the conditions under which they were trained and those in which which they are used differ. The causes of variability may be related to the acoustic environment, sound capture equipment, microphone change, etc.

Objectives

The speech recognition (ASR) stage will be supplemented by a semantic analysis to detect the words of the processed sentence that could have been misrecognized and to find words having similar pronunciation and matching better the context. For example, the sentence " Silvio Berlusconi, prince de Milan " can be recognized by the speech recognition system as : " Silvio Berlusconi, prince de mille ans ". Good semantic context representation of the sentence could help to find and correct this error.

The Master internship will be devoted to the innovative study of the taking into account of semantics through predictive representations that capture the semantic features of words and their context. Research will be conducted on the combination of semantic information with information from denoising to improve speech recognition. As deep neural networks (DNNs) can model complex functions and get outstanding performance, they will be used in all our modeling.

References

[Deng, 2014] Deng, L. Deep learning: Methods and applications. Foundations and Trends in Signal Processing, 7(3-4), 197=96387, 2014.

[Goodfellow et al., 2016] Goodfellow, I., Bengio, Y., & Courville, A. Deep Learning. MIT Press. http://www.deeplearningbook.org, 2016.

[Mikolov et al., 2013a] Mikolov, T. Chen, K., Corrado, G., and Dean, J. Efficient estimation of word representations in vector space, CoRR, vol. abs/1301.3781, 2013.

[Mikolov et al., 2013b] Mikolov, T., Sutskever, I., Chen, T. Corrado, G.S.,and Dean, J. Distributed representations of words and phrases and their compositionality, in Advances in Neural Information Processing Systems, 2013, pp. 3111=963119.

[Pennington et al., 2014] Pennington, J., Socher, R., and Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532-1543.

[Povey et al, 2011] Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlıcek, P., Qian, Y., Schwarz, Y., Silovsky, J., Stemmer, G., Vesely, K. The Kaldi Speech Recognition Toolkit, Proc. ASRU, 2011.

[Sheikh, 2016] Sheikh, I. Exploitation du contexte sémantique pour améliorer la reconnaissance des noms propres dans les documents audio diachroniques=94, These de doctorat en Informatique, Université de Lorraine, 2016.

[Sheikh et al., 2016] Sheikh, I. Illina, I. Fohr, D. Linares, G. Learning word importance with the neural bag-of-words model, in Proc. ACL Representation Learning for NLP (Repl4NLP) Workshop, Aug 2016.
URL sujet détaillé : https://team.inria.fr/multispeech/master-r2-internship-in-natural-language-processing-introduction-of-semantic-information-in-a-speech-recognition-system/

Remarques : Supervisors: Irina Illina, MdC, Dominique Fohr, CR CNRS

Required skills: background in statistics, natural language processing and computer program skills (Perl, Python). Candidates should email a detailed CV with diploma



SM207-85 Compilation schemes for high-level fault-tolerant distributed protocols  
Admin  
Encadrant : Cezara DRAGOI
Labo/Organisme : ENS Paris/INRIA Paris team Antique
URL : http://www.di.ens.fr/~cezarad/
Ville : PARIS

Description  

Fault-tolerant distributed data structures are at the core distributed systems.
One of the fundamental problems in the distributed setting is consensus: multiple processes must agree on a value. Consensus is unsolvable in the presence of asynchrony (unbounded message delay) and faults (message loss, process crash).
Therefore algorithms make assumptions on the network in order to solve consensus, leading to extremely intricate control structures, highly error prone. The development of blockchain technologies (which solve variants of consensus) recall one more time the importance of these systems and how hard it is to get them right.

We will investigate existing programming abstractions that simplify reasoning about consensus algorithms, and extend them if needed to capture solutions to other problems, e.g., state machine replication, k-set agreement, etc.
We focus on partially synchronous programming abstractions, that uniformly model asynchrony and faults using an adversarial environment that drops messages, leading to simpler control structure and simpler proof arguments.

We will define a domain-specific language that has a high-level (partially synchronous) semantics amendable to automated verification and an asynchronous semantics defining the runtime executions. Partial synchrony reduces and simplifies the system's behaviors, allowing the programmer to focus on the algorithm task, rather than on low-level network and timer code. We will define algorithms that derive asynchronous code from the partially synchronous one, while preserving its safety and liveness properties.
The domain specific approach has the advantage of simplifying the control structure, leading to fewer global states and more understandability of the algorithm, making programming less error prone but also helping the application of automated verification techniques on the domain specific code.
URL sujet détaillé : http://www.di.ens.fr/~cezarad/compilation.pdf

Remarques :



SM207-86 Abstract interpretation for distributed system  
Admin  
Encadrant : Cezara DRAGOI
Labo/Organisme : ENS Paris/INRIA Paris team Antique
URL : http://www.di.ens.fr/~cezarad/
Ville : PARIS

Description  

Fault-tolerant distributed data structures are at the core distributed systems however there are very few (semi-)automated verification techniques that tackle this class of systems. Consensus, i.e., multiple processes agree on a value, is one of the oldest and most important problems in the distributed setting. However, up to date there are no automated verification techniques that can prove this properties beyond toy examples, that is for real used algorithms. Consensus algorithms are still an active research area, for both systems and algorithms designers, and the recent blockchain technologies show, one more time, its difficulty.

The internship aims to develop automated verification techniques, based on abstract interpretation, to prove safety and liveness properties of fault-tolerant systems, in particular of consensus algorithms.

Abstract interpretation is an analysis technique that can be used for automated verification.
For sequential programs, the automated verification problem made significant progress, either by using abstract interpretation or using other techniques: satisfiability solvers, model checking, etc.
For distributed systems, automatic verification is still in its early stages, despite massive usage of cloud technologies or distributed databases. The internship's goal is to define abstract interpretation techniques that enable the automated verification of consensus algorithms. Consensus algorithms ensure coherences between independent processes running over a network whose topology varies (ring, tree, dynamic dags, etc) and different types of failure may happen (process crash, message loss or corruption).
The internship requires basic knowledge in programming languages. Notions in abstract interpretation and/or distributed systems are a plus but not mandatory. The internship is in french or english. R
URL sujet détaillé : https://www.di.ens.fr/~cezarad/AutomatedVerif.pdf

Remarques : Remuneration prevue pour les normalien(ne)s etudiant(e)s (auditeure/auditrices).



SM207-87 Scheduling Replica Requests in Key-Value Stores  
Admin  
Encadrant : Loris MARCHAL
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/loris.marchal/
Ville : Lyon

Description  

Many distributed systems have been proposed to optimize information storage and retrieval, in particular key-values stores such as Apache Cassandra or MongoDB. Even on well-provisioned systems, avoiding latency, that is, minimizing the delay of all data requests, is still challenging because of the unbalanced load among servers. In particular, the 5% slowest requests may experience much larger latencies than others, which is known as the ``tail latency'' problem. Data replication is commonly used to overcome this problem, which, in turn, asks for a good replication selection algorithms. When data has heterogeneous sizes, requests scheduled behind other requests for large data (and thus with large service time) may also suffer an additional delay: this is the ``head-of-line blocking'' problem.

Héron is a replica selection algorithm recently proposed to overcome this problem. It recognizes requests for large values using bloom filters to avoid scheduling other requests behind them. Experiments show that it outperforms state-of-the-art algorithms and reduces both median and tail latencies.

However, we do not know if such a replication selection algorithm could be further optimized. More precisely, we would like to know how far is Héron from an optimal selection algorithm. A natural way to answer this question is to abstract the problem and to model the distributed system by selecting keys characteristics (while simplifying others) in order to get a simpler but tractable scheduling problem. In fact, optimal or guaranteed algorithms have been designed for similar problems in the scheduling literature: for homogeneous request sizes with total replication, it is known that the FIFO policy gives optimal results both for average and maximum response time minimization. Data locality given by the replication policy may be precisely modeled using ``restricted availabilities''.


The objective of the internship is both to propose tight lower bounds on the achievable tail latency for the replica selection problem based on scheduling studies, and to identify which characteristics of the workload are crucial to design efficient replication strategies.

We have already identified two noteworthy specificities of the problem whose use in scheduling studies to design better bounds and algorithms would be pioneering:

- Thanks to the dynamic monitoring and scaling of the platform, the average load of each server is kept between 50% and 80%. This excludes extreme cases such as under-loaded
or overloaded scenarios.

- The popularity of data items can be learned during the execution. Popular items receive much larger volume of requests, which may be scheduled specifically.

The internship consists into two parts, whose respective weight can be adapted to the student's skills and inclination:

- Theoretical studies to model the problem, explore related scheduling works, design scheduling and replication selection algorithm, prove algorithm optimality or approximation ratios;

- Experimental studies based on the statistical characterisation of workloads from actual storage systems and the simulation of replication selection algorithms, to understand the key features of the workload.

URL sujet détaillé : http://perso.ens-lyon.fr/loris.marchal/docs/internship-replica-en.pdf

Remarques : Co-advised by Sonia Ben Moktar (LIRIS Lyon) and Étienne Rivière (UC Louvain)

Opportunity for the intern to spend one month in UC Louvain at the end of the internship, provided the intern is interested (specific funding may be available for this).



SM207-88 Inverse homogenization of implicit functions  
Admin  
Encadrant : Jonas MARTINEZ
Labo/Organisme : INRIA
URL : https://mfx.loria.fr/inverse-homogenization-internship/
Ville : Nancy

Description  

The objective of the internship is to perform inverse homogenization of periodic microstructures defined by an implicit function. Specifically, we aim to study implicit functions defined by skeletal structures. Skeletal structures are a powerful tool used in a broad number of scientific fields. Skeletons reduce the dimension of shapes, capture their geometric and topological properties, and seek to understand how these structures encode local and global features. The above properties provide a tight control of the topology and the geometry, which turns out to be crucial for enforcing fabrication constraints for additive manufacturing.
URL sujet détaillé : https://members.loria.fr/Monique.Teillaud/positions/2018-microtructures-internship.pdf

Remarques : Co-advised by Monique Teillaud (monique.teillaud.fr)



SM207-89 Computability in topological spaces  
Admin  
Encadrant : Mathieu HOYRUP
Labo/Organisme : Loria, Inria
URL : https://members.loria.fr/MHoyrup/
Ville : Villers-lès-Nancy

Description  

Computability theory over natural numbers and finite objects can be extended to infinite objects, represented by infinite symbolic sequences written on the input tape of a Turing machine. The choice of the representation is significant and raises the following problem: what are the computational tasks that can be performed by a Turing machine, for a given representation of input objects? Topology is very useful to understand this problem. For instance, every computable function is continuous for a suitable topology induced by the representation. Therefore, the natural way to extend computability theory to arbitrary mathematical objects is to work in a topological space.

Computability is well understood on some classes of topological spaces like Polish spaces. However it is less understood in other spaces, especially spaces without a countable basis.

The goal of the internship is to study some particular spaces like the space of real polynomials and understand computability on those spaces. For instances, what properties of points of the space can be recognized by a Turing machine, for several notions of recognizability?
URL sujet détaillé : :

Remarques :



SM207-90 Complexity of subsets of the plane  
Admin  
Encadrant : Mathieu HOYRUP
Labo/Organisme : Loria, Inria
URL : https://members.loria.fr/MHoyrup/
Ville : Villers-lès-Nancy

Description  

Computational complexity theory is the study of classes of problems over finite objects, such as the classes P or NP. It has been used to define and investigate the complexity of infinite objects such as real numbers or subsets of the Euclidean plane. Two important notions of complexity of a subset of the plane have been considered.

In definition 1, a set is polynomial-time computable if there is a program that draws the set in polynomial time, more precisely if given a pixel, the program decides whether the pixel intersects the set, within a time bounded by a polynomial in the size of the pixel.

In definition 2, a set is polynomimal-time computable if there is a program computing the distance of an arbitrary point to the set in polynomial time.

Definition 2 is known to be stronger than definition 1, and strictly so if P is different from NP.

The goal of the internship is to better understand the relationship between these two notions. One direction is to find classes of sets for which the two definitions are equivalent (a candidate is the class of convex sets). Another direction is to understand how the two notions are far from each other. For instance, for a set satisfying definition 1, what is the size of the set of points whose distance to the set is difficult to compute?
URL sujet détaillé : :

Remarques :



SM207-91 Sequential complexity of optimization  
Admin  
Encadrant : Aurélien GARIVIER
Labo/Organisme : UMPA/LIP
URL : https://www.math.univ-toulouse.fr/~agarivie
Ville : Lyon

Description  

Description:

The problem of "Best arm identification" is the following: a finite set of probability distributions is given, and an agent may sample them sequentially as she wishes; her goal is to identify, with as few samples as possible, the distribution with highest expectation.
Since [GK16], the complexity of this problem is known: there is an explicit lower-bound on the number of samples required, obtained by information-theoretic arguments. Besides, a strategy asymptotically reaching this lower bound is given. Some preliminary results are also known when a distribution with only an *approximately* optimal expectation is to be found. In all cases, the lower bound emphasizes an optimal proportion of arm draws which ensures efficiency.

The purpose of this internship is to investigate the possibility to build on these results in order to identify the complexity of optimizing a *continuous* (say, Lipschitz) function observed with noise, that is, the function-dependent number of samples required by *any* strategy (stochastic gradient descent, or whatever) to find a point close to the optimum with high probability. Additionally, this should permit to find the optimal density of draws of the domain that leads to optimality.

We will proceed by discretization of the continuous problem, leading to a finite set of distributions whose expectations are linked by the Lipschitz assumption. Hence, a first step in this work will be to identify the complexity of best-arm identification when the difference of means of two successive distributions is bounded.

Ref: [GK16] Optimal Best Arm Identification with Fixed Confidence, by Aurélien Garivier and Emilie Kaufmann in Conference On Learning Theory (COLT) n=B029, Jun. 2016, pp.998-1027

URL sujet détaillé : :

Remarques : for co-advising: please contact me



SM207-92 Coloration distribuée dans les graphes structurés / Distributed coloring in structured graph classes  
Admin  
Encadrant : Louis ESPERET
Labo/Organisme : G-SCOP
URL : http://oc.inpg.fr/esperet
Ville : Grenoble

Description  

This internship will be devoted to distributed algorithms for graph coloring. Graph coloring is a fundamental symmetry breaking problem in distributed algorithms, and improving the complexity of known algorithms is a major challenge in the area. Here we will focus on graph classes with a specific structure (which may arise in practical applications), like unit disk graphs, or graphs coming from partial orders. The study of structured graphs in the context of distributed algorithm is fairly recent and there are many interesting open questions.
URL sujet détaillé : http://oc.inpg.fr/sujetM2.pdf

Remarques : possibilité de rémunération



SM207-93 Vérification formelle performante pour la logique temporelle du premier ordre  
Admin  
Encadrant : Julien BRUNEL
Labo/Organisme : dans les deux cas, le domaine des termes du premier ordre est borné ; en revanche, un outil repose sur une analyse avec horizon temporel borné (bounded model checking) tandis que le second, conçu à l'ONERA, travaille sur horizon non-borné.
URL : https://www.onera.fr/staff/julien-brunel
Ville : Toulouse

Description  

Alloy (D. Jackson et al, MIT) est un langage de spécification formelle qui connaît un certain succès en raison du compromis pertinent qu'il propose entre simplicité de modélisation pour l'utilisateur et efficacité des analyses formelles. En pratique, la spécification se fait en logique relationnelle bornée (une logique du premier ordre avec termes relationnels et dont les domaines d'interprétation sont bornés par l'utilisateur) et la vérification automatisée de propriétés est effectuée par traduction en problème de satisfiabilité pour la logique propositionnelle (on tire parti des récents progrès importants effectués dans le domaine des moteurs SAT). Alloy se prête très bien à l'analyse de modèles où les propriétés vérifées sont essentiellement structurelles. Or, de nombreux problèmes comportent aussi une dimension comportementale, abordables en Alloy mais avec un certain nombre de lacunes.

Le DTIS, en collaboration avec le HasLab de l'université du Minho (Portugal) a récemment élaboré un langage formel, nommé Electrum [FSE 2016], qui peut être vu comme une extension comportementale d'Alloy. En pratique, le langage s'appuie sur la logique relationnelle bornée étendue par la logique temporelle linéaire avec opérateurs du passé (PLTL). Deux techniques de vérification ont été mises au point : dans les deux cas, le domaine des termes du premier ordre est borné ; en revanche, un outil repose sur une analyse avec horizon temporel borné (bounded model checking) tandis que le second, conçu à l'ONERA, travaille sur horizon non-borné.

L'objectif de ce stage sera d'améliorer les analyses actuelles en suivant certaines des pistes suivantes :
- Une piste intéressante consiste à tirer parti des symétries présentes dans les modèles Electrum : celles-ci permettent de réduire considérablement l'espace d'états. À l'heure actuelle, nous ne tenons compte que de certaines symétries sur les données, mais pas des symétries temporelles. De manière plus générale, l'emploi de techniques automatiques d'abstraction paraît prometteur.
- Les analyses actuelles ne tirent pas assez profit de la différence entre la description du " système " d'une part, et de la spécification des propriétés que le système doit vérifier d'autre part. Une récente extension d'Electrum [ABZ 2018] avec des actions permet justement une distinction plus claire entre ces deux éléments au sein du langage. De nouvelles analyses peuvent donc être imaginées, en particulier en s'inspirant des algorithmes récents basés sur une représentation SMT (Satisfiability Modulo Theory) du système et des propriétés.
- Nous avons débuté dans [ATVA 2016] une étude de la propriété du modèle fini (PMF) pour la logique sous-jacente à Electrum (FO-LTL). L'identification de certains fragments de la logique qui ont la PMF ouvrent à la voie à une vérification complète (sans borne sur les termes du permier ordre).

Références :
[ABZ 2018] Julien Brunel, D . Chemoui, A. Cunha, Th. Hujsa, N. Macedo, J. Tawa. Proposition of an Action Layer for Electrum. In Proc. of 6th Int. Conf. on ASM, Alloy, B, TLA, VDM, Z (ABZ), 2018.
[FSE 2016] N. Macedo, J. Brunel, D. Chemouil, A. Cunha, and D. Kuperberg. Lightweight Specification and Analysis of Dynamic Systems with Rich Configurations. In Proc. ACM SIGSOFT Intl Symp. on the Foundations of Software Engineering (FSE), 2016.
[ATVA 2016] D. Kuperberg, J. Brunel, and D. Chemouil. On Finite Domains in First-Order Linear Temporal Logic. In 14th Interl Symp. on Automated Technology for Verification and Analysis (ATVA), 2016.

URL sujet détaillé : http://w3.onera.fr/stages/sites/w3.onera.fr.stages/files/dtis-2019-44.pdf

Remarques :



SM207-94 Vérification formelle de systèmes distribués sur domaine infini  
Admin  
Encadrant : David CHEMOUIL
Labo/Organisme : ONERA
URL : https://www.onera.fr/staff/david-chemouil
Ville : Toulouse

Description  

La formalisation des systèmes distribués constitue un objet d'étude intéressant en soi mais aussi en raison du " challenge " qu'ils posent à la spécification formelle " naturelle " et à la vérification formelle à fort degré d'automatisation.
Ivy [PLDI 2016] permet de prouver des propriétés sur des systèmes de taille infinie avec un degré d'automatisation important (comparativement aux approches fondés sur des assistants de preuve). La description du système et la spécification des propriétés à vérifier doit se faire en utilisant un fragment assez restreint de la logique du premier ordre, sur lequel le problème de satisfiabilité est décidable. Ivy a été notamment appliqué à l'analyse de systèmes distribués (ex : protocoles de la famille Paxos)).
Des travaux récents [POPL 2018] proposent d'analyser des propriétés temporelles plus riches (vivacité) à l'aide d'Ivy.
Electrum [FSE 2016], un langage développé à l'ONERA depuis 2015, a choisi une approche différente : la spécification du système et des propriétés profite de toute l'expressivité de la logique temporelle du premier ordre (FO-LTL). La vérification est entièrement automatique, mais nécessite de borner la taille du domaine du premier ordre. Electrum a également été appliqué dans [FMCAD 2018] à l'analyse de systèmes distribués, via l'étude du protocole Chord, qui traite de la maintenance d'une table de hachage distribuée dans un réseau pair-à-pair.
Le présent stage se situe dans ce contexte. In fine, il s'agit de déterminer un formalisme ou des techniques permettant de spécifier " confortablement " des systèmes distribués puis de vérifier leurs propriétés (y compris la vivacité) au moyen de techniques avec un fort degré d'automatisation. Pour ce faire, on propose le présent programme de travail :
- Étudier le protocole Chord ainsi que le modèle Electrum de Chord
- Proposer une modélisation de Chord (ou d'un sous-ensemble de Chord) en Ivy (liveness incluse)
- Étudier les propositions théoriques sous-jacentes à Ivy
- En déduire des améliorations possibles pour Ivy & Electrum.

Ce stage peut se poursuivre en thèse.

URL sujet détaillé : http://w3.onera.fr/stages/sites/w3.onera.fr.stages/files/dtis-2019-50.pdf

Remarques :



SM207-95 Type-checking rewrite rules in Lambdapi  
Admin  
Encadrant : Frédéric BLANQUI
Labo/Organisme : LSV
URL : http://rewriting.gforge.inria.fr/
Ville : Cachan

Description  

Lambdapi is a new proof assistant based on a logical framework
called the λΠ-calculus modulo rewriting. Thanks to rewriting, Lambdapi allows the formalization of proofs that
cannot be done in other proof assistants. However, before adding a new rewrite
rule l → r in the system, it is necessary to check that it preserves typing. The goal of this internship is to study this problem and implement an algorithm in Lambdapi.
URL sujet détaillé : http://rewriting.gforge.inria.fr/divers/sr.pdf

Remarques :



SM207-96 Formal Proofs of Security Protocols using Oblivious Transfers  
Admin  
Encadrant : David BAELDE
Labo/Organisme : LSV, ENS Paris-Saclay
URL : http://www.lsv.fr/axes/SECSI/secsi
Ville : Cachan

Description  

Our group is working on formal security proofs of cryptographic protocols. Such protocols are
distributed programs that rely on cryptographic primitives such as encryption, hash function,
signatures, etc. They are used in many applications: secure transactions on Internet, mobile phones, smartcards, RFID chips, electronic voting, etc.

Caroline Fontaine, who joined our group recently, has proposed a protocol for privacy-preserving fingerprinted image distribution that aims at achieving several goals, including traitor tracing, buyer- and item-unlinkability [6]. The formal proof of such a protocol is
challenging for several reasons, some of which are the starting point of the proposed research project(s).
URL sujet détaillé : http://www.lsv.fr/~baelde/OT-security-internship.pdf

Remarques : This internship will be jointly supervised by David Baelde, Hubert Comon and Caroline Fontaine. If relevant, we have funding to support this project.



SM207-97 Automated verification of cryptographic protocols  
Admin  
Encadrant : Steve KREMER
Labo/Organisme : LORIA/INRIA Nancy - Grand Est
URL : https://team.inria.fr/pesto/
Ville : Nancy

Description  

Security protocols are distributed programs that aim at ensuring security properties, such as confidentiality, authentication or anonymity, by the means of cryptography. Such protocols are widely deployed, e.g., for electronic commerce on the Internet, in banking networks, mobile phones and more recently electronic elections. As properties need to be ensured, even if the protocol is executed over untrusted networks (such as the Internet), these protocols have shown extremely difficult to get right. Formal methods have shown very useful to detect errors and ensure their correctness.

Equational theories are used to describe algebraic properties of cryptographic primitives. The Tamarin prover [1] is a state-of-the art protocol verification tool, co-developped between ETH Z=FCrich, Saarbr=FCcken and Inria Nancy. It has for instance been successfully used to verify standards such as TLS 1.3 and 5G AKA as well as industrial protocols such as OPC UA. Tamarin has recently been extended to allow the specification of more complex cryptographic primitives, by the means of more general equational theories (any user-specified equational theory that can be oriented in a convergent rewrite system). However for some examples non-termination issues arise in practice, in particular for equational theories used in e-voting protocols.

The objective of this internship is to improve Tamarin's handling of such equational theories by identifying optimizations that avoid redundant and unnecessary cases causing the non-termination issues. A particular example that we aim at is a theory describing re-encryption, which allows to re-randomize a ciphertext. This requires a theoretical analysis and correctness proof of the proposed optimizations, followed by the implementation in the tool and practical evaluation.

The internship may also lead to a PhD thesis on similar topics.

Expected qualifications are:

- solid knowledge of logic, proofs and/or formal verification techniques,
- solid programming experience, ideally with functional programming in OCAML or Haskell.

Knowledge in security is not required, but a plus.

[1] http://tamarin-prover.github.io/

URL sujet détaillé : :

Remarques : Co-encadrement avec Jannik Dreier (jannik.dreier.fr)

Le paiement d'une gratification est envisageable, mais souvent incompatible avec le salaire des élèves normaliens. Certains frais peuvent cependant être pris en charge.




SM207-98 Design of an algorithm to compute the fluid limit of a discrete dynamical system with heterogeneous objects.  
Admin  
Encadrant : Bruno GAUJAL
Labo/Organisme : LIG
URL : https://team.inria.fr/polaris/members/bruno-gaujal/
Ville : Grenoble

Description  

Discrete dynamical systems with a large dimension are often well approximated by their fluid limit, or even more accurately by their size expansion of the second order.
When the system is made of heterogeneous objects, the existing algorithms computing this expansion are too costly and the goal of this internship is to design new effective algorithms to do so.
URL sujet détaillé : :

Remarques : For more details (scientific and logistics), please contact Bruno Gaujal.




SM207-99 Average complexity of reinforcment learning.  
Admin  
Encadrant : Bruno GAUJAL
Labo/Organisme : LIG
URL : https://team.inria.fr/polaris/members/bruno-gaujal/
Ville : Grenoble

Description  

Reinforcment learning algorithms are useful to optimize dynamical systems while they are running.
Even for simple cases, their average complexity is still unknown. The goal of the internship is to
study this question over simple cases (Q-learning for example).
URL sujet détaillé : :

Remarques : For more details (scientific and logistics), please contact Bruno Gaujal.




SM207-100 Potential games with partial interactions  
Admin  
Encadrant : Bruno GAUJAL
Labo/Organisme : LIG
URL : https://team.inria.fr/polaris/members/bruno-gaujal/
Ville : Grenoble

Description  

Consider a matrix potential game where the utility function of a player (say 100 24 25 29 44 46 100 110 118 200 201 259 does not depend
on the actions of player (say 100 24 25 29 44 46 100 110 118 200 201 259. In this case, players said to be indifferent to each other. The indifference relation creates a graph among players.
The general question that will be addressed during the internship is how to exploit this graph to design efficient algorithms to compute Nash equilibria of the game.
URL sujet détaillé : :

Remarques : For more details (scientific and logistics), please contact Bruno Gaujal.




SM207-101 Automated bisimulation checking in DEEPSEC  
Admin  
Encadrant : Steve KREMER
Labo/Organisme : LORIA/INRIA Nancy - Grand Est
URL : https://team.inria.fr/pesto/
Ville : Nancy

Description  

With the ubiquity of smartphones and other portable, connected devices, the Internet has become the main communication medium. In many cases, e.g., e-commerce, home banking, electronic voting, etc., communications need to be secured. This can be achieved using cryptographic protocols. Cryptographic protocols, such as TLS, are distributed programs which use cryptographic primitives, such as encryption or digital signatures. However, even if the underlying primitives are perfectly secure, protocols may be attacked.

Symbolic, automated verification has been a very successful approach for revealing vulnerabilities or proving the absence of entire families of attacks. The DEEPSEC (https://deepsec-prover.github.io/) verification tool for automatically analysing security protocols is a recent tool able to check security properties (expressed as behavioural equivalences) of cryptographic protocols (specified in the applied pi calculus, the tool's input language). This allows to model a large variety of security properties, including strong definitions of secrecy, anonymity and other privacy related properties.

The tool currently implements a procedure for checking trace equivalence. The paper describing the theory of DEEPSEC [1] also outlines a "theoretical procedure" for the verification of a stronger bisimulation equivalence, used to establish complexity result (coNEXP completeness). However, this theoretical procedure relies on a non-deterministic guess over a large search space which prevents an efficient implementation. The aim of the internship is to design an effective procedure, i.e., one being deterministic. This includes both theoretical aspects (algorithm design, correctness proof), as well as an implementation part to include this procedure in the DEEPSEC tool.

The internship may also lead to a PhD thesis on similar topics.

Expected qualifications are:

- solid knowledge of logic, proofs and/or formal verification techniques,
- solid programming experience, ideally with functional programming in OCAML.

Knowledge in security is not required, but a plus.

[1] Vincent Cheval, Steve Kremer, Itsaka Rakotonirina:
DEEPSEC: Deciding Equivalence Properties in Security Protocols Theory and Practice. IEEE Symposium on Security and Privacy 2018: 529-546

URL sujet détaillé : :

Remarques : Co-encadrement avec Vincent Cheval (vincent.cheval.fr)

Le paiement d'une gratification est envisageable, mais souvent incompatible avec le salaire des élèves normaliens. Certains frais peuvent cependant être pris en charge.




SM207-102 Surveillance à l'exécution de propriétés temporelles  
Admin  
Encadrant : Matthieu ROY
Labo/Organisme : LAAS-CNRS & ONERA
URL : http://homepages.laas.fr/mroy/
Ville : Toulouse

Description  

Pour affirmer qu'un système critique donné est fiable, il est nécessaire de certifier son bon fonctionnement, en particulier temporel. En complément d'une vérification statique avant l'exécution, nous considérons dans cette étude la vérification à l'exécution, qui consiste à surveiller le système pendant son exécution et à veiller à ce qu'il se conforme bien à une spécification donnée.

Lors de travaux précédents, nous avons développé un système formellement prouvé qui "compile" une spécification décrite par un automate temporisé en système de surveillance à l'exécution.
Cette solution présente le défaut de ne pas passer à l'échelle pour des spécifications de taille importante.

Pour pallier à cette limitation et permettre d'appliquer la vérification à l'exécution sur des systèmes plus complexes, cette étude va évaluer l'adéquation de l'apprentissage pour la vérification à l'exécution.

Dans une première phase, il s'agira d'entraîner un système d'intelligence artificielle de type deep learning sur des spécifications à base d'automate temporisé et sur des traces d'exécutions marquées correctes ou incorrectes.

Lors de la deuxième phase, ce système sera évalué sur d'autres instances du problème afin d'évaluer ses propriétés de précision (faux positifs et faux négatifs) et de performance à l'exécution.

Une étude comparative avec la solution déjà implémentée fournira une étude complète de l'intérêt de l'utilisation du {em machine learning} pour la vérification de propriétés à l'exécution.




URL sujet détaillé : /Users/mroy/ownCloud/Stages/TrustMeIA/pagetti-roy-2018.tex :

Remarques : Co encadrement avec Claire Pagetti (ONERA)



SM207-103 Building Robust Learning Systems  
Admin  
Encadrant : Pablo PIANTANIDA
Labo/Organisme :
URL : http://www.l2s.centralesupelec.fr
Ville : Gif-sur-Yvette

Description  

Research context. The aim of this internship proposal is to study advanced information-theoretic tools to detect the inadequacy of data during the main phases of training and classification in deep neural networks. The fundamental outcome of this research aim at providing practical learning algorithms with theoretical guarantees ensuring the origin and integrity of data. More specifically, the goal of this research is to provide learning algorithms to jointly learn not only deep representations (as mapping from space of the data to representations) and classifiers but also a =93soft-discriminator=94 that provides an indicator with statistical guarantees (e.g., confidence intervals) of the functioning of an Artificial Intelligence (AI) system under the potential influence of new data (e.g., the data itself evolves over time and so does the nature of the classification problem) during both training and classification phases. Robust learning systems are expected to be able to decide whether new incoming data can be correctly classified (e.g., according to statistical guarantees) or used to update (retraining) the state of the current system.

Scientific approach. An interesting starting point will be to approach the algorithmic design using an information-theoretic paradigm, which is based on the fundamental idea of Learning as Data Compression. This idea relies in the following insight: any regularity in the data (e.g. features and labels) can be used to compress the labels given the features, i.e., to describe it using fewer symbols than the number of symbols needed to describe the labels literally. The more regularities there are, e.g. is it easy to predict labels from features, the more the labels can be compressed given the features. Equating =91learning' with =91finding regularity', we can therefore say that the more we are able to compress the labels knowing the features, the more we have learned about the data. Formalizing this idea leads to a general theory of coding with side information and learning with several attractive properties.

URL sujet détaillé : https://www.dropbox.com/s/l9yoie88dolk0mw/Topic%20-%20IBM%20-%202018%20-%20itership.pdf?dl=0

Remarques : An international and R&D research environment. The intern student will be hosted by Laboratoire des Signaux et Systèmes (L2S, UMR8506) at CentraleSupélec (Université Paris Saclay) and the intern is motivated by a collaboration with IBM France Lab. Moreover, the L2S leads the CNRS International Associate Laboratory (LIA) on =93Information, Control and Learning=94 where the Montreal Institute for Learning Algorithms (MILA) Lab at Université de Montréal is part of it, carrying out top research in deep learning. Therefore, the inter candidate can take the opportunity, within the framework of this LIA, to visit MILA to convey a list of complementary and transferable expertise, implementing advanced algorithms for training deep neural networks (e.g., by working in an international environment with top researchers in the field).

Background of the candidate. We are looking for serious and motivated candidates, with interest and preferably studies to conduct research work in the area of machine learning and its applications. Advanced notions in machine learning, computer vision, optimization, information theory, or other related research areas, and a good familiarization with scientific research are very welcome.

To apply, candidates must transmit a digital application package including:
- Detailed curriculum vitae
- University transcripts
- Contact information of two references
to:
- Prof. Florence Alberge, Laboratoire des Signaux et Systèmes (CentraleSupélec-CNRS-UPS)
Email: florence.alberge.centralesupelec.fr
- Prof. Pablo Piantanida, Laboratoire des Signaux et Systèmes (CentraleSupélec-CNRS-UPS) =96
MILA (Université de Montréal)
Email: pablo.piantanida.com




SM207-104 Exploitation de données issues du Smart Grids avec une garantie de confidentialité  
Admin  
Encadrant : Pierre DUHAMEL
Labo/Organisme :
URL : http://www.l2s.centralesupelec.fr
Ville : Gif-sur-Yvette

Description  


========================================1. Contexte sociétal, économique et industriel
========================================
La gestion de l'énergie est un problème mondial avec de grandes applications et implications : haute demande en énergie et aspects environnementaux. Les réseaux intelligents sont proactifs pour faire face aux défis de la demande, de l'approvisionnement en électricité, améliorer la gestion de l'énergie et sécuriser les communications du réseau électrique à l'aide des technologies de l'information et de la communication. L'impact énergétique et le déploiement de technologies intelligentes pour l'automatisation, l'échange d'informations, l'analyse, l'optimisation dynamique des opérations et des ressources du réseau sont autant de défis majeurs pour les industries en matière de systèmes d'intelligence décisionnelle. Comme beaucoup de secteurs, le secteur de l'électricité vie sa révolution par la Donnée. La transformation des réseaux énergétiques, l'apparition de nouveaux services, de nouveaux acteurs : consommateurs et des nouveaux modèles tels que l'autoconsommation, modifient les besoins de fonctionnement des réseaux et la gestion de données de plus en plus massives qui seraient inexploitables sans recours à l'IA. Tous les métiers sont impactés : production, transport, distribution et fourniture ... les réseaux ne transportent pas que des électrons mais aussi des données =96 c'est le smart grids.
En France, une pièce essentielle du smart grids est le compteur communicant Linky =96 projet porté par Enedis, filiale autonome du groupe EDF. Ce compteur permet de collecter des mesures jusqu'à la maille " basse-tension " du réseau public de distribution. L'information part également du consommateur vers l'opérateur ce qui assure à ce dernier un moyen de surveillance efficace pour détecter rapidement les anomalies sur le réseau. Essor des énergies renouvelables, développement de la mobilité électrique, évolution des modes de consommation... le compteur Linky permet d'accompagner ces changements tout en garantissant la sûreté du système électrique, et donc la continuité de l'alimentation.


========================================2. Smart Grids : des données utiles mais particulièrement sensibles

========================================
Les compteurs communicants " smart meters " font partie intégrante du réseau intelligent et offrent donc une donnée précieuse, mais aussi sensible =96 notamment la courbe de charge qui mesure les variations des niveaux de consommation électrique à l'intérieur même de la journée (typiquement, pas de temps 30 minutes). C'est pour cette raison que la Commission nationale de l'informatique et des libertés (CNIL) a pris le soin d'encadrer les conditions de collecte et d'utilisation. Dans les faits, la mesure courbe de charge n'est aujourd'hui qu'exceptionnellement collectée et sa diffusion hautement encadrée et limitée par la Commission de Régulation de l'Energie (CRE).
Les enjeux de confidentialité " privacy " du Smart Grids =96 réels ou parfois fantasmés =96 freinent aujourd'hui le potentiel de ces données. L'objectif de notre travail est de permettre de répondre à cet enjeu, explorant plus en détails l'espace entre utilité et confidentialité.
Un bon compromis entre utilité et confidentialité :

Dans le contexte évoqué, il nous paraît essentiel de réfléchir aujourd'hui à la question du partage et de l'exploitation des données issues du Smart Grids avec des garanties de confidentialité élevées. Pour le partage, l'une des pistes envisagées est d'étudier des techniques basées sur un traitement statistique pour la résolution de 2 objectifs antagonistes :
1. Simuler avec précision des mesures individuelles de Smart Grids, typiquement des courbes de charge.
2. Garantir un niveau de confidentialité, afin que les données simulées ne permettent pas (ou difficilement) une re-identification, typiquement qu'une courbe de charge simulée " colle " trop à une courbe de charge réelle.

========================================3. Actions prévues pour les consommations électriques individuelles : simulation de données, modélisation et prédiction garantissant la confidentialité
========================================
Les objectifs de ce travail de stage sont très larges, et correspondent à un potentiel important des méthodes d'apprentissage pour une meilleure gestion de la confidentialité des données de réseaux électriques. Notre plan d'action est structuré en trois actions détaillées ci dessous.
- Prendre en entrée des courbes de charge individuelles décrites en grande dimension (par exemple 1 semaine avec un pas de =BD horaire ou 1 an avec un à pas journalier),
- Utiliser ces courbes de charge individuelles pour construire un générateur de courbes réalistes (préservant les principales propriétés statistiques des courbes de charge) utilisant un auto-encodeur convolutif (auto-encoder variational, " Generative adversarial nets " ou modèle d'auto-encoder génératif),
- Travailler sur une garantie plus ou moins forte que ce générateur ne divulgue pas tout ou partie des courbes individuelles ayant servi à le construire.

URL sujet détaillé : https://www.dropbox.com/s/nmuc0osgkyo7k27/Stage%20-%20EDF%20-%20CS.pdf?dl=0

Remarques :
========================================Un réseau international et une expérience de travail avec EDF R&D
========================================
Le travail en étroite collaboration entre le Laboratoire des Signaux et Systèmes (L2S) à CentraleSupélec (Paris Saclay) et les équipes concernées d'EDF R&D à Paris Saclay est un réel bonus pour garantir l'accès aux données et contraintes pertinentes de la problématique industrielle, ainsi que le développement d'outils adaptés. Le travail de ce projet pourrait s'inscrire également dans le cadre du Laboratoire International Associé (LIA) du CNRS " Information Learning and Control =96 The Ingredients of Intelligent Grids " avec plusieurs institutions au Québec (Université McGill, Université de Montréal,...) dont certaines fortement reliées aux Smart Grids (Hydro-Québec Industrial Research Chair à l'Université McGill) et à l'IA (l'Institut des algorithmes d'apprentissage MILA est le plus grand laboratoire au monde en la matière). Un séjour à Montréal pourrait être prévu dans le but de faciliter l'interaction entre les chercheurs et le stagiaire, ainsi que le partage d'expertise pertinente.

Membres académiques :
- Pablo PIANTANIDA (Professeur adjoint), Laboratoire des Signaux et Systèmes (L2S), CentraleSupélec- CNRS-Université Paris Saclay
- Pierre DUHAMEL (DR CNRS), Laboratoire des Signaux et Systèmes (L2S), CentraleSupélec-CNRSUniversité Paris Saclay

Membres industriels :
- Benoît Grossin (EDF R&D), Ingénieur de Recherche, Dépt. ICAME
- Amandine Pierrot (EDF R&D), Data Scientist

Contact : Dr. Pierre Duhamel (pierre.duhamel.centralesupelec.fr)
Pr. Pablo Piantanida (pablo.piantanida.fr)




SM207-105 Analyse fonctionnelle du Secure Monitor de OP-TEE/ Functional anlysis of OP-TEE Secure Monitor  
Admin  
Encadrant : Cécile BRAUNSTEIN
Labo/Organisme : Laboratoire de Paris 6 - Sorbonne Université
URL : https://www-soc.lip6.fr/index.php?id=432
Ville : Paris

Description  

Description
==========Today, users have a growing demand for security. In addition, systems are
increasingly complex, open and connected. A major stake is to be able to
guarantee the security of systems and (sensitive) data of embedded systems
and other connected objects (IOT ).
In this context, trusted execution environments (TEE ) were designed to
take advantage of secure architectures such as ARM's TrustZone. These se-
cure platforms allow application offering different levels of security to coexist
on the same architecture. For example on a smartphone, the TEE can be
used to secure services such as electronic wallets or biometric authentication.
OP-TEE ([6]) is an open source project that offers a full implementation of
a trusted execution environment.
The TEE must therefore be able to guarantee the authenticity of the
code that runs in secure mode, the integrity of the execution state (the state
of the registers, the memory . . . ) and the confidentiality of the code and the
data (see [8]). OP-TEE distinguishes two worlds, a non-secure one, used for
common applications and a secure one, used by services that use sensitive
data. The intern will study the possible data leak from the secure world to
the non-secure world.
One of the key elements of a TEE is the Secure Monitor. It's the pro-
gram that is called at every switching between the non-secure and the secure
world. Calling a secure service will therefore induce a specific system calls.
Following the internship of Maxime Ayrault, we have a model of OP-TEE
secure monitor [7]. Our study aims at enriching this model (for example with
the interruption mechanism) and then perform its functional verification.
Formal methods offer the possibility to automatically (or partially) check
properties on the analyzed code (for medium size systems). These methods
are generally used to verify the behavior of parts of applications. More recently, approaches have emerged to focus on security properties of sys-
tems [3]. As part of this internship, we are interested on the one hand by
the functional verification of the OP-TEE model with the help of the TEE
specification [4, 5]. This specification is developed by Global Platform a
consortium of industrial and academic partners working on the standardiza-
tion of the software interfaces. On the other hand, we would like to check
security properties, to guarantee the information confidentiality: we want to
show that an information cannot be known to unauthorized third parties.
The model checker CPAchecker ([1]) is a tool of verification that im-
plements a large number of model checking checking [2]. One goal of the
internship will be also to evaluate different verification strategies and find
the one or more that suit our model.

Goals
====
The objective of the internship is to model and to verify of a part of
a TEE. As a first step, a model of / Secure Monitor / and its
operational environment will need to be enriched. Then, starting from
the specification of the TEE, functional properties will be verified
on the model. Model checking algorithms will then be compared.
Finally, the specification can be completed by adding properties of
security and strategies to verify them.


Work plan
========
- Bibliographic study on OP-TEE and TEE specification
- Analysis of the code of the /secure monitor/ of Maxime Ayrault
- Proposal to add behaviors in the model
- Proposal of functional properties
- Study of model cheking algorithms implemented in CPAchecker
- Model checking performance analysis
- Perspective on the modeling of security properties


URL sujet détaillé : https://www-soc.lip6.fr/index.php?id=432

Remarques : Co-encadrement avec Emmanuelle Encrenaz




SM207-106 CNN inference approximation  
Admin  
Encadrant : Silviu FILIP
Labo/Organisme : IRISA
URL : https://www.irisa.fr
Ville : Rennes

Description  

Deep Learning, and in particular Convolutional Neural Networks (CNNs), are currently one of the most intensively and widely used predictive models in the field of machine learning. CNNs have shown to give very good results for many complex tasks such as object recognition in images/videos, drug discovery, natural language processing, autonomous driving up to playing complex games.

Despite these benefits, the computational workload involved in CNNs is often out of reach for low-power embedded devices, and/or is still very costly when running on datacenter hardware platforms. Thus, a lot of research effort from both industrials and academics has been concentrated in defining/designing custom hardware platforms supporting this type of algorithms, with the goal of improving performance and/or energy efficiency [Chao17, Hsin17, Zhi17].

In recent years, Approximate Computing (AxC) has become a major field of research to improve both speed and energy consumption in embedded and high-performance systems [Mit16]. By relaxing the need for fully precise or completely deterministic operations, approximate computing substantially improves energy efficiency. Various techniques for approximate computing augment the design space by providing another set of design knobs for performance-accuracy tradeoffs. For instance, the gain in energy between a low-precision 8-bit operation suitable for vision and a 64-bit double-precision floating-point operation necessary for high-precision scientific computation can reach up to 50x by considering storage, transport and computation. The gain in energy efficiency (the number of computations per Joule) is even higher. The challenge is then to find adequate (no more and no less) number representations and word-lengths of data/computations that are compatible with application constraints.

CNNs show inherent resilience to insignificant errors due to their iterative nature and learning process. Therefore, an intrinsic tolerance to inexact computation is evidenced, and using the approximate computing paradigm to improve power and delay characteristics is therefore relevant [Sung15]. Indeed, CNNs lend well with AxC techniques, especially with fixed-point arithmetic or low-precision floating-point implementations. They are therefore ideally suited for hardware acceleration using FPGA and/or ASIC implementation as acknowledged by the large body of work on this topic.

The goal of this project is to explore how approximation techniques can improve the performance of CNNs in deep-learning applications. In particular, the candidate will study how custom floating-point and fixed-point arithmetic, adequate number representations, and even algorithmic-level transformations, can improve performance and energy efficiency of CNN computation while keeping classification and learning phases at a very high success rate. The aim is to go further than current state-of-art research studies in which only the inputs and outputs of the neural network layers are quantized to low precision.

The candidate will investigate the tradeoffs and benefits that different numeric formats and arithmetic operators bring in terms of performance and energy efficiency when doing Deep Neural Network inference. The goal is for the candidate to contribute to the development of a framework for automatic precision exploration and weight value quantization for CNN inference suited for hardware acceleration using FPGAs.

One particular direction of study we might pursue is that of numerical format optimization when doing inference in network architectures based on structured block-circulant weight matrices [Ding17]. Such architectures offer the promise of small storage requirements (i.e., linear in the number of neurons of each layer vs quadratic in classic architectures) and improved performance in training and inference (i.e., quasi-linear vs quadratic cost in the classic setting), critical elements needed for a wide adoption of deep learning methods on low-power embedded devices.


References:

[Chao17] W. Chao, L. Gong, Q. Yu, X. Li, Y. Xie, and X. Zhou, =93DLAU: A scalable deep learning accelerator unit on FPGA=94, IEEE Trans. on Computer-Aided Design of Int. Circ. and Syst., vol. 36, no. 3, 2017, pp. 513-517.
[Ding17] CirCNN: accelerating and compressing deep neural networks using block-circulant weight matrices. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture (MICRO), 395-408, 2017.
[Hsin17] C. Yu-Hsin, T. Krishna, J. S. Emer, and V. Sze, =93Eyeriss: An energy-efficient reconfigurable accelerator for deep conv. neural networks=94, IEEE Journal of Solid-State Circuits, vol. 52, no. 1, 2017, pp. 127- 138.
[Mit16] S. Mittal, =93A Survey of Techniques for Approximate Computing=94, ACM Computing Surveys (CSUR), Vol. 48, no. 4, 2016, pp. 1-33.
[Sung15] W. Sung, S. Shin, and K. Hwang, =93Resiliency of Deep Neural Networks under Quantization,=94 CoRR, vol. abs/1511.06488, 2015.
[Zhi17] L. Zhiqiang, Y. Dou, J. Jiang, J. Xu, S. Li, Y. Zhou, and Y. Xu, =93Throughput- Optimized FPGA Accelerator for Deep Conv. Neural Networks=94, ACM Trans. on Reconfig. Tech. and Syst., vol. 10, no. 3, 2017, p 17.
URL sujet détaillé : http://perso.ens-lyon.fr/silviuioan.filip/files/master_approxDL.pdf

Remarques : The internship will take place at IRISA, in Rennes, co-advised by Olivier Sentieys and Silviu Filip. Usual internship renumeration applies.



SM207-107 Mesure et Intégrale de Lebesgue en Coq  
Admin  
Encadrant : Micaela MAYERO
Labo/Organisme : LIPN - Université Paris 13
URL : https://lipn.univ-paris13.fr/MILC/
Ville : Villetaneuse

Description  

Finite elements are at the heart of numerous simulation programs used in industry. Verification methods developed in this project will be a
breakthrough for the finite element method, but more generally for the reliability of critical software relying on complicated numerical algorithms.
The goal is to develop the basis of correction proving of the FELiScE library which implements the finite element method for approximating solutions to partial differential equations.

Theses proofs are based on measure theory, Leebesgue integral, distributions and Sobolev spaces. The goal of this work is to formally
prove in Coq some well known mathematical results such as the construction of L^2 space and several properties such as the completness of inner product associated norm.

Obviously, it is not only a ``translation'' from mathematical books to a proof assistant. The candidate must also be able to understand the
proofs and how to factorize and generalize them.

Note that there exists references regarding real analysis [BLM15] and the formal proof of the Lax-Milgram theorem [CPP2017].

Références:

[CPP2017] Sylvie Boldo and François Clément and Florian Faissole and Vincent Martin and Micaela Mayero. A Coq formal proof of the LaxMilgram theorem. In Proceedings of the 6th ACM SIGPLAN Conference on Certified Programs and Proofs, (CPP 2017), pages 79=9689, 2017.

[BLM15] Sylvie Boldo and Catherine Lelay and Guillaume Melquiond. Coquelicot: A User-Friendly Library of Real Analysis for Coq. In Mathematics in Computer Science, volume 9, number 1, pages 41-62, 2015.
URL sujet détaillé : https://www-lipn.univ-paris13.fr/~mayero/?content=master

Remarques : Co-encadrants :
Sylvie Boldo; Inria Saclay
François Clément, Vincent Martin; Inria Paris

Le/la stagiaire pourra être amené.e à se rendre à l'INRIA Paris ou Saclay.

Rémunération: oui.



SM207-108 Geometrical and topological mesh filtering for fabrication  
Admin  
Encadrant : Dobrina BOLTCHEVA
Labo/Organisme : LORIA - INRIA Grand Est
URL : https://members.loria.fr/DBoltcheva/
Ville : Nancy

Description  

3D printing has become many peoples hobby at home, thanks to the fast development in the past decades.
Today fabricating an appropriate 3D model using a low-cost 3D printer is nearly as easy as
printing a textual document, but creating a 3D model which is actually =94appropriate=94 for printing is definitely *complicated*.

Most of the 3D models flooding Internet are not meant to be fabricated, and usually
reveal many defects that make them unsuitable for printing, such as open boundaries, self-intersections, zero-thickness walls, incomplete geometry, many high frequency small details, etc.
Which of them can really be printed ? How to repair and filter the unprintable models to adapt them to the fabrication requirements ?

We currently miss a geometrical and topological mesh filtering approach that can cope with *any* STL file while preserving all visible surfaces with no or minimal distortions and managing material properties and domain-specific information contained into the input model.
The goal of this project is to design such a framework.
URL sujet détaillé : https://members.loria.fr/DBoltcheva/files/MeshFiltering.pdf

Remarques : Le stage sera co-encadré par Cédric Zani :

https://mines-nancy.univ-lorraine.fr/collaborator/cedric-zanni/



SM207-109 Théorie des pavages isotropes  
Admin  
Encadrant : Benjamin HELLOUIN
Labo/Organisme : Laboratoire de Recherche en Informatique
URL : www.lri.fr/~hellouin :
Ville : Orsay

Description  

(english version below)

La théorie des pavages est un domaine à l'interaction entre l'informatique fondamentale (décidabilité, modèles de calcul) ; les mathématiques discrètes (combinatoire, physique statistique) ; et les systèmes dynamiques.

Les pavages considérés par ces différentes communautés sont de natures et de complexités assez différentes et certains résultats ne franchissent pas la frontière comme ils le devraient. L'objectif de ce stage et d'explorer une nouvelle sous-classe de pavages, dit isotropes, qui semble à la fois contenir la plupart des modèles d'interêt physique ou combinatoire, et briser les preuves d'indécidabilité connues. Ce contexte laisse espérer des algorithmes non triviaux pour les problèmes indécidables en général, ou le développement de nouvelles preuves d'indécidabilité.

Vous revisiterez la littérature classique des pavages et verra comment elle se décline sur cette sous-classe pour dégager des questions potentiellement intéressantes et décider du bon formalisme à adopter. Ceci nous permettrait d'obtenir un premier panorama sur ces objets. Suite à cela, je propose divers problèmes ouverts sur ces objets, en plus de ceux que vous proposerez : se référer au sujet détaillé.

---

Tiling theory is at the border between theoretical computer science (decidability, models of computation), discrete mathematics (combinatorics, statistical physics), and the mathematical theory of dynamical systems.

Those communities study tiling with different properties and complexities, which means that some results do not carry over as well as they should. The objective of this internship is to study a newly-introduced subclass, isotropic tilings, that include most models of physical and combinatorial interest and at the same time break all known undecidability proofs. We can therefore hope for nontrivial algorithms for problems that are undecidable in general, or new methods for undecidability.

You shall revisit the classical tiling litterature and adapt it to isotropic tiling to find potential questions of interest and arguments for the best formalism. This would provide us with the first overview of these objects. Then, I offer various open problems to study, to which you will add your own. Please refer to the detailed description.
URL sujet détaillé : https://www.lri.fr/~hellouin/Stage.pdf

Remarques : Ce stage peut donner lieu à gratification (environ 500=80). Suivant la disponibilité du financement, une poursuite en thèse (en co-direction) peut être envisagée.

--

This internship can be paid (around 500=80). Depending on funding availability, the intern could follow up a Ph.D (with a senior supervisor).



SM207-110 Reconnaissance de bâtiments à partir de nuages de points 3D  
Admin  
Encadrant : Dobrina BOLTCHEVA
Labo/Organisme : LORIA - INRIA Grand Est
URL : https://members.loria.fr/DBoltcheva/
Ville : Nancy

Description  

The digitization of real objects is increasingly used in areas such as urban design and architecture.
Acquisition tools such as lidars and photogrammetry make it possible to produce digital representations of entire cities in the form of massive 3D point clouds sampling the surfaces of objects in the environment.

Nowadays, the pipeline of creating digital 3D models from point clouds, photogrammetry and land registry information (cadastre) remains largely manual and is recognized as time-consuming, tedious, subjective and requiring skills.
The workflow for the designer is to first select areas of interest (for example, buildings and houses) from the point cloud. Then he creates 3D models starting from simple parallelepipeds and gradually adding details such as roofs, windows, doors, cornices, stairs, etc.

The long-term goal is to develop a tool to assist the designer in the creation of 3D models in a semi-manual way.
During this internship, we will focus on a still open problem: the detection of roofs in the point clouds.
So we are planning to study a particular sub-problem, namely the detection of gable roofs (toitures à deux versants) by first injecting priors and then possibly extending the method to a detection based on a catalog of components.
URL sujet détaillé : https://members.loria.fr/DBoltcheva/files/rhinocerros.pdf

Remarques : Le stage sera co-encadré par Dmitry Sokolov :
https://members.loria.fr/DSokolov/

Si nos premiers résultats s'avèrent prometteurs, le projet pourra être poursuivi dans le cadre d'une thèse CIFRE.



SM207-111 Génération de maillage hexahédral structuré par blocks  
Admin  
Encadrant : Nicolas RAY
Labo/Organisme : LORIA - équipe ALICE
URL : http://alice.loria.fr/
Ville : villers les nancy

Description  

Block structured hexahedral are considered to be the "holy grail" for the hexahedral remeshing problem. These meshes have a very constrained structure that can be expoited by numerical simulations, but also makes them really difficult to produce. This specific structure can be represented by the Spatial Twist Continuum; a set of surfaces that are dual to layer sheets of hexahedron. Unfortunately, it is very difficult to produce such surfaces. The objective of this project is to extract the Spatial Twist Continuum from the result of another hexhedral remeshing algorithms that is based on global parameterization. Indeed, the results of this algorithm often captures well most important features of the volume, that could be used as an heuristic to define the STC.
URL sujet détaillé : https://members.loria.fr/NRay/projets/

Remarques :



SM207-112 Frame Field 2.5D  
Admin  
Encadrant : Nicolas RAY
Labo/Organisme : LORIA - équipe ALICE
URL : http://alice.loria.fr/
Ville : villers les nancy

Description  

Frame Fields are objects that associate, to each point of space, a set of six unit vectors that are either opposite of orthogonal to each others. Such fields are used to remesh volumes by hexahedron: it provides the orientation of the hexahedrons. They are actually generated by minimizing their spatial variation (rotation), subject to the constraint to be aligned witht the domain's boundary. The objective of this intership is to add extra constraints in this process, such that it will be possible to tile the domain by hexahedron aligned with the resulting fields. The main difficulty comes from singularities of the field i.e. curves around which a frame following an elementary cycle will have an infinite speed of rotation.
URL sujet détaillé : https://members.loria.fr/NRay/projets/

Remarques :



SM207-113 Atlas de texture à coutures invisibles  
Admin  
Encadrant : Nicolas RAY
Labo/Organisme : LORIA - équipe ALICE
URL : http://alice.loria.fr/
Ville : villers les nancy

Description  

For image synthesis, geometric objects are mostly represented by triangulated surfaces. We can improve their visual aspect by mapping an image (called texture) onto the surface. This mapping is defined by a texture atlas, where the frontiers between maps produce rendering artifacts. To overcome this problem, one can generate global grid-preserving parameterisations where seams can be made invisibles. The objective here is to implement an algorithm that can generate such texture atlases, and edit textures associated to the object, in such a way that the rendering will not produce visual artefacts anymore. The first step will be to implement an existing algorithm ["invisible seams" EGSR 2010]; we already have parameterisation algorithm, from which a textire atlas must be derived, as well as a set of constraints to be applied to textures to make them seamless. The second step will be to propose improvements of this method, to make it more robust.
URL sujet détaillé : https://members.loria.fr/NRay/projets/

Remarques :



SM207-114 Representation of elastic concepts --- Application to search in a digital humanities corpus  
Admin  
Encadrant : Olivier BRUNEAU
Labo/Organisme : AHP-PReST et LORIA
URL : http://poincare.univ-lorraine.fr/ http
Ville : Nancy

Description  

The execution of an exact queries to a corpus does not necessarily give a sufficient answer to the user, for at least two reasons: (A) a close match would be missed but nevertheless interesting, (B) an answer that is not close but gives a relevant viewpoint can also be of interest. This motivates the issue of inexact search in a corpus. This is an ongoing issue for a few years with an application in the Henri Poincaré correspondence. The notion of elastic query is introduced to address this issue: an elastic query is defined by a pair (Q, R) where Q is a classical query and R is as set of query transformation rules associated with costs and explanations: an item x matches (Q, R) with a cost c if there exists a sequence of transformation of Q into Q' by rules r1, ..., rn with c = cost(r1) + ... + cost(rn).

Three issues can be considered in this work (or in the PhD thesis that can continue it). The first one is the study of a language of query transformations: how is it sufficient to represent user's need and how it should be extended. The second one is about how the human-machine interaction should be considered in this framework. The last and not least issue aims at going further in the notion of elastic set, in particular: how classical operations on sets (inclusion, intersection, union, etc.) can be extended to elastic sets.

Keywords: symbolic artificial intelligence, digital humanities, semantic web technologies

(The detail of this work is given in the pdf file, don't hesitate to contact us for any question!)

URL sujet détaillé : :

Remarques :



SM207-115 Representation of elastic concepts --- Application to search in a digital humanities corpus  
Admin  
Encadrant : Olivier BRUNEAU
Labo/Organisme : AHP-PReST et LORIA
URL : http://poincare.univ-lorraine.fr/
Ville : Nancy

Description  

The execution of an exact queries to a corpus does not necessarily give a sufficient answer to the user, for at least two reasons: (A) a close match would be missed but nevertheless interesting, (B) an answer that is not close but gives a relevant viewpoint can also be of interest. This motivates the issue of inexact search in a corpus. This is an ongoing issue for a few years with an application in the Henri Poincaré correspondence. The notion of elastic query is introduced to address this issue: an elastic query is defined by a pair (Q, R) where Q is a classical query and R is as set of query transformation rules associated with costs and explanations: an item x matches (Q, R) with a cost c if there exists a sequence of transformation of Q into Q' by rules r1, ..., rn with c = cost(r1) + ... + cost(rn).

Three issues can be considered in this work (or in the PhD thesis that can continue it). The first one is the study of a language of query transformations: how is it sufficient to represent user's need and how it should be extended. The second one is about how the human-machine interaction should be considered in this framework. The last and not least issue aims at going further in the notion of elastic set, in particular: how classical operations on sets (inclusion, intersection, union, etc.) can be extended to elastic sets.

Keywords: symbolic artificial intelligence, digital humanities, semantic web technologies

(The detail of this work is given in the pdf file, don't hesitate to contact us for any question!)

URL sujet détaillé : http://bruneauolive.free.fr/images/sujet_bruneau_lieber.pdf

Remarques : This work with be co-supervised by Olivier Bruneau (who is a historian of mathematics in AHP-PReST) and Jean Lieber (who is a computer scientist in Loria). The trainee will have the classical payment (about 500 euros per month). A PhD thesis can follow this work.

Please contact Oivier at olivier.bruneau-lorraine.fr and Jean at jean.lieber.fr



SM207-116 Réduction de la consommation énergétique dans les réseaux mobiles hétérogènes  
Admin  
Encadrant : Isabelle GUÉRIN LASSOUS
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/isabelle.guerin-lassous/
Ville : Lyon

Description  

The objective of the internship is to design solutions reducing energy consumption of access points and base stations in heterogeneous mobile networks, while ensuring a given quality of service to the users. The solution will select, on line, which access points or which base stations to switch off, how to associate the users on the active access points / base stations, which power and which physical transmission rate to use, without sacrifying, at the same time, the network and users performance. The solution will likely be based on an optimization problem. This latter will be designed thanks to models of performance and energy consumption prediction. These models can be constructive and data-driven.
URL sujet détaillé : http://perso.ens-lyon.fr/isabelle.guerin-lassous/stage-green-mobile-networks.pdf

Remarques : Le stage sera co-encadré avec Thomas Begin.



SM207-117 Undecidability in perturbed dynamical systems  
Admin  
Encadrant : Mathieu SABLIK
Labo/Organisme : IMT
URL : https://www.math.univ-toulouse.fr/~msablik/index.html
Ville : Toulouse

Description  


Dynamical systems is a very natural and widely used model to capture systems evolving over time. Generally dynamics are computable, in the following informal sense: given a configuration, it is possible to compute the next configuration of the configuration after some finite time. In many of the contexts the properties of true interest are however about their long-term evolution.
However, if bounded-time simulations are usually possible, the computational outlook for the =93infinite=94 time horizon problems is grim: the long-term behavior of many interesting systems is uncomputable.

A notable feature, shared by prior works on computational intractability in dynamical systems~cite{Moore90,Braverman-Yampolsky-2007,Asarin-Bouajjani-2001}, is that the non-computability phenomenon is not robust: uncomputability disappears once one introduces even a small amount of noise into the system. Thus, if one believes that natural systems are inherently noisy, one would not be able to observe such non-computability phenomena.

A natural approach is to study the invariant measures of perturbed systems. In~cite{Braverman-Grigo-Rojas-2012}, the authors show that while simple dynamical systems defined by transformations of the interval have uncomputable invariant measures, perturbing such systems makes the invariant measures efficiently computable. Thus, noise that makes the short term behavior of the system harder to predict, may make its long term statistical behavior computationally tractable.

This type of results is not known for dynamical systems on discrete space as cellular automata. In this case it is possible that the noise cannot break uncomputability since some computation can be implemented in noisy cellular automata~cite{Gacs-2001}. The purpose of this stage is to see where the technics developed in~cite{Braverman-Grigo-Rojas-2012} cannot hold and try to construct dynamical system with uncomputable asymptotic behavior even if it is perturbed.


URL sujet détaillé : https://www.math.univ-toulouse.fr/~msablik/sujet_M2_SystemeDynamiqueCalculable.pdf

Remarques :



SM207-118 Models and planning for human-robot collaboration.  
Admin  
Encadrant : Vincent THOMAS
Labo/Organisme : LORIA/Inria Nancy GrandEst - Equipe LARSEN
URL : https://team.inria.fr/larsen/
Ville : Nancy

Description  

Collaboration between humans and robots is a current high-stake research subject with numerous application areas (smart factories, therapeutic robot companion,...).
The internship we propose here is linked to the "Flying co-worker" ANR accepted project (2018) whose aim is to build a collaborative flying robot to help human workers. More precisely, the proposed subject focuses on building the behavior (or the conditionnal plan) of an autonomous robot to assist a human worker to fulfill his tasks in the best possible way (meaning, minimizing a cost to define). It is a complex issue since (1) the robot has to estimate the objective of the human worker through partial observations of his activity, (2) the robot has to make decisions based on this partial information and (3) the human behavior might also depend on the actions undertaken by the robot.

This internship aims at addressing this issue by using models from the "decision making under uncertainty" research field (Markov Decision Processes, Partially Observable Markov Decision Processes [1]) and at investigating several questions:
- how to model the human worker behavior, his objectives and the task sequence he tries to accomplish (with the help of Markov chain);
- how to use this model to infer distributions on the current objective of the human worker based on his actions (Bayesian inference, HMM [2]);
- how to decide actions to gather more information about the human task (with the help of active sensing models and algorithms [3]);
- how to help the human worker while considering uncertainties about his state and his evolving objectives (by using POMDP models).

In a first step, the student will have to study Markov Models before proposing a formalization of a situation where an autonomous agent has to help a human to attain his current objective, initially unknown to the agent. For instance, the autonomous agent has to decide which tools to bring -among several available- to an isolated human worker by considering the probabilistic workflow of his activity. Then, in a second step, the internship will address the questions cited previously: first by finding a way to model the human objective [4] [5] and his adaptation to the robot actions [6] and then by proposing algorithms to build the optimal behaviour of the agent based on these models [7] and on reasoning on Human Robot joint action [8].

During the internship, this work will be mostly done in simulation to investigate algorithms and their efficiencies, but, if time allows it, a simple scenario (to be built during the internship) could be tested with a real robot.

Keywords : artificial intelligence, decision making under uncertainty, Bayesian inference.

References :
[1] Olivier Sigaud, Olivier Buffet. Markov Decision Processes in Artificial Intelligence. (Lavoisier - Hermes Science Publications 2008).
[2] Lawrence R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition (PROCEEDINGS OF THE IEEE - 1989)
[3] Mauricio Araya, Olivier Buffet, Vincent Thomas, François Charpillet. A POMDP Extension with Belief-dependent Rewards (NIPS - 2010)
[4] Arsène Fansi Tchango, Vincent Thomas, Olivier Buffet, Fabien Flacher, Alain Dutech. Simultaneous Tracking and Activity Recognition (STAR) using Advanced Agent-Based Behavioral Simulations. (ECAI - 2014)
[5] Alan Fern, Sriraam Natarajan, Kshitij Judah, Prasad Tadepalli. A Decision-theoretic Model of Assistance. (JAIR - 2014)
[6] Stefanos Nikolaidis. Mathematical Models of Adaptation in Human-Robot Collaboration. (PhD thesis - 2017)
[7] Dylan Hadfield-Menell, Anca Dragan, Pieter Abbeel, Stuart Russel. Cooperative Inverse Reinforcement Learning (NIPS - 2016)
[8] Severin Lemaignan, Mathieu Warnier, Emrah Sisbot, Aurélie Clodic, Rachid Alami. Artificial cognition for social human=96robot interaction: An implementation (Artificial Intelligence, Elsevier, 2017)
URL sujet détaillé : https://team.inria.fr/larsen/master-subjects/

Remarques : - Ce stage sera co-encadré par Francois Charpillet et Francis Colas (Inria Nancy Grand Est - équipe LARSEN).

- Ce stage est proposé dans le cadre du projet ANR "Flying coworker" qui a été accepté en juillet 2018 et dont l'objectif est de développer un drone volant capable de collaborer avec des humains. Ce sujet pourra se poursuivre en thèse dans le cadre du projet ANR.




SM207-119 Quotient d'indépendance & nombre chromatique fractionnaire  
Admin  
Encadrant : Jean-sébastien SERENI
Labo/Organisme : ICube-CSTB
URL : http://lbgi.fr/~sereni
Ville : Strasbourg

Description  

This internship focuses on the fractional chromatic number, an exciting graph
invariant that appears under various guises: as a probability distribution on
the independent sets of a graph that guarantees at least a fixed probability
that an arbitrary vertex belongs to a random independent set chosen according
to the distribution; as an (a:b)-colouring of the graph such that the ratio a/b
is as small as possible; and as the optimal value of a linear program.

The student will study various questions about this invariant and learn, apply
and possibly broaden the techniques developped for its study. Depending on the
student's preferences, the questions can pertain more to structural graph
theory or to extremal graph theory. Due to the different ways to express the
fractional chromatic number, these techniques cover a wide range of discrete
methods: graph structural arguments, colouring techniques, discrete probability
distributions, probabilistic methods, linear programming.

The internship can also be more focused on the independence ratio of graph,
defined as the ratio between the size of a largest independent set over the
number of vertices: the independence ratio is always a lower bound on the
fractional chromatic number, and there are some graph classes where equality
always hold, e.g. that of vertex-transitive graphs.

Continuing towards a Ph.D is possible. Thanks to the background built during
this internship, the student, if desired, will be able to move towards a large
variety of topics in graph theory: colouring, structure, extremal questions,
graph limits and flag algebras.
URL sujet détaillé : http://lbgi.fr/~sereni/stageM2_ENS-Lyon.pdf

Remarques :



SM207-120 Immersive Analytics in VR/AR  
Admin  
Encadrant : Christophe HURTER
Labo/Organisme : ENAC/DataVis
URL : http://recherche.enac.fr/~hurter/
Ville : Toulouse

Description  

Our society has entered a data-driven era, in which not only enormous amounts of data are being generated every day, but also growing expectations are placed on their analysis [1]. Analyzing these massive and complex datasets is essential for making new discoveries and creating benefits for people, but this remains a difficult task.
In this Internship, we will consider immersive devices like Mixed reality head (Occulus Rifts, HTC VIVE) set or augmented devices (i.e. HoloLens, https://www.microsoft.com/en-us/hololens ) to address data exploration issues.

This ambitious research project is at the crossroads of Information Visualization, Visual Analytics, Computer Graphics and Human-Computer Interaction, and it will contribute to the research efforts upon big data issues. This project will investigate the following challenges:
Data visualization, manipulation, multiple uses and algorithm understanding challenges.

This project will extend FiberClay concepts:
http://recherche.enac.fr/~hurter/FiberClay.html


This internship is a research internship with a possible PhD thesis extension.

URL sujet détaillé : http://recherche.enac.fr/~hurter/FiberClay.html

Remarques : Stage rémunéré, gratification obligatoire



SM207-121 Automated CTL proofs in proof assistants  
Admin  
Encadrant : Guillaume BUREL
Labo/Organisme : Laboratoire Spécification et Vérification (ENS Paris-Saclay)
URL : http://web4.ensiie.fr/~guillaume.burel/
Ville : Cachan

Description  

Context
-------
Formal methods are more and more commonly used to ensure the
correctness of industrial systems, in particular critical
ones. Several techniques can be used, among which model checking,
static analysis using abstract interpretation, mechanized theorem
proving (automated theorem provers and proof assistants), etc. For a
given system, different techniques can be used to prove the
correctness of different subsystems. The question that naturally arise
is whether the correctness of the subsystems implies the correctness
of the whole. To show this, one needs to relate the different proof
techniques together.

In the recent years, Ji [1] showed how to perform model
checking proofs based on the modal logic CTL in an automated theorem
prover, namely iProverModulo, with performances comparable to
state-of-the-art model checkers. On the other hand, iProverModulo is
able to produce proof in Dedukti
format. Dedukti (https://deducteam.github.io/) is a
universal proof checker, which is able to check proofs coming from a
variety of automated provers and proof assistants. Conversely,
Thiré [2] showed how to translate proof from a
subset of Dedukti to various proof assistants, namely Coq, PVS, Lean,
Matita and provers of the HOL family. In these proof assistants,
various embedding of CTL have been developed, but they lack
automation.

Subject
-------

The goal of this internship is to study how to automate proof search
of CTL formulas in proof assistants by using iProverModulo and
Dedukti. First, one has to make sure that the proofs produced by
iProverModulo corresponds to the fragment translatable to the proof
assistants. Then, the embedding of CTL in iProverModulo, and its
translation via Dedukti, should be aligned with how CTL has been
defined in the various proof assistants. Finally, tactics should be
developed in the proof assistant to make the whole process easy to
use.


References
----------

[1] Kailiang Ji. CTL model checking in deduction modulo. In Amy P. Felty and Aart Middeldorp, editors, CADE-25, volume 9195 of LNCS, pages 295=96310. Springer, 2015.

[2] François Thiré. Sharing a library between proof assistants: Reaching out to the HOL family. In Frédéric Blanqui and Giselle Reis, editors, LFMTP 2018, volume 274 of EPTCS, pages 57=9671, 2018.

URL sujet détaillé : http://web4.ensiie.fr/~guillaume.burel/stages/Master2_ctlsttfa.pdf

Remarques : The internship will take place in Inria project-team Deducteam, and will be supervised by Guillaume Burel, assistant professor at the
Ensiie, temporarily assigned at Inria.




SM207-122 Verification of security protocols  
Admin  
Encadrant : Véronique CORTIER
Labo/Organisme : LORIA, team PESTO
URL : https://members.loria.fr/VCortier/
Ville : Nancy

Description  

*Context*. Security protocols are distributed programs that aim at ensuring security proper- ties, such as confidentiality, authentication or anonymity, by the means of cryptography. Such protocols are widely deployed, e.g., for electronic commerce on the Internet, in banking networks, mobile phones and more recently electronic elections.
Formal methods have demonstrated their usefulness when designing and analyzing secu- rity protocols. They indeed provide rigorous frameworks and techniques that have allowed to discover new flaws. In particular, ProVerif [1] is a state-of-the-art tool dedicated to the security analysis of protocols. It has been successfully applied to numerous protocols of the literature, providing security proofs or discovering new attacks.
However, ProVerif fails short when it comes to analyse voting protocols. Indeed, voting protocols should achieve vote privacy (no one knows my vote) and verifiability (the result corresponds to the votes cast by voters). This last part requires to count the votes, which is beyond ProVerif's capabilities. Instead, current analyses devise sufficient conditions that imply verifiability and that are easier to prove. However, this means that part of the analysis has to be done by hand (on paper).

*Objectives of the internship*. The goal of the internship is to enhance ProVerif with counting operations. This would allow to analyse voting protocols but also any protocol with counting operations. This work has a practical impact but requires first to develop theoretical results in the area of automated deduction. Namely, ProVerif actually abstracts protocols by Horn clauses. The success of ProVerif lies in the development of a very efficient resolution strategy for Horn clauses.
Very recently, ProVerif has been enhanced with natural numbers together with a + operation [2]. The first goal of the internship is to develop a resolution procedure for clauses that contain a + operation on integers, especially when the + operator is applied between two variables. This is necessary to deal with verifiability in voting. The new resolution procedure will be integrated into ProVerif and tested on voting protocols of the literature.
On a medium-term, the goal is to devise decision procedures for operators with arithmetic properties like the exclusive or and the modular exponentiation. One direction is to split the properties into properties that can be handled directly by ProVerif and properties that require a dedicated resolution procedure.
One interesting aspect of the internship is that it mixes foundational work together with the study of practical protocols (in evoting).

This internship may lead to a PhD thesis on similar topics.

Références
[1] Bruno Blanchet. Modeling and verifying security protocols with the applied pi calculus and proverif. Foundations and Trends in Privacy and Security, pages 1=96135, 2016.
[2] Vincent Cheval, Véronique Cortier, and Mathieu Turuani. A little more conversation, a little less action, a lot more satisfaction : Global states in proverif. In Proceedings of the 31st IEEE Computer Security Foundations Symposium (CSF'18), pages 344=96358, 2018.
URL sujet détaillé : https://members.loria.fr/VCortier/files/sujet-vote-ProVerif.pdf

Remarques : Interns may receive a stipend.

This work will be co-supervised by Vincent Cheval (PESTO team, Loria, Nancy).



SM207-123 Perception and interpretation of human activity  
Admin  
Encadrant : Francis COLAS
Labo/Organisme : Loria-Inria
URL : https://team.inria.fr/larsen/
Ville : Nancy

Description  

For a long time, industrial robots were secluded in safety cages and autonomous robots were mainly designed and programmed to work without humans or while avoiding them (as obstacles). There is a growing trend towards environment sharing and collaboration between humans and robots with the objective of having robots assist humans (for manufacturing but also in our daily lives). One of the main issues for human-robot interaction is, for the robot, to be able to adequately perceive and interpret what the human is doing.

This internship is in the context of the ``Flying co-worker'' ANR project aiming at building a collaborative flying robot to help human workers. The objective of this subject is to be able to provide information about the current and future status of the human worker. This can mean the precise motion of the arms, hands, head of the human but also her posture and even the higher-level task she is performing.

There are several aspects to be tackled in the long term, each of which open for the present internship:
- human perception by a flying robot,
- human activity recognition,
- gesture and intention estimation,
- human activity prediction.

A possible work-plan would be to start by study and use state-of-the-art techniques for activity recognition based on probabilistic approaches such as Hidden Markov Models [1]. It would then be possible to develop new activity models which would maximize longer-term prediction rates, for instance, based on more accurate modeling of each task duration. A particular challenge is to be able to learn the parameters of such a new model. An additional step could be to leverage short-term motion prediction [2], coupled with the new activity model to further improve accuracy.

It is expected to balance theoretical contributions with experimental validation. To this end, various mobile robots (Tiago, Pepper, and several other ground robots) with different sensors are available as test platforms.


URL sujet détaillé : https://team.inria.fr/larsen/sujet2018-perception-human-activity/

Remarques : This internship takes place at Inria in Nancy in the Larsen team and is founded by the ``Flying co-worker'' ANR project. It is co-supervised by Serena Ivaldi, Vincent Thomas and François Charpillet.
PhD funding will be available for the continuation of this project.





SM207-124 Topological bounds on the chromatic number  
Admin  
Encadrant : Matej STEHLIK
Labo/Organisme : Laboratoire G-SCOP
URL : https://pagesperso.g-scop.grenoble-inp.fr/~stehlikm/
Ville : Grenoble

Description  

Scientific context

The chromatic number is arguably the most widely-studied graph invariant, and as a result there is a plethora of conjectures and open problems involving it (see [1]). Because of the difficulty of these conjectures, a fruitful approach has been to prove simpler versions, for instance by replacing the chromatic number by the fractional chromatic number.

An interesting family of lower bounds on the chromatic number (completely independent of the fractional chromatic number) comes from an application of the Borsuk--Ulam theorem: these are known as topological bounds on the chromatic number (see [2]).

Subject of the internship

The aim of this internship is to consider weaker versions of conjectures involving the chromatic number, by replacing the chromatic number by one of the topological bounds. This was already done [3] for the conjectures of Behzad-Vizing, Hedetniemi and Hadwiger, but many interesting conjectures remain: for instance, the Erdos-Lovasz-Faber conjecture, Reed's conjecture, and the double-critical graph conjecture.

The internship will be an opportunity to learn about the area of graph colouring, as well as topological methods. Given the large number of conjectures, the student will have ample scope for independent research.

The student is expected to have a basic knowledge of graph theory, as well as an interest in topology.

The internship may lead to a PhD thesis on similar topics.

References
[1] T. R. Jensen and B. Toft, Graph Coloring Problems, Wiley-Interscience Series in Discrete Mathematics and Optimization, John Wiley & Sons, Inc., New York, 1995.
[2] J. Matousek. Using the Borsuk=96Ulam theorem, Universitext, Springer-Verlag, Berlin, 2003.
[3] G. Simonyi, and A. Zsban, On topological relaxations of chromatic conjectures, European J. Combin. 31 (2010) 2110=962119.
URL sujet détaillé : :

Remarques :



SM207-125 Active Search for Information with a Mobile Robot  
Admin  
Encadrant : Olivier BUFFET
Labo/Organisme : INRIA / LORIA
URL : https://team.inria.fr/larsen/
Ville : Nancy

Description  

POMDPs (Partially Observable Markov Decision Processes) allow describing control/planning problems where the state of the system is only "guessed" (in a Bayesian sense) through indirect observations.

Recent algorithms allow solving POMDPs with large or continuous state, action and observation spaces. The objective of this research work is to adapt these algorithms to rho-POMDPs, i.e., an extension of POMDPs that allows defining information-oriented performance criteria, e.g., localizing a robot (rather than getting it to a certain location) or a victim in a disaster (assuming that this a prior step before any other intervention). The resulting solution(s) shall be evaluated on a simple mobile robot equipped with proximity sensors.

The work program consists in:
1- becoming familiar with prior work on rho-POMDPs and continuous POMDPs, probably implementing some related algorithms;
2- becoming familiar with the robot and related softwares;
3- proposing an algorithm for continuous rho-POMDPs (possibly with asymptotic convergence properties); and
4- evaluating it empirically on some scenarios involving the robot.

URL sujet détaillé : https://team.inria.fr/larsen/sujet2018-rhopomdp-robot/

Remarques : Co-advised with Vincent THOMAS.

This work requires good programming and math skills (including basic probability theory), and willingness to work on both theoretical and practical questions.



SM207-126 Static Selective Instrumentation for Parallel Programs Verification  
Admin  
Encadrant : Emmanuelle SAILLARD
Labo/Organisme : Inria
URL : http://emmanuellesaillard.fr
Ville : Talence

Description  

High Performance Computing (HPC) plays an important role in many fields like health, materials science, security or environment. The current supercomputer hardware trends lead to more complex HPC applications (heterogeneity in hardware and combinations of parallel programming models) that pose programmability challenges. As indicated by a recent US DOE report, progress to Exascale stresses the requirement for convenient and scalable debugging methods to help developers.

The PARallel COntrol flow Anomaly CHecker (PARCOACH) aims at helping developers in their debugging phase. It combines static and dynamic analyses to detect misuse of collectives (i.e., collective operations mismatch in MPI) in parallel applications. At compile-time, an interprocedural static analysis studies the control and data flow of a program to find potential deadlocks. During this step, a program is tagged as statically correct (no possible deadlock) or not verifiable (a deadlock may occur at runtime). For statically not verifiable programs, all collectives and exit statements are instrumented with Check Collective (CC) functions, starting from the first collectives that may deadlock in the program. The first "safe" collectives are not instrumented.
At execution-time, the CC functions verify which collectives will be called at different steps of execution. In case of an actual deadlock situation, the program is stopped before the deadlock occurs.

The goal of this internship is to improve the selective instrumentation of PARCOACH.
The student will work on developing an analysis that builds an automata of the program to compute the language of remaining collectives at different point of the program.
The analysis will combine the static information in PARCOACH with the automata to reduce the number of checks at runtime.

The internship will first focus on MPI applications and will then be extended to other parallel programming models.
The work will be implemented as an LLVM Pass and fully integrated to the PARCOACH tool.
URL sujet détaillé : http://emmanuellesaillard.fr/resources/M2_Intru.pdf

Remarques : Des connaissances en compilation et en programmation parallèle sont souhaitables mais pas indispensables.



SM207-127 Combiner SOG et Réduction d'Ordre Partiel pour Une Vérification Efficace de la Logique LTL  
Admin  
Encadrant : Kais KLAI
Labo/Organisme : LIPN
URL : https://lipn.univ-paris13.fr
Ville : Paris

Description  

The foal of the internship is to combine the symbolic observation graph and the partial order reduction technique in the context of LTL model checking.
URL sujet détaillé : https://www-lipn.univ-paris13.fr/~klai/stages/SOG_PO

Remarques : Ce stage sera co-encadré par Sami Evangelista



SM207-128 Development of a framework for the automated testing of Android applications.  
Admin  
Encadrant : Fabrizio PASTORE
Labo/Organisme : University of Luxembourg / SnT Centre for Security, Reliability and Trust
URL : :
Ville : Luxembourg (LUXEMBORG)

Description  

This internship project fit in a larger research program concerning the automated testing of Android applications. More precisely, the internship project regards the extension of a framework for the automated testing of Android systems using their GUI.

The framework is capable of automatically exercising the features of an Android system, thus mimicking the behavior of a human operator performing manual testing. The framework is characterized by an extensible architecture that enables engineers and researchers to specify the exploration strategies to consider when performing testing and to define the automated oracles to use to automatically identify failures.

This internship regards the extension of the framework to enable the automated detection of regression failures caused by changes in the Android OS. To this end the internship will consist of two major activities: (Activity1) Development of exploration strategies for changes in Android OS; (Activity2) Development of automated oracles for interaction-protocols.

The first activity regards the development of exploration strategies that facilitate the identification of regression failures that emerge after updates in the Android OS. The developed strategy may leverage different types of information like code-coverage (e.g., to determine which Android APIs are used), system models (to determine dependencies not visible at the code level), input generation strategies (e.g., random, based on dictionaries, or based on user-collected data), requirements and software specifications in natural language (e.g., to determine new features and behavioural changes).

The second activity concerns the identification of oracles that spot regression failures by identifying anomalies in the behavior of the software interfaces exercised during software testing. Anomalies at the interface level (e.g., return values never produced by previous versions of an API function) often lead to regression failures that are difficult to identify at testing time because they may not immediately affect the results visualized in the GUI. For example, failures depending on erroneous interactions with the Android backup service API can be observed from the GUI only when the Android system is restored from the backup, while anomalous behaviors of the API service (e.g., unexpected return values) might be observed at every invocation of the backup service. To develop oracles for interaction-protocols and effectively identify these kinds of anomalies the Research Program will investigate the use of different machine learning solutions to determine nominal software behaviors, spot anomalies and identify anomalies that indicate the presence of failures.

The student doing the internship will study the state-of-the-art regarding the techniques required to complete the internship:
(1) automated testing of GUI systems;
(2) model inference techniques;
(3) anomaly detection;
(4) natural language processing.


URL sujet détaillé : :

Remarques : The student doing the internship will receive a coverage for 800 EUR/month.

The student is entitled to receive an accomodation from the university (costs 400 EUR/month).

The student will work in the group of Prof. Lionel Briand under the supervision of Dr. Fabrizio Pastore.

Students showing interest and appropriate research attitude will be offered a position for enrollment under the PhD program at the University of Luxembourg/SnT.



SM207-129 Boolean networks generalised updatings  
Admin  
Encadrant : Sylvain SENÉ
Labo/Organisme : The candidate/student will integrate the CANA (calcul naturel / natural computation) team of the LIS (Laboratoire d'informatique et systèmes / Laboratory of Computer Science and Systems) and will be co-supervised by Laurent Tichit through a strong collaboration with the MABIOS (Mathématiques et algorithmique pour la biologie des systèmes / Mathematics and algorithms for systems biology) team of the I2M (Institut de mathématiques de Marseille / Institute of Mathematics of Marseille).
URL : https://pageperso.lis-lab.fr/sylvain.sene
Ville : Marseille

Description  

Abstract
Since the end of the sixties, Boolean networks introduced by McCulloch and Pitts have become central mathematical models of genetic regulation networks, for which many theoretical and applied results have been obtained. However, one question that remains open concerns the influence of the way Boolean entities update their states (1 or 0, expressed or not) along time. Two standpoints have appeared to be the most studied, according to more or less reasonable biological arguments: (i) block-sequential updates that consist in separating the entities in disjoint blocks so that update is parallel inside a block but sequential between blocks; (ii) asynchronous updates that consist in considering all the possible sequential updates. Biologically speaking, these two approaches do not succeed in reflecting the reality of biological clocks at the level of genetic regulations. In this project, we propose to study more complex updates to reach a better understanding of the richness and the intricacies of biological regulation and highlight them in real biological networks.
Keywords: Discrete dynamical systems theory and combinatorics, Boolean networks, updating modes, biological regulation networks.

Objectives
The very objectives are to consider a definition of updating modes complying with biological heterogeneities (e.g., authorizing repetitions and avoiding non-fair updates, i.e., updates enabling entities to change state infinitely more often than other), to characterise them from an enumerative combinatorics point of view and to discover kinds of networks robust/sensitive to them (creation/suppression of attractors, even fixed points). According to the background of the candidate, one objective could also consist in finding and developing new adapted tools to study their influence in real biological networks.

Proposed approach
The proposed approach is mainly theoretical and takes place at the frontier of discrete mathematics and theoretical computer science: it deals with dynamical system and graph theories, enumerative combinatorics, complexity and computability on Boolean networks, each of these domains used in analysing deeply and subtly the behaviours of biological networks. More precisely, on the basis of general periodic updating modes that can be viewed as finite sequences of finite subsets of entities (and sub-families of these that include for instance block-sequential and asynchronous updating modes), the first work will consist in finding explicit formulae of the number of such updating modes, and put it in relation with already proven results. The second part will be dedicated to the study of specific features that allow creating/removing fixed points (we know that creation is possible, we do not know anything about removal). A possible third part would be to tackle the adaptation of useful tools (like update graphs) to this framework.

Interdisciplinarity
Whilst essentially anchored in mathematics and computer science, this project is driven by strong biological motivations: first, studying the impact of updating modes is a way to reach a better understanding of the chromatin dynamics which gives rise to the genetic expression clock; second, previous studies showed that classical updating modes are not powerful enough to model biological systems exhibiting some kinds of synchronizing behaviours, like Zeitgebers (i.e. timers), which has been shown to be possible with the update modes mentioned above (and notably highlighted in plant growth and physiology). Moreover, analysing robustness/sensitivity of the asymptotic dynamics (creation/removal of fixed points for instance) of biological networks can lead to an adaptable and plastic framework to do cell reprogramming and model the emergence of new (even potentially pathological) phenotypes.
URL sujet détaillé : https://pageperso.lis-lab.fr/sylvain.sene/rech/M2_updates.pdf

Remarques : The candidate/student will integrate the CANA (calcul naturel / natural computation) team of the LIS (Laboratoire d'informatique et systèmes / Laboratory of Computer Science and Systems) and will be co-supervised by Laurent Tichit through a strong collaboration with the MABIOS (Mathématiques et algorithmique pour la biologie des systèmes / Mathematics and algorithms for systems biology) team of the I2M (Institut de mathématiques de Marseille / Institute of Mathematics of Marseille).

Internship duration: From 5 to 6 months.

Internship location: LIS, Luminy, Marseille, France.

Compensation: Legal internship compensation implemented at Aix-Marseille University.



SM207-130 Passage à l'échelle de la méthode de  
Admin  
Encadrant : - Encadrants
Labo/Organisme : Inria Bordeaux Sud Ouest / LaBRI
URL : https://team.inria.fr/hiepacs/
Ville : Bordeaux (Talence)

Description  

Les composantes méthodologiques et logicielles du high performance data analytics (HPDA) servent d'éléments de base pour des développements algorithmiques vers les applications de simulation ou d'analyse de données massives. Ainsi, une activité cruciale sur la voie de la convergence entre calcul haute performance (HPC) et HPDA est l'extension des méthodes de calcul haute performance aux traitements de données issues de la biologie.

L'objectif du stage est d'explorer les opportunités offertes par les techniques
numériques émergentes en algèbre linéaire et multi-linéaire " data sparse " et leur capacité de mises en oeuvre efficace sous forme d'algorithmes à base de tâches (en utilisant les support d'exécution StarPU/NewMadeleine développés à Inria Bordeaux) pour paralléliser l'algorithme MDS.
URL sujet détaillé : https://team.inria.fr/hiepacs/files/2018/11/stage-2019-hpda.pdf

Remarques : - Encadrants: Emmanuel Agullo (Inria HiePACS), Olivier Coulaud (Inria Hiepacs), Alain Franc (INRA / Inria Pleiade)
- Tél: EA 05 24 57 41 50 - OC 05 24 57 40 80 - AF 05 35 38 53 53
- Courriels: emmanuel.agullo.fr; olivier.coulaud.fr; alain.franc.fr
- Poursuite en doctorat: financement acquis



SM207-131 Deploying AI Go bot on Mobile Phone  
Admin  
Encadrant : Hervé POIRIER
Labo/Organisme : NAVER LABS Europe
URL : http://www.europe.naverlabs.com/
Ville : Meylan

Description  

The mission of NAVER LABS Europe is to advance the state-of-the-art in Ambient Intelligence including autonomous vehicles, artificial intelligence, 3D mapping, augmented reality and robotics while paving the way for these innovations into NAVER products and services.

The edge computing group focuses on bringing ambient intelligence to the edge to improve the user experience while respecting their privacy. Devices such as mobile phone or SoC platforms are more and more powerful which allow the processing of data close to where they're generated.

The goal of this internship is to port an AI bot that plays Go (like Elf OpenGo) on a mobile phone.

In this internship, you will focus on some of the following tasks in collaboration with other team members:

- Select and port the relevant AI algorithm on a mobile device
- Port and evaluate performances from high-end to low-end devices
- Optimisation using GPU and/or NPU code

Requirements and competencies expected:

- Master's student in Computer Science.
- Development skills: Python, with some experience of mobile platforms (iOS or Android).
- Knowledge in computer vision and machine learning techniques is a plus
- Knowledge of the Go game is a plus
- Highly motivated.

During the internship you will acquire significant knowledge in advanced computer vision and machine learning techniques while working closely with researchers and engineers. Additionally, you will become knowledgeable of Agile developments methodologies (eg: SCRUM, Continuous Integration)

URL sujet détaillé : :

Remarques :



SM207-132 Deploying Computer Vision Algorithms on Embedded Devices  
Admin  
Encadrant : Hervé POIRIER
Labo/Organisme : NAVER LABS Europe
URL : http://www.europe.naverlabs.com/
Ville : Meylan

Description  

The mission of NAVER LABS Europe is to advance the state-of-the-art in Ambient Intelligence including autonomous vehicles, artificial intelligence, 3D mapping, augmented reality and robotics while paving the way for these innovations into NAVER products and services.

The edge computing group focuses on bringing ambient intelligence to the edge to improve the user experience while respecting their privacy. Devices such as mobile phone or SoC platforms are more and more powerful which allow the processing of the data close to where they're generated.

The goal of this internship is to enable new type of applications by bringing advanced computer vision algorithms on devices with CPU/Memory or bandwidth constraints.

In this internship, you will focus on some of the following tasks in collaboration with the other team members:

- Select relevant computer vision algorithms such as object detection and classification
- Target embedded devices based on their power consumption from 10-20W (Nvidia Tegra TX2, Drive PX, AGX Xavier) to a few a few watts (Raspberry Pi) or even less (GAP8)
- Port and evaluate performances
- Develop applications demonstrating the performance of some of the embedded platforms

Requirements and competencies expected:

- Master Students in Computer Science
- Development skills: C/C++/C#, Java, python, with some experience of mobile platforms.
- Knowledge in computer vision and machine learning techniques is a plus
- Highly motivated

During the internship, you will acquire a significant knowledge in advanced computer vision and machine learning techniques while working closely with researchers and engineers. Additionally, you will become knowledgeable of Agile developments methodologies (eg: SCRUM, Continuous Integration)

URL sujet détaillé : :

Remarques :



SM207-133 Indoor 3D reconstruction with an Hololens headset  
Admin  
Encadrant : Hervé POIRIER
Labo/Organisme : NAVER LABS Europe
URL : http://www.europe.naverlabs.com/
Ville : Meylan

Description  

The mission of NAVER LABS Europe is to advance the state-of-the-art in Ambient Intelligence including autonomous vehicles, artificial intelligence, 3D mapping, augmented reality and robotics while paving the way for these innovations into NAVER products and services.

The edge computing group focuses on bringing ambient intelligence to the edge to improve the user experience while respecting their privacy. Devices such as mobile phone or SoC platforms are more and more powerful which allow the processing of the data close to where they're generated.

The goal of this internship is to design and develop an Augmented reality application running on a AR Headset (i.e. Hololens) to perform realtime 3D reconstruction of the environment with an active feedback.

The internship will be divided into several phases in collaboration with the other team members:

- Evaluation of the state of the art of augmented reality and 3D reconstruction algorithms
- Design a prototype, design experiments and collect data
- Model training and performance evaluation
- Optimizations to have a real time and nice user experience

Requirements and competencies expected:

- Master Students in Computer Science.
- Development skills: C/C++/C#, Java, python, with some experience of mobile platforms.
- Knowledge in computer vision and machine learning techniques is a plus
- Highly motivated.

During the internship you will acquire a significant knowledge in advanced computer vision and machine learning techniques while working closely with researchers and engineers. Additionally, you will become knowledgeable of Agile developments methodologies (eg: SCRUM, Continuous Integration)

URL sujet détaillé : :

Remarques :



SM207-134 Résilience d'algorithmes d'algèbre linéaire numérique pour la simulation à grande échelle  
Admin  
Encadrant : - Encadrants
Labo/Organisme : Inria Bordeaux Sud Ouest / LaBRI
URL : https://team.inria.fr/hiepacs/
Ville : Bordeaux (Talence)

Description  

Pour contenir le budget énergétique des machines parallèles " extreme scale " des choix technologiques sont incontournables tels qu'accroître démesurément le nombre de cóurs de calcul et réduire leur voltage de fonctionnement. Une conséquence directe de ces orientations est que le taux de panne va croître aussi bien les pannes matérielles conduisant à la perte de cóurs ou des nóuds (hard faults). Les approches basées sur des points de reprise (check-point restrart) seront trop coûteuses pour passer à l'échelle et des solutions alternatives doivent être considérées. Celles-ci nécessitent de remettre en cause les algorithmes numériques et leur mise en óuvre parallèle.

Dans le contexte de méthodes numériques en algèbre linéaire, l'objectif de ce stage est de revisiter les méthodes asynchrones de type point fixe qui permettent l'utilisation de techniques peu synchronisantes lors de la reprise après une faute.
URL sujet détaillé : https://team.inria.fr/hiepacs/files/2018/11/2019_resilienceAsynchrone.pdf

Remarques : - Encadrants: Emmanuel Agullo, Luc Giraud
- Tél: E. Agullo 05 24 57 41 50 - L. Giraud 05 61 19 30 25
- Courriels: emmanuel.agullo.fr; luc.giraud.fr




SM207-135 Use of Machine Learning Techniques to Help Find Proofs in the Coq Proof Assistant  
Admin  
Encadrant : David DELAHAYE
Labo/Organisme : LIRMM UMR 5506
URL : http://www.lirmm.fr/~delahaye/
Ville : Montpellier

Description  

In this internship, we propose to build, in the framework of the Coq system, a method of proof assistance, which given a goal, could recommend a tactic to apply. The idea is to extract this recommendation by applying machine learning algorithms to large corpus of proofs (typically the standard library of Coq, for example). To do so, we could consider recent work done in the framework of Isabelle/HOL or first experiments that were carried out within the framework of Coq.
URL sujet détaillé : http://www.lirmm.fr/~delahaye/docs/ml-coq.pdf

Remarques : Le stage s'effectuera au sein de l'équipe MaREL du LIRMM. Le stage sera rémunéré. L'encadrement sera réalisé par David Delahaye (Université de Montpellier, LIRMM, David.Delahaye.fr).



SM207-136 Deductive Approaches for Process Verification  
Admin  
Encadrant : David DELAHAYE
Labo/Organisme : LIRMM UMR 5506
URL : http://www.lirmm.fr/~delahaye/
Ville : Montpellier

Description  

Process algebras, such as Milner's CCS, can model and verify communicating parallel systems. Here, we consider algebras such as CCS or LOTOS. We exclude mobile algebras like π-calculus. Milner defined a set of equivalence relations (strong equivalence ~, observational equivalence ≈, observational congruence =). These relations are defined on the operational semantics of the given processes on labeled transition systems (LTS). In practice, these relations are also verified on LTS. The objective of this internship is to build an automated deduction tool to verify these relationships, without trying to translate processes into transition systems.
URL sujet détaillé : http://www.lirmm.fr/~delahaye/docs/proc-coq.pdf

Remarques : Le stage s'effectuera au sein de l'équipe MaREL
du LIRMM (à Montpellier) et en collaboration avec le LGI2P de l'IMT Mines Alès. L'encadrement sera réalisé par David Delahaye (Université de Montpellier, LIRMM, David.Delahaye.fr) et Thomas Lambolais (IMT Mines Alès, LGI2P, Thomas.Lambolais-ales.fr).



SM207-137 Familial monads and structural operational semantics  
Admin  
Encadrant : Tom HIRSCHOWITZ
Labo/Organisme : LAMA - Univ. Savoie Mont Blanc
URL : https://www.lama.univ-savoie.fr/pagesmembres/hirschowitz
Ville : Chambéry

Description  

Structural operational semantics is a method for specifying the
dynamics of programming languages by induction on their syntax. A
crucial issue is to find sufficient conditions for the produced
language to behave well. E.g., is program equivalence a congruence?
Such questions have led to quite a few 'formats', whose diversity has
been mostly unified by 'functorial operational semantics', an
abstract, categorical framework. However, functorial operational
semantics does not scale well to languages with variable binding.



In recent work (to appear at POPL '19), we define a new categorical
framework based on 'familial monads', which cover languages with
variable binding, and in which we smoothly prove two important,
standard properties of the obtained semantics: 'congruence of
bisimilarity' and soundness of 'bisimulation up to context'. The goal
of the internship/thesis is to contribute to the development of this
framework, either by looking into formats, or by extending it beyond
congruence of bisimilarity and soudness of bisimulation up to context,
to related techniques like Howe's method, weak bisimulation,
environmental bisimulation, or finding solutions to process equations.
In the longer run, we could investigate new formats based on familial
monad theory.

URL sujet détaillé : https://www.lama.univ-savoie.fr/pagesmembres/hirschowitz/stages/familial-sos-en.pdf

Remarques : Possibilité d'indemnisation.



SM207-138 Honest Product Reviews  
Admin  
Encadrant : Matthias GALLÉ
Labo/Organisme :
URL : http://www.europe.naverlabs.com/Research/Natural-Language-Processing
Ville : Meylan

Description  

An increasing amount and value of information about products and places is stored in collections of user reviews. This information is messy, often contradictory or redundant but taken together hides troves of insights that users without time to read them all are interested in.

In this internship you will explore recent summarization techniques and apply them to the problem of creating a unified review which reflects that wisdom of the crowd.

At Naver Labs Europe you will interact with researchers working in the latest developments of machine learning and natural language processing, apply them to our data using our GPU cluster. We regularly publish in top conferences, release data-set and otherwise interact with our academic collaborators.


Requirements
● PhD or Master (research-oriented) level
● Knowledge of deep learning as applied to NLP
● Good coding skills, including at least one the major deep learning toolkits (preferably pytorch)

Duration
5-6 months
Application instructions
To apply, please send a mail with CV and cover letter to matthias.galle.com





URL sujet détaillé : http://www.europe.naverlabs.com/NAVER-LABS-Europe/Internships/Honest-Product-Reviews

Remarques :



SM207-139 Ordonnancement pour la Gestion d'ordre de vente dans un contexte de technologie Blockchain.  
Admin  
Encadrant : Hadrien CROUBOIS
Labo/Organisme : iexec
URL : https://iex.ec/
Ville : Lyon

Description  

Étudier la dynamique des taches de calcul dans une infrastructure de calcul distribué afin de produire un algorithme gouvernant l'émission d'ordres de vente par le scheduler.


La blockchain est une technologie innovante qui promet de révolutionner bon nombre de domaines tel que la finance, la logistique, la gestion d'objets connectés, les assurances, le jeu vidéo, etc. Au-delà des cryptomonnaies tel que bitcoin, le développement des smart-contracts, introduit par Ethereum, permet d'imaginer des applications de la blockchain dans de nombreux contextes publiques ou privé. Le caractère distribué et tolérant aux fautes byzantines d'une telle structure en fait un outil parfait pour l'organisation d'infrastructures décentralisées. iExec utilise cette technologie pour l'organisation d'un Cloud décentralisé ou chacun peut mettre à disposition des ressources de calcul et se voit rémunéré pour cela. Au cóur de cette infrastructure, les workers sont organisés en workerpools qui sont gérés par des entités ad-hoc appelés schedulers.

Au delà des problématiques d'ordonnancement de taches à proprement parler, un scheduler a aussi pour mission d'agréger les ressources des workers afin de les publier dans la marketplace. Ce mécanisme nécessite une évaluation des capacité de calcul des workers distribués et non fiable ainsi qu'une compréhension des dynamiques de réplications mises en place par iExec pour assurer la fiabilité des résultats.

L'objectif de ce stage est d'étudier la dynamique des taches de calcul dans une infrastructure de calcul distribué afin de produire un algorithme gouvernant l'émission d'ordres de vente par le scheduler. Cette algorithme devra prendre en considération la non fiabilité des workers ainsi que tous les mécanismes de certifications qui viendraient pénaliser le scheduler si celui-ci n'arrivait pas à produire un consensus.

iExec est une entreprise lyonnaise qui développe des solutions de Cloud distribué en lien avec la blockchain. Ce Cloud cible aussi bien les utilisateurs traditionnels de ressources Cloud que les smart-contracts nécessitant une puissance de calcul dont ne dispose par la blockchain.

Avalon (http://avalon.ens-lyon.fr) est une des équipes Inria hébergée au LIP (ENS Lyon). L'objectif à long terme d'Avalon est de contribuer à la conception de modèles de programmation supportant un grand nombre d'architectures, de les implémenter en maîtrisant les concepts algorithmiques sous-jacents, et d'en étudier l'impact sur les algorithmes au niveau applicatif. Pour atteindre ses objectifs, l'équipe envisage de contribuer à différents niveaux comme l'algorithmique distribuée, les modèles de programmation, la découverte et le déploiement de services, l'organisation et la composition de services, la gestion d'importants volumes de données, le calcul en nuage, etc. Nos résultats théoriques sont validés sur des prototypes logiciels à destination de différentes thématiques scientifiques telles que la bio-informatique, la physique, l'astrophysique, etc. La start'up iExec est née au sein de l'équipe Avalon.

A l'issue de ce stage, il sera possible de poursuivre ces travaux en doctorat.
URL sujet détaillé : :

Remarques : Ce stage est co-encadré par iExec et E. Caron de l'ENS de Lyon.



SM207-140 Scheduling and mapping strategies for reducing energy consumption of edge applications and services  
Admin  
Encadrant : Laurent LEFEVRE
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/laurent.lefevre/
Ville : Lyon

Description  

Deploying edge processing capabilities closed to users is an original approach in order to reduce latency of digital services. This training period will address the scenario of applications deployed between clouds, edge and fog infrastructures. Such services and applications can generate a big environmental footprint with an enormous electrical consumption and a non optimized usage of resources. Several models, heuristics and approaches will be explored during this training period in order to reduce the electrical footprint of such big applications. The impact and flexibility supported by new generation networks (SDN, NFV) will be included in such exploration.

This internship will occur in the Avalon team of LIP laboratory (ENS Lyon) where several members work on GreenIT and how to improve energy efficiency of large digital systems (datacenters, clouds, networks). Interested students must contact Laurent Lefevre laurent.lefevre-lyon.fr for having more details on the addressed subject.
URL sujet détaillé : :

Remarques : This training period will be co advised between Laurent Lefevre (laurent.lefevre-lyon.fr) and Olivier Gluck (Olivier.Gluck-lyon1.fr). This internship can lead to a PhD continuation.



SM207-141 Training data for a certifiable Artificial Neural Network based algorithm  
Admin  
Encadrant : Christophe GABREAU
Labo/Organisme : AIRBUS
URL : https://www.airbus.com/
Ville : Toulouse

Description  

As part as the certification process, AIRBUS has to demonstrate that a safety critical software is compliant with its intended use in its operational context and that any residual problem does not jeopardize the safety requirements. Machine learning and specifically Artificial Neural Network (ANN) technology will be widely used in software design in the next future.
The main objective of this internship will be to investigate the properties of the training set of data, the ANN-based algorithm and the related training process that could conduct to a measurable and acceptable result.
URL sujet détaillé : https://company.airbus.com/careers/jobs-and-applications/search-for-vacancies~jobid=001A4B0A914A1EE8B79F42E431B2C500~.html

Remarques : Dates: 1er semestre 2019. Stage rémunéré. Poursuite en thèse possible.



SM207-142 Stable sets and coloring via convex relaxations  
Admin  
Encadrant : Alantha NEWMAN
Labo/Organisme : G-SCOP
URL : https://pagesperso.g-scop.grenoble-inp.fr/~newmana/
Ville : Grenoble

Description  

This internship will focus on algorithms for finding stable sets and coloring graphs with few colors using approaches from convex programming. In general, these problems are hard to approximate to within any non-trivial factor, and therefore these approaches will not be useful. However, in many instances, it has been demonstrated that assuming some structure on a graph leads to algorithms with guarantees that are out of reach for general graphs. For example, it has been conjectured that a graph without a fixed forbidden induced subgraph and with no large clique has a large stable set. Another example is the problem of coloring 3-colorable graphs. While the best known algorithms yield proper colorings with n^{epsilon} colors for some fixed constant epsilon < 1, forbidding a P_k for k at most 7 allows this problem to be solved in polynomial time. For the class of perfect graphs, one can use convex relaxations to compute the maximum stable set and chromatic number in polynomial time. The goal of this project is to study how to apply tools from convex optimization to these problems. For example, a natural approach to finding a stable set in a graph is through a semidefinite relaxation (i.e., vector program). The program can be strengthened by adding constraints; for example, constraints can be derived using hierarchies. While, in general, the bounds obtained are unlikely to be very informative, we will focus on graph classes whose structure can be used to obtain better bounds and simpler algorithms.
URL sujet détaillé : :

Remarques : possibilité de rémuneration



SM207-143 Stratégies GPU pour le clustering de graphes de grande taille  
Admin  
Encadrant : Laurent RISSER
Labo/Organisme : Institut de Mathématiques de Toulouse
URL : http://laurent.risser.free.fr/
Ville : Toulouse

Description  

The goal of this project is to explore the use of Graphic Processing Units (GPUs) for the clustering of large graphs (eg: the road network of very large urban areas or small countries).

GPU computing is indeed strongly related to the success of different machine learning methods such as deep learning or xgboost. There is now a growing interest to use it for other machine learning strategies (see eg nvidia rapids). Here, its use will be applied to graph clustering, where the information is not regularly organized in the memory.

The trainee will first have to deeply understand how GPU computing works as well as a classic graph clustering algorithm. The heart his or her work will then be to explore different solutions to model the graph clustering problem so that it can be efficiently solved using GPUs.
URL sujet détaillé : http://laurent.risser.free.fr/TMP_SHARE/StageGraphsAnalysis.pdf

Remarques :



SM207-144 Research internship in machine learning theory  
Admin  
Encadrant : Sébastien GERCHINOVITZ
Labo/Organisme : Institut de Mathématiques de Toulouse
URL : https://www.math.univ-toulouse.fr/~sgerchin/
Ville : Toulouse

Description  

I have one Master 2 research internship opportunity in machine learning theory, which can be about one of the following two problems. The first one is about approximation theory for deep feedforward neural networks; the goal is to read three very recent papers and address related open problems. The second possible topic is about an open problem in sequential learning theory; namely about optimal bounds for online linear regression with adversarial data. For more details, please see the attached pdf document.
URL sujet détaillé : https://www.math.univ-toulouse.fr/~sgerchin/docs/internship-learning-Gerchinovitz.pdf

Remarques : If the internship is successful, we will apply for a PhD grant on the same research topic.



SM207-145 On the success of Expectation-Maximization  
Admin  
Encadrant : Vincent Cohen-Addad
Labo/Organisme : LIP6
URL : https://www.di.ens.fr/~vcohen/
Ville : Paris

Description  

Consider a set of points X in R^d generated by a mixture of k Gaussians.
How to recover the parameters of the distribution (the means and
the variances)? This is a fundamental problem that arise in many
applications stemming from machine learning and data analysis.
A very popular approach for solving this problem is to use the
expectation-maximization (EM) heuristic
(https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm).
This heuristic is very good in practice both in terms of the quality of the solution
output and for its running time. However, from a theoretical standpoint,
very little is known about it. How fast is the convergence? How good is
the solution? Even for simple cases such as k=3, or d=2, not much is known.

The goal of the internship is to make a step toward the understanding of
the practical success of the EM heuristic by proving that the algorithm
output the "correct" solution in "reasonable" time, starting with simple scenarios.

The internship will be in Sorbonne Université, in Jussieu, Paris, advised
by Vincent Cohen-Addad. Collaborations and research visits to work with
Varun Kanade in Oxford and with Aurélien Garivier in ENS Lyon will be possible.


URL sujet détaillé : :

Remarques :



SM207-146 Deep learning for semantic information access  
Admin  
Encadrant : Lynda TAMINE-LECHANI
Labo/Organisme :
URL : Lynda Tamine (https://www.irit.fr/~Lynda.Tamine-Lechani/)
Ville : Toulouse

Description  

1 internship followed by 1 PhD position at the Toulouse University, France.

The Renault Labs France and the Informatics Research Institute of Toulouse (IRIT) invite applications for a fully
funded PhD position on "Deep learning for texts and knowledge bases access." The PhD will be co-supervised
by François-Paul Servant (Renault), Prof. Lynda Tamine and Dr. Jose Moreno.


Thesis description
=================The thesis targets two main objectives:
1) the semantic representation of documents that mention entities from different external resources;
2) the categorization of documents by family of entities mentioned and the search of documents meeting entities' needs.
To achieve these objectives, we plan to move towards an approach based on deep learning to solve both
representation and access problem (categorization, information retrieval), constrained by the content and structure
of multiple external resources (terminology, thesauri, knowledge graphs etc.).

>From the point of view of representation, we are in line with recent works based on the joint regularization
of neural embeddings augmented by resources [Faruqui2014; Yu2014; Wang2014; Yamada2016]. This work is based on the hypothesis
that learned representations are interpretable if they are aligned with entities derived from resources so
that representations of entities obtained in latent space are all the closer as they are associated with
semantically related entities in the external resources. These representations extended to sentences, texts,
are exploitable in an information search task [Nguyen2018], in the identification of mentions of
entities [Moreno2017], or the categorization of short textual documents [Kim2014]. Although distributional
representations exist for words/texts, structured resources and their combinations, no work is
interested in the constrained regularization of multiple resources, nor in the multi-level structuring
of entities in these resources. One of the first works in this direction uses Poincaré geometry to represent
the hierarchies in resources[Nickel2017], but completely ignores the representation of relationships between entities.
However, relationships are omnipresent in today's widely used knowledge bases, including those considered at Renault.

The thesis project faces new scientific challenges related to the definition of adequate neural architectures
and associated cost functions, capable of learning compositionality (semantic compositionality) both in
the local context (text) and global contexts (resources).

The envisioned starting date is September 2018 (starting in early 2019 is also possible).


Requirements
===========We are looking for one candidate with a strong focus on information retrieval/NLP and machine learning
with the following profile:

+ Good Master's degree in Computer Science, Statistics, Mathematics or related disciplines (essential)
+ Good programming skills in Python/Keras|TensorFlow|Torch (essential)
+ Advanced knowledge in algorithms and data structures (optional)
+ Ability to work independently and be self-motivated (essential)
+ Excellent communication skills in English (essential - minumil score of 750 in TOEIC)


Applying
=======To apply, please email your application to: françois-paul.servant.com, lynda.lechani.fr, and jose.moreno.fr.

The application should consist of the following:
+ a curriculum vitae
+ transcript of marks according to M1-M2 profile or last 3 years of engineering school (with indication on the ranking if possible)
+ covering letter
+ letter(s) of recommendation including at least one letter drawn up by a university referent

Potential candidates will be invited for an interview with the supervisors.
The application file should be sent


Conditions of employment
=======================Internship: gratification 540 euros
Thesis - You will be hired on fixed-term contract (3 years contract - CIFRE) at Renault, a world leader in car manufacturing. Payment with respect with academic profile.


Working at Toulouse (IRIT/Renault)
=================================You will be integrated in two teams with academic and industrial profiles: the IRIS team of IRIT recognized for its research
activities in the field of information retrieval and information synthesis with a focus on the use of Deep Learning technologies and the team Renault.

The IRIT lab represents one of the major potential of the French research in computer science, with a workforce of more than
700 members including 272 researchers and teachers 204 PhD students, 50 post-doc and researchers under contract and also 32
engineers and administrative employees.

Toulouse is located on the banks of the Garonne River, 150 kilometres from the Mediterranean Sea, 230 km from the Atlantic Ocean
and 680 km from Paris. It is the fourth-largest metro area in France, with 1,312,304 inhabitants as of January 2014.
Toulouse is the centre of the European aerospace industry, with the headquarters of Airbus, the Galileo positioning system, the SPOT
satellite system, ATR and the Aerospace Valley. It also hosts the European headquarters of Intel and CNES's Toulouse
Space Centre (CST), the largest space centre in Europe. Thales Alenia Space, and Astrium Satellites also have a significant presence in Toulouse.
The University of Toulouse is one of the oldest in Europe (founded in 1229) and, with more than 103,000 students, it is the
fourth-largest university campus in France, after the universities of Paris, Lyon and Lille.
The city was the capital of the Visigothic Kingdom in the 5th century and the capital of the province of Languedoc in the
Late Middle Ages and early modern period, making it the unofficial capital of the cultural region of Occitania (Southern France).

References
=========
[Faruqui2014] Faruqui M., Dodge J., Jauhar S. K., Dyer C., Hovy E., Smith N. A. Retrofitting Word Vectors to Semantic Lexicons, NAACL, 2014.
[Moreno2017] Moreno, J. G., Besançon, R., Beaumont, R., D'hondt, E., Ligozat, A. L., Rosset, S., Grau, B. (2017, Combining word and entity embeddings for entity linking. In Extended Semantic Web Conference (ESWC) pp. 337-352, 2017.
[Nickel2017] Nickel, M., & Kiela, D. Poincaré embeddings for learning hierarchical representations. In Advances in Neural Information Processing Systems (pp. 6341-6350), 2017.
[Nguyen2018] Gia Nguyen, Lynda Tamine, Laure Soulier, Nathalie Souf, A Tri-Partite Neural Document Language Model for Semantic Information Retrieval. In Extended Semantic Web Conference (ESWC), 2018.
[Yu2014] Yu M., Dredze M. Improving Lexical Embeddings with Semantic Knowledge, ACL, p. 545- 550, 2014.
[Wang2014] Wang Z., Zhang J., Feng J., Chen Z., " Knowledge Graph and Text Jointly Embedding ", EMNLP, p. 1591- 1601, 2014
[Yamada2016] Yamada, I., Shindo, H., Takeda, H., Takefuji, Y., " Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation ", CoNLL, p. 250-259, 2016
URL sujet détaillé : :

Remarques :



SM207-147 Deep learning methods for the sensor-based activity recognition tasks  
Admin  
Encadrant : Veronique DESLANDRES
Labo/Organisme : LIRIS-SMA
URL : https://perso.liris.cnrs.fr/vdesland/doku.php?id=start
Ville : Villeurbanne

Description  

Evaluation of machine learning methods, based on a dataset from a smart home (co-working space), about activities observed over 20 working days.
The dataset is proposed by Orange and INRIA (paper "A Dataset of Routine Daily Activities in an Instrumented Home" by UCAmI 2017).
The idea is to evaluate different deep learning algorithms, for different purposes:
- Recognition of atomic activities;
- Learning scenarios;
- Sensor clustering for given activities / scenarios;
- Learning similar sequences
The learning process could be completed with more or less predefined expert knowledge (ex.: labeling, location of sensors).

Depending on the candidat's motivation and results, the work could be followed with for example the following tasks:
- Identification of abnormal situations;
- Further activities prediction of further values for the sensors;
URL sujet détaillé : :

Remarques :



SM207-148 Active and semi-supervised learning using topological data analysis  
Admin  
Encadrant : Emilie DEVIJVER
Labo/Organisme : Ce stage sera co-encadré, avec Massih-Reza Amini (Laboratoire d'Informatique de Grenoble) et Rémi Molinier (Institut Fourier)
URL : http://ama.liglab.fr/~devijver
Ville : Grenoble

Description  


With the tremendous growth of data available electronically, the constitution of labeled training sets for learning, which often requires a skilled human agent or a physical experiment, becomes unrealistic.
One alternative is to gather a small set of labeled training examples = (mathbf{x}_i, y_i)_{1leq ileq l} sim mathcal{D}^l try to take advantage of the huge amount of unlabeled observations _l = (mathbf{x}_i)_{l+1leq i leq l+u} sim mathcal{D}_X^u with>>l to find a prediction function more efficient that the one found based on the labeled training set.


However, in some cases, the set of labeled training examples is not well constructed, in the sense that it does not reveal the complexity of the data set. For example, it would be true if labeled observations were originally drawn randomly from a huge set of unlabeled data, without revealing the intrinsic nature of the whole set.


In this project, we propose to consider topological data analysis (TDA) to improve the selection of informative labeled examples in conjunction with semi-supervised algorithms (Amini et al., NIPS 2009). TDA is a growing mathematical field which is starting to be widely applied in many different contexts (see Chazal et Michel, arXiv 2017 for a survey on the subject). It consists in using algebraic tools, which can be computed, to detect some topological features of the data we are considering.
TDA has already been used for machine learning algorithm as introduced in Chazal et al., J.ACM 2013 et Wasserman, Annual Review of Statistics and Its Application 2018, among others.


The aim here is to detect interesting observations (regarding the topology of the data) in order to be labeled, to ask for their labels (active learning step) and then to use semi-supervised learning algorithms to improve the performance of the prediction function. The resulting algorithm will be tested on real-world applications using data extracted from different information retrieval problems. We have in mind a specific data set coming from biology application (ongoing collaboration with I2BC).

URL sujet détaillé : :

Remarques : Ce stage sera co-encadré, avec Massih-Reza Amini (Laboratoire d'Informatique de Grenoble) et Rémi Molinier (Institut Fourier)



SM207-149 Effects in Type Theory  
Admin  
Encadrant : Nicolas TABAREAU
Labo/Organisme : Inria - LS2N
URL : tabareau.fr :
Ville : Nantes

Description  


We propose to incorporate effects in the theory of proof assistants. We notice that at stake is not only the certification of programs with effects, but that it has also implications in semantics and logic.
Type theory is based on the lambda calculus which is purely functional. Noticing that any realistic program contains effects (state, exceptions, input-output...), many works are focusing on certified pro- gramming with effects: notably Ynot, and more recently F⋆ and Idris, which propose various ways for encapsulating effects and restricting the dependency of types on effectful terms. Effects are either specialised, with monads with Hoare-style pre- and post-conditions found in Ynot or F⋆, or more general, with algebraic effects implemented in Idris. There is on the other hand a deficit of logical and semantic investigations.
Yet, a type theory that integrates effects would have logical, algebraic and computational implica- tions through the Curry-Howard correspondence.
The goal is to develop a type theory with effects that accounts both for practical experiments in certified programming, and for clues from denotational semantics and logical phenomena, starting from the unifying setting of (linear) call-by-push-value [1, 2].
Several extensions with dependent types of call-by-push-value have been proposed [3, 4]. The difficulty resides in expressing the dependency of types on effectful terms, and severe and artificial restrictions are considered. The goal is to understand and relax the various restrictions through the idea of semantic value uncovered in recent works. Our distinctive approach is to seek a proper extension of type theory, that would give the foundations to a classical type theory [5] but also give a foundational role to termination-checking traditional (intuitionistic) proof assistants.
My recent works with Pierre-Marie Pédrot [6, 7] suggest that it is possible to define a monadic translation in Type Theory by a careful combination of the call-by-name decomposition in call-by-push-value and the use of the Eilenberg-Moore adjunction that decomposes any monad. This can be seen as a monadic translation that works in a dependently typed setting. The only assumption on the monad is that the universe of all types forms an algebra for this monad.
The preliminary results are encouraging1 although further investigations remain to be done. With this approach, it should be possible to define many effects in CIC, if not all (because the free monad over an algebraic effect is covered by our setting).
2 Team
The internship will be held in the CoqHoTT ERC project, advised by Nicolas Tabareau, inside the Gallinette Inria team in Nantes.
References
[1] P. B. Levy, Call-By-Push-Value: A Functional/Imperative Synthesis, ser. Semantic Structures in Com- putation. Springer, 2004, vol. 2.
1 A Coq plugin is already available at https://github.com/CoqHott/coq-effects/.
1
[2] P.-L. Curien, M. Fiore, and G. Munch-Maccagnoni, =93A Theory of Effects and Resources: Adjunction Models and Polarised Calculi,=94 in Proc. POPL, 2016.
[3] D. Ahman, N. Ghani, and G. D. Plotkin, =93Dependent Types and Fibred Computational Effects,=94 in Proc. FoSSaCS, 2015.
[4] M. Vákár, =93A Framework for Dependent Types and Effects,=94 arXiv preprint arXiv:1512.08009, 2015.
[5] H. Herbelin, =93A Constructive Proof of Dependent Choice, Compatible with Classical Logic,=94 in LICS 2012 - 27th Annual ACM/IEEE Symposium on Logic in Computer Science. Dubrovnik, Croatia: IEEE Computer Society, Jun. 2012, pp. 365=96374. [Online]. Available: https://hal.inria.fr/hal-00697240
[6] P. Pédrot and N. Tabareau, =93An Effectful Way to Eliminate Addiction to Dependence,=94 in 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017, 2017, pp. 1=9612.
[7] P.-M. Pédrot and N. Tabareau, =93Failure is Not an Option An Exceptional Type Theory,=94 in ESOP 2018 - 27th European Symposium on Programming, ser. LNCS, vol. 10801. Thessaloniki, Greece: Springer, Apr. 2018, pp. 245=96271. [Online]. Available: https://hal.inria.fr/hal-01840643
URL sujet détaillé : http://tabareau.fr/sujet_stage_meven_bertrand.pdf

Remarques :



SM207-150 L'entropie de Rényi et la concentration de la mesure  
Admin  
Encadrant : Frédéric DUPUIS
Labo/Organisme : LORIA
URL : https://members.loria.fr/FDupuis/
Ville : Nancy

Description  

La théorie de l'information étudie les limites fondamentales sur les tâches de traitement de données comme la compression et la transmission de données sur un canal bruité. Cette théorie repose sur la notion d'entropie de Shannon d'une variable aléatoire, qui quantifie son degré d'aléa, et permet de caractériser ces tâches: par exemple, on peut compresser d'une variable aléatoire X en environ nH(X) bits, où H(X) dénote l'entropie de X.

Or, si l'on s'intéresse aux questions de concentration de la mesure reliées à ces tâches (par exemple, si l'on tente de compresser n copies de X dans moins de nH(X) bits, la probabilité d'erreur tend-elle vers 1 exponentiellement vite?), une autre notion d'entropie intervient: l'entropie de Rényi. Celle-ci permet en effet de prouver une version "informationnelle" de l'inégalité de Chernoff qui suffit à résoudre les cas les plus simples, et certains résultats plus complexes peuvent également être compris sous cet angle.

Cette approche est cependant étonnamment peu développée, et certains cas plus complexes qui font par exemple intervenir des variables aléatoires dépendantes ne peuvent toujours pas être traitée avec cette méthode. Le but de ce projet serait donc d'approfondir cet axe de recherche; plusieurs questions plus spécifiques pourront être abordées selon les intérêts du stagiaire.

URL sujet détaillé : http://members.loria.fr/FDupuis/stage-renyi-concentration.pdf

Remarques :



SM207-151 High Performance Deep Reinforcement Learning  
Admin  
Encadrant : Bruno RAFFIN
Labo/Organisme : INRIA DataMove
URL : https://team.inria.fr/datamove
Ville : Saint Martin d'Hères -Grenoble

Description  

Deep learning algorithms need high computational power to deal with increasingly larger
datasets. Reinforcement learning goal is to self-learn a task trying to maximize a reward
(wining a game for instance) interacting with simulations. Recently, researchers have successfully introduced deep neural networks enabling to address more complex problems. This is often refereed as Deep Reinforcement Learning (DRL). DRL managed for instance to play many ATARI games. Very recently AlphaGo Zero was a leap forward in AI as it outperformed the best human players (and itself) after being trained without using data from human games but solely through reinforcement learning. The process requires an advanced infrastructure for the training phase. To speed up the learning process and enable a wide exploration of the parameter space, the neural network interacts in parallel with several instances of the simulation. For instance AlphaGo Zero trained during more than 70 hours using 64 GPU workers and 19 CPU parameter servers for playing 4.9 million games of generated self-play, using 1,600 simulations for each Monte Carlo Tree Search.

The goal of this internship is to work on developing a high performance infrastructure for deep reinforcement learning. We will first target the Go game following the architecture proposed for AlphaGoZero.
URL sujet détaillé : https://team.inria.fr/datamove/job-offers/#DRL

Remarques : Co-advised with Philippe Preux (Sequel team at Lille - https://team.inria.fr/sequel/)



SM207-152 Placement statique pour l'ordonnancement de tâches sur plate-forme haute performance hétérogène  
Admin  
Encadrant : Olivier BEAUMONT
Labo/Organisme : Inria Bordeaux Sud Ouest
URL : http://www.labri.fr/perso/obeaumon/
Ville : Talence

Description  




Task scheduling and resource allocation are increasingly essential operations in large-scale systems, as the difference in cost (in time, energy) between calculation and communication costs increases with new generations of HPC machines, and that the hierarchy of data access is becoming more complex.

Traditionally, large scale systems have been programmed using the standard MPI programming interface. This is, however, at a fairly low level, and such programming requires manual description of the communications. Their optimization, to go beyond the BSP (Bulk-Synchronous Parallelism) paradigm, which poses serious problems of load balancing, thus requires an important engineering cost. Different new paradigms have been proposed to abstract programming at a higher level. Some rely on a master-slave paradigm, what limits their scalability; others require to rewrite the code in an object paradigm or to describe algebraically the whole application computation. The approach proposed here is based on StarPU [1] and consists of preserving the existing "sequential code" aspect of the applications, while at the same time focusing them on large-scale systems, by automatically generating a suitable task graph. This approach has already been successfully applied in the case of a single computing node (with multi-cores and GPUs) in [2,3,4,5], where the relative advantages of static scheduling strategies ( calculated before launch) and dynamic (calculated at runtime) were discussed.

The objective of the internship is to study the following problem: how to place the data of an application described as a task graph, in order to minimize the communications while guaranteeing good load balancing? The internship will begin with a study of the scientific literature on the subject, then focus on the design and analysis of algorithms for this problem. Extensions can be considered in a second step: possibility to replicate computations, taking into account the heterogeneity of the platform (presence of GPUs), ...



Keywords: resource allocation, static scheduling, dynamic scheduling, StarPU, heterogeneity, distributed algorithms


Comments: Thesis funding is acquired for a subject in continuation of this topic of internship for a start of the thesis in September 2019




References :

[1] http://starpu.gforge.inria.fr

[2] Olivier Beaumont, Lionel Eyraud-Dubois, Suraj Kumar Fast Approximation Algorithms for Task-Based Runtime Systems, Concurrency and Computation: Practice and Experience, Wiley, 2018, 30 (17)

[3] Olivier Beaumont, Lionel Eyraud-Dubois, Suraj Kumar Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for Task-Based Runtime Systems on Multicores and GPUs, IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2017, Orlando, United States.

[4] Olivier Beaumont, Terry Cojean, Lionel Eyraud-Dubois, Abdou Guermouche, Suraj Kumar Scheduling of Linear Algebra Kernels on Multiple Heterogeneous Resources International Conference on High Performance Computing, Data, and Analytics (HiPC 2016), Dec 2016, Hyderabad, India . IEEE, 2016, Proceedings of the IEEE International Conference on High Performance Computing (HiPC 2016)

[5] Emmanuel Agullo, Olivier Beaumont, Lionel Eyraud-Dubois, Suraj Kumar Are Static Schedules so Bad? A Case Study on Cholesky Factorization IEEE International Parallel & Distributed Processing Symposium (IPDPS 2016), May 2016, Chicago, IL, United States. IEEE, 2016



URL sujet détaillé : :

Remarques : Sujet co-encadré avec Lionel Eyraud Dubois

Commentaires : Un financement de thèse est acquis pour un sujet en prolongement de ce sujet de stage pour un démarrage de la thèse en Septembre 2019



SM207-153 Caractérisation de traces de navigation  
Admin  
Encadrant : Lionel TABOURIER
Labo/Organisme : LIP6 (Sorbonne Université / CNRS)
URL : http://www.complexnetworks.fr/
Ville : Paris

Description  

This internship aims at investigating the characteristics of users navigation traces on an "infotainment" website (www.melty.fr)

In previous works, we observed that some types of sessions are related to low content diversity consumption. Our assumption is that the structure of the site, as well as the nature of the recommendation links lead to isolate the user in a specific topic.

The main goal of the internship is to test this assumption by exploring navigation traces corresponding to different recommendation techniques.

URL sujet détaillé : http://www.complexnetworks.fr/wp-content/uploads/2018/11/sujet_algodiv.pdf

Remarques : co-encadrement : Pedro Ramaciotti (LIP6), Christophe Prieur (Télécom Paris Tech)




Last modification : 2018-12-04 11:24:46 laurent.lefevre@ens-lyon.fr View source.