SM2071 Deep Learning for Cell Tracking on 3D + time live embryos.


Description


Machine Learning and more specifically Deep Learning give very promising results in various domains of biological research. The aim of the internship is to apply Deep learning to the field of biological imaging, and more specifically to 4D (3D plus time) cell segmentation, the recognition and tracking of each cell and its progeny in live developing embryos. The internship will start from a preexisting database of more than 10 fullysegmented embryos (>10000 tracked cells). We would like to create an endtoend method for complex 3D cell tracking using deep learning. In this internship, depending on the profile and interests of the intern, we can:  Implement and adapt to our dataset existing endtoend deep learning methods for instance segmentation (the identification of individual objects).  Explore theoretical aspects of the adaptation of the memory concept in Recurrent Neural Networks (LSTM) to cell tracking in 3D+time images.  Use deep learning to predict and define the rules of cell division during embryogenesis. This 56 months internship can lead to a PhD project, potentially combining computer approaches with the experimental validation of predictions. URL sujet détaillé :
:
Remarques : PROFILE AND RESEARCH SKILLS:
The internship is mainly intended for students with an initial training in maths and computer science, though biologists with very good skills in computer programming and statistical analysis could be considered. An interest in biology is welcome. Team spirit, autonomy, dynamism, creativity would be appreciated. Proficiency in technical English or French must be sufficient to write technical documentation, and to interact verbally daily.
Required programming language: python
The selected student should basic knowledge of at least one of the following topics: machine learning theory, image processing, high performance computing or the use deep learning libraries (e.g. Keras).
CONTEXT:
The intern will be based in Montpellier, and the supervision will be provided by a computer scientist Emmanuel Faure (CNRS) and a biologist Patrick Lemaire (CNRS)
Mails : emmanuel.faure.fr and patrick.lemaire.cnrs.fr
Emmanuel Faure, (LIRMM Montpellier from Fall 2018 https://www.lirmm.fr/icar/) is a Data Scientist using recent machine learning techniques to understand quantitative biology and medicine.
Patrick Lemaire's team (CRBM, Montpellier, France; http://www.crbm.cnrs.fr/en/recherche/equipes/)focuses on animal embryonic development, using ascidians as a model system.





SM2072 Verification of machine learning models


Description


Neural networks and other machine learning models have proved to be very successful in a wide range of applications. However to be safely used in critical scenarios we need guarantees: how accurate and robust is your model? A very classical example shows a neural net classifying images being fooled by adding random noise, asserting that a panda becomes a gibbon with high confidence.
Verification of machine learning models is a growing field, fostering ideas from automata theory, program verification, invariant synthesis, and dynamical systems.
The goal of the internship is to study and construct algorithms for verifying neural networks with simple architectures. This in particular requires defining and comparing correctness and robustness notions. The outcome of the internship can be either theoretical or practical through implementation. URL sujet détaillé : https://nathanaelfijalkow.github.io/pdf/internship_learning.pdf
Remarques : A PhD scholarship is available for starting in September 2019.





SM2073 Génération de chemin fluide en dimension 2 ou 3 / Smooth Path Generation in Dimension 2 or 3


Description


A path is the geometric part of a motion, without time aspect. Smooth paths (C 2 or more), if they respect the motion constraints of a robot (given as a differential equation), can be followed by this robot more precisely than simpler paths (C 1 or less). Writing motion constraints as a differential equation is usually quite simple. Integrating this differential equation allows to find the motion associated to a given command starting from an initial state, as well as the end state of this motion. On the contrary, it is much harder to find out the right command corresponding to a motion connecting an initial state to an end one, as existence and uniqueness of the motion can be questioned.
As far as paths are concerned, ISEEML library remains one of the more efficient to solve this problem. While it was initially designed for carlike robots' paths, it can be extended to those of any kind of mobile robots, in dimension 2 (wheeled or legged robots) or 3 (planes or drones): clothoid pieces just have to be replaced by pieces of any kind of other spirals, computations remaining unchanged. It is also possible to handle different motion constraints for initial or end part of the paths. At last, connecting configurations (made of a position, an orientation and a curvature) whose curvature is not zero (or extreme) should be possible. For this last point, only resolution clues are currently available. During this internship, the student will start and get accustomed to ISEEML library by programming the two first points, which are already totally formalised. He will then try and solve last point, and implement the solution found. Finally, experiments will be carried out, either in an adhoc application (using Qt) or in a control benchmark (based on ROS) switching easily from simulation to real robot handling. URL sujet détaillé : http://members.loria.fr/AScheuer/research/2018master2subject
Remarques : Internship allowance and accommodation (Crous) may be obtained / possibilité d'indemnité de stage et de logement par le Crous.





SM2074 Mean Field Optimal Control for MultiArmed Bandit Problems


Description


Mutliarmed bandits are classical models of sequential decision making problems in which a controller (or learner) needs to decide at each time step how to allocate its resources to a finite set of alternatives (called arms). They are widely used in online learning today but the complexity of computing optimal policies for Markovian bandit grows quickly with the number of arms. The goal of the internship is to study the use of recent progress from mean field theory to compute close to optimal allocation policies for such problems.
(see the detailed pdf for more details). URL sujet détaillé : https://team.inria.fr/polaris/files/2018/10/stage_bandits.pdf
Remarques : Coadvisor : Patrick Loiseau (Inria)





SM2075 Sequential Learning in Combinatorial Bandits under Delayed and Anonymous Feedback


Description


Background
The multiarmed bandit (MAB) problem is an elegant model to solve sequential prediction problems. A learner needs to choose at each time step an action (also called arm) in a given set, then he suffers a loss and observes a feedback corresponding to that chosen action. The objective of the learner is to guarantee that the accumulated loss is not much larger than that of the best actioninhindsight (that is, to minimize the regret).
In this work, we focus on bandit problems with combinatorial structure and linear loss function. The classical algorithms for Karms bandits problems such as EXP3 [1] offer a regret upperbound that is inefficient in this case due to the fact that the considered action set typically contains an exponential number of actions. However, exploiting the combinatorial structure of the action set, we can use the exponential average weights algorithms (e.g., EXP2 [4]) to bound the regret in terms of the dimension of the representative space (typically a smaller number). This algorithms work, however, only under the classical feedback models: the learner can receive either bandit feedback (he only observes the incurred loss by the chosen action) or either fullinformation feedback (he observes losses of all actions).
Work description
In this internship, the goal is to extend combinatorial bandits algorithms and results to more complex feedback structure that would enable realistically modeling applications such as advertising on social networks, scheduling data transmission on a network, or decision making in recommendation systems.
To that end, we will leverage several new settings of feedback that were recently proposed by [2, 3]: (i) Delayed feedback: The loss of time t will only be observed at time t+d for d> 0 fixed. (ii) Anonymous composite feedback: At time t, the learner only observes the total sum of the loss from time steps t − d + 1, t − d, . . . , t for d> 0 fixed. For these settings, techniques based on classical EXP3 algorithm for MAB are presented in [2, 3]. The student will investigate theoretically whether these techniques can be extended to combinatorial bandits in order to obtain algorithms with good regret bounds for these cases. Then, he/she will propose an efficient implementation of the new extended algorithms and perform simulations to validate the results and to evaluate the actual performance of the algorithms.
Expected abilities of the student
Strong background in mathematics (in particular probability and statistics); if possible initiation to sequential learning; and basic programming experience (e.g., Python, R or C++).
References
[1] Peter Auer, Nicolo CesaBianchi, Yoav Freund, and Robert E Schapire, The nonstochastic multiarmed bandit problem, SIAM journal on computing 32 (2002), no. 1, 48=9677. [2] Nicolo CesaBianchi, Claudio Gentile, and Yishay Mansour, Nonstochastic bandits with composite anonymous feedback. [3] Nicolo CesaBianchi, Claudio Gentile, Yishay Mansour, and Alberto Minora, Delay and cooperation in nonstochastic bandits, Journal of machine learning research 49 (2016), 605=96622. [4] Nicolo CesaBianchi and G=E1bor Lugosi, Combinatorial bandits, Journal of Computer and System Sciences 78 (2012), no. 5, 1404=961422. URL sujet détaillé : http://ligmembres.imag.fr/loiseapa/pdfs/InternshipBanditsDelayedFeedback.pdf
Remarques : Coencadrement avec Dong Quan Vu (Nokia Bell Labs)





SM2076 Setbased cosimulation, uncertainties and guarantee


Description


Simulation tools are mainly used in the design steps of systems, such as robots, ship, airplane, missiles. It is also the best approach to develop the software of this kind of complex and critical systems. In the case of a physical system (hybrid system of cyberphysical system), the vehicle, for example, moves into a dynamical environment (wind, current, etc.). It is then interesting to be able to simulate both the environment and the system. The major problem is that both are strongly connected, by the physics but also by the software (control loop). In our lab, we develop a tool for setbased simulation, allowing us to consider many sources of uncertainty while let us, nevertheless, produce guarantee on the result. The subject of the internship is to develop the mathematical theory for a setbased cosimulation (a new theorem is needed), implement it (in our tool in c++) and test the solution on a shipocean example (in simulation and maybe on an actual system). URL sujet détaillé :
:
Remarques : team : Safety of Hybrid Systems keywords : Mathematics, computer science and basics in physics (ODEs, PDEs)
Salary following the law (around 550 =80 per month)
A PhD could be proposed to good students





SM2077 Distance geometry problems


Description


Some systems are sometimes defined only by distances, such as in chemistry, localization, sensor network, robotics. Distance geometry problems are difficult to solve. From a complexity viewpoint, obtaining an approximate solution to a distance geometry is NPhard. From an optimization viewpoint, distance geometry problems have many local minimizers which introduce doubts. In particular, distance geometry problems are interesting mathematical problems with important applications in robotics for solving localization problems, UAV swarm organization, etc. If the distances given by sensors are always uncertain, in these special cases, the measures can be obtained with delay (conditionned by dynamics of UAVs and communication) or even completly wrong (due to outliers introduced by sensor fault). More than a distance solver able to manage with uncertainties, some more complicated issues can thus be addressed. We propose an internship in order to develop a more robust distance solver algorithm using the Interval Analysis in order to bound and certify the results. The CayleyMenger distance matrix could be a good approach to fix the problem of lack of constraints. The relaxed intersection (or qintersection) can also be used to counteract the effect of outliers. Finally, an introduction of dynamics of UAVs can be considered to minimize the effect of delay in distance getting.
URL sujet détaillé :
:
Remarques : team : Safety of Hybrid Systems keywords : Constraint programming, Mathematics, computer science
Salary following the law (around 550 =80 per month)
A PhD could be proposed to good students





SM2078 Risque d'avalanche & visualisation 3D


Description


Version française :
Le travail demandé lors de ce stage se fera en trois étapes : 1. La réalisation d'un état de l'art et d'une comparaison de deux approches de fusion d'informations incertaines et multicritères : ◦ Les sousensembles flou ; ◦ L'approche possibiliste. L'objectif est d'identifier les cas d'usage de chacune de ces approches et d'étudier l'opportunité de les combiner. 2. La mise en application d'une ou plusieurs de ces approches dans le cadre de l'estimation du niveau de vigilance à adopter sur le terrain, au regard du risque d'avalanche. 3. La visualisation des résultats obtenus dans un environnement 3D (voire de réalité augmentée) déjà mis en óuvre au sein du laboratoire. _________________________________________________ English version :
The work required during this internship will be done in three stages: 1. The realization of a state of the art and a comparison of two approaches to merging uncertain and multicriteria information: ◦ The fuzzy subsets; ◦ The possibilist approach. The objective is to identify the use cases of each of these approaches and to study the opportunity to combine them. 2. The implementation of one or more of these approaches as part of the assessment of the level of vigilance to be adopted in the field with regard to avalanche risk. 3. Visualization of the results obtained in a 3D (or even augmented reality) environment already implemented in the laboratory. URL sujet détaillé : https://projects.listic.univsmb.fr/emploi/stage/2018_Sujet_Master_CIME.pdf
Remarques : Ce stage sera gratifié selon la réglementation ministérielle en vigueur. En fonction des résultats obtenus et de la motivation du candidat ou de la candidate, un financement de thèse assuré sur trois ans à hauteur de 1615 =80 par mois pourra être proposé.





SM2079 Speculative execution based on GLR/GLL theory


Description


In the context of the current difficulty to feed compute units in today's CPUs with enough data without stalling on the memory accesses, this internship intends to study the use of generalized context free parsing theory to represent speculative execution of a program, relying on the ability of GLL/GLR parsers to fork and merge speculative paths. The work will study the ability of abstracting a computation over a sequence of instructions in that theory, and extract memory addresses that could be resolved that way.
A good understanding of compilation and processor architecture is required. URL sujet détaillé :
:
Remarques : Stage rémunéré suivant la grille CEA.





SM20710 Rust for Performance


Description


The analysis of large multidimensional spatiotemporal datasets poses challenging questions regarding storage requirements and query performance. Several data structures have recently been proposed to address these problems that rely on indexes that precompute different aggregations from a knownapriori dataset. Consider now the problem of handling streaming datasets, in which data arrive as one or more continuous data streams. Such datasets introduce challenges to the data structure, which now has to support dynamic updates (insertions/deletions) and rebalancing operations to perform self reorganizations. We developed a Packed Memory Quadtree (PMQ), that efficiently supports dynamics data insertions and outperform other classical data structures like Rtrees or Btrees. The PMQ ensures good performance by keeping control on the density of data that are internally stored in a large array. The PMQ is based on the Packed Memory Array [3,4,5] and share with it its amortized cost of O(log^2(N)) per insertion. But today all processors are multicores. Data structures need to be parallelized to leverage the compute power available. The goal of this internship is to design, develop and test a parallel version of the PMQ using the Rust programming language (or C++). URL sujet détaillé : https://team.inria.fr/datamove/joboffers/#rust
Remarques : Gratification de Stage





SM20711 Data Assimilation at Large Scale


Description


Ensemble runs consist in running many times the same simulation with different parameter sets to get a sturdy analysis of the simulation behavior in the parameter space. The simulation can be a high resolution large scale parallel code or a smaller scale lightweight version called metamodel or surrogatemodel. This is a process commonly used for Uncertainty Quantifications, Sensibility Analysis or Data Assimilation. This may require to run from thousands to millions of the same simulation, making it an extremely computeintensive process that will fully benefit of Exascale machines.
Existing approaches show a limited scalability. They either rely on intermediate files where each simulation run writes outputs to file that are next processed for result analysis. This makes for flexible process, but writing these data to the file system and reading them back for the analysis step is a strong performance bottleneck at scale. The alternative approach consists in aggregating all simulation runs into the same large monolithic MPI job. Results are processed as soon as available avoiding intermediate files. However, by not taking benefit of the loose synchronization needs between the different runs, this overconstrains the execution regarding compute resource allocation, application flexibility or fault tolerance. Recent approaches like the Melissa framework adopt an elastic architecture. Simulations or groups of simulations (depend on the level of synchronization needed between the runs) are submitted to the machine batch scheduler independently. Once these jobs start they dynamically connect to a parallel data processing parallel server. This server gets the data from the running simulation that are processed in parallel online as soon as available, thus avoiding intermediate files. The computed partial results can be retrofeeded to the simulation (needed for data assimilation for instance). They can also be used to support an adaptive sampling process where the set of parameters for the next simulation runs are defined according to these partial results. Such framework enables to fully take benefit of the loose synchronization capabilities between simulation runs: simulations are submitted and allocated independently enabling to better use the machine resources and to support efficient fault tolerance mechanisms. URL sujet détaillé : https://team.inria.fr/datamove/joboffers/#GRPC
Remarques : Gratification de Stage





SM20712 Certified Compilation of Skeletal Semantics


Description


Skeletal semantics (https://hal.inria.fr/INRIA/hal01881863) are a new approach to describe the semantics of programming languages in a highly generic and modular way. These semantics describe the control flow for the evaluation of each construct of the language without specifying the implementation of primitive construct. For instance, the semantics for the addition of two expressions could be "evaluate the first expression, then evaluate the second expression, then add the results", without specifying how to add numbers. One may then provide several interpretations of such semantics, such as a bigstep semantics, a static semantics (i.e., typing), or an abstract semantics (for abstract interpretation). These interpretations are defined independently of the programming language, and thus may be reused for other languages.
The objective of the internship is to study the certified compilation of skeletal semantics. The student will choose a source language, for instance a simple WHILE language, and a target language, for instance a stackbased language. The student will define these languages as skeletal semantics, and will define the compilation from source to target as an interpretation of the source semantics. The student will then establish a set of necessary conditions between the behavior of primitive constructs to certify the compilation is correct: the behavior of a compiled program corresponds to the behavior of the source program. URL sujet détaillé : http://people.rennes.inria.fr/Alan.Schmitt/internships/skeletons.pdf
Remarques : Rémunération possible. Bourse disponible dans le cadre d'un projet ANR pour la poursuite en thèse.





SM20713 Gradual Typing for dynamic languages


Description


We propose three main internships on gradual typing for dynamic languages (though other are possible).
CONTEXT
The idea behind gradual typing [1] is to integrate, in a unique framework, both static typing (where the types in a program are checked against each other before its execution) and dynamic typing (where the types are only controlled during the execution).
Gradual typing reconciles static and dynamic typing by allowing the programmer to obtain the strong guarantees offered by static typing for certain parts of a program while keeping the flexibility of dynamic typing for other parts. Formally, this is done by adding a new type annotation, often called dynamic, which acts as a fully dynamic type. That is, values of this type will pass any static type check, and their type will only be checked at runtime. What is important is that preexisting dynamicallytyped code is still valid, but can easily be upgraded with more type annotations. This is why gradual typing has been attracting more and more interest and has been adopted by some of the major partakers in the development of information technology and computer science, as shown by the development of the languages Dart by Google, TypeScript by Microsoft and Flow by Facebook.
Regarding static typing, one of the tools that proved to be among the most suited to Web development is the use of union and intersection types, also known as settheoretic types [3,4]. The aim of settheoretic types is to integrate the mathematical definitions, principles, and operations of settheory (such as the union, the intersection, or the complement of sets) in a type system. This allows for a much more precise control over types, thus providing stronger static guarantees than with simple types.
GOALS
We propose three different internships on gradualtyping for polymorphic languages, that will build upon the work we already developed on gradual typing with settheoretic types [6,7]:
SUBJECT 1: Occurrence typing. Occurrence typing is a technique developed for TypedRacket to use flow information for typing highly dynamic code [2]. The idea is that when some dynamic type test is performed on a variable, then we can refine our type inference according to whether the test has succeeded or not. For instance, if in an ifexpression we test whether a variable x has type integer, then we know that in the then branch x will have type integer (intersection with its static type) and that in the else branch it will have its static type minus the type integer. The goal of this internship is to formalize occurrence typing in the context of semantic subtyping so as to extract the maximum amount of information from the flow of a program.
SUBJECT 2: Code analysis for inferring intersection types Intersection types are a key feature for a feasible typing of dynamic languages, where the same function can be used in contexts so different that they cannot be captured by common polymorphic type systems. The idea is that in each context the function is used with some given type, that, thus, the function has all these types, and, therefore, it also has the intersection of all these types. While it is relatively easy to check whether a function has an intersection type specified by the programmer via an annotation, as it can be done in Flow, it is very hard (actually, undecidable) to infer an intersection type for code partially annotated or not annotated at all. The goal of this stage is to use the results of a static analysis (and, time permitting, profiling information) to drive the inference of intersection types for functions defined in dynamic languages.
SUBJECT 3: Abstract machines for cast calculi. In order to provide strong safety guarantees, graduallytyped languages can be compiled into intermediate versions which contain explicit type casts that dynamically check the correctness of graduallytyped portions of a program. The implementation of such intermediate representations is the subject of active research and of a heated debate [5], but there does not exist any work on the compilation of cast when these are on settheoretic types. The goal of the internship is to study the current literature on the implementation of cast calculi, use it to define an abstract machine for cast calculi with settheoretic types, and time permitting implement it.
References
[1] Jeremy Siek and Walid Taha. Gradual typing for functional languages. In Scheme and Functional Programming Workshop (September 2006), pp. 81=9692
[2] Occurrence typing ￼, The Typed Racket Guide
[3] Giuseppe Castagna, Kim Nguyen, Zhiwu Xu, Pietro Abate: Polymorphic Functions with SetTheoretic Types: Part 2: Local Type Inference and Type Reconstruction. POPL 2015.
[4] Giuseppe Castagna, Kim Nguyen, Zhiwu Xu, Hyeonseung Im, Sergueï Lenglet, Luca Padovani: Polymorphic functions with settheoretic types: part 1: syntax, semantics, and evaluation. POPL 2014.
[5] Asumu Takikawa, Daniel Feltey, Ben Greenman, Max New, Jan Vitek, Matthias Felleisen. Is Sound Gradual Typing Dead? POPL 16.
[6] Giuseppe Castagna and Victor Lanvin. Gradual typing with union and intersection types. In International Conference on Functional Programming, 2017.
[7] Giuseppe Castagna, Victor Lanvin, Tommaso Petrucciani, and Jeremy Siek. Gradual typing: A new perspective (subtyping and implicit parametric polymorphism). POPL 2019, to appear, 2018. URL sujet détaillé : https://www.irif.fr/~gc/stageGradualTyping.en.html
Remarques : Part of this project is proposed in collaboration with Facebook Language Research. It will be pursued in collaboration with Jeremy Siek (Indiana University) in sabbatical leave at IRIF from January 2019 (possibility of a joint supervision).





SM20714 Hierarchical Parallelization


Description


Technological trends in circuit design force the duplication of computation units to obtain more performances. Splitting a program into parallel units and setup the synchronisations is bugprone and expensive. Hence, the trend to rely on a {em parallelizing compiler} to automate this task. Numerous problems must be addresses by such a compiler. Where is the parallelism in the program? Which parts must be parallelized and how? How to allocate the hardware resources?
The {em polyhedral model} addresses these issues in the same unifying framework. Under certain conditions, the iterations of {tt for} loops may be summarized by polyhedra. Then, a geometric reasonning allows to build compiler analysis (e.g. dependence, scheduling) thanks to linear programming and geometric operations (union, intersection, projection). But the precision comes with a cost: the complexity of polyhedral analysis limits the size of compilable programs to a few tens of lines!
In this internship, we propose to study how to scale the polyhedral model. We will leverage a {em divideandconquer} approach: the program is divided into subprograms, each subprogram is analyzed separately, then we ``gather the pieces''. We will focus on {em loop tiling}, a polyhedral loop transformation which divides the iteration domain of nested {tt for} loops into atomic blocks.
Program: 1) Study the possible loop tilings on a composition of polyhedral programs (e.g. several matrix multiplication). Figure out how {em local} loop tilings may be composed to form a {em global} loop tiling. Specify the properties of a ``good'' composition. 2) Propose an algorithm for hierarchical loop tiling, which fulfills the properties formalized in step 1. 3) Implement and test your algorithm. The approach will be validated experimentally on the benchmarks of the polyhedral community.
Requirements. Notions in compilation and parallelism. URL sujet détaillé : http://perso.enslyon.fr/christophe.alias/stages/modulartilingen.pdf
Remarques :





SM20715 Automatic circuit transformations for faulttolerance and their certification in Coq


Description


Circuit faulttolerance is nowadays a major research issue for any safety critical applications due to the increased risk of soft errors. Such risk results from the continuous shrinking of transistor size that makes components more sensitive to radiation.
The standard faulttolerant technique is Triple Modular Redundancy (TMR) that consists in triplicating the circuit and inserting majority voters to filter out corrupted bits [1]. Recently, we have proposed several novel faulttolerance techniques based on time redundancy [2] trading throughput for minimal hardware overhead. Circuits are represented using a functional hardware description language and faulttolerance techniques as program (circuit) transformations. Since faulttolerance is typically used in critical domains (aerospace, nuclear power, etc.), the correctness of such transformations is essential. If there is little doubt about the correctness of simple transformations such as TMR, this is not the case for more intricate ones. We have used the Coq proof assistant to formalize the semantics of an hardware description language and several faultmodels to formally prove the correctness of our time redundant transformations [3].
With recent nanoscale technologies, a high energy particle strike often produces several errors called SEMT (for "single event multiple transients") in adjacent flipflops or wires. However, the techniques mentioned before can only handle a single error at a time.
The objective of the work is to study extensions of these techniques to tolerate SEMT. After a bibliographic study of faulttolerance transformations for digital circuits, the following topics could be considered.
 The description of SEMT faultmodels using information about the location of cells, gates and wires. (e.g. what simultaneous corruptions are possible, how they propagate, etc.)
 The analysis of circuits to infer where the different cells, gates and wires should be placed relatively to each other to prevent SEMTs to lead to actual failures.
 The design of a new faulttolerance technique (possibly by combining several already known ones) able to mask SEMT.
 The formalization of the SEMT faultmodels, masking techniques and their certification in the Coq proof assistant.
Depending on his/her background and interests, the candidate may focus on only one or two of the above tasks.
References :
[1] R. E. Lyons and W. Vanderkulk. The Use of TripleModular Redundancy to Improve Computer Reliability. IBM Journal. 1962.
[2] Dmitry Burlyaev, Pascal Fradet, and Alain Girault. Automatic timeredundancy transformation for faulttolerant circuits. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2015.
[3] Dmitry Burlyaev and Pascal Fradet. Formal verication of automatic circuit transformations for faulttolerance. In Formal Methods in ComputerAided Design, FMCAD, 2015.
Context :
The main goal of the Spades team is the safe design of realtime embedded systems. URL sujet détaillé :
:
Remarques :





SM20716 Hardware Compilation


Description


A process network is a collection of pure functions communicating through channels. Process networks are a natural intermediate representation for hardware compilation: the frontend extracts the parallelism and derive the process network. Then, the backend maps the process network to hardware. In this internship, we propose to focus on the compilation of {em channels} to hardware.
Compiler optimizations for parallelism and data locality restructure deeply the execution order of the program, hence the read/write patterns in communication channels. This breaks most FIFO channels, which have to be implemented with addressable buffers. Expensive hardware is required to enforce synchronizations, which often results in dramatic performance loss. We have proposed a algorithm to reorganize the channels so the FIFO broken by loop tiling can be recovered. We have prove this algorithm to be complete on DPN, a constrained family of process networks.
The goal of this internship is to study the completeness of FIFO recovery on {em general} process networks. Does there exists an algorithm to recover the FIFO broken by {em any} tiling transformation? In the positive case (our belief), find an algorithm and prove the completeness. Otherwise, point out the fundamental reasons why such an algorithm can never exists.  Study several polyhedral program where the technique developed in cite{hip3es} did not work.  Propose a complete algorithm + proof of completeness. Or, prove that such an algorithm can never exist.  In the positive case, implement and test your algorithm. The approach will be validated experimentally on the benchmarks of the polyhedral community.
Requirements. Notions in compilers and parallelism URL sujet détaillé : http://perso.enslyon.fr/christophe.alias/stages/fifoen.pdf
Remarques :





SM20717 Types for reconfiguration


Description


Scientific context
Location graphs [1] are a new formal model for software components currently developed in the INRIA SPADES team, that combines dynamic reconfiguration, graph control structures and isolation properties, and which subsumes located variants of the pi calculs [2], and more abstract computational models, such as Milner's bigraphs [3].
Subject of the internship
The goal of this internship is to develop a simple I/O type system for location graphs, inpired by notions of I/O types for the picalculus [4], of mobility types in Boxed Ambients [5], and of types for a dynamic componentbased language [6]. Apart from communication, location graphs have capabilities for passivating locations and modifying a location graph structure. An I/O type system for location graph needs to take into account these capabilities to prevent simple but less usual forms of errors such as unlawful update, binding and movement in a location graph. If successful, the internship can also begin to investigate more complex forms of types dealing with structural invariants, including conditions such as encapsulation or isolation between components, a topic related to the aliasing problem in objectoriented programming languages and in particular the notions of ownership types that have been developed to cope with it [7].
References
[1] J.B. Stefani : "Component as location graphs", LNCS 8997, Springer, 2015.
[2] R. Milner : "Communicating and Mobile Systems: The Pi Calculus", Cambridge University Press, 1999.
[3] R. Milner : "The Space and Motion of Communicating Agents", Cambridge University Press, 2009.
[4] D. Sangiorgi, D. Walker : "The picalculus: A Theory of Mobile Processes", Cambridge University Press, 2001.
[5] M. Bugliesi, S. Crafa, M. Merro, V. Sassone: "Communication and mobility control in boxed ambients", Information and Computation 202(1), 2005.
[6] J.C. Seco, L. Caires : "Types for dynamic reconfiguration", LNCS 3924, Springer, 2006.
[7] D. Clarke, J. Ostlund, I. Sergey, T. Wrigstad: "Ownership Types: A Survey", LNCS 7850, Springer, 2013. URL sujet détaillé :
:
Remarques :





SM20718 Delivering VoD services to vehicles along a highway


Description


Vehicular networks bringing connectivity to the vehicles for the purpose of safety and entertainment are expected to be deployed in the upcoming years. While different technologies including LTE and WiMAX have been considered, IEEE 802.11p (aka WiFi) has dragged much attention.
In this work, we consider the following scenario. Passengers in a vehicle are willing to use VoD (Video on Demand) service in order to watch a movie during their journey to their destination. Several RSUs (Road Side Unit) delivering a wireless connection to passingby vehicles are scattered along the road. We assume that the network operator may adapt the quality of the video (changing its codec) or block the download from cars having a bad connection so that the passengers watch their movie without any disruption. In order to achieve this goal, one needs to download enough data and filling a buffer with frames that are kept for being played later when the vehicle will have no connection. It is worth keeping in mind that with IEEE 802.11 cars may have download the video at different speed (transmission rate) depending on their distance to their nearest RSU.
We already developed a theoretical model to forecast what would be the performance and Quality of Experience (expressed as a disruption time) of each vehicle. This model was validated against a discreteevent simulator [1].
The objective of this work is to design (and to validate against simulations) an algorithm that selects the video codec or block the download of remote vehicles so that passengers experiences a seamless play of their movie. The algorithm will be based on the predictive model presented in [1] (or an improved version of it) that takes into account the number of RSUs, the expected number of contending vehicles on each RSU, the bandwidth of RSUs, the distance to travel, the length of the movie, etc.
Keywords: selective algorithm; modeling; Video on Demand; 802.11 (WiFi); discreteevent simulator (ns3); performance evaluation.
[1] Video on Demand in IEEE 802.11pbased Vehicular Networks: Analysis and Dimensioning. T.Begin, A. Busson, I. GuérinLassous, A. Boukerche  ACM MSWIM 2018  Montreal (Canada).
URL sujet détaillé : http://perso.enslyon.fr/thomas.begin/tmp/stageVoDvehicular.pdf
Remarques : Supervisors: Thomas Begin, Isabelle GuérinLassous and Anthony Busson EMails: thomas.beginlyon.fr; isabelle.guerinlassouslyon.fr; anthony.bussonlyon.fr





SM20719 Logics for reconfiguration


Description


Scientific context
Location graphs [1] are a new formal model for software components currently developed in the INRIA SPADES team, that combines dynamic reconfiguration, graph control structures and isolation properties, and which subsumes located variants of the pi calculs [2], and more abstract computational models, such as Milner's bigraphs [3].
Subject of the internship
The goal of this internship is to develop an adequate logic for location graphs. An adequate logic for a computational model provides a logical characterization of behavioral equivalence in the model. In the case of location graphs, such a logic should cover modalities à la HennessyMilner [4], as well as spatial modalities to deal with the underlying hypergraph structure of a location graph. Several works have looked into adequate logics for computational models with spatial modalities, such as Ambient calculi [5,6]. The difficulty in location graphs lies with taking account encapsulation and isolation constraints. Those would probably require considering constructs analogous to those found in separation logic [7,8,9].
References
[1] J.B. Stefani : "Component as location graphs", LNCS 8997, Springer, 2015.
[2] R. Milner : "Communicating and Mobile Systems: The Pi Calculus", Cambridge University Press, 1999.
[3] R. Milner : "The Space and Motion of Communicating Agents", Cambridge University Press, 2009.
[4] L. Aceto, A. Ingolfsdottir, K.G. Larsen, and J. Srba : "Reactive Systems", Cambridge University Press, 2007.
[5] L. Caires and L. Cardelli. A spatial logic for concurrency (part i). Inf. Comput., 186(2), 2003.
[6] L. Caires and L. Cardelli. A spatial logic for concurrency  ii. Theor. Comput. Sci., 322(3), 2004.
[7] P. O'Hearn : "Resources, concurrency and local reasoning"", Theoretical Computer Science, 375(13), 2007.
[8] P. W. O'Hearn, H. Yang, and J. C. Reynolds : "Separation and information hiding", ACM Trans. Program. Lang. Syst., 31(3), 2009.
[9] A. Francalanza, J. Rathke, and V. Sassone : "Permissionbased separation logic for messagepassing concurrency", Logical Methods in Computer Science, 7(3), 2011. URL sujet détaillé :
:
Remarques :





SM20720 Online assignment games =96 trading off mismatch and waiting costs


Description


Assignment markets are ubiquitous: From assigning sellers to buyers, workers to jobs, or workload to machines. Shapley and Shubik (1972) introduce a model where each seller is looking for exactly one buyer and vice versa. They show that, generically, there exists a unique and optimal matching.
This projects will study online assignment markets, where buyers and sellers arrive to the market over time. Such markets may be classic labor markets or market institutions that emerged in recent years such as ride sharing applications. For any market, modern technology allows the market designer to vary (at low cost) at what points in time to clear the market (Emek et al., 2016; Akbarpour et al., 2017). On the one end of the scale, she can immediately match buyers and sellers (and remove them from the market). On the other end of the scale, she can wait until all buyers and sellers have arrived and then apply the optimal matching. We shall call this design decision of the market designer the clearing schedule. This project seeks to understand: 1. How does the expected payoff change dependent on the clearing schedule? 2. What are (optimal) clearing schedules with respect to a specific cost function of the waiting time?
The outcome of this research will ideally be presented at an international conference with both academics and market design practitioners.
Required skills: Mandatory knowledge of basic probability theory and combinatorics. In addition, simulation skills can be of use. URL sujet détaillé :
:
Remarques : The internship may be continued as a PhD. The internship is paid.





SM20721 Pervasive platform for learningbased applications


Description


The purpose of the internship is to provide new pervasive platforms and architectures based on the fog paradigm in order to run smart, AIbased applications close to devices and users. This approach contrasts with current cloudbased solutions presenting significant drawbacks in terms of security, performance, and affordability.
The project will extend an existing serviceoriented platform (see http://selfstar.imag.fr) with new abstractions and mechanisms facilitating the development and evolution of smart applications. These extensions are meant to integrate individualized learning models and interact with remote platforms to distribute computing. The core of the work will be to better specify pervasive applications and their needs.
The work will be done in cooperation with HongKong Polytechnic university (top 100 university). It will be applied to energy savings and crowd management in HongKong highrise buildings for which more than four years of data will be made available. Also several learningbased applications have been already developed by the HongKong team. A trip to HongKong might be necessary. URL sujet détaillé :
:
Remarques : Highly motivated student needed to work in an international context.





SM20722 Deep learning for anomaly detection


Description


We are seeking a selfmotivated and enthusiastic machine learning student to join the R&D worldline team.
Machine learning interns will have the opportunity to explore the application of deep learning algorithms towards anomaly detection problems. In this internship, the sequential information will be extracted automatically to help anomaly detection. Recurrent Neural Network (RNN) is often used for this propose. This work consists on how to use different RNN extensions in parallel being able to deal with multisources problem. In this context, you will have the opportunity to work on the latest general purpose GPU technologies available on our home servers and a variety of deep learning frameworks implemented.
Skills acquired at the end of the internship Program with efficient technologies for the encountered case Practical experience with machine learning Progress in AI algorithms and know how to opt for the right strategy and suitable algorithm Get demonstrative and documented results Develop technologies that are not restrained to one usecase Collaborate with researchers and engineers Participate in the proposition of new exploratory fields URL sujet détaillé : https://jobs.atos.net/job/lyon/deeplearningforanomalydetectionhfenstage/5343/9159388
Remarques :





SM20723 Evaluating the impact of cache control technologies on HPC workloads


Description


Many scientific processes can be expressed as a workflow graph where nodes are computational tasks (scientific simulations, analysis, visualization) and edges are data exchanges between tasks. Traditional workflow infrastructures exchange data through files. However, the increasing gap between I/O bandwidth and computational capabilities in current and future supercomputers requires a change in the way scientists are analyzing data produced by simulations. Filebased workflows must now be replaced by in situ workflows performing data extraction, data reduction, or online visualization before storing relevant data to the file system.
On current supercomputers, a in situ workflow execution can span across thousands of nodes. Such execution may exhibit single tasks distributed across multiple nodes, or multiple tasks sharing a same single node. In this last case, tasks must be isolated from each other, to avoid potential performance degradation.
For shared resources like caches in manycores architectures, recent technologies have been made available to partition them among distinct groups of processes, for example with the Cache Allocation Technology on modern Intel processors. Using an interface similar to lowlevel container mechanims, one can use this technology to split shared caches between competing workloads.
Unfortunately, the impact of such technology on the performance of HPC workloads is still not well understood, making it difficult to choose the right allocation policy, or to tune it on the fly during execution.
To address this issue, the purpose of this internship is to perform a wide experimental study of the impact of cache controls on HPC workloads with a focus on its impact of runtime in the context of multiple workloads competing for resources. This study will result in the design of predictive models for how a cache control policy would impact a known workload at runtime, focusing on either cache capacity or cache bandwidth control. The evaluation will use currently available processors and existing software allowing the control of Intel's CAT.
This internship is part of a collaboration between Grenoble University and Argonne National Laboratory. A successful candidate will have the opportunity of visiting Argonne during the summer to further this work. URL sujet détaillé :
:
Remarques :





SM20724 Online HPC workload data analysis for light and reactive control


Description


Many scientific processes can be expressed as a workflow graph where nodes are computational tasks (scientific simulations, analysis, visualization) and edges are data exchanges between tasks. Traditional workflow infrastructures exchange data through files. However, the increasing gap between I/O bandwidth and computational capabilities in current and future supercomputers requires a change in the way scientists are analyzing data produced by simulations. Filebased workflows must now be replaced by in situ workflows performing data extraction, data reduction, or online visualization before storing relevant data to the file system.
On current supercomputers, a in situ workflow execution can span across thousands of nodes. Such execution may exhibit single tasks distributed across multiple nodes, or multiple tasks sharing a same single node. In this last case, tasks must be isolated from each other, to avoid potential performance degradation.
However, the behavior of each task and the performance of the HPC application itself depends on several external factors : the input dataset, the number of computing resources available, possible network contention, ... In such a situation, the HPC runtime should adapt online its placement and isolation policies in reaction to the actual execution it controls. Often the cost of running this control process can be high and the runtime choses to perform it sporadically to lessen its global impact. The main issue here is to decide when it is the most pertinent to run the control process in order to adapt the resources policies.
The purpose of this internship is to design a method to analyze an application online, that is during its execution. The idea is to continuously collect a trace, that contains software and hardware metrics, and to analyze it in order to detect changes in the execution. We are mainly interested in detecting phases changes (the application switch for computation to communication or conversely), load changes (the input dataset is irregular and the computation is not balanced) and environnemental changes (there is a network contention, some nodes are down). The objective being to sort out what is relevant (has an impact on the performance) from what is not and to trigger the runtime control algorithms only when a relevant change occurs.
This internship is part of a collaboration between Grenoble University and Argonne National Laboratory. A successful candidate will have the opportunity of visiting Argonne during the summer to further this work. URL sujet détaillé :
:
Remarques :





SM20725 programmation vérifiée de fonctions de flot


Description


Dans ce stage, nous nous intéressons au langage de programmation vérifiée F* (InriaMicrosoft) pour y définir un modèle de concurrence supportant la génération de code réactif (exécution préhemptive, ordonnancement dynamique) ou tempréel (nonpréhemptive, ordonancement statique) en utilisant le générateur de code Kremlin de F* et les outils de métaprogrammation MetaF*. URL sujet détaillé : https://www.irisa.fr/prive/talpin/frp.pdf
Remarques : Stage coencadré par Yuan Rui Zhang, postdoctorant, et dans le cadre de l'équipeassocié Composite avec UC San Diego (MESL)





SM20726 A logical framework to verify requirements of hybrid system models


Description


The objective of the internship is to explore build upon previous works on compositional verification of hybrid systems that yielded a component based approach for system design with an executable semantics of contracts [3] as well as formalized cyberphysical systems into components with a strong logical foundation grounded on differential logic dL and associated with an automatic composition theorem to prove properties from individual component contracts. URL sujet détaillé : https://www.irisa.fr/prive/talpin/cps.pdf
Remarques : Stage coencadré par Benoit Boyer, Mitsubishi Electric R&D Europe (MERCE)





SM20727 Etude et preuve de convergence d'un algorithme de Machine Learning déterministe


Description


Le développement rigoureux de logiciel, basé sur des méthodes formelles, exige que l'on dispose de modèles du programme. Comme de tels modèles ne sont pas toujours disponibles, différentes méthodes d'inférence ont été proposées pour reconstruire ces modèles. On suppose que le système peut être représenté, à un certain niveau d'abstraction, par une machine à états finie ayant certaines entrées et sorties. L'équipe VASCO du LIG s'intéresse depuis plusieurs années aux techniques d'inférence "active", qui consistent construire cette machine à états en observant ses sorties en fonction de différentes entrées. Ces techniques d'apprentissage reposent sur des tests en boîte noire, ne supposent pas qu'on dispose du code source, ni même du code binaire du logiciel.
Ces techniques s'inscrivent dans le cadre théorique de l'inférence grammaticale: retrouver un modèle de grammaire (ou d'automates ici) à partir de l'observation des séquences d'événements (séries temporelles, vues comme mots d'un langage sur l'alphabet des événements).
Ces techniques sont bien utilisées depuis une douzaine d'années pour retrouver des modèles de systèmes testés en boîte noire (rétroingénierie de modèle), que ce soit pour faire de la vérification, de la documentation, de la recherche de failles de sécurité etc.
POINT DE DEPART
L'équipe Vasco du LIG à Grenoble, en collaboration avec l'Université de S=E3o Paulo, a développé récemment un nouvel algorithme appelé hW, très efficace pour le cas où on n'a pas la possibilité de réinitialiser le système, donc on doit l'apprendre au fil de son exécution. L'algorithme hW a de très bonnes performances en moyenne, mais sa convergence est difficile à prouver.
TRAVAIL A MENER
Il faudra étudier les hypothèses permettant ou facilitant la preuve de convergence de l'algorithme d'inférence.
On pourra également étudier une généralisation de l'algorithme pour inférer des automates non fortement connexes en minimisant le recours à des réinitialisations.
Le travail théorique pourra être complété par  des analyses expérimentales sur des benchmarks de systèmes logiciels  l'utilisation de l'algorithme sur un ou plusieurs cas d'étude (par exemple, rétroingénierie d'applications embarquées sur cartes à puce).
RÉFÉRENCES BIBLIOGRAPHIQUES
[Dagstuhl2016] Amel Bennaceur, Reiner Hähnle, Karl Meinke. Machine Learning for Dynamic Software Analysis: Potentials and Limits International Dagstuhl Seminar 16172, Dagstuhl Castle, April 2427, Springer 2018.
[GSBO18] Roland Groz, Adenilso Simao, Nicolas Bremond, Catherine Oriat. Revisiting AI and Testing methods to Infer FSM Models of BlackBox Systems AST 2018 (Automated Software Testing), G=F6teborg, May 2018. URL sujet détaillé :
:
Remarques :





SM20728 Towards Efficient Big Data Management with Transient Storage Systems


Description


In the High Performance Computing (HPC) domain, huge amounts of data are collected from complex simulations of physical phenomena (such as climate modeling, galaxy evolution, etc.), and data analysis techniques from the field of Big Data are used to extract pertinent informations. Needless to say, data storage is a cornerstone of Big Data processing. Whereas traditional HPC systems usually separate computational resources from storage, using parallel file systems, new supercomputers now often have local storage devices (such as SSDs or NVRAM) located on their compute nodes. This local storage allows users to deploy new types of distributed storage systems along with the applications. Such a storage system, deployed only for the duration of an application's execution, is called a transient storage system.
This internship focuses on answering the following research question: "How can we efficiently deploy and terminate a transient storage system?" . Indeed, a transient storage system only exists during the run time of the application that uses it, initially hosting no data. When the application terminates, data would be lost if not backed up. Thus, two steps are needed: loading data from the persistent storage at initialization time, and dumping the data onto the persistent storage before termination. Both operations involve large data transfers that may slow down the initialization and termination of such storage systems. Optimizing these transfers would lead to faster application deployments and termination.
Multiple relevant contributions can be achieved during this internship:
1. Theoretical contributions: establish a lower bound (mathematical model) of the duration of the operations of loading and dumping data in order to identify their bottlenecks, and have a baseline for the evaluation of implementations of such systems.
2. Algorithmic contributions: optimize the algorithms used to transfer the data between nodes. For instance, one can consider the network topology, or use AI to monitor the application's behavior and load/dump data in the background.
3. Experimental contributions: implement the algorithms in Pufferbench (a benchmark designed to evaluate commission and decommission algorithms to experiment with the algorithms previously designed.
4. Validation contributions: implement and evaluate a microservice able to efficiently run the studied operations using Mochi (a set of libraries designed to quickly build distributed storage systems  https://www.mcs.anl.gov/research/projects/mochi/).
This subject will be done in close collaboration with Matthieu Dorier from Argonne National Laboratory (ANL, USA). In addition to the possibility to interact with toplevel researchers and scientists from ANL, the work can involve the use of largescale HPC experimental facilities available to Inria and to ANL.
Depending on the quality of the achievements, this internship could be followed by a PhD thesis in the same area, in the same collaboration context. URL sujet détaillé : http://master.irisa.fr/internship/uploaded/5.pdf
Remarques : Advisors:
Gabriel Antoniu  gabriel.antoniu.fr (main advisor)
Nathanaël Cheriere  nathanael.cheriere.fr





SM20729 HPCBig Data convergence at processing level by bridging insitu/intransit processing with Big Data analytics


Description


In the High Performance Computing (HPC) area, the need to get fast and relevant insights from massive amounts of data generated by extremescale computations led to the emergence of insitu and intransit processing approaches. They allow data to be visualized and processed in realtime, in an interactive way, as they are produced, as opposed to the traditional approach consisting of transferring data offsite after the end of the computation, for offline analysis. In the Big Data Analytics (BDA) area, the search for realtime, fast analysis was materialized through a different approach: streambased processing. As the tools and cultures of HPC and BDA have evolved in divergent directions motivated by different optimization criteria, it becomes clear today that leveraging together the progresses achieved in the two areas can be an efficient means addresses the Big Data challenges, which are now relevant for both HPC and BDA.
This Master internship will be hosted by the KerData team at IRISA/Inria Rennes Bretagne Atantique (https://team.inria.fr/kerdata/). It aims to propose an approach enabling HPCBig Data convergence at the data processing level, by exploring alternative solutions to architecture a unified framework for extremescale data processing. The architecture of such a framework will leverage the extreme scalability demonstrated by in situ/in transit data processing approaches originated in the HPC area, in conjunction with Big Data processing approaches emerged in the BDA area (batchbased, streamingbased and hybrid). The highlevel goal of this framework is to enable the usage of a large spectrum of Big Data analytics techniques at extreme scales, to support precise predictions in realtime and fast decision making.
In the process of designing the unified data processing framework, we will leverage the Damaris framework for scalable, asynchronous I/O and insitu and in transit visualization and processing, developed by KerData at Inria (https://project.inria.fr/damaris). Damaris already demonstrated its scalability up to 16,000 cores on some of the top supercomputers of Top500, including Titan, Jaguar and Kraken. For the purpose of this internship, Damaris will have to be extended to support Big Data analytics for data processing (e.g., based on the Apache Flink and Apache Spark engines and on their higherlevel machinelearning libraries). The work will involve software architecture design, implementation and experimental validation. In particular, adding programmatic support for Big Data analytics on dedicated cores in Damaris will focus on distributed, highperformance, alwaysavailable, elastic and accurate data processing. Furthermore, we plan to design, implement and experimentally validate a mechanism for incrementally migrating running stream tasks from the insitu processing backend to the intransit one without stopping the query execution.
This subject will be done in close collaboration with Bogdan Nicolae and Pierre Matri from Argonne National Laboratory (ANL, USA). In addition to the possibility to interact with top level researchers and scientists from ANL, the work can involve the use of largescale HPC experimental facilities available at ANL as well as Inria's Grid5000 distributed testbed  https://www.grid5000.fr/ Depending on the quality of the achievements, this internship could be followed by a Ph.D. thesis in the same area, in the same collaboration context. URL sujet détaillé : http://master.irisa.fr/internship/uploaded/23.pdf
Remarques : Advisors:
Alexandru Costan  alexandru.costan.fr
Gabriel Antoniu  gabriel.antoniu.fr





SM20730 Safe floatingpoint optimisation


Description


Satisfiability modulo theory is an approach to solving firstorder formulas in certain logical theories or combinations thereof. It is used in program verification and security analyses (concolic testing), as well as for certain kinds of optimization problems.
Originally, SMTsolvers only allowed ideal arithmetic types (Z, Q), but some now also allow floatingpoint types. The idea is that if we wish to prove the absence of corner cases in numerical algorithms implemented with floatingpoint arithmetic, we must be able to specify problems with floatingpoint variables and floatingpoint operations.
One way to address this issue is to ``bitblast floatingpoint operations into basic operations over bits, as in a hardware implementation of floatingpoint. This has been done in certain solvers (Z3dots) but results in formulas that are very hard and slow to solve.
We instead wish to be fast in the ``easy case where a formula holds over floatingpoint values because it holds over the real numbers, using ideal operations, and still holds if rounding error is accounted for. For this, we will soundly take rounding error into account, with increasing precision.
The goal of this project is to explore this approach. URL sujet détaillé : http://wwwverimag.imag.fr/~monniaux/stages/2019/float_optimisation.pdf
Remarques :





SM20731 Memoryaware scheduling fot the StarPU runtime


Description


Parallel computing platforms are used, among others, to perform largescale numerical simulations and are becoming increasingly complex, as they now include computing accelerators, deep memory hierarchies, and the combination of shared and distributed memories. On way to take advantage of their raw power despite this complexity is to rely on lightweight scheduling runtimes, such as the StarPU runtime. A scientific computing application is then described as a directed graph of tasks, where arcs represent dependencies between tasks. The scheduler allocates available tasks on the various computing resources of the platform. One of the problem to efficiently solve such a scheduling problem is linked to memory: we have to avoid the scenario where the platform runs out of memory, otherwise the application would crash (or experienced very bad performance because of swapping).
Solutions to this problem have been proposed. First, schedulers taking into account the amount of available memory have been proposed for the case of a single computing resource or for specific task graphs on parallel resources. More recently, we have proposed a method to add new dependencies to the task graph such that any valid schedule of the resulting graph is guaranteed not to exceed a given amount of memory. The benefit of this method is to deal with memory bottlenecks before the execution, and to allow for runtime schedulers, which are able to dynamically adapt to changing conditions in the computing platforms. Unfortunately, the complexity of the algorithm and the large number of added dependencies make this method hard to use in practice.
The goal of this internship is to improve this method, by limiting the needed guarantees: instead of proving that all execution will not exceed the memory (the actual guarantee), it is indeed enough to make sure that no memory bottlenecks will happen, as runtime schedulers such as StarPU are able to block tasks requesting memories before some other tasks release enough memory. The objective is to dramatically limit the number of new dependencies to add to the graph.
The internship will consist of the following steps:  Model the problem using graph concepts, select the most appropriate memory model and express the objective as a graph property.  Propose algorithmic solutions to solve the problem, with a special care on the complexity of the solutions.  Test the proposed solutions first by simulation, on actual application graphs.  Implement some of these solutions on the StarPU runtime.
The distribution of the work between theory (designing algorithms and proving them) and practice (simulation and real implementation) is flexible and will depend on the skills and will of the candidate. URL sujet détaillé : http://perso.enslyon.fr/loris.marchal/sujetstageM2en.pdf
Remarques : Coadvised by Samuel Thibault (Univ. Bordeaux), leader of the StarPU development, who will be on leave at the LIP during the internship.





SM20732 Smoothed analysis of walking strategies in Delaunay triangulation


Description


The smoothed analysis technique allow a tradeoff between a worse case complexity and a random complexity by varying a noise level. (no noise is worst case, lot of noise means fully randomn position).
The aim of the internship is to apply this kind of analysis to algorithms traversing a triangulation using neighborhood relations between triangles. Such algorithms have application for routing or for point location. URL sujet détaillé : https://members.loria.fr/Olivier.Devillers/stagesmoothed.pdf
Remarques : Gratification légale





SM20733 DTIS201907 (à rappeler dans toute correspondance)


Description


Combinatorial optimization problems such as task allocation problems, scheduling problems, vehicle routing problems occur in many applications in the aerospace domain: when planning Earth or space observation tasks by satellite constellations, when deploying complex tasks on embedded, realtime manycore computing platforms, for the planning of a robot fleet deployment in a given zone, for planning tasks for aircraft assembly lines, etc. To solve such problems two main classes of algorithms can be distinguished: complete search methods on the one hand, which are guaranteed to find a solution if one exists and can prove the optimality of a solution, and incomplete search methods on the other hand, which explore candidate solutions in a more heuristic manner, without the guarantees of complete methods, and are typically used for solving problems under limited computation time budgets, or for solving very high dimensional problems, or both. Major advances for complete methods have been brought by the use of /conflictdriven/ techniques, whereby explanations for failures to satisfy the problem constraints are generated and recorded by the search process, so as to never reproduce the bad decisions again and to steer away the search process from bad regions of the search space. Such techniques are used for instance for the Boolean satisfiability problem (SAT) in /Conflict Driven Clause Learning (CDCL)/ [1] algorithms, and for satisfiability modulo theory solvers (SMT) with the /CDCL(T)/ algorithm or /ModelConstructing Satisfiability Calculus/ [2], and in Constraint Programming with the /Lazy Clause Generation/ approach [3]. Nevertheless, complete methods still have issues scaling to high dimensional problems, which often leads to resort to incomplete methods such as /greedy search/, /local search/ or /metaheuristic search/ [4], either as main search engines, or to guide decisions performed by complete search methods by probing the search space.
The goal of this internship is to study to which extent conflict learning methods inspired by complete search methods can increase the performance of incomplete search methods. The first step of the work is to perform a stateofthe art report on existing hybrid incomplete/conflict learning methods [5]. The second step is to propose novel ways to combine incomplete methods and conflict learning methods, possibly revisiting the nature of conflict information handled by the algorithms and the way it is processed and used during search. Last, experiments will be conducted to assess the performance increase of the proposed hybrid method.
Bibliography: [1] Conflict Driven Clause Learning, Chapter 4, Handbook of Satisfiability. Armin Biere, Marijn Heule, Hans van Maaren and Toby Walsh (Eds.) IOS Press, 2009.
[2] A ModelConstructing Satisfiability Calculus. Leonardo Mendonça de Moura and Dejan Jovanovic. Verification, Model Checking, and Abstract Interpretation, 14th International Conference (VMCAI), 2013.
[3] Solving RCPSP/max by lazy clause generation. Andreas Schutt, Thibaut Feydy, Peter J. Stuckey and Mark G. Wallace. Journal of Scheduling 16(3) : 273289 (2013).
[4] Handbook of Metaheuristics. Michel Gendreau and JeanYves Potvin. Springer Publishing Company, 2010.
[5] Integrating Conflict Driven Clause Learning to Local Search. Gilles Audemard and JeanMarie Lagniez and Bertrand Mazure and Lakhdar Sais. Proceedings 6th International Workshop on Local Search Techniques in Constraint Satisfaction, 2009. URL sujet détaillé : https://w3.onera.fr/stages/sites/w3.onera.fr.stages/files/dtis201907.pdf
Remarques : stage réalisé en coencadrement avec Rémi DELMAS (ONERA)





SM20734 Mesure de l'alignement des systèmes d'information


Description


In the context of the flexible and agile factory of the future, it is crucial that information systems (IS), cornerstone of modern organizations, evolve in harmony with the organization it supports. The IS's ability to meet the business and organization needs is called alignment. It is regularly considered as one of the main concerns of IS managers. In general, alignment is worked out during the IS design so that the designed IS supports both the business strategies and the business processes involved. However, the environment in which companies evolve is also evolving. This implies an adjustment of the alignment over time and therefore changes to the IS.
Context and objectives: The ability of an organization to evolve is a key factor of its success, the control of the IS dynamics is therefore a main stake. The objective of the project is to set up models to measure the evolution of IS alignment over time.
To address the above barriers, we will work on three complementary scientific areas:  Assessment of alignment at the different levels (strategic, tactical and project): the challenge is to model the different levels of alignment and then to define the corresponding parameters enabling to evaluate a given level of alignment and to articulate them. Metrics will be set up for each level. In addition, the evaluation should be as automated as possible.  Alignment dynamics modelling: The main problem here is to develop a model that reflects the evolution of the alignment over time for the different levels of alignment.  Dynamic Alignment Adjustment Methodology: The goal here is to recommend what to do in order to control and adjust the IS alignment over time. In other words, the goal is to put together the alignment assessment models and the model of the alignment dynamics and then to exploit them to support the management of the IS alignment. In addition, we hope to develop a software prototype implementing the proposed methodology. URL sujet détaillé : https://displab.fr/sites/default/files/disp2019a_m2_internship__va.pdf
Remarques : Internship based at the DISP laboratory of the University of Lyon, in partnership with the ICUBE CSIP team, INSA Strasbourg, which can lead to a PhD thesis.





SM20735 Transformation Proof From Madeus to Petri Nets


Description


Due to the complexity of distributed infrastructures, installation (deployment) and regular maintenance (e.g. updates, fault management, and other elements of normal software lifecycles) of distributed software becomes a complex administrative task, composed of many steps and welldefined procedures. In the STACK research team at Inria (France), Helene Coullon works on Madeus, a novel distributed software deployment formal model, as well as its Python implementation in a product called MAD (Madeus Application Deployer). The overall approach that Madeus uses to support more efficient deployment is to expose and leverage opportunities for parallelism in the deployment process. Unfortunately, introducing parallelism also introduces more complexity for the enduser/administrator wanting to deploy a distributed software system. For this reason, Helene Coullon and Ludovic Henrio work in collaboration with other computer scientists to apply formal verification on deployments written with Madeus. Verification is of major importance for largescale distributed systems that introduce concurrency and uncertainty.
Among other ongoing works on formal verifications, a Madeus deployement process can be translated to a petri net. As a result, once the transformation applied, model checkers can be used to perform verifications. The formal semantics that defines a translation from Madeus to Petri nets has already been defined. Petri nets are particularly adapted to the modeling of Madeus deployment processes, because the syntax and semantics of a Madeus deployment process is somehow similar to a petri net.
The intern will be responsible for proving that the transformation from Madeus deployment process to Petri nets is correct, and thus the obtained petri net has the same behavior than the deployment formally defined with Madeus. The missions of the intern candidate are to:  understand both the operational semantics of Madeus and its semantics as a translation to petri nets,  study which way to express the semantics of Petri nets is the most adapted to the work,  study the different possibilities to achieve the proof of correctness of the transformation and choose one solution, the main question to be answered here will be: what kind of equivalence do we prove between the two semantics and what are its properties.  Perform the proof,  and eventually, the proof could be performed with the Coq proof assistant, but a thorough paper proof would also be satisfactory. URL sujet détaillé : http://helenecoullon.fr/download/internshippositionens.pdf
Remarques : Coadvised by Ludovic Henrio (ludovic.henrio.fr) CNRS researcher in Lyon at the LIP, ENS Lyon.





SM20736 Rounding 3D meshes


Description


In most software, 3D polygonal meshes have vertices whose coordinates are represented with fixedprecision floatingpoint numbers, say {doubles}. Unfortunately, actions, such as rotations and intersections, create vertices whose coordinates cannot be emph{exactly} represented with doubles. A natural problem is thus to round the coordinates to doubles, without creating selfintersections between the faces and without changing the geometry too much; the faces can however be subdivided.
We recently presented an algorithm that solves the problem in 3D on the integer grid. This solution is not yet known to be practical because its worstcase complexity (n^{19}) very bad but we conjecture a good complexity of (nsqrt{n}) practice on nonpathological data.
The goal of this Master thesis, which could lead to a Ph.D. thesis, is to investigate and eventually develop a practical and efficient solution for this problem. The approach will be to implement a simplified version of our algorithm in and, in parallel, to work at improving its efficiency, both in theory and in practice. URL sujet détaillé : https://members.loria.fr/Sylvain.Lazard/stagesnap.pdf
Remarques :





SM20737 Build a platform for increasing the transparency of social media advertising


Description


Keywords: audit, privacy, online targeted advertising, Twitter, Facebook, browser extension, data collection and analysis, machine learning, inference, statistics Lab: Laboratoire d'Informatique de Grenoble (LIG), Grenoble, France (head: Eric Gaussier) Team in the lab: SLIDE (head: Sihem AmerYahia) Advisor: Oana Goga (CNRS & Univ. Grenoble Alpes, LIG) oana.gogasws.org https://people.mpisws.org/~ogoga/ Project Description: In recent years, targeted advertising has been source of a growing number of privacy complaints from Internet users [1]. At the heart of the problem lies the opacity of the targeted advertising mechanisms: users do not understand what data advertisers have about them and how this data is being used to select the ads they are being shown. This lack of transparency has begun to catch the attention of policy makers and government regulators, which are increasingly introducing laws requiring transparency [2]. The project consists in building a tool that provides explanations for why a user has been targeted with a particular ad that does not need the collaboration of the advertising platform. The key idea is to crowdsource the transparency task to users. The project consists in building a collaborative tool where users donate in a privacypreserving manner data about the ads they receive. The platform needs to aggregate data from multiple users (using machine learning) to infer (statistically) why a user received a particular ad. Intuitively, the tool will group together all users that received the same ad, and look at the most common demographics and interests of users in the group. The key challenge is to identify the limits of what we can statistically infer from such a platform. Throughout the project the student will be able to familiarize himself with the online targeted advertising ecosystems, learn to collect data from online services and apply machine learning and statistics concepts on realworld data. Requirements: Strong coding skills and strong background in statistics and data mining. URL sujet détaillé : https://people.mpisws.org/~ogoga/offers/Internship_2017_crowdsouring.pdf
Remarques : possibilité de rénumération





SM20738 Audit the transparency mechanisms provided by social media advertising platforms


Description


Keywords: audit, privacy, online targeted advertising, Twitter, browser extension, data collection and analysis, machine learning, statistics
Lab: Laboratoire d'Informatique de Grenoble (LIG), Grenoble, France (head: Eric Gaussier)
Team in the lab: SLIDE (head: Sihem AmerYahia)
Advisor: Oana Goga (CNRS & Univ. Grenoble Alpes, LIG) oana.gogasws.org https://people.mpisws.org/~ogoga/
Project Description: In recent years, targeted advertising has been source of a growing number of privacy complaints from Internet users [1]. At the heart of the problem lies the opacity of the targeted advertising mechanisms: users do not understand what data advertisers have about them and how this data is being used to select the ads they are being shown. This lack of transparency has begun to catch the attention of policy makers and government regulators, which are increasingly introducing laws requiring transparency [2]. To enhance transparency, Twitter recently introduced a feature (called =93why am I seeing this=94) that provides users with an explanation for why they have been targeted a particular ad. While this is a positive step, it is important to check whether these transparency mechanisms are not actually deceiving users leading to more harm than good. The goal of the project is to verify whether the explanations provided by Twitter satisfy basic properties such as completeness, correctness and consistency. The student will need to develop a browser extension or a mobile application that is able to collect the ad explanations received by realworld users on Twitter and to conduct controlled ad campaigns such that we can collect the corresponding ad explanations. Throughout the project the student will be able to familiarize himself with the online targeted advertising ecosystems, learn to conduct online experiments and measure their impact and conceptually reflect at what is a good explanation.
Requirements: Strong coding skills. Experience in working with data is a plus. URL sujet détaillé : https://people.mpisws.org/~ogoga/offers/Internship_2017_Twitter_explanations.pdf
Remarques : possibilité de rénumération; continuation avec these





SM20739 Gestion d'énergie dans le noyau Linux


Description


Dans le domaine du mobile, l'utilisation d'un système de régulation thermique actif, comme un ventilateur, n'est pas possible. Afin de permettre au matériel de rester dans l'enveloppe thermique, des solutions software sont mises en place dans le noyau pour réguler la température, c'est le cas du mécanisme de CPU cooling device, qui modifie la fréquence du processeur en fonction de la température.
Deux nouvelles stratégies doivent être étudiées: l'injection de cycles idle [1] et l'utilisation simultanée (combo) d'injection de cycles idle et de changements de fréquence [2]. Vu que le nombre de fréquences disponibles est limité, il s'agira de créer des fréquences virtuelles en utilisant à la fois des fréquences réelles et des cycles idle.
Toutefois, cette approche peut se réveler trop pénalisante pour l'utilisateur si elle introduit des temps de latences trop importants [3].
L'objectif du stage est de construire un driver et d'expérimenter à partir du framework existant l'impact de ces solutions, proposer des améliorations et formaliser le raisonnement sous forme mathématique.
L'étudiant sera encadré par un chercheur à l'IRIT et un expert developpement noyau.
La finalité de ce driver est qu'il soit livré dans le noyau Android (4.9 et 4.14) et Linux.
Prérequis:
 Etre interessé et motivé par ce genre de problèmatique  Avoir envie de travailler dans le domaine de l'Opensource  Etre à l'aise avec l'environnement de dev OSS/GNU/Linux
L'étudiant se verra attribuer un portable préinstallé, ainsi qu'une board de développement [4][5] pour les tests, pendant la durée du stage.
L'environnement de travail étant distribué, il n'y a pas de locaux d'entreprise. L'IRIT accueillira l'étudiant dans ses locaux.
Sans être obligatoire, une bonne maîtrise de l'Anglais technique serait un plus en terme de confort pour la communication avec les différents intervenants.
[1] https://git.linaro.org/people/daniel.lezcano/linux.git/commit/?h=thermal/idleinject&id=f31b7021f2910ff6969e998861d9fb269c782579
[2] https://git.linaro.org/people/daniel.lezcano/linux.git/commit/?h=thermal/idleinject&id=9a136388ef1ceddc35f569abc2b4bb5bc33ef21d
[3] https://www.youtube.com/watch?v=4V1ynaj6dq8
[4] https://github.com/96boards/documentation/blob/master/ConsumerEdition/HiKey960/HardwareDocs/HardwareUserManual.md
[5] http://www.cnetfrance.fr/news/hikey960huaweietgooglelancentunnanoordinateurfaconraspberrypi39851830.htm URL sujet détaillé :
:
Remarques : This work will be in collaboration (and payed by) Linaro which is the company responsible for the Thermal/Frequency system in Linux on ARM





SM20740 Complexity of factorization in products of linear forms


Description


We have recently proposed new algorithms for the factorization of multivariate polynomials into products of linear forms. The goal of this internship will be to improve the proposed algorithms in directions such as: (i) better running time. (ii) derandomization. (iii) ability to handle polynomials of very high degree.
This internship is suitable for a student interested in algorithms and complexity (and especially arithmetic circuit complexity). URL sujet détaillé : http://perso.enslyon.fr/pascal.koiran/stageM2_1819.pdf
Remarques :





SM20741 Octogones entiers et réels en programmation par contraintes


Description


Le but de ce stage est d'étudier plus précisément le domaine abstrait des octogones, tel que défini en interprétation abstraites, dans le cadre de la programmation par contraintes, et ses applications potentielles pour des problèmes discrets. Les octogones ont déjà montré de bonnes performances sur des problèmes à variables réelles [PTB2014], par ailleurs, l'algorithme de clôture sur les octogones [M2006] est très proche d'algorithme de consistance dédiés, comme les temporal constraint networks [DP1987]. Le domaine abstrait des octogones, déjà bien étudié dans le cas des variables réelles, semble donc un bon candidat pour la résolution de problèmes à variables entières, ou bien entières et réelles, avec des contraintes temporelles (problèmes d'emplois du temps, d'ordonnancement, etc). URL sujet détaillé : https://www.normalesup.org/~truchet/Docs/sujet2018.pdf
Remarques :





SM20742 Avoiding cluster overload with a controlbased approach


Description


Work in cooperation with Olivier Richard (Datamove) and Bogdan Robu (Gipsalab)
 Context
HPC systems are facing more and more variability in their behavior (related to e.g., performance and power consumption) and because they are less predictable their administration requires a more complex management. This can be done by monitoring the information in the systems, analyzing this data in order to activate appropriate systemlevel or applicationlevel feedback mechanisms (e.g., informing schedulers, downclocking CPUs).
Such feedback mechanisms rely on extensive monitoring and analysis, and involve decisions and their execution. Such feedback loops, in the domain of Computer Science, are the object of Autonomic Computing [1], which emerged mainly from distributed and Cloud systems. One approach in designing feedback loops is naturally Control Theory, which is extremely widespread in all domains of engineering, but only quite recently and scarcely applied to regulation in computing systems [2]. It can bring to the systems designers methodologies to design and implement feedback loops with wellmastered and guaranteed behavior in order to obtain automated management with goals of optimization of resources or avoidance of crashes and overloads.
 Expected work
The research work will consist in proposing feedcback loops for the runtime management of resources, targeting the particular computation platform CiGri [3], a lightweight, scalable and fault tolerant grid system which exploits the unused resources of a set of computing clusters. It interacts with the computing clusters through the Ressource Job Management System OAR [4], a batch scheduler software. Help from Control engineering scientsists form INRIA and Gipsalab Grenoble will be provided for the creation of the control laws.
The design of the feedback loops will build upon existing work of a 3rd year engineer student from INP, and follow our methodology :  analysis of the system and identification of its dynamics and the problems to be tackled;  design of a feedback controller ; we will start from an existing P (Proportional) regulation, through a maximum number of jobs to be sent to the cluster, in response to the current number of jobs processed. The current control objective is to maximize cluster utilization while avoiding overload.  implementation and experimental results in simulation; comparing the new control scheme with the existing solution.
According to this sequence of steps, new characteristics of the platform will be considered, e.g., other problems of overload, for example concerning storage architecture. Controllers will be designed going beyond the simple P regulation, alternative objectives and control schemes will be explored, e.g. adaptive and discrete control. Implementations and experimental validation will be consolidated.
[1] J. O. Kephart and D. M. Chess. The vision of autonomic computing. IEEE Computer, 36(1):41=9650, Jan. 2003.
[2] M. Litoiu, M. Shaw, G. Tamura, N. M. Villegas, H. A. M=FCller, H. Giese, R. Rouvoy and E. Rutten. What can control theory teach us about assurances in selfadaptive software systems? In R. de Lemos, D. Garlan, C. Ghezzi, and H. Giese, editors, Software Engineering for SelfAdaptive Systems, volume 9640 of LNCS, Springer. https://hal.inria.fr/hal01281063, 2016.
[3] http://ciment.ujfgrenoble.fr/CiGri/dokuwiki/doku.php
[4] N. Capit, G. Da Costa, Y. Georgiou, G. Huard, C. Martin, G. Mounié, P. Neyron, and O. Richard. A batch scheduler with high level components. In IEEE Int. Symp. on Cluster Computing and the Grid, CCGrid, pages 776=96783, 2005.
URL sujet détaillé :
:
Remarques : Coadvising with Olivier Richard (Datamove Inria/Lig) and Bogdan Robu (Gipsalab)
We plan to continue on this research topic with a PhD.





SM20743 Convergence des réseaux automobiles et avioniques / Convergence between automotive and avionics networks


Description


Les systèmes cyberphysiques (automobiles, avions, drones...) sont équipés de centaines de capteurs, de dizaines de calculateurs, communiquants à travers un réseau partagé (CAN, Ethernet, AFDX, TSN...). Le fonctionnement correct de l'ensemble nécessite que chaque message échangé respecte un temps maximum de traversée du réseau. Il faut donc, lors de la conception du système, pouvoir prouver le respect de cette échéance (borne sur la latence). La théorie du calcul réseau [NC01,DNC18] a été utilisée pour calculer de telles bornes sur les cóurs de réseau AFDX de nombreux avions civils (A380, A350...). Après ces succès, de nombreuses pistes de travail sont apparues: améliorer la précision des résultats, réduire les temps de calcul, modéliser de nouveaux types de réseaux (en particulier la technologie émergente TSN, Time Sensitive Network).
Comme nos activités vont de la recherche sur les fondements du modèle jusqu'à l'implantation, les travaux du stage pourront, en fonction du profil et de l'intére du stagiaire, se porter  soit sur les bases théorique du modèle luimême, et en particulier la modélisation de la décomposition en paquets dans le dioide (min,plus), [ETFA16]  soit sur la modélisation de la technologie TSN, solution Ethernet tempsréel développée par l'IEEE [HAL18],  soit sur les techniques d'implantation efficace (algorithmes sur les fonctions linéaires par morceaux, parcours de graphe, heuristiques).
Cyberphysical systems (automotive, avionics, UAV...) are embedding thousands of sensors, dozens of computers, all communication through a shared network (CAN, Ethernet, AFDX, TSN...). The correct behaviour of the system requires that the communication delay observes a specified deadline. The, one have to prove at design time that such constraints are respected by the system. The Network Calculus theory has been used to prove such bounds on several civil aircrafts (A380, A350...). And the current developments address several areas: increasing the accuracy of results, reducing the calculation time, considering new network technologies (especially the Time Sensitive Network, TSN).
Since our activities address from models theoretical grounds up to implementation, the student could address  either some fundamental mathematical work on the model, especially in the modelling of packets in the (min,plus) dioid [ETFA16],  or on the modelling of the Time Sensitive Network technology [HAL18],  or looking at the computation time of the implementation.
[NC01] "Network Calculus", J.Y. Le Boudec et P.Thiran, Springer, 2001, http://ica1www.epfl.ch/PS_files/NetCal.htm [DNC18] "Deterministic Network Calculus: From Theory to Practical Implementation" Anne Bouillard, Marc Boyer, Euriell Le Corronc, Oct 2018, WileyISTE [ETFA16] "Embedding network calculus and event stream theory in a common model" Marc Boyer, Pierre Roux Proc. of the 21th IEEE Int. Conf. on Emerging Technologies and Factory Automation (ETFA 2016) https://hal.archivesouvertes.fr/hal01311502v1 [HAL18] "Modelling in network calculus a TSN architecture mixing TimeTriggered, Credit Based Shaper and BestEffort queues" Hugo Daigmorte, Marc Boyer, Luxi Zhao Technical Report URL sujet détaillé : http://w3.onera.fr/formationparlarecherche/sites/w3.onera.fr.formationparlarecherche/files/tisdtis201810_m._boyer.pdf
Remarques : Poursuite en thèse envisageable / Doing a PhD on the subject will be possible.





SM20744 Automated reasoning


Description


Many applications, notably in the context of verification (for critical systems in transportation, energy, etc.), rely on checking the satisfiability of logic formulas. Satisfiabilitymodulotheories (SMT) solvers handle large formulas in expressive languages with builtin and custom operators (e.g. arithmetic and data structure operators). These tools are built using a cooperation of a SAT (propositional satisfiability) solver to handle the Boolean structure of the formula and theory reasoners to tackle the atomic formulas (e.g. x> y + z for the theory of arithmetic). The veriT SMT solver, developed in Nancy, is a stateoftheart reasoner of this kind.
When it comes to handle pure quantified firstorder logic with equality, the superposition calculus gives best results. This calculus is a complete set of rules extending resolution to firstorder logic with equality. It is extremely good at finding intricate proofs, even on large sets of axioms. Superpositionbased saturation automated theorem proving has been steadily progressing for decades. One of the best provers of this kind is the Eprover, developed in Stuttgart.
The main objective of this subject is to use the best of both SMT and superposition for efficient reasoning on large logic formulas with quantifiers, in presence of interpreted symbols. There are both theoretical and practical aspects related to this question.
The work will be conducted under the supervision of Stephan Schulz (DHBW, Stuttgart), Jasmin Blanchette (VU Amsterdam) and Pascal Fontaine (Loria).
We already have first experiments both integrating the superposition calculus in SMT and making use of SAT and SMT techniques in the context of superposition. We suggest that, as an introductory step, the student get knowledge of these prototypes, and evaluate their strength and weaknesses on known sets of benchmarks. Another direction is to get acquainted with the Avatar technique successfully used in the Vampire prover.
As a second step, we envision incrementally generalizing the approach in the prototypes for a tighter integration of both reasoning techniques. In parallel, the student will design an abstract calculus to study completeness issues, and termination for decidable fragments (notably for the BernaysSch=F6nfinkelRamsey fragment). Depending on the interest of the student for formally verified algorithms, this calculus might be the object of formal verification using a proof assistant.
Our ultimate goal is to propose powerful reasoners including SMT and superposition techniques to discharge proof obligations from proof assistants (interactive theorem provers). It is thus crucial to produce proofs that can be understood and replayed by external tools. The student will describe, provide the theoretical foundations, and implement proof reconstruction of formulas delegated to the hybrid SuperpositionSMT solver.
The subject can and will be adjusted according to the interests of the student. URL sujet détaillé : https://team.inria.fr/veridis/files/2018/10/internshipAR.pdf
Remarques : Joint internship with Jasmin Blanchette (VU Amsterdam) and Stephan Schulz (DHBW Stuttgart). Geographic mobility among the sites is to discuss. The subject naturally extends into a PhD subject, that will be best defined in details with the student. Do not hesitate to contact us for informal inquiries.





SM20745 Monitoring des échanges applicatifs avec un système IoT


Description


Afin d'assurer la communication entre ses applications et/ou avec le monde extérieur, BergerLevrault met en óuvre des échanges de données au sein d'architectures flexibles, évolutives et au couplage lâche, telles que les architectures orientées services et les architectures événementielles (SOA, EDA). Ces architectures reposent sur plusieurs moyens de communication pour acheminer les données. Nous nous concentrons pour ce travail sur des protocoles applicatifs utilisés dans le monde de l'IoT (InternetofThings). L'IoT est considéré comme l'un des principaux piliers des futurs développements de technologies intelligentes. L'utilisation des objets connectés se répand très vite et ce pour capter des informations dans divers domaines (environnement, industrie, gestion quotidienne des foyers et des villes...) faisant de l'IoT un moyen de contrôle et de prévention indispensable. Dans le cadre d'un ensemble de dispositifs applicatifs mis en place pour contrôler des indicateurs liés à la gestion des bâtiments et à la maintenance prédictive. BergerLevrault exploite les mesures collectées depuis des objets connectés en utilisant des interfaces et des connecteurs qui mettent en óuvre des mécanismes de messaging. Ces derniers permettent d'envoyer des messages d'un éditeur source à un ou plusieurs destinataires en utilisant des protocoles spécifiques tels que MQTT et CoAP. La multiplicité de ces échanges de données génère une complexité et fait ressortir des besoins de contrôle pouvant être traités par la mise en place d'un système de monitoring et d'analyse. Nous proposons d'analyser les infrastructures mis en óuvre pour implémenter ces échanges, à savoir le broker MQTT ou le server CoAP, collecter des métainformations pour pouvoir :  Tracer les messages échangés,  Simplifier de la visualisation des échanges,  Améliorer la sécurité : s'assurer de l'absence d'infractions, d'interceptions ou de fuites de données  Renforcer la maintenabilité : détection d'exceptions (ex : problème de transfert d'un message), précision du contexte et de l'origine du problème, alertes et notifications.
Le stage est basé au laboratoire DISP de l'Université Lyon 2, en partenariat avec la société BergerLevrault (société spécialisée dans l'édition de logiciels dans les domaines de l'éducation, de la santé, du sanitaire, du social et de la gestion des territoires). Vous serez entourés d'une équipe de chercheurs et travaillerez en lien fort avec l'entreprise sur un cas pratique. Vous serez suivi pour planifier l'ensemble des résultats attendus sur la durée du stage. URL sujet détaillé : https://displab.fr/sites/default/files/2019__disp_m2__monitoring_des_echanges_dun_systeme_iot.pdf
Remarques : Internship based at the DISP laboratory of the University of Lyon, in partnership with the ICUBE CSIP team, INSA Strasbourg, which can lead to a PhD thesis.





SM20746 Suivi et évaluation des échanges dans un système d'information distribué


Description


Afin d'assurer l'interopérabilité entre ses applications communicantes et/ou avec le monde extérieur, BergerLevrault met en óuvre des échanges de données au sein d'architectures flexibles, évolutives et au couplage lâche, telles que les architectures orientées services et les architectures événementielles (SOA, EDA). Ces architectures reposent sur plusieurs moyens de communication pour acheminer les données, tel que l'échange de messages. Ces échanges sont opérationnalisés via l'utilisation d'un broker de messages qui permet d'envoyer des messages d'une source à un ou plusieurs destinataires en employant des routages spécifiques. La multiplicité de ces échanges de données génère une complexité et fait ressortir des besoins de contrôle pouvant être traités par la mise en place de moyens de monitoring et d'analyse. Nous proposons d'analyser ces échanges en supervisant l'infrastructure utilisée pour les implémenter, à savoir le broker RabbitMQ. Sur la base d'une collection de données structurée, il est demandé de :  Etudier et faire évoluer le modèle de données reflétant l'architecture de messaging établie et les échanges mises en óuvre sur le broker de messages  Proposer des indicateurs et des seuils acceptables sur la base d'exigences prédéfinies relatives au fonctionnement des échanges (activité des émetteurs/récepteurs, absence d'infractions, taux de messages par type...).  Préciser les causes de dysfonctionnements des échanges et définir leurs impacts  Définir une séquence d'analyse permettant de vérifier des causes potentielles de dysfonctionnement  Etablir une évaluation des situations et indicateurs constatés et proposer, en conséquence, les actions correctives à mener. Ceci permettra d'améliorer le processus de prise de décision.
Le stage est basé au laboratoire DISP de l'Université Lyon 2, en partenariat avec la société BergerLevrault (société spécialisée dans l'édition de logiciels dans les domaines de l'éducation, de la santé, du sanitaire, du social et de la gestion des territoires). Vous serez entouré d'une équipe de chercheurs et travaillerez en lien fort avec l'entreprise sur un cas pratique. Vous serez suivi pour planifier l'ensemble des résultats attendus sur la durée du stage. URL sujet détaillé : https://displab.fr/sites/default/files/2019__disp_m2__suivi_et_evaluation_des_echanges_dans_un_systeme_dinformation_distribue.pdf
Remarques : Internship based at the DISP laboratory of the University of Lyon





SM20747 Microscopy by sequencing


Description


Recent advances in DNA next generation sequencing technologies (NGS) have been extremely fast. Second or third generation machines can routinely read million of individual strands and produce Gb of sequencing data, enough for example to encode full YouTube videos. A number of recent research have looked at possible applications for such massive extraction of molecular information. In this project, we want to explore a new possibility, outside of biology: to extract the shape of an unknown microscopic object through the combination of DNAgels and NGS. The idea is to create a gel around an object of interest using randomly barcoded DNA monomers, and then extract the connectivity of the individual molecular subunit forming the gel by ligation followed by sequencing. This data will then inform a shapereconstructing algorithm. URL sujet détaillé : http://perso.enslyon.fr/nicolas.schabanel/Sujet%20de%20stage%20microscopy%20by%20sequencing.pdf
Remarques : Coencadrant: Yannick RONDELEZ (Gulliver, ESPCI, Paris) http://www.yannickrondelez.com
Stage bilocalisé Lyon/Paris
Localisation principale du stage sur Paris à l'ESPCI ou sur Lyon à l'ENS de Lyon, suivant les préférences du stagiaire





SM20748 Towards Paremeterized Distributed Component Model


Description


Context:
The internship is placed in the context of the design of abstractions for defining parameterized topologies of software components for distributed computing in general, and Cloud/Fog computing in particular.
More precisely, we want to extend the Madeus component model with collective operations and configuration of system of variable size (at least decided at deployment time).
Madeus accurately describes the life cycle of each component by a Petri net structure, and is able to finely express the dependencies between components. The overall dependency graph it produces is then used to reduce deployment time by parallelizing deployment actions.
For the moment, it is not possible to describe naturally in Madeus a system with a size that will be chosen at deployment time. Communication patterns in such variable size systems are not possible or very difficult to express.
Objectives:
The objective of this internship is to study the adaptation and possible extensions of the Madeus model to better take into account large component architectures, particularly focusing on architectures of parametric size. More particularly the objective is to design constructs for parametric communication and topologies, either expressible in Madeus or expressed as minimal extension of the model. The first extension to be considered is the ability to trigger not only two sided communications but also communications toward a set of components. The duality between the transmission of control tokens characteristic of the Madeus system and the data transmission aspect will have to be taken into account. In particular, the interplay between data and communication is crucial in the context of collections of components because the size of the system should be a parameter of the system, or could even be transmitted as data during configuration. This influences the communication patterns, at least concerning the number of targets of collective communications.
The objectives of the internship are at the same time to define sound extensions of the model and to implement the corresponding extension in the Madeus framework (https://mad.readthedocs.io/). The internship can thus focus either on theoretical aspects, or on the design and implementation aspects. URL sujet détaillé : https://lhenrio.github.io/Docs/m2.pdf
Remarques : Coadvisors: Christian Perez (Avallon project, LIP, ENS Lyon) and Ludovic Henrio (Cash project, LIP, ENS Lyon)





SM20749 Analyse d'un algorithme de reconstruction surfacique


Description


The topic of the internship is surface reconstruction, where one has to turn a point cloud into a "meaningful" meshed surface. One of the advisors coauthored an algorithm that performs well in practice (and is used in industry). The goal of the internship is to study the theoretical guarantees offered by that algorithm, both in terms of quality of reconstruction (geometry and topology) and in terms of computational complexity. URL sujet détaillé : https://members.loria.fr/XGoaoc/sujetstagereconstruction.pdf
Remarques : Le stage est coencadré par Xavier Goaoc :
xavier.goaoc.fr
https://members.loria.fr/XGoaoc





SM20750 Delaunay triangulation on surfaces


Description


Context Delaunay triangulations in the Euclidean space Rd have been extensively studied [1]. Their mathematical properties are well understood, many algorithms to construct them have been proposed and analyzed. The following simple incremental algorithm [2] allows for very efficient implementations: For each new point p  Find the simplices in conflict with p, i.e. the simplices whose circumscribing ball containts p  Remove these simplices from the triangulation. This creates a hole.  Fill the conflict hole by ``starring'' it from p. Periodic structures are arising in many fields, ranging from (nano)materials to astrophysics. Modeling such periodic structures can be achieved through the use of Delaunay triangulations of the flat torus, which is homeomorphic to the quotient of Rd under the action of a group of translations. An algorithm was proposed for general closed flat manifolds, i.e. compact quotient spaces of the Euclidean space under the action of a discrete group of isometries [3]. The algorithm is based on the abovementioned incremental algorithm, which subsumes that that the conflict hole is a topological disk for any new point p. This is a priori not always the case on a surface. This has led to the necessity of ensuring that the Delaunay triangulation is maintained as a simplicial complex throughout the incremental construction, i.e., its edges do not form cycles of length 1 (loops, i.e., edges whose two vertices are equal) or 2 (two edges sharing the same two vertices) [3]. Packages [4,5] of CGAL, the Computational Geometry Algorithms Library, implement this incremental algorithm [3] in the special case of the 2D (resp. 3D) flat torus with square (resp. cubic) domain, homeomorphic to R2/Z2 (resp. R3/Z3). The 3D package has already found users in various fields including particle physics and material engineering. This work also led to a fruitful collaboration with astrophysicists [6].
Topic A variant [7,1] of the incremental algorithm works as follows in R2: For each new point p  Split the triangle containing p into three triangles  Flip nonDelaunay edges  Propagate flips until there are no more nonDelaunay edges. The algorithm can be extended to higher dimensions [8].
The question to study is whether this variant extends to the flat torus in 2D and 3D and still works when there are loops or double edges in the triangulation. The question will be studied theoretically and may be completed by a prototype implementation.
The work may extend to a PhD on a related topic.
References [1] Mark de Berg, Otfried Cheong, Marc van Kreveld, and Marc Overmars. Computational Geometry  Algorithms and Applications, SpringerVerlag. [2] Adrian Bowyer. Computing Dirichlet tessellations. The Computer Journal, 24:162166, 1981. [3] Manuel Caroli and Monique Teillaud. Delaunay triangulations of closed Euclidean dorbifolds, Discrete and Computational Geometry, 55(4):827853, 2016. [4] Nico Kruithof. 2D periodic triangulations [5] Manuel Caroli, Aymeric Pellé, Mael RouxelLabbé, and Monique Teillaud. 3D periodic triangulations [6] Johan Hidding, Rien van de Weygaert, Gert Vegter, Bernard J.T. Jones, and Monique Teillaud. Video: The Sticky Geometry of the Cosmic Web, Proceedings 28th Annual Symposium on Computational Geometry, pages 421422, 2012. [7] Charles L.Lawson. Software for C1 Surface Interpolation. C. L. Lawson. Technical report. Jet Propulsion Laboratory. 1977. [8] Herbert Edelsbrunner and N.H. Shah. Incremental topological flipping works for regular triangulations. Algorithmica, 15:223241, 1996. URL sujet détaillé : https://members.loria.fr/Monique.Teillaud/positions/masterDTsurfaces.html
Remarques : coadvising with Vincent Despré "gratification légale"





SM20751 Incremental topological sort for a reactive programming language


Description


The Interactive Informatics Team at ENAC is developing a reactive programming framework, named djnn[1], dedicated to the design of interactive systems. This framework combines an execution engine with a new programming language named Smala[2]. It proposes a new way to unify dataflow primitives with state machines based on the concept of process activation. In a way similar to reactive synchronous languages, such as Lustre [3], the execution of a djnn program rests on the ordering of the nodes of an interactive components graph. However, in the present implementation, the graph is completely reordered at each addition/removal of a component, which can strongly affect the execution time and the reactivity of the interactive system. Thus, the aim of this internship is to define and implement an incremental topological sort for the execution graph. The first step will be a literature survey, then the definition and implementation of the algorithm. Finally the student will have to verify the benefit of this algorithm over the previous one.
This internship may be pursued by a PhD about the optimisation and/or certification of a reactive language dedicated to HumanMachine Interaction.
[1] M. Magnaudet, S. Chatty, S. Conversy, S. Leriche, C. Picard, and D. Prun. Djnn/Smala: A Conceptual Framework and a Language for InteractionOriented Programming. Proc. ACM Hum.Comput. Interact. 2, EICS, Article 12 (June 2018) [2] http://smala.io [3] N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous dataflow programming language Lustre. In Proceedings of the IEEE, pages 13051320, 1991 URL sujet détaillé :
:
Remarques :





SM20752 Fast Solvers for High Frequency Aeroacoustics


Description


The industrial context (given by Airbus Central R&T) is the simulation of aeroacoustic wave propagation in heterogeneous media. It leads to solve mixed dense/sparse linear systems of very large size. We intend to develop high performance parallel solver for this kind of problems. In particular, we have developed a Hierarchical matrices solver, using task based programming in c++, for dense linear systems, and we want to extend it to sparse and mixed systems, and interface it with other existing solvers. The subject is very wide (cf. googledoc description) and its exact content can be discussed. A continuation in PhD is possible. URL sujet détaillé : https://docs.google.com/document/d/1TXWBJxGOX3EiDihiSQKF76hGL17eVg47NfHpRLnEAQ/edit?usp=sharing
Remarques : Coencadré par Emmanuel AGULLO (emmanuel.agullo.fr). Stage rémunéré.
Prolongation en thèse souhaitée.





SM20753 Dynamique symbolique et croissance dans les groupes


Description


The aim of this internship is to investigate the possible growths of recursively presented groups. Techniques will be inspired by symbolic dynamics and computability theory. URL sujet détaillé : https://www.lacl.fr/pvanier/rech/M2_croissance.pdf
Remarques : Possibilité de rémunération, possibilité de poursuite en thèse.





SM20754 Test of Embedded MultiAgent Systems (EMAS)


Description


1. Description:
Embedded systems are more and more used in many areas of everyday life: transports, smart houses, medical applications ... For deploying such systems, MultiAgent Systems (MAS) provide a large scale solution, involving agents that work locally to achieve a common global task. Recent uses of such applications integrate varying mobile devices such as mobile phones, tablets, etc. Consequently, multiagent based applications are more and more complex and difficult to validate. Indeed, many constraints should be considered:  Huge quantities of generated data and their heterogeneity;  A multitude of possible scenarios, making the reproduction of an error scenario difficult or even unfeasible;  Cooperation and interactions between agents may introduce errors, even though the agents are themselves valid. Existing works mainly rely on general approaches for verifying and validating MAS. This makes it difficult to take into account the specificities of the various hardware and/or software platforms on which the agents can execute. Furthermore, the geographical dimension introduced by mobile devices adds more complexity to MAS runtime environments. A preliminary work, supported by the TRUST Chair (industrial chair of Grenoble INP on dependability and security of systems) obtained the following results:  A stateoftheart on existing works on MAS design and testing;  General guidelines related to the design cycle of EMAS in order to take into account agent testing, especially by the addition of testing agents. This work has been published (" Toward an Embedded MultiAgent System Methodology and Positioning on Testing ", Camille Barnier, OumElKheir Aktouf, Annabelle Mercier and JeanPaul Jamont) in the IWSF workshop that was held in Toulouse on October 2017, colocated with ISSRE 2017. The proposed strategy for testing multiagent systems consists of three phases: the agent testing, the acceptance testing and the society testing.
2. Objectives:
The initial model has been implemented in the MASH platform, allowing a testing multiagent system to observe the MAS and the agents under test. More precisely, two types of testing agents have been introduced: a local test agent that measures the behavior of an agent and a society test agent that aims at observing the collective functionalities of the MAS. The study focuses on the WMAC model, a multiagent model developed at the LCIS lab. The proposed project can focus on the consolidation of the model for the detection of disjoint groups in the selforganization phase of the system. The model and its current implementation can be revised to make them more generic and integrate them to the Diamond MAS design methodology and the MASH platform. An update of the bibliographic study on the EMAS and MAS testing is also expected URL sujet détaillé : http://lcis.grenobleinp.fr/themes/securitedessystemesembarquesetdistribuescritiques/offresdethesepostdocstage/
Remarques :  Coadvising: A. Mercier et J.P. Jamont  Internship benefits (indemnités de stage)
 This project is part of a collaboration with the IESE Frauhofer research institute in Kaiserslautern (Germany)





SM20755 Test Coverage analysis for Mobile and ContextAware Applications


Description


Project description:
In recent years, more and more mobile applications (mobile apps) have been developed to support different applications in social, news, tourism, health, business, and other domains. Hundreds of new apps are released and downloaded daily. By the end of year 2020, it is expected that the global revenue from the mobile app market will be 79 Billion USD whereas almost 378 Billion downloads of mobile apps will be performed [1]. Testing mobile apps is a hot and challenging research topic as testing a mobile app induces to validate it on many different contexts including different hardware platforms, operating systems, Web browsers, many geographical locations for locationdependent apps, etc. Crowdsourcingbased testing is a recent approach where testing is operated by volunteer users through the cloud. This approach is particularly suited for mobile applications since various users operating in various contexts can be involved. The TMACS project (Testing of Mobile Apps using Crowdsourcing) at the LCIS lab aims at studying crowdsourcingbased testing for mobile apps. A webbased testing platform is under development. This platform, called TMACS, provides important features for crowdsourcing testing of mobile apps by means of the following functionalities: ‐ It allows mobile app providers to register and upload mobile apps for testing; ‐ It allows volunteering Internet users to register and test uploaded mobile apps. Expected behavior is that uploaded mobile apps are tested by many different Internet users in order to cover different runtime platforms and meaningful geographical locations. An issue related to outsourcing testing such as crowdsourcing testing concerns the evaluation of test coverage to assess the application validation. This implies first to determine test coverage criteria suitable for such applications.
Project objectives:
There are many open questions in mobile applications testing, as it has been shown in [2]. More generally, there are no standard testing approaches for such contextaware applications [3]. An interesting idea that has been explored for similar situations [2] [4] is to identify the program variables that implement the context related data of the mobile application and to adopt dataflowbased code coverage criteria [5] to assess the thoroughness of a test suite with respect to the mobility context. In this project, a literature survey focusing on test coverage evaluation for mobile and contextaware applications is first required. It is expected that at least one coverage assessment approach will be proposed and then precisely defined. Finally, the proposed approach will be implemented and integrated into the TMACS platform. This integration would make possible an experimental evaluation of the approach.
Prerequisites:
Java and UML technologies are important basics for the project.
References
1. https://dazeinfo.com/2016/04/20/globalmobileappdownloadrevenuemarket20162020 report/ 2. T. Zhang, J. Gao, O. Aktouf, T. Uehara. Test Model and Coverage Analysis for Locationbased Mobile Services. The 27th International Conference on Software Engineering and Knowledge Engineering, SEKE 2015, Wyndham Pittsburgh University Center, Pittsburgh, USA, July 6  July 8, 2015. 3. S. Matalonga, F. Rodrigues, G.H. Travassos. Characterizing testing methods for contextaware software systems: Results from a quasi systematic literature review. The Journal of Systems and Software, 131 (2017) 121. 4. H. Lu, W. K. Chan, T. H. Tse. Testing contextaware middlewarecentric programs: A data flow approach and an RFIDbased experimentation. Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT '06/FSE14). 5. P. G. Frankl , E. J. Weyuker. An Applicable Family of Data Flow Testing Criteria. IEEE Transactions on Software Engineering, v.14 n.10, p.14831498, October 1988. URL sujet détaillé : http://lcis.grenobleinp.fr/themes/securitedessystemesembarquesetdistribuescritiques/offresdethesepostdocstage/
Remarques :  Coadvising: I. Parissis  Internship benefits (indemnités de stage)
 This project is part of a collaboration with Danang University (VietNam)





SM20756 An Algorithmic Approach to Animal Behavior


Description


An internship is available under the guidance of Amos Korman, supported by an ERC Consolidator Grant. The internship will be held at IRIF, CNRS and Univ Paris Diderot, a leading center for theoretical computer science in France. The internship will include a long visit (or a couple) to Israel, as part of a collaboration with a bat laboratory at TelAviv University (see http://www.yossiyovel.com/).
The topic concerns the study of algorithmic game theoretical principles with a special focus on animal contexts, and particularly, behavior of bats. Of main interest are competition issues that relate to tasks such as search (or foraging), food consumption, and territoriality. The research will include a large purely theoretical component but will also strive to connect to experiments in biology.
This unconventional approach is highly demanding, and we are seeking only highly gifted candidates. The candidates should have a solid background in the following subjects, ordered by importance:
 Probability theory  Programming skills  Interest in game theory  Love of animals  Ability to connect theory with experiments
Applications should include a CV hopefully together with 12 letters of recommendations and should be sent to amos.korman.com
Some references:
 A very nice survey on evolutionary game theory and potential directions for further research: http://rsif.royalsocietypublishing.org/content/10/88/20130544
 A recent article of mine, studying the impact of competition on the dispersal of foraging players: https://arxiv.org/pdf/1805.01319 URL sujet détaillé :
:
Remarques : The internship is intended to hopefully continue to a Ph.D.
It is possible to do the internship at TelAviv University, hosted by Yossi Yovel. In either case, the main guiding advisor will be Amos Korman.





SM20757 Interference analysis for the new Kalray MPPA3 manycore


Description


In automotive and avionics, some programs are subject to critical timing constraints. In these realtime critical systems, the timing properties must be guaranteed. Among them, the worstcase execution time that guarantees an upperbound on the execution time of programs. On a multicore platform, this worstcase execution is highly influenced by all concurrent programs. In our work, we studied the Kalray MPPA2 platform (a set of multicore communicating through a network on chip comprising 256 compute cores, designed by a company located in Montbonnot, near Grenoble), and designed an algorithm able to tightly bound the interference, taking into account both the specificities of the software application and the lowlevel details of the hardware architecture. Our interference analysis models the interference on the memory bus and the communications through the networkonchip. In the new generation of chip, the MPPA3 (announced in 2017), considerable changes have been made to the architecture (less clusters but faster communications and memory access). The generic part of our algorithm are still valid, but the mathematical model of the hardware architecture must be redesigned. The goal of the internship is to analyze how far our previous work may be adapted to this new platform and propose modified/new interference analysis.
URL sujet détaillé : http://www.enslyon.fr/LIP/CASH/wpcontent/uploads/2018/10/m2wcetmppa3.pdf
Remarques : Cosupervised between Matthieu Moy (LIP) and Claire Maiza (Verimag). The internship can be physically hosted either in Grenoble or in Lyon.





SM20758 Definition and basis of the verification of a compiler for a new reactive langage


Description


The Interactive Informatics (II) Team of ENAC's research laboratory develops an environment called djnn [1]. It allows the development and execution of software that supports humanmachine interaction. This environment contains an original language dedicated to the description of these software (component structuring, control mechanisms, data flow, input and output management...): smala[2]. Compiled and linked to specific libraries, an application described in smala can then be executed.
The smala language is reactive: it provides a way to describe applications responding to external events (user inputs from the interface, timers, etc...) instantaneously. Unlike other languages, smala unifies the concepts of events processing, state machines and data flow management. We are particularly interested in the description of critical systems for air transportation (cockpit, control position, etc.) that require certification.
The basis of the work on the formalization of the semantics of the smala language and the definition of an associated compiler was set during an internship in the summer of 2018. In particular, we defined the minimum elements of the language. This internship is the continuation of this work along two axes:
 definition of the compiler using a functional language  verification of this compiler using the Coq tool[3] based on the work done on COMPCERT[4] : a certified C compiler and the verification of a compiler for Lustre[5]
This work could be followed by a PhD thesis in the continuity or around the verification of interactive properties on components.

L'équipe d'Informatique Interactive (LII) du laboratoire de recherche de l'ENAC (Ecole Nationale de l'Aviation Civile) développe un environnement nommé djnn [1]. Il permet l'élaboration et l'exécution de logiciels supportant les interactions hommemachine. Cet environnement contient un langage original dédié à la description de ces logiciels (structuration en composants, mécanismes de contrôle, flot de données, gestion des entrées et des sorties...) : smala [2]. Compilée puis liée à des librairies spécifiques, une application décrite en smala peut alors être exécutée.
Le langage smala est réactif : il permet de décrire des applications répondant à des événements externes (inputs utilisateur depuis l'interface, timers, etc...) de manière instantanée. Contrairement aux langages usuels, smala repose sur l'unification des concepts et des mécanismes de traitement d'événements, de machines à état et de gestion de flots de données. Nous nous intéressons plus particulièrement à la description de systèmes critiques pour le transport aérien (cockpit, position de contrôle, etc.) qui nécessitent une certification.
Un travail sur la formalisation de la sémantique du langage smala et la définition d'un compilateur associé a été amorcé lors d'un stage pendant l'été 2018. Il a en particulier permis de définir les éléments minimaux du langage. Ce stage est la poursuite de ce travail selon deux axes :
 définition du compilateur dans un langage fonctionnel  vérification de ce compilateur à l'aide de l'outil Coq [3] en s'inspirant des travaux effectués sur COMPCERT [4] : un compilateur C certifié et la vérification d'un compilateur pour Lustre [5]
Ce travail pourra ensuite être poursuivi par une thèse de doctorat dans la continuité ou autour de la vérification de propriétés interactives sur des composants.
[1] M. Magnaudet, S. Chatty, S. Conversy, S. Leriche, C. Picard, and D. Prun. Djnn/Smala: A Conceptual Framework and a Language for InteractionOriented Programming. Proc. ACM Hum.Comput. Interact. 2, EICS, Article 12 (June 2018) [2] http://smala.io [3] https://coq.inria.fr/ [4] X. Leroy. Formal verification of a realistic compiler. Communications of the ACM, 52(7):107115, 2009 [5] Bourke, T., Brun, L., Dagand, P. É., Leroy, X., Pouzet, M., & Rieg, L. (2017, June). A formally verified compiler for Lustre. In ACM SIGPLAN Notices (Vol. 52, No. 6, pp. 586601). ACM. URL sujet détaillé :
:
Remarques :





SM20759 Blockchain Incentives


Description


Blockchains promise to disrupt various industries with the ability to simplify transactions between individuals by limiting the intermediaries, hence increasing speed and lowering costs. This momentum produced an increase in the number of blockchain proposals coming from the industry sometime claiming both performance and security.
Unfortunately, there is in comparison very little work in the research literature that can support these claims. One example is the wellknown impossibility of agreeing on a unique block at a given index when there are a third of misbehaving participants. While few companies claimed to have successfully integrated Byzantine fault tolerant solutions without apparent bugs, the performance of their solution does not scale with the number of participants, making it of limited interest to blockchain.
The University of Sydney and CSIRO have developed the Red Belly Blockchain, a secure blockchain that scales to large number of participants without compromising security. It does not solve the classic consensus problem but a variant of it [1] to allow performance to scale to a large number of participants. It implements a community blockchain [2] that neither relies on a predetermined set of participants like consortium blockchains nor incentivises all participants to create the same blocks as in public blockchain to avoid resource wastes. The topic of this research project is to improve the accountability of the Red Belly Blockchain by designing algorithms incentivising participants to contribute in the interest of the system.
[1] DBFT: Efficient Leaderless Byzantine Consensus and its Applications to Blockchains. Tyler Crain, Vincent Gramoli, Mikel Larrea, Michel Raynal. Proceedings of the 17th IEEE International Symposium on Network Computing and Applications. IEEE NCA 2018.
[2] ComChain: Bridging the Gap Between Public and Consortium Blockchains. Guillaume Vizier, Vincent Gramoli. IEEE Blockchain 2018. URL sujet détaillé :
:
Remarques : per month leaving allowance





SM20760 Compression de graphes


Description


Graph compression is gaining popularity as a precious tool for identifying both structure and meaning in massive data. The goal of this master project is to study selected graph properties in order to identify those able to construct simple reduced representations of graphs. The obtained summaries are intended to be used without decompression for complex graph operations such as learning, classification and querying. URL sujet détaillé : https://liris.cnrs.fr/~hseba/Master/Stage_Master2_Recherche_Graphes_et_Algorithmes.pdf
Remarques : This internship can be followed by a PhD thesis in LIRIS coencadrement avec Mohammed Haddad,LIRIS, Lyon 1





SM20761 Solving equations with Chebyshev approximations


Description


The problem of solving polynomial equations of the form p(x)=0 appears in robotics, biology and many other engineering applications. Stateoftheart software can already solve polynomial equations of degree 10 000 in several minutes.
On the other hand, some results in approximation theory show that if n is the degree of p, then there exists a polynomial q of degree O(sqrt{n k}), such that q(x)p(x) < 10^{k} for all x in the interval [1; 1].
The goal of this internship is to use approximation methods to speed up the existing root isolation methods for generic polynomials. URL sujet détaillé : https://members.loria.fr/GMoroz/internships/isolation_chebyshev.pdf
Remarques :





SM20762 Scalability of the interference analysis for a multicore platform


Description


In automotive and avionics, some programs are subject to critical timing constraints. In these realtime critical systems, the timing properties must be guaranteed. Among them, the worstcase execution time that guarantees an upperbound on the execution time of programs. On a multicore platform, this worstcase execution is highly influenced by all concurrent programs. In our work, we studied the Kalray MPPA2 platform (a set of multicore communicating through a network on chip comprising 256 compute cores, designed by a company located in Montbonnot, near Grenoble), and designed an algorithm able to tightly bound the interference. Our interference analysis models the interference on the memory bus and the communications through the networkonchip. This analysis showed very good results in terms of precision, but is limited in terms of scalability (the overall algorithm is O(n^4) in number of tasks). We already have ideas on how to reduce the complexity and improve the scalability (by changing the way we iterate to find a solution), and on how to improve the precision further (through a better study of the communication, memory accesses and scheduling). The aim of this internship is to explore one of the improvement axis and implement it in our analysis tool.
URL sujet détaillé : http://www.enslyon.fr/LIP/CASH/wpcontent/uploads/2018/10/m2wcetscalability.pdf
Remarques : Cosupervised between Matthieu Moy (LIP) and Claire Maiza (Verimag). The internship can be physically hosted either in Grenoble or in Lyon.





SM20763 Programming models for optimised compilation


Description


The goal of this internship is to explore, through examples, optimization and parallelization opportunities.
The expected outcomes of the internship are : =97 Identify primitives needed for efficient parallelism (e.g. efficient FIFO communication), and implement them as a library. =97 Identify patterns in the application that map to parallel constructs, and prototype the transformation algo rithms to these constructs by applying them manually. =97 Generalization of the concepts above as a programming language.
The internship could follow the following steps (the internship will most likely not treat them all, but rather focus on a few of them) : 1. Understand a small HPC application (e.g. a few tens of lines kernel). 2. Apply optimizations by hand. Optimizations of interest here are mainly the ones using dataflow between parts of the program. For instance, a program containing two consecutive loops may in some case be transformed into one process per loop, with a limitedsize buffer between processes. 3. Design highlevel programming constructs, for example inspired by actorbased models that would allow splitting the program into multiple subprograms and help optimization. This would make explicit the compo sition and interactions of subprograms. 4. Rework the optimization phase (2) using these constructs. 5. Specify formally the semantics of the newly proposed constructs. 6. Specify formally the optimizations and prove their correction with respect to the semantics. 7. Generalization : use the new model on other programs.
This proposal is very flexible, and may evolve as : =97 a more theoretical study, that would focus on points 5 and 6 and on the design of the programming model (but would still start with an example program and its optimization). =97 or a more applied study, applied to several programs and focused on the realizability of the optimizations (implementation of a simple optimizing compiler). URL sujet détaillé : http://www.enslyon.fr/LIP/CASH/wpcontent/uploads/2018/10/m2intermediatemodel.pdf
Remarques : Coadvisors: Matthieu Moy and Christophe Alias (Cash team, LIP)





SM20764 Extracting Information from Moments


Description


Certification and validation of computational results is a major issue for modern sciences raising challenging problems at the interplay of mathematics and computational aspects of computer science. One can emphasize in this context several applications arising in the design of modern cyberphysical systems with a crucial need of certification and information extraction. In particular, one tries to avoid incidents such as the Patriot missile crash in 1991, the FDIV Pentium bug in 1994 or more recently the collision of Google's selfdriving car in 2016.
These issues give rise to many mathematical problems. Polynomial optimization (which consists in computing the infimum of a polynomial function under algebraic constraints) is one of the most important, difficult and challenging one. The emergence of this exciting new field goes back to the last decade and has led to striking developments from a cross fertilization between (real) algebraic geometry, applied mathematics, theoretical computer science and engineering. The backbone of this powerful methodology is the "momentSOS" approach, also known as "Lasserre hierarchy".
This research project aims at designing polynomial optimization algorithms to extract useful information from moment data.
In the context of global optimization, we are concerned with the design of solution extraction procedures, providing for instance, the minimizer(s) of a given polynomial. For a finite set of minimizers, this corresponds to the support of a discrete measure. To improve upon existing extraction procedures, we intend to rely on the Christoffel polynomial associated to the matrix whose entries are the moments of such a measure. Previous research focused on approximating the set of minimizers for linear functionals coming from positive measures (e.g. convex combination of Dirac measures). To find the set of minimizers of general nonnegative polynomials, we intend to study the Christoffel function associated to linear functionals coming from signed measures.
Several applications lead to approximate functions corresponding to measures supported on indicator functions or trajectories. However it is typically hard to approximate them with polynomials of increasing degrees in order to obtain a pointwise convergence behavior. Such nonuniform convergence behaviors are related to an effect known as the Gibbs phenomenon. Another goal of this project is to explain and attenuate this phenomenon. The starting point will be to investigate classical optimization problems, with available solutions. URL sujet détaillé : http://wwwverimag.imag.fr/~magron/sujets/extract.pdf
Remarques : Coadvised by JeanBernard Lasserre and Victor Magron.
The Master student will be hosted by the Mac team in the LAAS laboratory, located at Toulouse.
Motivated candidates should hold a Bachelor degree and have a solid background in either optimization, signal processing, control, real algebraic geometry or computer algebra. Good programming skills are also required. Knowledge of French does not constitute a perrequisite.
Regular internship salary.





SM20765 Exact Certificates for Noncommutative Polynomial Optimization


Description


Certification and validation of computational results is a major issue for modern sciences raising challenging problems at the interplay of mathematics and computational aspects of computer science. One can emphasize in this context several applications arising in the design of modern cyberphysical systems with a crucial need of exact certification. In particular, one tries to avoid incidents such as the Patriot missile crash in 1991, the FDIV Pentium bug in 1994 or more recently the collision of Google's selfdriving car in 2016. Since these systems often involve nonlinear functions, such as polynomials, it is highly desirable to design exact polynomial optimization methods. In 2001, Lasserre introduced a hierarchy of convex relaxations for approximating polynomial infima. Each relaxation is solved with a semidefinite programming (SDP) solver, implemented in finiteprecision, thus providing only approximate certificates. The emergence of this exciting new field goes back to the last decade and has led to striking developments from a cross fertilization between (real) algebraic geometry, applied mathematics, theoretical computer science and engineering.
This project intends to design hybrid symbolic/numeric algorithms to output exact certificates for optimization problems with polynomial data, involving noncommutative (NC) variables, e.g. matrices. NC polynomials allow to handle certain control systems (e.g. orbit problems or Hinfinity control) and quantum computation problems, (e.g. violation level of bell inequalities or bosonic energy computations). That is, one relies on expressions obtained by adding and multiplying matrices. A wide class of engineering problems has motivated major developments in this recently emerging area. Connections between linear control systems may produce highly complicated systems involving polynomial matrix coefficients. These hard systems can be analyzed thanks to an NC variant of the Lasserre's hierarchy. This hierarchy shares the same interesting properties as in the commutative case, but comes also with certification and scalability issues.
The challenging goal of this internship is to design algorithms to compute exact certificates for noncommutative polynomial optimization problems, while controlling the bit complexity of the algorithmic procedures. Preliminary work will consist of studying the existing algorithms to obtain exact decompositions of nonnegative polynomials. In the commutative case, univariate polynomials have been recently studied by means of classical techniques from symbolic computation (root isolation, squarefree decomposition). An extension to multivariate polynomials has been derived thanks to a perturbation/compensation algorithm.
A promising strategy is to adapt suitably this algorithm to the NC case. After designing the certification framework, further efforts should lead to provide the related bit complexity estimates, both on runtime and output size. Practical experiments shall be performed through implementing a tool within the Maple library RealCertify. One expected goal is to compare the performance of the tool with existing frameworks. URL sujet détaillé : http://wwwverimag.imag.fr/~magron/sujets/multivsohs.pdf
Remarques : Coadvised by JeanBernard Lasserre and Victor Magron.
The Master student will be hosted by the Mac team in the LAAS laboratory, located at Toulouse.
Motivated candidates should hold a Bachelor degree and have a solid background in either optimization, signal processing, control, real algebraic geometry or computer algebra. Good programming skills are also required. Knowledge of French does not constitute a perrequisite.
Regular internship salary.





SM20766 Exact Semidefinite Approximations of Reachable Sets


Description


Certification and validation of computational results is a major issue for modern sciences raising challenging problems at the interplay of mathematics and computational aspects of computer science. One can emphasize in this context several applications arising in the design of modern cyberphysical systems with a crucial need of exact certification. In particular, one tries to avoid incidents such as the Patriot missile crash in 1991, the FDIV Pentium bug in 1994 or more recently the collision of Google's selfdriving car in 2016.
These issues give rise to many mathematical problems. Polynomial optimization (which consists in computing the infimum of a polynomial function under algebraic constraints) is one of the most important, difficult and challenging one. The emergence of this exciting new field goes back to the last decade and has led to striking developments from a cross fertilization between (real) algebraic geometry, applied mathematics, theoretical computer science and engineering.
Consider for instance the problem of minimizing 4 x^4 + 4 x^3 y  7 x^2 y^2  2 x y^3 + 10 y^4 over R^2 One way to certify that its minimum is 0 is to decompose this polynomial as a sum of squares (SOS), which is the core subject of study in real algebra. Here the decomposition is (2 x y + y^2)^2 + (2 x^2 + x y  3 y^2)^2>= 0. In general, one can compute such SOS decompositions by solving a semidefinite program (SDP), which is a standard tool in applied mathematics and convex optimization. In SDP, one optimizes a linear function under the constraint that a given matrix is semidefinite positive, i.e. has only nonnegative eigenvalues. One particular issue arising while relying on SDP solvers is that they are often implemented using floatingpoint arithmetic, thus output only approximate certificates.
Given a dynamical polynomial system described by a continuoustime (differential) equation, the reachable set (RS) is the set of all states that can be reached from a set of initial conditions under general state constraints. This set appears in different fields such as optimal control, hybrid systems or program analysis. In general, computing or even approximating the RS is a challenge.
The ambitious goal of this project is to compute exact enclosures of RS for continuoustime systems.
Preliminary work will consist of studying the existing algorithms to obtain exact SOS decompositions of nonnegative polynomials. In particular,the case of multivariate polynomial optimization has been recently handled, thanks to a perturbation/compensation algorithm.
A promising research track is to design a similar algorithm to compute certified approximations for the reachable set of continuoustime polynomial systems. The first step shall be to derive a characterization of the RS, following the scheme developed for discretetime systems.
Practical experiments shall be performed through implementing a tool within the Maple library RealCertify. A further expected goal is to compare the performance with existing frameworks. URL sujet détaillé : http://wwwverimag.imag.fr/~magron/sujets/reachode.pdf
Remarques : The internship will be coadvised by Victor Magron (CNRS) and Khalil Ghorbal( Hycomes group, INRIA Rennes, France). The Master student will be hosted either at Inria (Rennes, France) in the Hycomes group, or at LAAS CNRS (Toulouse, France) MAC team.
Motivated candidates should hold a Bachelor degree and have a solid background in either optimization, computer arithmetics, real algebraic geometry or computer algebra. Good programming skills are also required. Knowledge of French does not constitute a perrequisite.
Regular internship salary.





SM20767 Coinductive equivalences for extended lambdacalculi


Description


Coinductive proof techniques have been developed to reason about various kinds of foundational calculi, notably concurrent process calculi sur as CCS and the picalculus, the lambdacalculus and some of its extensions.
In such a setting, the (coinductive) bisimulation proof method can be enhanced with various techniques, which make the development of proofs of equivalences between programs easier.
The goal of this internship is to study coinductive proof techniques for nondeterministic extensions of the lambdacalculus. See the pdf file for a short bibliography, which may give an impression of the kind of work which is expected during this internship. URL sujet détaillé : http://perso.enslyon.fr/daniel.hirschkoff/Stages/dhlambda.pdf
Remarques : The internship will be cosupervised with Davide Sangiorgi (Univ. Bologna and INRIA). The internship can be funded.
A PhD after the internship is among the options.





SM20768 Big Data architecture for agriculture application


Description


The LISITE laboratory of ISEP is working with an industrial partner (Cap2020) on an innovative project for smart farming =93IDEAC=94. The IDEAC project aims to help farmers to make precise and predictive decisions for maximum productivity, greater profits and more sustainability. An open source architecture have been proposed to address the different challenges (heterogeneous data collection (sensors, server apis, social networks....), big data storage and data analytics...). In this context, we are looking for a highly motivated intern willing to join the project. The goal of this internship is to make a usecase scenario to validate the architecture. The intern will have the following responsibilities:  Installing the architecture components on a virtual machine (documentation is provided)  Understanding Apache NiFi's architecture and dataflow concepts.  Ingesting data using provided APIs and identifying how to parse heterogeneous data into normalized format.  Creating a script to querying Hbase Key words: Apache NIFI, Hadoop, Hbase... Requirements:  Master, Bac +5 in computer science / engineering or equivalent  Familiar with Big Data technologies (Big Data processing and management) and distributed systems: Hadoop, Spark, HBase  Good scripting and coding skills (python, java or C++)  Autonomous and team working URL sujet détaillé :
:
Remarques : Duration: 4 to 6 months Start Date: as soon as possible
Grant: yes
Hosting team: LISITE, Institut Supérieur d'Électronique de Paris (ISEP) (https://www.isep.fr/)
Application instructions:To apply, please send a mail and CV to raja.chiky.fr





SM20769 Automatic Verification of an Abstract Machine for Synthesized Operating System Schedulers


Description


The goal of this internship is to build a verifier for implementations of schedulers derived by compiling highlevel descriptions in a domain specific language. Schedulers are complex and critical component of all operating systems, and they are very challenging to verify in general. However, this internship will rely on the knowledge of the domain specific language and its runtime in order to identify the key properties to verify and the abstractions that allow to achieve the verification. An abstract interpreter for programs manipulating complex data structures will be used as a basis to set up abstract domains specific to the scheduling problem. URL sujet détaillé : https://www.di.ens.fr/~rival/interns/stagesched.pdf
Remarques : The internship is taking place as part of a funded ANR project (so that funding is available both for the internship and for a PhD thesis).





SM20770 Separation Logic with Summary Predicates Described by Set Universal Quantification


Description


The goal of this internship is to extend separation logic with a novel summarization technique that involves no recursion, and is thus adapted to datastructures without an inductive definition. This includes graphs or union find datastructures. Summaries should allow to describe memory areas of unbounded size. Furthermore, it should be used so as to create an abstract domain able to support the automatic verification of programs that manipulate unbounded non recursive datastructures. The implementation will be based on MemCAD, an existing abstract interpretationbased static analyzer. URL sujet détaillé : https://www.di.ens.fr/~rival/interns/stagesetext.pdf
Remarques : The internship is taking place as part of a funded ANR project (so that funding is available both for the internship and for a PhD thesis).





SM20771 Reversible Causal Graph Dynamics


Description


Cellular Automata consist in a grid of identical cells, each of which may take a state among a finite set. The state of each cell at time t + 1 is computed by applying a fixed local rule to the state of cell at time t, and that of its neighbours, synchronously and homogeneously across space.
Causal Graph Dynamics [1] extend Cellular Automata from fixed grids, to timevarying graphs. That is, the whole graph evolves in discrete time steps, and this global evolution is required to have a number of physicslike symmetries: shiftinvariance (it acts everywhere the same), causality (information has a bounded speed of propagation). One could also demand reversibility (the whole evolution is a bijection over the set of labelled graphs). That way we obtain a Computer Scienceinspired toy model, that has features mimicking both Quantum theory (reversibility) and General Relativity (evolving topology). URL sujet détaillé : https://drive.google.com/file/d/1j07HykXuEo4qrx4qTtC3eG01U_YtaraN/view?usp=sharing
Remarques : Rénumération usuelle. Coencadrement: voir le sujet.





SM20772 Intégration de composantes V2V structurées grâce au clustering CBL dans une infrastructure de réseau véhiculaire =96 Evaluation sur un cas d'étude relatif aux ADAS


Description


In the context of a research project on Advanced DriverAssistance Systems, we are developing a simulation testbed for evaluating various scenarios of vehiculetovehicule (V2V) and vehiculetoinfrastructure (V2I) communications. The goal of this training is to design and realize the scenarios that will allow evaluating a Clustering algorithm, initially proposed for structuring a Vehicular Ad Hoc Network based only on V2V communications, in a context where an infrastructure also allows V2I communications.
After an initiation to network and trafic modeling with tools such as OPNET, MATLAB and SUMO, the research work will start by a study of the clustering scheme ChainBranchLeaf (CBL), designed at IFSTTAR in a collaboration with ULCO [1], and also of the existing scenarios and applications already modeled in OPNET Riverbed Modeler.
Using RealWorld road trafic data, the trainee will design and model a deployment of 802.11p Road Side Units (RSU) representing the infrastructure in the area where 802.11p equipped vehicles will be moving. He will then propose various strategies for an accurate repartition of the trafic generated by the vehicles between V2V and V2I communications.
Finally, He will apply these strategies to the transmission of some ADAS applications trafic, and evaluate their efficiency using the simulation testbed that he will have realized.
Please note that a PhD subject is proposed in relation with this training subject in the same team. URL sujet détaillé : https://www.abg.asso.fr/en/candidatOffres/show/id_offre/80946
Remarques : This work will be cosupervised by Patrick Sondi, MCF at ULCO (LISIC)
http://wwwlisic.univlittoral.fr/spip.php?article50&membre...
Payment :
The Training is rewarded 26,25=80/day





SM20773 Arithmetic and casting in Lean


Description


Proof assistants (also called interactive theorem provers) make it possible to develop computerchecked, formal proofs of theorems. Lean is a new proof assistant, based on dependent type theory similar to Coq. Constructing formal proofs by hand can be tedious, especially when dealing with simple numeric or arithmetic facts. We often want the system to manage such proofs for us. The goal of this project is to adapt tools for integer arithmetic and casting to Lean. Using Lean's metaprogramming framework, these tools can be implemented in the language of Lean itself. URL sujet détaillé : https://leanforward.github.io/internships/arith_cast.pdf
Remarques : The project will be coadvised by Jasmin Blanchette. In addition to any compensation offered by the intern's home institution, we offer =80250 per month and will reimburse travel expenses to Amsterdam.





SM20774 Designing High Performance Parallel Algorithms for Tensor Computations


Description


A tensor is a multidimensional array that generalizes matrices and vectors (i.e., a matrix is a 2D tensor and a vector is a 1D tensor) to express data/objects of high dimensions (e.g., for shopping habits on Amazon can be modelled using a 3D tensor with dimensions user =D7 product =D7 time). Tensors have gained an immense popularity in the recent past with fundamental applications in numerous domains including signal processing, quantum chemistry/physics, scientific computing, big data analysis, and machine learning. In parallel with this, there have been a significant effort to provide fast tensor libraries with high performance parallel implementations of tensor kernels in various settings (shared memory, distributed memory, and GPU paralellism) and even hardware implementations (i.e. Tensor Cores in Nvidia Turing/Volta and Tensor Processing Unit of Google). High performance tensor computations is an ongoing research having enormous impact with many challenges to overcome.
Sparse tensors have been adopted by many applications in data analytics, including recommender systems, anomaly detection, graph analytics and healthcare data analysis, in which data is sparse (meaning most entries in the tensor representing the data is zero). In these applications, datasets typically contain millions to billions of entries. This renders analyzing such datasets computationally very expensive, hence the need for very efficient parallel algorithms and implementations. Sparse tensor algorithms operate only on the nonzero elements in the tensor, and the irregularity in the sparsity pattern introduces many challenges such as irregular memory accesses, load imbalance, unbalanced communication cost, which effectively renders obtaining scalable parallel algorithms difficult.
Dense tensor decompositions have been increasingly used in the solution of linear systems (Ax = b) and eigenvalue problems (Ax = λx) that appear in scientific simulations, providing substantial gains with respect to matrixbased techniques. They enable expressing the problem (the matrix A and the vectors x, b, etc.) in a compressed tensor form, then provide a framework that enables carrying out all matrix and vector operations in this compressed form (matrixvector and matrix multiplication, vector inner product, matrix or vector addition and scaling, etc.). Carrying out these operations as fast as possible with modern HPC methods is indispensable for executing largescale scientific simulations in a timely manner.
In both scenarios, bringing HPC techniques into tensor computations is an utmost need in big data analysis and scientific computing, which will be the principal focus of this internship. The internship can lead to a PhD. subject for good candidates. URL sujet détaillé : https://drive.google.com/file/d/0BxUrvV1GfK0wM3RGM2pzMHg2MTdfaVRxejNla2pMVk54emVV/view?usp=sharing
Remarques : The internship will be coadvised by Prof. Marc Baboulin (https://www.lri.fr/~baboulin/). There is also a possibility of funding for the internship (need to contact Prof. Baboulin and Kaya).





SM20775 Variations on the cakecutting theorem


Description


Fair division is an active research area in combinatorics, social choice theory, computer science, etc. One of the main topics in this area is that of envyfree cake cutting where the players have different preferences and the goal is to find a division of a cake (identified with the interval [0,1]) into connected pieces so that it is possible to assign the pieces to the players without making any of them jealous.
The existence of such an envyfree is ensured under mild conditions. Recently, the question of whether these conditions could be even further simplified has arisen. For instance, what can be said if some players dislike the cake? In the traditional setting, they are all "hungry", and always prefer to get something instead of nothing.
Another question is related to a discrete version of the cake cutting problem. What if the cake can be cut only at some specific positions (e.g., the cake is a necklace and the beads cannot be cut)? Is it possible to find an "approximate" envyfree division?
All these questions come also with algorithmic challenge.
The objective of this research project is to study possible extensions and variations of the envyfree cake cutting theorem and to start with very concrete questions such as the ones mentioned above. URL sujet détaillé :
:
Remarques : Stipend possible.





SM20776 Formalization of polyhedra in the proof assistant Coq


Description


Convex polyhedra are the subsets of R^n defined by systems of linear (affine) inequalities. They play a major role in pure mathematics (algebraic geometry, discrete maths, combinatorics), in applied mathematics (optimization, operations research, control theory) as well as in computer science (computational geometry, software verification, compilation, program optimization, constraint solving). The growing role of mathematical software in the study of open problems in mathematics, as well as the critical nature of the aforementioned applications, provide a strong motivation to formalize the theory of polyhedra in a proof assistant. A first milestone in this direction has been reached in the recent work [1]. This has led to the development of the library CoqPolyhedra (https://github.com/nhojem/CoqPolyhedra).
The internship is motivated by the project of formalizing the theory of polyhedra in Coq and its applications to mathematical problems. We propose two examples of subjects, which can be adapted according to the taste of the candidate. Both naturally lead to a PhD on the topic. The first subject aims at developing the computational abilities of the library CoqPolyhedra, i.e. allowing to formally prove mathematical properties on instances of polyhedra of large size. The objective is to limit the slowdown incurred by doing computations within Coq (compared with an informal polyhedral computation library), in order to make an integration of CoqPolyhedra into mathematical software practically possible. The second subject is about the abstract representation of polyhedra in the proof assistant. Indeed, polyhedra admit many possible representations, and the choice of a good representation depends on the context. The goal is to find a formalization allowing to manipulate the different representations of polyhedra as easily as in a penandpaper proof.
[1] Xavier Allamigeon and Ricardo D. Katz. A formalization of convex polyhedra based on the simplex method. Journal of Automated Reasoning, Aug 2018. URL sujet détaillé : http://www.cmap.polytechnique.fr/~allamigeon/M2_internship_ENS_Lyon.pdf
Remarques :





SM20777 Calculs locaux pour données distribuées




SM20778 Modèles formels pour algorithmes distribués probabilistes


Description


Introduction dans un cadre formel de composantes probabilistes pour le comportement de robots autonomes en essaims. Étude de cas : extension des solutions actuelles pour le scattering. URL sujet détaillé : https://perso.liris.cnrs.fr/xavier.urbain/ens/M2IF/1819_proba.pdf
Remarques : Coencadrement S. Tixeuil, LIP6 P. Courtieu, Cedric





SM20779 Backward Abstract Interpretation using Over and UnderApproximations


Description


The goal of the internship is to develop backward value analyses in a static analyzer by abstract interpretation in a realistic setting for Clike programs featuring notably functions and pointers. The intern will consider both overapproximations and underapproximations.
The theory of abstract interpretation allows the design of effective and efficient static analyzers able to compute approximations of program semantics. The overwhelming majority of analyzers compute overapproximations:  A classic forward analysis computes an overapproximation of the states reachable from the program entry.  A classic backward analysis computes the coreachable states from some desired states. This set is also overapproximated, so as to infer necessary conditions for the program to reach the desired states.  Overapproximating forward and backward analyses can be iterated, either to provide automatically some context for the false positives of an imprecise analysis, or to provide interactive abstract debugging.
In theory, an underapproximating backward analysis could instead provide sufficient conditions for a target program state to be reached. Applications include:  Proving that an alarm is a true error and not a false alarm by inferring a counterexample execution  Inferring procedure contracts: sufficient assertions to insert at the beginning of a procedure to ensure that its execution will never fail.  Evaluating the uncertainty of an analysis by combining both overapproximations and underapproximations, and quantifying the distance between them. Such analyses may be iterated to improve their precision.
The lack of effective underapproximating infinitestate abstract domains has limited the development of Abstract Interpretation techniques to solve these problems. Effective underapproximations have been proposed for classic numeric domains (intervals, polyhedra) in to infer sufficient conditions for the absence of array bound errors in simple programs, and later found applications in proving liveness properties. These can serve as the basis for the internship, but will require extensions to ensure precision, scalability, and support for nonnumeric variables such as pointers.
The intended work will include a theoretical side: developing abstract semantics and proving formally their soundness. It will also include a practical side: implementing the semantics and validating their benefit experimentally.
The host team is developing a static analysis platform, Mopsa, that includes an analysis of C programs using several, readytouse abstractions, and a framework to easily extend it to new abstractions. A first step will be to add support for backward iterations in Mopsa, and evaluate classic overapproximating backward analyses on programs featuring numeric and pointer variables. Then, the intern will implement and evaluate the novel overapproximating and underapproximating backward operators developed during the internship. Feedback from experimentations throughout the internship will guide the design of new abstractions tailored to concrete problems in realistic settings. URL sujet détaillé : https://wwwapr.lip6.fr/~mine/enseignement/mpri/20182019/stages/ai_back.pdf
Remarques : Stage proposé dans le cadre du projet ERC Mopsa.





SM20780 Static Analysis of Ethereum Smart Contracts by Abstract Interpretation


Description


The goal of the internship is to develop, prove correct, and implement novel static analyses by Abstract Interpretation to verify the correctness of smart contracts.
The internship focuses on Ethereum, a cryptocurrency framework that emploies blockchain techniques and supports the execution of smart contracts. Smart contracts can be written in a variety of languages, which are compiled into bytecode that is run on the Ethereum Virtual Machine (EVM). EVM provides a Turingcomplete, stackbased machine with a bytearray based memory and an associative storage (with 256bit keys and values). The EVM is completely specified formally, giving it an unambiguous semantics. Hence, the intership will focus on analyzing EVM bytecode.
Application of formal methods to smart contracts is a recent area of research, with few results yet. Examples include preliminary analyses using type and effect systems with F*, and using modelchecking with Spin. We are not aware of works based on Abstract Interpretation.
Several causes of vulnerabilities in Ethereum smart contracts have been uncovered. Given the novelty of the subject, a preliminary study must be performed to determine precisely which vulnerabilities can be efficiently detected by static analysis, and which properties must be inferred to detect them. Nevertheless, one promising direction is to focus on vulnerabilities related to exceptional behaviors (i.e., runtime errors). The intern may focus primarily on the following analyses:
 Checking exceptions. Erroneous operations in the execution of a contract cause exceptions to be raised and the contract to be terminated. Some exceptions, though, are propagated back to the caller contract, while others result in return codes, which may not be checked appropriately, causing vulnerabilities. A static analysis could detect that possible exceptions are correctly handled, even through calls to contracts.
 Call stack depth. The EVM stack depth is limited to 1024, and exceeding this limitation causes an exception that may not be properly handled by a contract and be the base of attacks. A static stack depth analysis can alleviate this kind of errors.  Atomicity vulnerabilities. Contracts do not operate atomically: a contract can call another contract, that can access and exploit the intermediate state the calling contract is in for ill effects. For instance, a reentrancy bug, where a contract is called again from within its own execution, is the basis of the The DAO attack, which caused a 60 million US dollar loss in June 2016.  Transaction fees. Every operation in the EVM comes with a cost, measured in an abstract unit called gas. The cost of a contract is specified upfront, and the execution stops (with an exception) whenever this cost is exceeded. A cost analysis can check statically whether a contract will meet its cost target. Such an analysis can leverage the large body of literature on worst case execution time of binary programs. The intern will leverage existing abstractions, such as numeric abstractions, and develop new abstractions if needed. Most static analyses target sourcelevel programs. Analyzing bytecodelevel programs poses additional challenges due to the very lowlevel of instructions, and the tendency of finegrained abstract transfer functions to accumulate imprecision. Methods including developing specific relational domains or highlevel expression reconstruction may be required to reach the intended level of precision.
The intended work will include a theoretical side: developing abstract semantics and proving formally their soundness. It will also include a practical side: implementing the semantics in a static analyzer and validating their benefit experimentally on sample smart contracts. The host team is developing a static analysis platform, Mopsa, that includes several readytouse abstractions, and a framework that allows easily extending the analyses both to new abstractions, and to new languages. For experimental purposes, the intern will thus write an EVM bytecode frontend for Mopsa, reuse existing Mopsa abstractions, and add the necessary abstractions to carry the static analysis. URL sujet détaillé : https://wwwapr.lip6.fr/~mine/enseignement/mpri/20182019/stages/ai_contract.pdf
Remarques : Stage proposé dans le cadre du projet ERC Mopsa.





SM20781 Dataflow analysis of malicious binary codes. Toward a study of the cartography of functionalities and their correlations.


Description


The team Carbone at Loria has developed thanks to the high security lab (LHS), an innovative method called morphological analysis. This method can detect code similarities. It can also detect functionalities embedded in a binary code and detect malware. The objective is to rebuild the dataflow graph in order to cartography the set of functionalities used inside a malicious code. URL sujet détaillé : https://members.loria.fr/JYMarion/internshipsubjectsujetstagem2/
Remarques : Le stage sera rémunéré et l'hébergement peut être pris en charge.





SM20782 MultiFlows and MultiCuts in Planar Graphs


Description


Given a graph G=(V,E) with edge capacities, and pairs of terminals (s_i, t_i), the mincut problem asks to remove mincapacity set of edges so that each pair of terminals are separated. The maximal integral flow problem asks to send maxamount of integral flow between pairs of terminals, while respecting the capacity. The latter problem includes the edgedisjoint path problem as a special case.
The two problems are classical problems in theoretical computer science. This project will focus on the special case of planar graphs and hope to design exact, approximation, and fixedparameterised algorithms for the two problems.
A knowledge of graph theory, especially concerning the planar graphs, will be much appreciated for the people interested in this project. URL sujet détaillé :
:
Remarques : Salary is possible





SM20783 Natural Language Processing


Description


According to the 2017 International Migration Report, the number of international migrants worldwide has grown rapidly in recent years, reaching 258 million in 2017, among whom 78 million in Europe. A key reason for the difficulty of EU leaders to take a decisive and coherent approach to the refugee crisis has been the high level of public anxiety about immigration and asylum across Europe. There are at least three social factors underlying this attitude (Berri et al, 2015): the increase in the number and visibility of migrants; the economic crisis that has fed feelings of insecurity; the role of mass media. The last factor has a major influence on the political attitudes of the general public and the elite. Refugees and migrants tend to be framed negatively as a problem. This translates into a significant increase of hate speech towards migrants and minorities. The Internet seems to be a fertile ground for hate speech (Knobel, 2012).
The goal of this Master internspeech Develop a new methodology to automatically detect hate speech, based on machine learning and Neural Networks. Human detection of this material is infeasible since the contents to be analyzed are huge. In recent years, research has been conducted to develop automatic methods for hate speech detection in the social media domain. These typically employ semantic content analysis techniques built on Natural Language Processing (NLP) and Machine Learning (ML) methods (Schmidt et al. 2017). Although current methods have reported promising results, their evaluations are largely biased towards detecting content that is nonhate, as opposed to detecting and classifying real hateful content (Zhang et al., 2018). Current machine learning methods use only certain taskspecific features to model hate speech. We propose to develop an innovative approach to combine these pieces of information into a multifeature approach so that the weaknesses of the individual features are compensated by the strengths of other features (explicit hate speech, implicit hate speech, contextual conditions affecting the prevalence of hate speech, etc.).
References
Baroni, M., Dinu, G., and Kruszewski, G. (2014). =93Don't count, predict! a systematic comparison of contextcounting vs. contextpredicting semantic vectors=94. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Volume 1, pages 238247.
Berri M, GarciaBlanco I, Moore K (2015), Press coverage of the Refugee and Migrant Crisis in the EU: A Content Analysis of five European Countries, Report prepared for the United Nations High Commission for Refugees, Cardiff School of Journalism, Media and Cultural Studies.
Dai, A. M. and Le, Q. V. (2015). =93Semisupervised sequence Learning=94. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R., editors, Advances in Neural Information Processing Systems 28, pages 30613069. Curran Associates, Inc
Dong, L., Wei, F., Tan, C., Tang, D., Zhou, M., and Xu, K. (2014). =93Adaptive recursive neural network for targetdependent twitter sentiment classification=94. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL, Baltimore, MD, USA, Volume 2: pages 4954.
Iyyer, M., Manjunatha, V., BoydGraber, J., and Daumé, H. (2015). =93Deep unordered composition rivals syntactic methods for text classification=94. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, volume 1, pages 16811691.
Johnson, R. and Zhang, T. (2015). =93Effective use of word order for text categorization with convolutional neural networks=94. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 103112.
Kim, Y. (2014). =93Convolutional neural networks for sentence classification=94. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 17461751.
Knobel M. (2012). L'Internet de la haine. Racistes, antisémites, néonazis, intégristes, islamistes, terroristes et homophobes à l'assaut du web. Paris: Berg International
Mikolov, T., Yih, W.t., and Zweig, G. (2013a). =93Linguistic regularities in continuous space word representations=94. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 746751.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013b). =93Distributed representations of words and phrases and their Compositionality=94. In Advances in Neural Information Processing Systems, 26, pages 31113119. Curran Associates, Inc.
Nam, J., Kim, J., Loza Menc__a, E., Gurevych, I., and F=7Furnkranz, J. (2014). =93Largescale multilabel text classification =96 revisiting neural networks=94. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD14), Part 2, volume 8725, pages 437452.
Schmidt A., Wiegand M.(2017). A Survey on Hate Speech Detection using Natural Language Processing, Workshop on Natural Language Processing for Social Media
Zhang, Z., Luo, L (2018). Hate speech detection: a solved problem? The Challenging Case of Long Tail on Twitter. arxiv.org/pdf/1803.03662 URL sujet détaillé : https://team.inria.fr/multispeech/masterr2internshipinnaturallanguageprocessingonlinehatespeechagainstmigrants/
Remarques : Supervisors: Irina Illina, MdC, Dominique Fohr, CR CNRS
Required skills: background in statistics, natural language processing and computer program skills (Perl, Python). Candidates should email a detailed CV with diploma





SM20784 Natural Language Processing


Description


Motivations and context
Semantic and thematic spaces are vector spaces used for the representation of words, sentences or textual documents. The corresponding models and methods have a long history in the field of computational linguistics and natural language processing. Almost all models rely on the hypothesis of statistical semantics which states that: statistical schemes of appearance of words (context of a word) can be used to describe the underlying semantics. The most used method to learn these representations is to predict a word using the context in which this word appears [Mikolov et al., 2013b, Pennington et al., 2014], and this can be realized with neural networks. These representations have proved their effectiveness for a range of natural language processing tasks [Baroni et al., 2014]. In particular, Mikolov's Skipgram and CBOW models et al. [Mikolov et al., 2013b, Mikolov et al., 2013a] have become very popular because of their ability to process large amounts of unstructured text data with reduced computing costs. The efficiency and the semantic properties of these representations motivate us to explore these semantic representations for our speech recognition system.
Robust automatic speech recognition (ASR) is always a very ambitious goal. Despite constant efforts and some dramatic advances, the ability of a machine to recognize the speech is still far from equaling that of the human being. Current ASR systems see their performance significantly decrease when the conditions under which they were trained and those in which which they are used differ. The causes of variability may be related to the acoustic environment, sound capture equipment, microphone change, etc.
Objectives
The speech recognition (ASR) stage will be supplemented by a semantic analysis to detect the words of the processed sentence that could have been misrecognized and to find words having similar pronunciation and matching better the context. For example, the sentence " Silvio Berlusconi, prince de Milan " can be recognized by the speech recognition system as : " Silvio Berlusconi, prince de mille ans ". Good semantic context representation of the sentence could help to find and correct this error.
The Master internship will be devoted to the innovative study of the taking into account of semantics through predictive representations that capture the semantic features of words and their context. Research will be conducted on the combination of semantic information with information from denoising to improve speech recognition. As deep neural networks (DNNs) can model complex functions and get outstanding performance, they will be used in all our modeling.
References
[Deng, 2014] Deng, L. Deep learning: Methods and applications. Foundations and Trends in Signal Processing, 7(34), 197=96387, 2014.
[Goodfellow et al., 2016] Goodfellow, I., Bengio, Y., & Courville, A. Deep Learning. MIT Press. http://www.deeplearningbook.org, 2016.
[Mikolov et al., 2013a] Mikolov, T. Chen, K., Corrado, G., and Dean, J. Efficient estimation of word representations in vector space, CoRR, vol. abs/1301.3781, 2013.
[Mikolov et al., 2013b] Mikolov, T., Sutskever, I., Chen, T. Corrado, G.S.,and Dean, J. Distributed representations of words and phrases and their compositionality, in Advances in Neural Information Processing Systems, 2013, pp. 3111=963119.
[Pennington et al., 2014] Pennington, J., Socher, R., and Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 15321543.
[Povey et al, 2011] Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlıcek, P., Qian, Y., Schwarz, Y., Silovsky, J., Stemmer, G., Vesely, K. The Kaldi Speech Recognition Toolkit, Proc. ASRU, 2011.
[Sheikh, 2016] Sheikh, I. Exploitation du contexte sémantique pour améliorer la reconnaissance des noms propres dans les documents audio diachroniques=94, These de doctorat en Informatique, Université de Lorraine, 2016.
[Sheikh et al., 2016] Sheikh, I. Illina, I. Fohr, D. Linares, G. Learning word importance with the neural bagofwords model, in Proc. ACL Representation Learning for NLP (Repl4NLP) Workshop, Aug 2016. URL sujet détaillé : https://team.inria.fr/multispeech/masterr2internshipinnaturallanguageprocessingintroductionofsemanticinformationinaspeechrecognitionsystem/
Remarques : Supervisors: Irina Illina, MdC, Dominique Fohr, CR CNRS
Required skills: background in statistics, natural language processing and computer program skills (Perl, Python). Candidates should email a detailed CV with diploma





SM20785 Compilation schemes for highlevel faulttolerant distributed protocols


Description


Faulttolerant distributed data structures are at the core distributed systems. One of the fundamental problems in the distributed setting is consensus: multiple processes must agree on a value. Consensus is unsolvable in the presence of asynchrony (unbounded message delay) and faults (message loss, process crash). Therefore algorithms make assumptions on the network in order to solve consensus, leading to extremely intricate control structures, highly error prone. The development of blockchain technologies (which solve variants of consensus) recall one more time the importance of these systems and how hard it is to get them right.
We will investigate existing programming abstractions that simplify reasoning about consensus algorithms, and extend them if needed to capture solutions to other problems, e.g., state machine replication, kset agreement, etc. We focus on partially synchronous programming abstractions, that uniformly model asynchrony and faults using an adversarial environment that drops messages, leading to simpler control structure and simpler proof arguments.
We will define a domainspecific language that has a highlevel (partially synchronous) semantics amendable to automated verification and an asynchronous semantics defining the runtime executions. Partial synchrony reduces and simplifies the system's behaviors, allowing the programmer to focus on the algorithm task, rather than on lowlevel network and timer code. We will define algorithms that derive asynchronous code from the partially synchronous one, while preserving its safety and liveness properties. The domain specific approach has the advantage of simplifying the control structure, leading to fewer global states and more understandability of the algorithm, making programming less error prone but also helping the application of automated verification techniques on the domain specific code. URL sujet détaillé : http://www.di.ens.fr/~cezarad/compilation.pdf
Remarques :





SM20786 Abstract interpretation for distributed system


Description


Faulttolerant distributed data structures are at the core distributed systems however there are very few (semi)automated verification techniques that tackle this class of systems. Consensus, i.e., multiple processes agree on a value, is one of the oldest and most important problems in the distributed setting. However, up to date there are no automated verification techniques that can prove this properties beyond toy examples, that is for real used algorithms. Consensus algorithms are still an active research area, for both systems and algorithms designers, and the recent blockchain technologies show, one more time, its difficulty.
The internship aims to develop automated verification techniques, based on abstract interpretation, to prove safety and liveness properties of faulttolerant systems, in particular of consensus algorithms.
Abstract interpretation is an analysis technique that can be used for automated verification. For sequential programs, the automated verification problem made significant progress, either by using abstract interpretation or using other techniques: satisfiability solvers, model checking, etc. For distributed systems, automatic verification is still in its early stages, despite massive usage of cloud technologies or distributed databases. The internship's goal is to define abstract interpretation techniques that enable the automated verification of consensus algorithms. Consensus algorithms ensure coherences between independent processes running over a network whose topology varies (ring, tree, dynamic dags, etc) and different types of failure may happen (process crash, message loss or corruption). The internship requires basic knowledge in programming languages. Notions in abstract interpretation and/or distributed systems are a plus but not mandatory. The internship is in french or english. R URL sujet détaillé : https://www.di.ens.fr/~cezarad/AutomatedVerif.pdf
Remarques : Remuneration prevue pour les normalien(ne)s etudiant(e)s (auditeure/auditrices).





SM20787 Scheduling Replica Requests in KeyValue Stores


Description


Many distributed systems have been proposed to optimize information storage and retrieval, in particular keyvalues stores such as Apache Cassandra or MongoDB. Even on wellprovisioned systems, avoiding latency, that is, minimizing the delay of all data requests, is still challenging because of the unbalanced load among servers. In particular, the 5% slowest requests may experience much larger latencies than others, which is known as the ``tail latency'' problem. Data replication is commonly used to overcome this problem, which, in turn, asks for a good replication selection algorithms. When data has heterogeneous sizes, requests scheduled behind other requests for large data (and thus with large service time) may also suffer an additional delay: this is the ``headofline blocking'' problem.
Héron is a replica selection algorithm recently proposed to overcome this problem. It recognizes requests for large values using bloom filters to avoid scheduling other requests behind them. Experiments show that it outperforms stateoftheart algorithms and reduces both median and tail latencies.
However, we do not know if such a replication selection algorithm could be further optimized. More precisely, we would like to know how far is Héron from an optimal selection algorithm. A natural way to answer this question is to abstract the problem and to model the distributed system by selecting keys characteristics (while simplifying others) in order to get a simpler but tractable scheduling problem. In fact, optimal or guaranteed algorithms have been designed for similar problems in the scheduling literature: for homogeneous request sizes with total replication, it is known that the FIFO policy gives optimal results both for average and maximum response time minimization. Data locality given by the replication policy may be precisely modeled using ``restricted availabilities''.
The objective of the internship is both to propose tight lower bounds on the achievable tail latency for the replica selection problem based on scheduling studies, and to identify which characteristics of the workload are crucial to design efficient replication strategies.
We have already identified two noteworthy specificities of the problem whose use in scheduling studies to design better bounds and algorithms would be pioneering:
 Thanks to the dynamic monitoring and scaling of the platform, the average load of each server is kept between 50% and 80%. This excludes extreme cases such as underloaded or overloaded scenarios.
 The popularity of data items can be learned during the execution. Popular items receive much larger volume of requests, which may be scheduled specifically.
The internship consists into two parts, whose respective weight can be adapted to the student's skills and inclination:
 Theoretical studies to model the problem, explore related scheduling works, design scheduling and replication selection algorithm, prove algorithm optimality or approximation ratios;
 Experimental studies based on the statistical characterisation of workloads from actual storage systems and the simulation of replication selection algorithms, to understand the key features of the workload. URL sujet détaillé : http://perso.enslyon.fr/loris.marchal/docs/internshipreplicaen.pdf
Remarques : Coadvised by Sonia Ben Moktar (LIRIS Lyon) and Étienne Rivière (UC Louvain)
Opportunity for the intern to spend one month in UC Louvain at the end of the internship, provided the intern is interested (specific funding may be available for this).





SM20788 Inverse homogenization of implicit functions


Description


The objective of the internship is to perform inverse homogenization of periodic microstructures defined by an implicit function. Specifically, we aim to study implicit functions defined by skeletal structures. Skeletal structures are a powerful tool used in a broad number of scientific fields. Skeletons reduce the dimension of shapes, capture their geometric and topological properties, and seek to understand how these structures encode local and global features. The above properties provide a tight control of the topology and the geometry, which turns out to be crucial for enforcing fabrication constraints for additive manufacturing. URL sujet détaillé : https://members.loria.fr/Monique.Teillaud/positions/2018microtructuresinternship.pdf
Remarques : Coadvised by Monique Teillaud (monique.teillaud.fr)





SM20789 Computability in topological spaces


Description


Computability theory over natural numbers and finite objects can be extended to infinite objects, represented by infinite symbolic sequences written on the input tape of a Turing machine. The choice of the representation is significant and raises the following problem: what are the computational tasks that can be performed by a Turing machine, for a given representation of input objects? Topology is very useful to understand this problem. For instance, every computable function is continuous for a suitable topology induced by the representation. Therefore, the natural way to extend computability theory to arbitrary mathematical objects is to work in a topological space.
Computability is well understood on some classes of topological spaces like Polish spaces. However it is less understood in other spaces, especially spaces without a countable basis.
The goal of the internship is to study some particular spaces like the space of real polynomials and understand computability on those spaces. For instances, what properties of points of the space can be recognized by a Turing machine, for several notions of recognizability? URL sujet détaillé :
:
Remarques :





SM20790 Complexity of subsets of the plane


Description


Computational complexity theory is the study of classes of problems over finite objects, such as the classes P or NP. It has been used to define and investigate the complexity of infinite objects such as real numbers or subsets of the Euclidean plane. Two important notions of complexity of a subset of the plane have been considered.
In definition 1, a set is polynomialtime computable if there is a program that draws the set in polynomial time, more precisely if given a pixel, the program decides whether the pixel intersects the set, within a time bounded by a polynomial in the size of the pixel.
In definition 2, a set is polynomimaltime computable if there is a program computing the distance of an arbitrary point to the set in polynomial time.
Definition 2 is known to be stronger than definition 1, and strictly so if P is different from NP.
The goal of the internship is to better understand the relationship between these two notions. One direction is to find classes of sets for which the two definitions are equivalent (a candidate is the class of convex sets). Another direction is to understand how the two notions are far from each other. For instance, for a set satisfying definition 1, what is the size of the set of points whose distance to the set is difficult to compute? URL sujet détaillé :
:
Remarques :





SM20791 Sequential complexity of optimization


Description


Description:
The problem of "Best arm identification" is the following: a finite set of probability distributions is given, and an agent may sample them sequentially as she wishes; her goal is to identify, with as few samples as possible, the distribution with highest expectation. Since [GK16], the complexity of this problem is known: there is an explicit lowerbound on the number of samples required, obtained by informationtheoretic arguments. Besides, a strategy asymptotically reaching this lower bound is given. Some preliminary results are also known when a distribution with only an *approximately* optimal expectation is to be found. In all cases, the lower bound emphasizes an optimal proportion of arm draws which ensures efficiency.
The purpose of this internship is to investigate the possibility to build on these results in order to identify the complexity of optimizing a *continuous* (say, Lipschitz) function observed with noise, that is, the functiondependent number of samples required by *any* strategy (stochastic gradient descent, or whatever) to find a point close to the optimum with high probability. Additionally, this should permit to find the optimal density of draws of the domain that leads to optimality.
We will proceed by discretization of the continuous problem, leading to a finite set of distributions whose expectations are linked by the Lipschitz assumption. Hence, a first step in this work will be to identify the complexity of bestarm identification when the difference of means of two successive distributions is bounded.
Ref: [GK16] Optimal Best Arm Identification with Fixed Confidence, by Aurélien Garivier and Emilie Kaufmann in Conference On Learning Theory (COLT) n=B029, Jun. 2016, pp.9981027 URL sujet détaillé :
:
Remarques : for coadvising: please contact me





SM20792 Coloration distribuée dans les graphes structurés / Distributed coloring in structured graph classes


Description


This internship will be devoted to distributed algorithms for graph coloring. Graph coloring is a fundamental symmetry breaking problem in distributed algorithms, and improving the complexity of known algorithms is a major challenge in the area. Here we will focus on graph classes with a specific structure (which may arise in practical applications), like unit disk graphs, or graphs coming from partial orders. The study of structured graphs in the context of distributed algorithm is fairly recent and there are many interesting open questions. URL sujet détaillé : http://oc.inpg.fr/sujetM2.pdf
Remarques : possibilité de rémunération





SM20793 Vérification formelle performante pour la logique temporelle du premier ordre


Admin


Encadrant : Julien BRUNEL 
Labo/Organisme : dans les deux cas, le domaine des termes du premier ordre est borné ; en revanche, un outil repose sur une analyse avec horizon temporel borné (bounded model checking) tandis que le second, conçu à l'ONERA, travaille sur horizon nonborné. 
URL : https://www.onera.fr/staff/julienbrunel 
Ville : Toulouse 



Description


Alloy (D. Jackson et al, MIT) est un langage de spécification formelle qui connaît un certain succès en raison du compromis pertinent qu'il propose entre simplicité de modélisation pour l'utilisateur et efficacité des analyses formelles. En pratique, la spécification se fait en logique relationnelle bornée (une logique du premier ordre avec termes relationnels et dont les domaines d'interprétation sont bornés par l'utilisateur) et la vérification automatisée de propriétés est effectuée par traduction en problème de satisfiabilité pour la logique propositionnelle (on tire parti des récents progrès importants effectués dans le domaine des moteurs SAT). Alloy se prête très bien à l'analyse de modèles où les propriétés vérifées sont essentiellement structurelles. Or, de nombreux problèmes comportent aussi une dimension comportementale, abordables en Alloy mais avec un certain nombre de lacunes.
Le DTIS, en collaboration avec le HasLab de l'université du Minho (Portugal) a récemment élaboré un langage formel, nommé Electrum [FSE 2016], qui peut être vu comme une extension comportementale d'Alloy. En pratique, le langage s'appuie sur la logique relationnelle bornée étendue par la logique temporelle linéaire avec opérateurs du passé (PLTL). Deux techniques de vérification ont été mises au point : dans les deux cas, le domaine des termes du premier ordre est borné ; en revanche, un outil repose sur une analyse avec horizon temporel borné (bounded model checking) tandis que le second, conçu à l'ONERA, travaille sur horizon nonborné.
L'objectif de ce stage sera d'améliorer les analyses actuelles en suivant certaines des pistes suivantes :  Une piste intéressante consiste à tirer parti des symétries présentes dans les modèles Electrum : cellesci permettent de réduire considérablement l'espace d'états. À l'heure actuelle, nous ne tenons compte que de certaines symétries sur les données, mais pas des symétries temporelles. De manière plus générale, l'emploi de techniques automatiques d'abstraction paraît prometteur.  Les analyses actuelles ne tirent pas assez profit de la différence entre la description du " système " d'une part, et de la spécification des propriétés que le système doit vérifier d'autre part. Une récente extension d'Electrum [ABZ 2018] avec des actions permet justement une distinction plus claire entre ces deux éléments au sein du langage. De nouvelles analyses peuvent donc être imaginées, en particulier en s'inspirant des algorithmes récents basés sur une représentation SMT (Satisfiability Modulo Theory) du système et des propriétés.  Nous avons débuté dans [ATVA 2016] une étude de la propriété du modèle fini (PMF) pour la logique sousjacente à Electrum (FOLTL). L'identification de certains fragments de la logique qui ont la PMF ouvrent à la voie à une vérification complète (sans borne sur les termes du permier ordre).
Références : [ABZ 2018] Julien Brunel, D . Chemoui, A. Cunha, Th. Hujsa, N. Macedo, J. Tawa. Proposition of an Action Layer for Electrum. In Proc. of 6th Int. Conf. on ASM, Alloy, B, TLA, VDM, Z (ABZ), 2018. [FSE 2016] N. Macedo, J. Brunel, D. Chemouil, A. Cunha, and D. Kuperberg. Lightweight Specification and Analysis of Dynamic Systems with Rich Configurations. In Proc. ACM SIGSOFT Intl Symp. on the Foundations of Software Engineering (FSE), 2016. [ATVA 2016] D. Kuperberg, J. Brunel, and D. Chemouil. On Finite Domains in FirstOrder Linear Temporal Logic. In 14th Interl Symp. on Automated Technology for Verification and Analysis (ATVA), 2016. URL sujet détaillé : http://w3.onera.fr/stages/sites/w3.onera.fr.stages/files/dtis201944.pdf
Remarques :





SM20794 Vérification formelle de systèmes distribués sur domaine infini


Description


La formalisation des systèmes distribués constitue un objet d'étude intéressant en soi mais aussi en raison du " challenge " qu'ils posent à la spécification formelle " naturelle " et à la vérification formelle à fort degré d'automatisation. Ivy [PLDI 2016] permet de prouver des propriétés sur des systèmes de taille infinie avec un degré d'automatisation important (comparativement aux approches fondés sur des assistants de preuve). La description du système et la spécification des propriétés à vérifier doit se faire en utilisant un fragment assez restreint de la logique du premier ordre, sur lequel le problème de satisfiabilité est décidable. Ivy a été notamment appliqué à l'analyse de systèmes distribués (ex : protocoles de la famille Paxos)). Des travaux récents [POPL 2018] proposent d'analyser des propriétés temporelles plus riches (vivacité) à l'aide d'Ivy. Electrum [FSE 2016], un langage développé à l'ONERA depuis 2015, a choisi une approche différente : la spécification du système et des propriétés profite de toute l'expressivité de la logique temporelle du premier ordre (FOLTL). La vérification est entièrement automatique, mais nécessite de borner la taille du domaine du premier ordre. Electrum a également été appliqué dans [FMCAD 2018] à l'analyse de systèmes distribués, via l'étude du protocole Chord, qui traite de la maintenance d'une table de hachage distribuée dans un réseau pairàpair. Le présent stage se situe dans ce contexte. In fine, il s'agit de déterminer un formalisme ou des techniques permettant de spécifier " confortablement " des systèmes distribués puis de vérifier leurs propriétés (y compris la vivacité) au moyen de techniques avec un fort degré d'automatisation. Pour ce faire, on propose le présent programme de travail :  Étudier le protocole Chord ainsi que le modèle Electrum de Chord  Proposer une modélisation de Chord (ou d'un sousensemble de Chord) en Ivy (liveness incluse)  Étudier les propositions théoriques sousjacentes à Ivy  En déduire des améliorations possibles pour Ivy & Electrum.
Ce stage peut se poursuivre en thèse. URL sujet détaillé : http://w3.onera.fr/stages/sites/w3.onera.fr.stages/files/dtis201950.pdf
Remarques :





SM20795 Typechecking rewrite rules in Lambdapi


Description


Lambdapi is a new proof assistant based on a logical framework called the λΠcalculus modulo rewriting. Thanks to rewriting, Lambdapi allows the formalization of proofs that cannot be done in other proof assistants. However, before adding a new rewrite rule l → r in the system, it is necessary to check that it preserves typing. The goal of this internship is to study this problem and implement an algorithm in Lambdapi. URL sujet détaillé : http://rewriting.gforge.inria.fr/divers/sr.pdf
Remarques :





SM20796 Formal Proofs of Security Protocols using Oblivious Transfers


Description


Our group is working on formal security proofs of cryptographic protocols. Such protocols are distributed programs that rely on cryptographic primitives such as encryption, hash function, signatures, etc. They are used in many applications: secure transactions on Internet, mobile phones, smartcards, RFID chips, electronic voting, etc.
Caroline Fontaine, who joined our group recently, has proposed a protocol for privacypreserving fingerprinted image distribution that aims at achieving several goals, including traitor tracing, buyer and itemunlinkability [6]. The formal proof of such a protocol is challenging for several reasons, some of which are the starting point of the proposed research project(s). URL sujet détaillé : http://www.lsv.fr/~baelde/OTsecurityinternship.pdf
Remarques : This internship will be jointly supervised by David Baelde, Hubert Comon and Caroline Fontaine. If relevant, we have funding to support this project.





SM20797 Automated verification of cryptographic protocols


Description


Security protocols are distributed programs that aim at ensuring security properties, such as confidentiality, authentication or anonymity, by the means of cryptography. Such protocols are widely deployed, e.g., for electronic commerce on the Internet, in banking networks, mobile phones and more recently electronic elections. As properties need to be ensured, even if the protocol is executed over untrusted networks (such as the Internet), these protocols have shown extremely difficult to get right. Formal methods have shown very useful to detect errors and ensure their correctness.
Equational theories are used to describe algebraic properties of cryptographic primitives. The Tamarin prover [1] is a stateofthe art protocol verification tool, codevelopped between ETH Z=FCrich, Saarbr=FCcken and Inria Nancy. It has for instance been successfully used to verify standards such as TLS 1.3 and 5G AKA as well as industrial protocols such as OPC UA. Tamarin has recently been extended to allow the specification of more complex cryptographic primitives, by the means of more general equational theories (any userspecified equational theory that can be oriented in a convergent rewrite system). However for some examples nontermination issues arise in practice, in particular for equational theories used in evoting protocols.
The objective of this internship is to improve Tamarin's handling of such equational theories by identifying optimizations that avoid redundant and unnecessary cases causing the nontermination issues. A particular example that we aim at is a theory describing reencryption, which allows to rerandomize a ciphertext. This requires a theoretical analysis and correctness proof of the proposed optimizations, followed by the implementation in the tool and practical evaluation.
The internship may also lead to a PhD thesis on similar topics.
Expected qualifications are:
 solid knowledge of logic, proofs and/or formal verification techniques,  solid programming experience, ideally with functional programming in OCAML or Haskell.
Knowledge in security is not required, but a plus.
[1] http://tamarinprover.github.io/ URL sujet détaillé :
:
Remarques : Coencadrement avec Jannik Dreier (jannik.dreier.fr)
Le paiement d'une gratification est envisageable, mais souvent incompatible avec le salaire des élèves normaliens. Certains frais peuvent cependant être pris en charge.





SM20798 Design of an algorithm to compute the fluid limit of a discrete dynamical system with heterogeneous objects.


Description


Discrete dynamical systems with a large dimension are often well approximated by their fluid limit, or even more accurately by their size expansion of the second order. When the system is made of heterogeneous objects, the existing algorithms computing this expansion are too costly and the goal of this internship is to design new effective algorithms to do so. URL sujet détaillé :
:
Remarques : For more details (scientific and logistics), please contact Bruno Gaujal.





SM20799 Average complexity of reinforcment learning.


Description


Reinforcment learning algorithms are useful to optimize dynamical systems while they are running. Even for simple cases, their average complexity is still unknown. The goal of the internship is to study this question over simple cases (Qlearning for example). URL sujet détaillé :
:
Remarques : For more details (scientific and logistics), please contact Bruno Gaujal.





SM207100 Potential games with partial interactions


Description


Consider a matrix potential game where the utility function of a player (say 100 24 25 29 44 46 100 110 118 200 201 259 does not depend on the actions of player (say 100 24 25 29 44 46 100 110 118 200 201 259. In this case, players said to be indifferent to each other. The indifference relation creates a graph among players. The general question that will be addressed during the internship is how to exploit this graph to design efficient algorithms to compute Nash equilibria of the game. URL sujet détaillé :
:
Remarques : For more details (scientific and logistics), please contact Bruno Gaujal.





SM207101 Automated bisimulation checking in DEEPSEC


Description


With the ubiquity of smartphones and other portable, connected devices, the Internet has become the main communication medium. In many cases, e.g., ecommerce, home banking, electronic voting, etc., communications need to be secured. This can be achieved using cryptographic protocols. Cryptographic protocols, such as TLS, are distributed programs which use cryptographic primitives, such as encryption or digital signatures. However, even if the underlying primitives are perfectly secure, protocols may be attacked.
Symbolic, automated verification has been a very successful approach for revealing vulnerabilities or proving the absence of entire families of attacks. The DEEPSEC (https://deepsecprover.github.io/) verification tool for automatically analysing security protocols is a recent tool able to check security properties (expressed as behavioural equivalences) of cryptographic protocols (specified in the applied pi calculus, the tool's input language). This allows to model a large variety of security properties, including strong definitions of secrecy, anonymity and other privacy related properties.
The tool currently implements a procedure for checking trace equivalence. The paper describing the theory of DEEPSEC [1] also outlines a "theoretical procedure" for the verification of a stronger bisimulation equivalence, used to establish complexity result (coNEXP completeness). However, this theoretical procedure relies on a nondeterministic guess over a large search space which prevents an efficient implementation. The aim of the internship is to design an effective procedure, i.e., one being deterministic. This includes both theoretical aspects (algorithm design, correctness proof), as well as an implementation part to include this procedure in the DEEPSEC tool.
The internship may also lead to a PhD thesis on similar topics.
Expected qualifications are:
 solid knowledge of logic, proofs and/or formal verification techniques,  solid programming experience, ideally with functional programming in OCAML.
Knowledge in security is not required, but a plus.
[1] Vincent Cheval, Steve Kremer, Itsaka Rakotonirina: DEEPSEC: Deciding Equivalence Properties in Security Protocols Theory and Practice. IEEE Symposium on Security and Privacy 2018: 529546 URL sujet détaillé :
:
Remarques : Coencadrement avec Vincent Cheval (vincent.cheval.fr)
Le paiement d'une gratification est envisageable, mais souvent incompatible avec le salaire des élèves normaliens. Certains frais peuvent cependant être pris en charge.





SM207102 Surveillance à l'exécution de propriétés temporelles


Description


Pour affirmer qu'un système critique donné est fiable, il est nécessaire de certifier son bon fonctionnement, en particulier temporel. En complément d'une vérification statique avant l'exécution, nous considérons dans cette étude la vérification à l'exécution, qui consiste à surveiller le système pendant son exécution et à veiller à ce qu'il se conforme bien à une spécification donnée.
Lors de travaux précédents, nous avons développé un système formellement prouvé qui "compile" une spécification décrite par un automate temporisé en système de surveillance à l'exécution. Cette solution présente le défaut de ne pas passer à l'échelle pour des spécifications de taille importante.
Pour pallier à cette limitation et permettre d'appliquer la vérification à l'exécution sur des systèmes plus complexes, cette étude va évaluer l'adéquation de l'apprentissage pour la vérification à l'exécution.
Dans une première phase, il s'agira d'entraîner un système d'intelligence artificielle de type deep learning sur des spécifications à base d'automate temporisé et sur des traces d'exécutions marquées correctes ou incorrectes.
Lors de la deuxième phase, ce système sera évalué sur d'autres instances du problème afin d'évaluer ses propriétés de précision (faux positifs et faux négatifs) et de performance à l'exécution.
Une étude comparative avec la solution déjà implémentée fournira une étude complète de l'intérêt de l'utilisation du {em machine learning} pour la vérification de propriétés à l'exécution.
URL sujet détaillé : /Users/mroy/ownCloud/Stages/TrustMeIA/pagettiroy2018.tex
:
Remarques : Co encadrement avec Claire Pagetti (ONERA)





SM207103 Building Robust Learning Systems


Description


Research context. The aim of this internship proposal is to study advanced informationtheoretic tools to detect the inadequacy of data during the main phases of training and classification in deep neural networks. The fundamental outcome of this research aim at providing practical learning algorithms with theoretical guarantees ensuring the origin and integrity of data. More specifically, the goal of this research is to provide learning algorithms to jointly learn not only deep representations (as mapping from space of the data to representations) and classifiers but also a =93softdiscriminator=94 that provides an indicator with statistical guarantees (e.g., confidence intervals) of the functioning of an Artificial Intelligence (AI) system under the potential influence of new data (e.g., the data itself evolves over time and so does the nature of the classification problem) during both training and classification phases. Robust learning systems are expected to be able to decide whether new incoming data can be correctly classified (e.g., according to statistical guarantees) or used to update (retraining) the state of the current system.
Scientific approach. An interesting starting point will be to approach the algorithmic design using an informationtheoretic paradigm, which is based on the fundamental idea of Learning as Data Compression. This idea relies in the following insight: any regularity in the data (e.g. features and labels) can be used to compress the labels given the features, i.e., to describe it using fewer symbols than the number of symbols needed to describe the labels literally. The more regularities there are, e.g. is it easy to predict labels from features, the more the labels can be compressed given the features. Equating =91learning' with =91finding regularity', we can therefore say that the more we are able to compress the labels knowing the features, the more we have learned about the data. Formalizing this idea leads to a general theory of coding with side information and learning with several attractive properties. URL sujet détaillé : https://www.dropbox.com/s/l9yoie88dolk0mw/Topic%20%20IBM%20%202018%20%20itership.pdf?dl=0
Remarques : An international and R&D research environment. The intern student will be hosted by Laboratoire des Signaux et Systèmes (L2S, UMR8506) at CentraleSupélec (Université Paris Saclay) and the intern is motivated by a collaboration with IBM France Lab. Moreover, the L2S leads the CNRS International Associate Laboratory (LIA) on =93Information, Control and Learning=94 where the Montreal Institute for Learning Algorithms (MILA) Lab at Université de Montréal is part of it, carrying out top research in deep learning. Therefore, the inter candidate can take the opportunity, within the framework of this LIA, to visit MILA to convey a list of complementary and transferable expertise, implementing advanced algorithms for training deep neural networks (e.g., by working in an international environment with top researchers in the field).
Background of the candidate. We are looking for serious and motivated candidates, with interest and preferably studies to conduct research work in the area of machine learning and its applications. Advanced notions in machine learning, computer vision, optimization, information theory, or other related research areas, and a good familiarization with scientific research are very welcome.
To apply, candidates must transmit a digital application package including:
 Detailed curriculum vitae
 University transcripts
 Contact information of two references
to:
 Prof. Florence Alberge, Laboratoire des Signaux et Systèmes (CentraleSupélecCNRSUPS)
Email: florence.alberge.centralesupelec.fr
 Prof. Pablo Piantanida, Laboratoire des Signaux et Systèmes (CentraleSupélecCNRSUPS) =96
MILA (Université de Montréal)
Email: pablo.piantanida.com





SM207104 Exploitation de données issues du Smart Grids avec une garantie de confidentialité


Description


========================================1. Contexte sociétal, économique et industriel ======================================== La gestion de l'énergie est un problème mondial avec de grandes applications et implications : haute demande en énergie et aspects environnementaux. Les réseaux intelligents sont proactifs pour faire face aux défis de la demande, de l'approvisionnement en électricité, améliorer la gestion de l'énergie et sécuriser les communications du réseau électrique à l'aide des technologies de l'information et de la communication. L'impact énergétique et le déploiement de technologies intelligentes pour l'automatisation, l'échange d'informations, l'analyse, l'optimisation dynamique des opérations et des ressources du réseau sont autant de défis majeurs pour les industries en matière de systèmes d'intelligence décisionnelle. Comme beaucoup de secteurs, le secteur de l'électricité vie sa révolution par la Donnée. La transformation des réseaux énergétiques, l'apparition de nouveaux services, de nouveaux acteurs : consommateurs et des nouveaux modèles tels que l'autoconsommation, modifient les besoins de fonctionnement des réseaux et la gestion de données de plus en plus massives qui seraient inexploitables sans recours à l'IA. Tous les métiers sont impactés : production, transport, distribution et fourniture ... les réseaux ne transportent pas que des électrons mais aussi des données =96 c'est le smart grids. En France, une pièce essentielle du smart grids est le compteur communicant Linky =96 projet porté par Enedis, filiale autonome du groupe EDF. Ce compteur permet de collecter des mesures jusqu'à la maille " bassetension " du réseau public de distribution. L'information part également du consommateur vers l'opérateur ce qui assure à ce dernier un moyen de surveillance efficace pour détecter rapidement les anomalies sur le réseau. Essor des énergies renouvelables, développement de la mobilité électrique, évolution des modes de consommation... le compteur Linky permet d'accompagner ces changements tout en garantissant la sûreté du système électrique, et donc la continuité de l'alimentation.
========================================2. Smart Grids : des données utiles mais particulièrement sensibles
======================================== Les compteurs communicants " smart meters " font partie intégrante du réseau intelligent et offrent donc une donnée précieuse, mais aussi sensible =96 notamment la courbe de charge qui mesure les variations des niveaux de consommation électrique à l'intérieur même de la journée (typiquement, pas de temps 30 minutes). C'est pour cette raison que la Commission nationale de l'informatique et des libertés (CNIL) a pris le soin d'encadrer les conditions de collecte et d'utilisation. Dans les faits, la mesure courbe de charge n'est aujourd'hui qu'exceptionnellement collectée et sa diffusion hautement encadrée et limitée par la Commission de Régulation de l'Energie (CRE). Les enjeux de confidentialité " privacy " du Smart Grids =96 réels ou parfois fantasmés =96 freinent aujourd'hui le potentiel de ces données. L'objectif de notre travail est de permettre de répondre à cet enjeu, explorant plus en détails l'espace entre utilité et confidentialité. Un bon compromis entre utilité et confidentialité :
Dans le contexte évoqué, il nous paraît essentiel de réfléchir aujourd'hui à la question du partage et de l'exploitation des données issues du Smart Grids avec des garanties de confidentialité élevées. Pour le partage, l'une des pistes envisagées est d'étudier des techniques basées sur un traitement statistique pour la résolution de 2 objectifs antagonistes : 1. Simuler avec précision des mesures individuelles de Smart Grids, typiquement des courbes de charge. 2. Garantir un niveau de confidentialité, afin que les données simulées ne permettent pas (ou difficilement) une reidentification, typiquement qu'une courbe de charge simulée " colle " trop à une courbe de charge réelle.
========================================3. Actions prévues pour les consommations électriques individuelles : simulation de données, modélisation et prédiction garantissant la confidentialité ======================================== Les objectifs de ce travail de stage sont très larges, et correspondent à un potentiel important des méthodes d'apprentissage pour une meilleure gestion de la confidentialité des données de réseaux électriques. Notre plan d'action est structuré en trois actions détaillées ci dessous.  Prendre en entrée des courbes de charge individuelles décrites en grande dimension (par exemple 1 semaine avec un pas de =BD horaire ou 1 an avec un à pas journalier),  Utiliser ces courbes de charge individuelles pour construire un générateur de courbes réalistes (préservant les principales propriétés statistiques des courbes de charge) utilisant un autoencodeur convolutif (autoencoder variational, " Generative adversarial nets " ou modèle d'autoencoder génératif),  Travailler sur une garantie plus ou moins forte que ce générateur ne divulgue pas tout ou partie des courbes individuelles ayant servi à le construire. URL sujet détaillé : https://www.dropbox.com/s/nmuc0osgkyo7k27/Stage%20%20EDF%20%20CS.pdf?dl=0
Remarques : ========================================Un réseau international et une expérience de travail avec EDF R&D
========================================
Le travail en étroite collaboration entre le Laboratoire des Signaux et Systèmes (L2S) à CentraleSupélec (Paris Saclay) et les équipes concernées d'EDF R&D à Paris Saclay est un réel bonus pour garantir l'accès aux données et contraintes pertinentes de la problématique industrielle, ainsi que le développement d'outils adaptés. Le travail de ce projet pourrait s'inscrire également dans le cadre du Laboratoire International Associé (LIA) du CNRS " Information Learning and Control =96 The Ingredients of Intelligent Grids " avec plusieurs institutions au Québec (Université McGill, Université de Montréal,...) dont certaines fortement reliées aux Smart Grids (HydroQuébec Industrial Research Chair à l'Université McGill) et à l'IA (l'Institut des algorithmes d'apprentissage MILA est le plus grand laboratoire au monde en la matière). Un séjour à Montréal pourrait être prévu dans le but de faciliter l'interaction entre les chercheurs et le stagiaire, ainsi que le partage d'expertise pertinente.
Membres académiques :
 Pablo PIANTANIDA (Professeur adjoint), Laboratoire des Signaux et Systèmes (L2S), CentraleSupélec CNRSUniversité Paris Saclay
 Pierre DUHAMEL (DR CNRS), Laboratoire des Signaux et Systèmes (L2S), CentraleSupélecCNRSUniversité Paris Saclay
Membres industriels :
 Benoît Grossin (EDF R&D), Ingénieur de Recherche, Dépt. ICAME
 Amandine Pierrot (EDF R&D), Data Scientist
Contact : Dr. Pierre Duhamel (pierre.duhamel.centralesupelec.fr)
Pr. Pablo Piantanida (pablo.piantanida.fr)





SM207105 Analyse fonctionnelle du Secure Monitor de OPTEE/ Functional anlysis of OPTEE Secure Monitor


Description


Description ==========Today, users have a growing demand for security. In addition, systems are increasingly complex, open and connected. A major stake is to be able to guarantee the security of systems and (sensitive) data of embedded systems and other connected objects (IOT ). In this context, trusted execution environments (TEE ) were designed to take advantage of secure architectures such as ARM's TrustZone. These se cure platforms allow application offering different levels of security to coexist on the same architecture. For example on a smartphone, the TEE can be used to secure services such as electronic wallets or biometric authentication. OPTEE ([6]) is an open source project that offers a full implementation of a trusted execution environment. The TEE must therefore be able to guarantee the authenticity of the code that runs in secure mode, the integrity of the execution state (the state of the registers, the memory . . . ) and the confidentiality of the code and the data (see [8]). OPTEE distinguishes two worlds, a nonsecure one, used for common applications and a secure one, used by services that use sensitive data. The intern will study the possible data leak from the secure world to the nonsecure world. One of the key elements of a TEE is the Secure Monitor. It's the pro gram that is called at every switching between the nonsecure and the secure world. Calling a secure service will therefore induce a specific system calls. Following the internship of Maxime Ayrault, we have a model of OPTEE secure monitor [7]. Our study aims at enriching this model (for example with the interruption mechanism) and then perform its functional verification. Formal methods offer the possibility to automatically (or partially) check properties on the analyzed code (for medium size systems). These methods are generally used to verify the behavior of parts of applications. More recently, approaches have emerged to focus on security properties of sys tems [3]. As part of this internship, we are interested on the one hand by the functional verification of the OPTEE model with the help of the TEE specification [4, 5]. This specification is developed by Global Platform a consortium of industrial and academic partners working on the standardiza tion of the software interfaces. On the other hand, we would like to check security properties, to guarantee the information confidentiality: we want to show that an information cannot be known to unauthorized third parties. The model checker CPAchecker ([1]) is a tool of verification that im plements a large number of model checking checking [2]. One goal of the internship will be also to evaluate different verification strategies and find the one or more that suit our model. Goals ==== The objective of the internship is to model and to verify of a part of a TEE. As a first step, a model of / Secure Monitor / and its operational environment will need to be enriched. Then, starting from the specification of the TEE, functional properties will be verified on the model. Model checking algorithms will then be compared. Finally, the specification can be completed by adding properties of security and strategies to verify them.
Work plan ========  Bibliographic study on OPTEE and TEE specification  Analysis of the code of the /secure monitor/ of Maxime Ayrault  Proposal to add behaviors in the model  Proposal of functional properties  Study of model cheking algorithms implemented in CPAchecker  Model checking performance analysis  Perspective on the modeling of security properties
URL sujet détaillé : https://wwwsoc.lip6.fr/index.php?id=432
Remarques : Coencadrement avec Emmanuelle Encrenaz





SM207106 CNN inference approximation


Description


Deep Learning, and in particular Convolutional Neural Networks (CNNs), are currently one of the most intensively and widely used predictive models in the field of machine learning. CNNs have shown to give very good results for many complex tasks such as object recognition in images/videos, drug discovery, natural language processing, autonomous driving up to playing complex games.
Despite these benefits, the computational workload involved in CNNs is often out of reach for lowpower embedded devices, and/or is still very costly when running on datacenter hardware platforms. Thus, a lot of research effort from both industrials and academics has been concentrated in defining/designing custom hardware platforms supporting this type of algorithms, with the goal of improving performance and/or energy efficiency [Chao17, Hsin17, Zhi17].
In recent years, Approximate Computing (AxC) has become a major field of research to improve both speed and energy consumption in embedded and highperformance systems [Mit16]. By relaxing the need for fully precise or completely deterministic operations, approximate computing substantially improves energy efficiency. Various techniques for approximate computing augment the design space by providing another set of design knobs for performanceaccuracy tradeoffs. For instance, the gain in energy between a lowprecision 8bit operation suitable for vision and a 64bit doubleprecision floatingpoint operation necessary for highprecision scientific computation can reach up to 50x by considering storage, transport and computation. The gain in energy efficiency (the number of computations per Joule) is even higher. The challenge is then to find adequate (no more and no less) number representations and wordlengths of data/computations that are compatible with application constraints.
CNNs show inherent resilience to insignificant errors due to their iterative nature and learning process. Therefore, an intrinsic tolerance to inexact computation is evidenced, and using the approximate computing paradigm to improve power and delay characteristics is therefore relevant [Sung15]. Indeed, CNNs lend well with AxC techniques, especially with fixedpoint arithmetic or lowprecision floatingpoint implementations. They are therefore ideally suited for hardware acceleration using FPGA and/or ASIC implementation as acknowledged by the large body of work on this topic.
The goal of this project is to explore how approximation techniques can improve the performance of CNNs in deeplearning applications. In particular, the candidate will study how custom floatingpoint and fixedpoint arithmetic, adequate number representations, and even algorithmiclevel transformations, can improve performance and energy efficiency of CNN computation while keeping classification and learning phases at a very high success rate. The aim is to go further than current stateofart research studies in which only the inputs and outputs of the neural network layers are quantized to low precision.
The candidate will investigate the tradeoffs and benefits that different numeric formats and arithmetic operators bring in terms of performance and energy efficiency when doing Deep Neural Network inference. The goal is for the candidate to contribute to the development of a framework for automatic precision exploration and weight value quantization for CNN inference suited for hardware acceleration using FPGAs.
One particular direction of study we might pursue is that of numerical format optimization when doing inference in network architectures based on structured blockcirculant weight matrices [Ding17]. Such architectures offer the promise of small storage requirements (i.e., linear in the number of neurons of each layer vs quadratic in classic architectures) and improved performance in training and inference (i.e., quasilinear vs quadratic cost in the classic setting), critical elements needed for a wide adoption of deep learning methods on lowpower embedded devices.
References:
[Chao17] W. Chao, L. Gong, Q. Yu, X. Li, Y. Xie, and X. Zhou, =93DLAU: A scalable deep learning accelerator unit on FPGA=94, IEEE Trans. on ComputerAided Design of Int. Circ. and Syst., vol. 36, no. 3, 2017, pp. 513517. [Ding17] CirCNN: accelerating and compressing deep neural networks using blockcirculant weight matrices. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture (MICRO), 395408, 2017. [Hsin17] C. YuHsin, T. Krishna, J. S. Emer, and V. Sze, =93Eyeriss: An energyefficient reconfigurable accelerator for deep conv. neural networks=94, IEEE Journal of SolidState Circuits, vol. 52, no. 1, 2017, pp. 127 138. [Mit16] S. Mittal, =93A Survey of Techniques for Approximate Computing=94, ACM Computing Surveys (CSUR), Vol. 48, no. 4, 2016, pp. 133. [Sung15] W. Sung, S. Shin, and K. Hwang, =93Resiliency of Deep Neural Networks under Quantization,=94 CoRR, vol. abs/1511.06488, 2015. [Zhi17] L. Zhiqiang, Y. Dou, J. Jiang, J. Xu, S. Li, Y. Zhou, and Y. Xu, =93Throughput Optimized FPGA Accelerator for Deep Conv. Neural Networks=94, ACM Trans. on Reconfig. Tech. and Syst., vol. 10, no. 3, 2017, p 17. URL sujet détaillé : http://perso.enslyon.fr/silviuioan.filip/files/master_approxDL.pdf
Remarques : The internship will take place at IRISA, in Rennes, coadvised by Olivier Sentieys and Silviu Filip. Usual internship renumeration applies.





SM207107 Mesure et Intégrale de Lebesgue en Coq


Description


Finite elements are at the heart of numerous simulation programs used in industry. Verification methods developed in this project will be a breakthrough for the finite element method, but more generally for the reliability of critical software relying on complicated numerical algorithms. The goal is to develop the basis of correction proving of the FELiScE library which implements the finite element method for approximating solutions to partial differential equations.
Theses proofs are based on measure theory, Leebesgue integral, distributions and Sobolev spaces. The goal of this work is to formally prove in Coq some well known mathematical results such as the construction of L^2 space and several properties such as the completness of inner product associated norm.
Obviously, it is not only a ``translation'' from mathematical books to a proof assistant. The candidate must also be able to understand the proofs and how to factorize and generalize them.
Note that there exists references regarding real analysis [BLM15] and the formal proof of the LaxMilgram theorem [CPP2017]. Références:
[CPP2017] Sylvie Boldo and François Clément and Florian Faissole and Vincent Martin and Micaela Mayero. A Coq formal proof of the LaxMilgram theorem. In Proceedings of the 6th ACM SIGPLAN Conference on Certified Programs and Proofs, (CPP 2017), pages 79=9689, 2017.
[BLM15] Sylvie Boldo and Catherine Lelay and Guillaume Melquiond. Coquelicot: A UserFriendly Library of Real Analysis for Coq. In Mathematics in Computer Science, volume 9, number 1, pages 4162, 2015. URL sujet détaillé : https://wwwlipn.univparis13.fr/~mayero/?content=master
Remarques : Coencadrants : Sylvie Boldo; Inria Saclay
François Clément, Vincent Martin; Inria Paris
Le/la stagiaire pourra être amené.e à se rendre à l'INRIA Paris ou Saclay.
Rémunération: oui.





SM207108 Geometrical and topological mesh filtering for fabrication


Description


3D printing has become many peoples hobby at home, thanks to the fast development in the past decades. Today fabricating an appropriate 3D model using a lowcost 3D printer is nearly as easy as printing a textual document, but creating a 3D model which is actually =94appropriate=94 for printing is definitely *complicated*.
Most of the 3D models flooding Internet are not meant to be fabricated, and usually reveal many defects that make them unsuitable for printing, such as open boundaries, selfintersections, zerothickness walls, incomplete geometry, many high frequency small details, etc. Which of them can really be printed ? How to repair and filter the unprintable models to adapt them to the fabrication requirements ?
We currently miss a geometrical and topological mesh filtering approach that can cope with *any* STL file while preserving all visible surfaces with no or minimal distortions and managing material properties and domainspecific information contained into the input model. The goal of this project is to design such a framework. URL sujet détaillé : https://members.loria.fr/DBoltcheva/files/MeshFiltering.pdf
Remarques : Le stage sera coencadré par Cédric Zani :
https://minesnancy.univlorraine.fr/collaborator/cedriczanni/





SM207109 Théorie des pavages isotropes


Description


(english version below)
La théorie des pavages est un domaine à l'interaction entre l'informatique fondamentale (décidabilité, modèles de calcul) ; les mathématiques discrètes (combinatoire, physique statistique) ; et les systèmes dynamiques.
Les pavages considérés par ces différentes communautés sont de natures et de complexités assez différentes et certains résultats ne franchissent pas la frontière comme ils le devraient. L'objectif de ce stage et d'explorer une nouvelle sousclasse de pavages, dit isotropes, qui semble à la fois contenir la plupart des modèles d'interêt physique ou combinatoire, et briser les preuves d'indécidabilité connues. Ce contexte laisse espérer des algorithmes non triviaux pour les problèmes indécidables en général, ou le développement de nouvelles preuves d'indécidabilité.
Vous revisiterez la littérature classique des pavages et verra comment elle se décline sur cette sousclasse pour dégager des questions potentiellement intéressantes et décider du bon formalisme à adopter. Ceci nous permettrait d'obtenir un premier panorama sur ces objets. Suite à cela, je propose divers problèmes ouverts sur ces objets, en plus de ceux que vous proposerez : se référer au sujet détaillé.

Tiling theory is at the border between theoretical computer science (decidability, models of computation), discrete mathematics (combinatorics, statistical physics), and the mathematical theory of dynamical systems.
Those communities study tiling with different properties and complexities, which means that some results do not carry over as well as they should. The objective of this internship is to study a newlyintroduced subclass, isotropic tilings, that include most models of physical and combinatorial interest and at the same time break all known undecidability proofs. We can therefore hope for nontrivial algorithms for problems that are undecidable in general, or new methods for undecidability.
You shall revisit the classical tiling litterature and adapt it to isotropic tiling to find potential questions of interest and arguments for the best formalism. This would provide us with the first overview of these objects. Then, I offer various open problems to study, to which you will add your own. Please refer to the detailed description. URL sujet détaillé : https://www.lri.fr/~hellouin/Stage.pdf
Remarques : Ce stage peut donner lieu à gratification (environ 500=80). Suivant la disponibilité du financement, une poursuite en thèse (en codirection) peut être envisagée.

This internship can be paid (around 500=80). Depending on funding availability, the intern could follow up a Ph.D (with a senior supervisor).





SM207110 Reconnaissance de bâtiments à partir de nuages de points 3D


Description


The digitization of real objects is increasingly used in areas such as urban design and architecture. Acquisition tools such as lidars and photogrammetry make it possible to produce digital representations of entire cities in the form of massive 3D point clouds sampling the surfaces of objects in the environment.
Nowadays, the pipeline of creating digital 3D models from point clouds, photogrammetry and land registry information (cadastre) remains largely manual and is recognized as timeconsuming, tedious, subjective and requiring skills. The workflow for the designer is to first select areas of interest (for example, buildings and houses) from the point cloud. Then he creates 3D models starting from simple parallelepipeds and gradually adding details such as roofs, windows, doors, cornices, stairs, etc.
The longterm goal is to develop a tool to assist the designer in the creation of 3D models in a semimanual way. During this internship, we will focus on a still open problem: the detection of roofs in the point clouds. So we are planning to study a particular subproblem, namely the detection of gable roofs (toitures à deux versants) by first injecting priors and then possibly extending the method to a detection based on a catalog of components. URL sujet détaillé : https://members.loria.fr/DBoltcheva/files/rhinocerros.pdf
Remarques : Le stage sera coencadré par Dmitry Sokolov : https://members.loria.fr/DSokolov/
Si nos premiers résultats s'avèrent prometteurs, le projet pourra être poursuivi dans le cadre d'une thèse CIFRE.





SM207111 Génération de maillage hexahédral structuré par blocks


Description


Block structured hexahedral are considered to be the "holy grail" for the hexahedral remeshing problem. These meshes have a very constrained structure that can be expoited by numerical simulations, but also makes them really difficult to produce. This specific structure can be represented by the Spatial Twist Continuum; a set of surfaces that are dual to layer sheets of hexahedron. Unfortunately, it is very difficult to produce such surfaces. The objective of this project is to extract the Spatial Twist Continuum from the result of another hexhedral remeshing algorithms that is based on global parameterization. Indeed, the results of this algorithm often captures well most important features of the volume, that could be used as an heuristic to define the STC. URL sujet détaillé : https://members.loria.fr/NRay/projets/
Remarques :





SM207112 Frame Field 2.5D


Description


Frame Fields are objects that associate, to each point of space, a set of six unit vectors that are either opposite of orthogonal to each others. Such fields are used to remesh volumes by hexahedron: it provides the orientation of the hexahedrons. They are actually generated by minimizing their spatial variation (rotation), subject to the constraint to be aligned witht the domain's boundary. The objective of this intership is to add extra constraints in this process, such that it will be possible to tile the domain by hexahedron aligned with the resulting fields. The main difficulty comes from singularities of the field i.e. curves around which a frame following an elementary cycle will have an infinite speed of rotation. URL sujet détaillé : https://members.loria.fr/NRay/projets/
Remarques :





SM207113 Atlas de texture à coutures invisibles


Description


For image synthesis, geometric objects are mostly represented by triangulated surfaces. We can improve their visual aspect by mapping an image (called texture) onto the surface. This mapping is defined by a texture atlas, where the frontiers between maps produce rendering artifacts. To overcome this problem, one can generate global gridpreserving parameterisations where seams can be made invisibles. The objective here is to implement an algorithm that can generate such texture atlases, and edit textures associated to the object, in such a way that the rendering will not produce visual artefacts anymore. The first step will be to implement an existing algorithm ["invisible seams" EGSR 2010]; we already have parameterisation algorithm, from which a textire atlas must be derived, as well as a set of constraints to be applied to textures to make them seamless. The second step will be to propose improvements of this method, to make it more robust. URL sujet détaillé : https://members.loria.fr/NRay/projets/
Remarques :





SM207114 Representation of elastic concepts  Application to search in a digital humanities corpus


Description


The execution of an exact queries to a corpus does not necessarily give a sufficient answer to the user, for at least two reasons: (A) a close match would be missed but nevertheless interesting, (B) an answer that is not close but gives a relevant viewpoint can also be of interest. This motivates the issue of inexact search in a corpus. This is an ongoing issue for a few years with an application in the Henri Poincaré correspondence. The notion of elastic query is introduced to address this issue: an elastic query is defined by a pair (Q, R) where Q is a classical query and R is as set of query transformation rules associated with costs and explanations: an item x matches (Q, R) with a cost c if there exists a sequence of transformation of Q into Q' by rules r1, ..., rn with c = cost(r1) + ... + cost(rn).
Three issues can be considered in this work (or in the PhD thesis that can continue it). The first one is the study of a language of query transformations: how is it sufficient to represent user's need and how it should be extended. The second one is about how the humanmachine interaction should be considered in this framework. The last and not least issue aims at going further in the notion of elastic set, in particular: how classical operations on sets (inclusion, intersection, union, etc.) can be extended to elastic sets.
Keywords: symbolic artificial intelligence, digital humanities, semantic web technologies
(The detail of this work is given in the pdf file, don't hesitate to contact us for any question!) URL sujet détaillé :
:
Remarques :





SM207115 Representation of elastic concepts  Application to search in a digital humanities corpus


Description


The execution of an exact queries to a corpus does not necessarily give a sufficient answer to the user, for at least two reasons: (A) a close match would be missed but nevertheless interesting, (B) an answer that is not close but gives a relevant viewpoint can also be of interest. This motivates the issue of inexact search in a corpus. This is an ongoing issue for a few years with an application in the Henri Poincaré correspondence. The notion of elastic query is introduced to address this issue: an elastic query is defined by a pair (Q, R) where Q is a classical query and R is as set of query transformation rules associated with costs and explanations: an item x matches (Q, R) with a cost c if there exists a sequence of transformation of Q into Q' by rules r1, ..., rn with c = cost(r1) + ... + cost(rn).
Three issues can be considered in this work (or in the PhD thesis that can continue it). The first one is the study of a language of query transformations: how is it sufficient to represent user's need and how it should be extended. The second one is about how the humanmachine interaction should be considered in this framework. The last and not least issue aims at going further in the notion of elastic set, in particular: how classical operations on sets (inclusion, intersection, union, etc.) can be extended to elastic sets.
Keywords: symbolic artificial intelligence, digital humanities, semantic web technologies
(The detail of this work is given in the pdf file, don't hesitate to contact us for any question!) URL sujet détaillé : http://bruneauolive.free.fr/images/sujet_bruneau_lieber.pdf
Remarques : This work with be cosupervised by Olivier Bruneau (who is a historian of mathematics in AHPPReST) and Jean Lieber (who is a computer scientist in Loria). The trainee will have the classical payment (about 500 euros per month). A PhD thesis can follow this work.
Please contact Oivier at olivier.bruneaulorraine.fr and Jean at jean.lieber.fr





SM207116 Réduction de la consommation énergétique dans les réseaux mobiles hétérogènes


Description


The objective of the internship is to design solutions reducing energy consumption of access points and base stations in heterogeneous mobile networks, while ensuring a given quality of service to the users. The solution will select, on line, which access points or which base stations to switch off, how to associate the users on the active access points / base stations, which power and which physical transmission rate to use, without sacrifying, at the same time, the network and users performance. The solution will likely be based on an optimization problem. This latter will be designed thanks to models of performance and energy consumption prediction. These models can be constructive and datadriven. URL sujet détaillé : http://perso.enslyon.fr/isabelle.guerinlassous/stagegreenmobilenetworks.pdf
Remarques : Le stage sera coencadré avec Thomas Begin.





SM207117 Undecidability in perturbed dynamical systems


Description


Dynamical systems is a very natural and widely used model to capture systems evolving over time. Generally dynamics are computable, in the following informal sense: given a configuration, it is possible to compute the next configuration of the configuration after some finite time. In many of the contexts the properties of true interest are however about their longterm evolution. However, if boundedtime simulations are usually possible, the computational outlook for the =93infinite=94 time horizon problems is grim: the longterm behavior of many interesting systems is uncomputable.
A notable feature, shared by prior works on computational intractability in dynamical systems~cite{Moore90,BravermanYampolsky2007,AsarinBouajjani2001}, is that the noncomputability phenomenon is not robust: uncomputability disappears once one introduces even a small amount of noise into the system. Thus, if one believes that natural systems are inherently noisy, one would not be able to observe such noncomputability phenomena.
A natural approach is to study the invariant measures of perturbed systems. In~cite{BravermanGrigoRojas2012}, the authors show that while simple dynamical systems defined by transformations of the interval have uncomputable invariant measures, perturbing such systems makes the invariant measures efficiently computable. Thus, noise that makes the short term behavior of the system harder to predict, may make its long term statistical behavior computationally tractable.
This type of results is not known for dynamical systems on discrete space as cellular automata. In this case it is possible that the noise cannot break uncomputability since some computation can be implemented in noisy cellular automata~cite{Gacs2001}. The purpose of this stage is to see where the technics developed in~cite{BravermanGrigoRojas2012} cannot hold and try to construct dynamical system with uncomputable asymptotic behavior even if it is perturbed.
URL sujet détaillé : https://www.math.univtoulouse.fr/~msablik/sujet_M2_SystemeDynamiqueCalculable.pdf
Remarques :





SM207118 Models and planning for humanrobot collaboration.


Description


Collaboration between humans and robots is a current highstake research subject with numerous application areas (smart factories, therapeutic robot companion,...). The internship we propose here is linked to the "Flying coworker" ANR accepted project (2018) whose aim is to build a collaborative flying robot to help human workers. More precisely, the proposed subject focuses on building the behavior (or the conditionnal plan) of an autonomous robot to assist a human worker to fulfill his tasks in the best possible way (meaning, minimizing a cost to define). It is a complex issue since (1) the robot has to estimate the objective of the human worker through partial observations of his activity, (2) the robot has to make decisions based on this partial information and (3) the human behavior might also depend on the actions undertaken by the robot.
This internship aims at addressing this issue by using models from the "decision making under uncertainty" research field (Markov Decision Processes, Partially Observable Markov Decision Processes [1]) and at investigating several questions:  how to model the human worker behavior, his objectives and the task sequence he tries to accomplish (with the help of Markov chain);  how to use this model to infer distributions on the current objective of the human worker based on his actions (Bayesian inference, HMM [2]);  how to decide actions to gather more information about the human task (with the help of active sensing models and algorithms [3]);  how to help the human worker while considering uncertainties about his state and his evolving objectives (by using POMDP models).
In a first step, the student will have to study Markov Models before proposing a formalization of a situation where an autonomous agent has to help a human to attain his current objective, initially unknown to the agent. For instance, the autonomous agent has to decide which tools to bring among several available to an isolated human worker by considering the probabilistic workflow of his activity. Then, in a second step, the internship will address the questions cited previously: first by finding a way to model the human objective [4] [5] and his adaptation to the robot actions [6] and then by proposing algorithms to build the optimal behaviour of the agent based on these models [7] and on reasoning on Human Robot joint action [8].
During the internship, this work will be mostly done in simulation to investigate algorithms and their efficiencies, but, if time allows it, a simple scenario (to be built during the internship) could be tested with a real robot.
Keywords : artificial intelligence, decision making under uncertainty, Bayesian inference.
References : [1] Olivier Sigaud, Olivier Buffet. Markov Decision Processes in Artificial Intelligence. (Lavoisier  Hermes Science Publications 2008). [2] Lawrence R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition (PROCEEDINGS OF THE IEEE  1989) [3] Mauricio Araya, Olivier Buffet, Vincent Thomas, François Charpillet. A POMDP Extension with Beliefdependent Rewards (NIPS  2010) [4] Arsène Fansi Tchango, Vincent Thomas, Olivier Buffet, Fabien Flacher, Alain Dutech. Simultaneous Tracking and Activity Recognition (STAR) using Advanced AgentBased Behavioral Simulations. (ECAI  2014) [5] Alan Fern, Sriraam Natarajan, Kshitij Judah, Prasad Tadepalli. A Decisiontheoretic Model of Assistance. (JAIR  2014) [6] Stefanos Nikolaidis. Mathematical Models of Adaptation in HumanRobot Collaboration. (PhD thesis  2017) [7] Dylan HadfieldMenell, Anca Dragan, Pieter Abbeel, Stuart Russel. Cooperative Inverse Reinforcement Learning (NIPS  2016) [8] Severin Lemaignan, Mathieu Warnier, Emrah Sisbot, Aurélie Clodic, Rachid Alami. Artificial cognition for social human=96robot interaction: An implementation (Artificial Intelligence, Elsevier, 2017) URL sujet détaillé : https://team.inria.fr/larsen/mastersubjects/
Remarques :  Ce stage sera coencadré par Francois Charpillet et Francis Colas (Inria Nancy Grand Est  équipe LARSEN).
 Ce stage est proposé dans le cadre du projet ANR "Flying coworker" qui a été accepté en juillet 2018 et dont l'objectif est de développer un drone volant capable de collaborer avec des humains. Ce sujet pourra se poursuivre en thèse dans le cadre du projet ANR.





SM207119 Quotient d'indépendance & nombre chromatique fractionnaire


Description


This internship focuses on the fractional chromatic number, an exciting graph invariant that appears under various guises: as a probability distribution on the independent sets of a graph that guarantees at least a fixed probability that an arbitrary vertex belongs to a random independent set chosen according to the distribution; as an (a:b)colouring of the graph such that the ratio a/b is as small as possible; and as the optimal value of a linear program.
The student will study various questions about this invariant and learn, apply and possibly broaden the techniques developped for its study. Depending on the student's preferences, the questions can pertain more to structural graph theory or to extremal graph theory. Due to the different ways to express the fractional chromatic number, these techniques cover a wide range of discrete methods: graph structural arguments, colouring techniques, discrete probability distributions, probabilistic methods, linear programming.
The internship can also be more focused on the independence ratio of graph, defined as the ratio between the size of a largest independent set over the number of vertices: the independence ratio is always a lower bound on the fractional chromatic number, and there are some graph classes where equality always hold, e.g. that of vertextransitive graphs.
Continuing towards a Ph.D is possible. Thanks to the background built during this internship, the student, if desired, will be able to move towards a large variety of topics in graph theory: colouring, structure, extremal questions, graph limits and flag algebras. URL sujet détaillé : http://lbgi.fr/~sereni/stageM2_ENSLyon.pdf
Remarques :





SM207120 Immersive Analytics in VR/AR


Description


Our society has entered a datadriven era, in which not only enormous amounts of data are being generated every day, but also growing expectations are placed on their analysis [1]. Analyzing these massive and complex datasets is essential for making new discoveries and creating benefits for people, but this remains a difficult task. In this Internship, we will consider immersive devices like Mixed reality head (Occulus Rifts, HTC VIVE) set or augmented devices (i.e. HoloLens, https://www.microsoft.com/enus/hololens ) to address data exploration issues.
This ambitious research project is at the crossroads of Information Visualization, Visual Analytics, Computer Graphics and HumanComputer Interaction, and it will contribute to the research efforts upon big data issues. This project will investigate the following challenges: Data visualization, manipulation, multiple uses and algorithm understanding challenges.
This project will extend FiberClay concepts: http://recherche.enac.fr/~hurter/FiberClay.html
This internship is a research internship with a possible PhD thesis extension. URL sujet détaillé : http://recherche.enac.fr/~hurter/FiberClay.html
Remarques : Stage rémunéré, gratification obligatoire





SM207121 Automated CTL proofs in proof assistants


Description


Context  Formal methods are more and more commonly used to ensure the correctness of industrial systems, in particular critical ones. Several techniques can be used, among which model checking, static analysis using abstract interpretation, mechanized theorem proving (automated theorem provers and proof assistants), etc. For a given system, different techniques can be used to prove the correctness of different subsystems. The question that naturally arise is whether the correctness of the subsystems implies the correctness of the whole. To show this, one needs to relate the different proof techniques together.
In the recent years, Ji [1] showed how to perform model checking proofs based on the modal logic CTL in an automated theorem prover, namely iProverModulo, with performances comparable to stateoftheart model checkers. On the other hand, iProverModulo is able to produce proof in Dedukti format. Dedukti (https://deducteam.github.io/) is a universal proof checker, which is able to check proofs coming from a variety of automated provers and proof assistants. Conversely, Thiré [2] showed how to translate proof from a subset of Dedukti to various proof assistants, namely Coq, PVS, Lean, Matita and provers of the HOL family. In these proof assistants, various embedding of CTL have been developed, but they lack automation.
Subject 
The goal of this internship is to study how to automate proof search of CTL formulas in proof assistants by using iProverModulo and Dedukti. First, one has to make sure that the proofs produced by iProverModulo corresponds to the fragment translatable to the proof assistants. Then, the embedding of CTL in iProverModulo, and its translation via Dedukti, should be aligned with how CTL has been defined in the various proof assistants. Finally, tactics should be developed in the proof assistant to make the whole process easy to use.
References 
[1] Kailiang Ji. CTL model checking in deduction modulo. In Amy P. Felty and Aart Middeldorp, editors, CADE25, volume 9195 of LNCS, pages 295=96310. Springer, 2015.
[2] François Thiré. Sharing a library between proof assistants: Reaching out to the HOL family. In Frédéric Blanqui and Giselle Reis, editors, LFMTP 2018, volume 274 of EPTCS, pages 57=9671, 2018. URL sujet détaillé : http://web4.ensiie.fr/~guillaume.burel/stages/Master2_ctlsttfa.pdf
Remarques : The internship will take place in Inria projectteam Deducteam, and will be supervised by Guillaume Burel, assistant professor at the Ensiie, temporarily assigned at Inria.





SM207122 Verification of security protocols


Description


*Context*. Security protocols are distributed programs that aim at ensuring security proper ties, such as confidentiality, authentication or anonymity, by the means of cryptography. Such protocols are widely deployed, e.g., for electronic commerce on the Internet, in banking networks, mobile phones and more recently electronic elections. Formal methods have demonstrated their usefulness when designing and analyzing secu rity protocols. They indeed provide rigorous frameworks and techniques that have allowed to discover new flaws. In particular, ProVerif [1] is a stateoftheart tool dedicated to the security analysis of protocols. It has been successfully applied to numerous protocols of the literature, providing security proofs or discovering new attacks. However, ProVerif fails short when it comes to analyse voting protocols. Indeed, voting protocols should achieve vote privacy (no one knows my vote) and verifiability (the result corresponds to the votes cast by voters). This last part requires to count the votes, which is beyond ProVerif's capabilities. Instead, current analyses devise sufficient conditions that imply verifiability and that are easier to prove. However, this means that part of the analysis has to be done by hand (on paper).
*Objectives of the internship*. The goal of the internship is to enhance ProVerif with counting operations. This would allow to analyse voting protocols but also any protocol with counting operations. This work has a practical impact but requires first to develop theoretical results in the area of automated deduction. Namely, ProVerif actually abstracts protocols by Horn clauses. The success of ProVerif lies in the development of a very efficient resolution strategy for Horn clauses. Very recently, ProVerif has been enhanced with natural numbers together with a + operation [2]. The first goal of the internship is to develop a resolution procedure for clauses that contain a + operation on integers, especially when the + operator is applied between two variables. This is necessary to deal with verifiability in voting. The new resolution procedure will be integrated into ProVerif and tested on voting protocols of the literature. On a mediumterm, the goal is to devise decision procedures for operators with arithmetic properties like the exclusive or and the modular exponentiation. One direction is to split the properties into properties that can be handled directly by ProVerif and properties that require a dedicated resolution procedure. One interesting aspect of the internship is that it mixes foundational work together with the study of practical protocols (in evoting).
This internship may lead to a PhD thesis on similar topics.
Références [1] Bruno Blanchet. Modeling and verifying security protocols with the applied pi calculus and proverif. Foundations and Trends in Privacy and Security, pages 1=96135, 2016. [2] Vincent Cheval, Véronique Cortier, and Mathieu Turuani. A little more conversation, a little less action, a lot more satisfaction : Global states in proverif. In Proceedings of the 31st IEEE Computer Security Foundations Symposium (CSF'18), pages 344=96358, 2018. URL sujet détaillé : https://members.loria.fr/VCortier/files/sujetvoteProVerif.pdf
Remarques : Interns may receive a stipend.
This work will be cosupervised by Vincent Cheval (PESTO team, Loria, Nancy).





SM207123 Perception and interpretation of human activity


Description


For a long time, industrial robots were secluded in safety cages and autonomous robots were mainly designed and programmed to work without humans or while avoiding them (as obstacles). There is a growing trend towards environment sharing and collaboration between humans and robots with the objective of having robots assist humans (for manufacturing but also in our daily lives). One of the main issues for humanrobot interaction is, for the robot, to be able to adequately perceive and interpret what the human is doing.
This internship is in the context of the ``Flying coworker'' ANR project aiming at building a collaborative flying robot to help human workers. The objective of this subject is to be able to provide information about the current and future status of the human worker. This can mean the precise motion of the arms, hands, head of the human but also her posture and even the higherlevel task she is performing.
There are several aspects to be tackled in the long term, each of which open for the present internship:  human perception by a flying robot,  human activity recognition,  gesture and intention estimation,  human activity prediction.
A possible workplan would be to start by study and use stateoftheart techniques for activity recognition based on probabilistic approaches such as Hidden Markov Models [1]. It would then be possible to develop new activity models which would maximize longerterm prediction rates, for instance, based on more accurate modeling of each task duration. A particular challenge is to be able to learn the parameters of such a new model. An additional step could be to leverage shortterm motion prediction [2], coupled with the new activity model to further improve accuracy.
It is expected to balance theoretical contributions with experimental validation. To this end, various mobile robots (Tiago, Pepper, and several other ground robots) with different sensors are available as test platforms.
URL sujet détaillé : https://team.inria.fr/larsen/sujet2018perceptionhumanactivity/
Remarques : This internship takes place at Inria in Nancy in the Larsen team and is founded by the ``Flying coworker'' ANR project. It is cosupervised by Serena Ivaldi, Vincent Thomas and François Charpillet. PhD funding will be available for the continuation of this project.





SM207124 Topological bounds on the chromatic number


Description


Scientific context
The chromatic number is arguably the most widelystudied graph invariant, and as a result there is a plethora of conjectures and open problems involving it (see [1]). Because of the difficulty of these conjectures, a fruitful approach has been to prove simpler versions, for instance by replacing the chromatic number by the fractional chromatic number.
An interesting family of lower bounds on the chromatic number (completely independent of the fractional chromatic number) comes from an application of the BorsukUlam theorem: these are known as topological bounds on the chromatic number (see [2]).
Subject of the internship
The aim of this internship is to consider weaker versions of conjectures involving the chromatic number, by replacing the chromatic number by one of the topological bounds. This was already done [3] for the conjectures of BehzadVizing, Hedetniemi and Hadwiger, but many interesting conjectures remain: for instance, the ErdosLovaszFaber conjecture, Reed's conjecture, and the doublecritical graph conjecture.
The internship will be an opportunity to learn about the area of graph colouring, as well as topological methods. Given the large number of conjectures, the student will have ample scope for independent research.
The student is expected to have a basic knowledge of graph theory, as well as an interest in topology.
The internship may lead to a PhD thesis on similar topics.
References [1] T. R. Jensen and B. Toft, Graph Coloring Problems, WileyInterscience Series in Discrete Mathematics and Optimization, John Wiley & Sons, Inc., New York, 1995. [2] J. Matousek. Using the Borsuk=96Ulam theorem, Universitext, SpringerVerlag, Berlin, 2003. [3] G. Simonyi, and A. Zsban, On topological relaxations of chromatic conjectures, European J. Combin. 31 (2010) 2110=962119. URL sujet détaillé :
:
Remarques :





SM207125 Active Search for Information with a Mobile Robot


Description


POMDPs (Partially Observable Markov Decision Processes) allow describing control/planning problems where the state of the system is only "guessed" (in a Bayesian sense) through indirect observations.
Recent algorithms allow solving POMDPs with large or continuous state, action and observation spaces. The objective of this research work is to adapt these algorithms to rhoPOMDPs, i.e., an extension of POMDPs that allows defining informationoriented performance criteria, e.g., localizing a robot (rather than getting it to a certain location) or a victim in a disaster (assuming that this a prior step before any other intervention). The resulting solution(s) shall be evaluated on a simple mobile robot equipped with proximity sensors.
The work program consists in: 1 becoming familiar with prior work on rhoPOMDPs and continuous POMDPs, probably implementing some related algorithms; 2 becoming familiar with the robot and related softwares; 3 proposing an algorithm for continuous rhoPOMDPs (possibly with asymptotic convergence properties); and 4 evaluating it empirically on some scenarios involving the robot. URL sujet détaillé : https://team.inria.fr/larsen/sujet2018rhopomdprobot/
Remarques : Coadvised with Vincent THOMAS.
This work requires good programming and math skills (including basic probability theory), and willingness to work on both theoretical and practical questions.





SM207126 Static Selective Instrumentation for Parallel Programs Verification


Description


High Performance Computing (HPC) plays an important role in many fields like health, materials science, security or environment. The current supercomputer hardware trends lead to more complex HPC applications (heterogeneity in hardware and combinations of parallel programming models) that pose programmability challenges. As indicated by a recent US DOE report, progress to Exascale stresses the requirement for convenient and scalable debugging methods to help developers.
The PARallel COntrol flow Anomaly CHecker (PARCOACH) aims at helping developers in their debugging phase. It combines static and dynamic analyses to detect misuse of collectives (i.e., collective operations mismatch in MPI) in parallel applications. At compiletime, an interprocedural static analysis studies the control and data flow of a program to find potential deadlocks. During this step, a program is tagged as statically correct (no possible deadlock) or not verifiable (a deadlock may occur at runtime). For statically not verifiable programs, all collectives and exit statements are instrumented with Check Collective (CC) functions, starting from the first collectives that may deadlock in the program. The first "safe" collectives are not instrumented. At executiontime, the CC functions verify which collectives will be called at different steps of execution. In case of an actual deadlock situation, the program is stopped before the deadlock occurs.
The goal of this internship is to improve the selective instrumentation of PARCOACH. The student will work on developing an analysis that builds an automata of the program to compute the language of remaining collectives at different point of the program. The analysis will combine the static information in PARCOACH with the automata to reduce the number of checks at runtime.
The internship will first focus on MPI applications and will then be extended to other parallel programming models. The work will be implemented as an LLVM Pass and fully integrated to the PARCOACH tool. URL sujet détaillé : http://emmanuellesaillard.fr/resources/M2_Intru.pdf
Remarques : Des connaissances en compilation et en programmation parallèle sont souhaitables mais pas indispensables.





SM207127 Combiner SOG et Réduction d'Ordre Partiel pour Une Vérification Efficace de la Logique LTL


Description


The foal of the internship is to combine the symbolic observation graph and the partial order reduction technique in the context of LTL model checking. URL sujet détaillé : https://wwwlipn.univparis13.fr/~klai/stages/SOG_PO
Remarques : Ce stage sera coencadré par Sami Evangelista





SM207128 Development of a framework for the automated testing of Android applications.


Admin


Encadrant : Fabrizio PASTORE 
Labo/Organisme : University of Luxembourg / SnT Centre for Security, Reliability and Trust 
URL : : 
Ville : Luxembourg (LUXEMBORG) 



Description


This internship project fit in a larger research program concerning the automated testing of Android applications. More precisely, the internship project regards the extension of a framework for the automated testing of Android systems using their GUI.
The framework is capable of automatically exercising the features of an Android system, thus mimicking the behavior of a human operator performing manual testing. The framework is characterized by an extensible architecture that enables engineers and researchers to specify the exploration strategies to consider when performing testing and to define the automated oracles to use to automatically identify failures.
This internship regards the extension of the framework to enable the automated detection of regression failures caused by changes in the Android OS. To this end the internship will consist of two major activities: (Activity1) Development of exploration strategies for changes in Android OS; (Activity2) Development of automated oracles for interactionprotocols.
The first activity regards the development of exploration strategies that facilitate the identification of regression failures that emerge after updates in the Android OS. The developed strategy may leverage different types of information like codecoverage (e.g., to determine which Android APIs are used), system models (to determine dependencies not visible at the code level), input generation strategies (e.g., random, based on dictionaries, or based on usercollected data), requirements and software specifications in natural language (e.g., to determine new features and behavioural changes).
The second activity concerns the identification of oracles that spot regression failures by identifying anomalies in the behavior of the software interfaces exercised during software testing. Anomalies at the interface level (e.g., return values never produced by previous versions of an API function) often lead to regression failures that are difficult to identify at testing time because they may not immediately affect the results visualized in the GUI. For example, failures depending on erroneous interactions with the Android backup service API can be observed from the GUI only when the Android system is restored from the backup, while anomalous behaviors of the API service (e.g., unexpected return values) might be observed at every invocation of the backup service. To develop oracles for interactionprotocols and effectively identify these kinds of anomalies the Research Program will investigate the use of different machine learning solutions to determine nominal software behaviors, spot anomalies and identify anomalies that indicate the presence of failures.
The student doing the internship will study the stateoftheart regarding the techniques required to complete the internship: (1) automated testing of GUI systems; (2) model inference techniques; (3) anomaly detection; (4) natural language processing.
URL sujet détaillé :
:
Remarques : The student doing the internship will receive a coverage for 800 EUR/month.
The student is entitled to receive an accomodation from the university (costs 400 EUR/month).
The student will work in the group of Prof. Lionel Briand under the supervision of Dr. Fabrizio Pastore.
Students showing interest and appropriate research attitude will be offered a position for enrollment under the PhD program at the University of Luxembourg/SnT.





SM207129 Boolean networks generalised updatings


Admin


Encadrant : Sylvain SENÉ 
Labo/Organisme : The candidate/student will integrate the CANA (calcul naturel / natural computation) team of the LIS (Laboratoire d'informatique et systèmes / Laboratory of Computer Science and Systems) and will be cosupervised by Laurent Tichit through a strong collaboration with the MABIOS (Mathématiques et algorithmique pour la biologie des systèmes / Mathematics and algorithms for systems biology) team of the I2M (Institut de mathématiques de Marseille / Institute of Mathematics of Marseille). 
URL : https://pageperso.lislab.fr/sylvain.sene 
Ville : Marseille 



Description


Abstract Since the end of the sixties, Boolean networks introduced by McCulloch and Pitts have become central mathematical models of genetic regulation networks, for which many theoretical and applied results have been obtained. However, one question that remains open concerns the influence of the way Boolean entities update their states (1 or 0, expressed or not) along time. Two standpoints have appeared to be the most studied, according to more or less reasonable biological arguments: (i) blocksequential updates that consist in separating the entities in disjoint blocks so that update is parallel inside a block but sequential between blocks; (ii) asynchronous updates that consist in considering all the possible sequential updates. Biologically speaking, these two approaches do not succeed in reflecting the reality of biological clocks at the level of genetic regulations. In this project, we propose to study more complex updates to reach a better understanding of the richness and the intricacies of biological regulation and highlight them in real biological networks. Keywords: Discrete dynamical systems theory and combinatorics, Boolean networks, updating modes, biological regulation networks.
Objectives The very objectives are to consider a definition of updating modes complying with biological heterogeneities (e.g., authorizing repetitions and avoiding nonfair updates, i.e., updates enabling entities to change state infinitely more often than other), to characterise them from an enumerative combinatorics point of view and to discover kinds of networks robust/sensitive to them (creation/suppression of attractors, even fixed points). According to the background of the candidate, one objective could also consist in finding and developing new adapted tools to study their influence in real biological networks.
Proposed approach The proposed approach is mainly theoretical and takes place at the frontier of discrete mathematics and theoretical computer science: it deals with dynamical system and graph theories, enumerative combinatorics, complexity and computability on Boolean networks, each of these domains used in analysing deeply and subtly the behaviours of biological networks. More precisely, on the basis of general periodic updating modes that can be viewed as finite sequences of finite subsets of entities (and subfamilies of these that include for instance blocksequential and asynchronous updating modes), the first work will consist in finding explicit formulae of the number of such updating modes, and put it in relation with already proven results. The second part will be dedicated to the study of specific features that allow creating/removing fixed points (we know that creation is possible, we do not know anything about removal). A possible third part would be to tackle the adaptation of useful tools (like update graphs) to this framework.
Interdisciplinarity Whilst essentially anchored in mathematics and computer science, this project is driven by strong biological motivations: first, studying the impact of updating modes is a way to reach a better understanding of the chromatin dynamics which gives rise to the genetic expression clock; second, previous studies showed that classical updating modes are not powerful enough to model biological systems exhibiting some kinds of synchronizing behaviours, like Zeitgebers (i.e. timers), which has been shown to be possible with the update modes mentioned above (and notably highlighted in plant growth and physiology). Moreover, analysing robustness/sensitivity of the asymptotic dynamics (creation/removal of fixed points for instance) of biological networks can lead to an adaptable and plastic framework to do cell reprogramming and model the emergence of new (even potentially pathological) phenotypes. URL sujet détaillé : https://pageperso.lislab.fr/sylvain.sene/rech/M2_updates.pdf
Remarques : The candidate/student will integrate the CANA (calcul naturel / natural computation) team of the LIS (Laboratoire d'informatique et systèmes / Laboratory of Computer Science and Systems) and will be cosupervised by Laurent Tichit through a strong collaboration with the MABIOS (Mathématiques et algorithmique pour la biologie des systèmes / Mathematics and algorithms for systems biology) team of the I2M (Institut de mathématiques de Marseille / Institute of Mathematics of Marseille).
Internship duration: From 5 to 6 months.
Internship location: LIS, Luminy, Marseille, France.
Compensation: Legal internship compensation implemented at AixMarseille University.





SM207130 Passage à l'échelle de la méthode de


Description


Les composantes méthodologiques et logicielles du high performance data analytics (HPDA) servent d'éléments de base pour des développements algorithmiques vers les applications de simulation ou d'analyse de données massives. Ainsi, une activité cruciale sur la voie de la convergence entre calcul haute performance (HPC) et HPDA est l'extension des méthodes de calcul haute performance aux traitements de données issues de la biologie.
L'objectif du stage est d'explorer les opportunités offertes par les techniques numériques émergentes en algèbre linéaire et multilinéaire " data sparse " et leur capacité de mises en oeuvre efficace sous forme d'algorithmes à base de tâches (en utilisant les support d'exécution StarPU/NewMadeleine développés à Inria Bordeaux) pour paralléliser l'algorithme MDS. URL sujet détaillé : https://team.inria.fr/hiepacs/files/2018/11/stage2019hpda.pdf
Remarques :  Encadrants: Emmanuel Agullo (Inria HiePACS), Olivier Coulaud (Inria Hiepacs), Alain Franc (INRA / Inria Pleiade)  Tél: EA 05 24 57 41 50  OC 05 24 57 40 80  AF 05 35 38 53 53
 Courriels: emmanuel.agullo.fr; olivier.coulaud.fr; alain.franc.fr
 Poursuite en doctorat: financement acquis





SM207131 Deploying AI Go bot on Mobile Phone


Description


The mission of NAVER LABS Europe is to advance the stateoftheart in Ambient Intelligence including autonomous vehicles, artificial intelligence, 3D mapping, augmented reality and robotics while paving the way for these innovations into NAVER products and services.
The edge computing group focuses on bringing ambient intelligence to the edge to improve the user experience while respecting their privacy. Devices such as mobile phone or SoC platforms are more and more powerful which allow the processing of data close to where they're generated.
The goal of this internship is to port an AI bot that plays Go (like Elf OpenGo) on a mobile phone.
In this internship, you will focus on some of the following tasks in collaboration with other team members:
 Select and port the relevant AI algorithm on a mobile device  Port and evaluate performances from highend to lowend devices  Optimisation using GPU and/or NPU code
Requirements and competencies expected:
 Master's student in Computer Science.  Development skills: Python, with some experience of mobile platforms (iOS or Android).  Knowledge in computer vision and machine learning techniques is a plus  Knowledge of the Go game is a plus  Highly motivated.
During the internship you will acquire significant knowledge in advanced computer vision and machine learning techniques while working closely with researchers and engineers. Additionally, you will become knowledgeable of Agile developments methodologies (eg: SCRUM, Continuous Integration) URL sujet détaillé :
:
Remarques :





SM207132 Deploying Computer Vision Algorithms on Embedded Devices


Description


The mission of NAVER LABS Europe is to advance the stateoftheart in Ambient Intelligence including autonomous vehicles, artificial intelligence, 3D mapping, augmented reality and robotics while paving the way for these innovations into NAVER products and services.
The edge computing group focuses on bringing ambient intelligence to the edge to improve the user experience while respecting their privacy. Devices such as mobile phone or SoC platforms are more and more powerful which allow the processing of the data close to where they're generated.
The goal of this internship is to enable new type of applications by bringing advanced computer vision algorithms on devices with CPU/Memory or bandwidth constraints.
In this internship, you will focus on some of the following tasks in collaboration with the other team members:
 Select relevant computer vision algorithms such as object detection and classification  Target embedded devices based on their power consumption from 1020W (Nvidia Tegra TX2, Drive PX, AGX Xavier) to a few a few watts (Raspberry Pi) or even less (GAP8)  Port and evaluate performances  Develop applications demonstrating the performance of some of the embedded platforms
Requirements and competencies expected:
 Master Students in Computer Science  Development skills: C/C++/C#, Java, python, with some experience of mobile platforms.  Knowledge in computer vision and machine learning techniques is a plus  Highly motivated
During the internship, you will acquire a significant knowledge in advanced computer vision and machine learning techniques while working closely with researchers and engineers. Additionally, you will become knowledgeable of Agile developments methodologies (eg: SCRUM, Continuous Integration) URL sujet détaillé :
:
Remarques :





SM207133 Indoor 3D reconstruction with an Hololens headset


Description


The mission of NAVER LABS Europe is to advance the stateoftheart in Ambient Intelligence including autonomous vehicles, artificial intelligence, 3D mapping, augmented reality and robotics while paving the way for these innovations into NAVER products and services.
The edge computing group focuses on bringing ambient intelligence to the edge to improve the user experience while respecting their privacy. Devices such as mobile phone or SoC platforms are more and more powerful which allow the processing of the data close to where they're generated.
The goal of this internship is to design and develop an Augmented reality application running on a AR Headset (i.e. Hololens) to perform realtime 3D reconstruction of the environment with an active feedback.
The internship will be divided into several phases in collaboration with the other team members:
 Evaluation of the state of the art of augmented reality and 3D reconstruction algorithms  Design a prototype, design experiments and collect data  Model training and performance evaluation  Optimizations to have a real time and nice user experience
Requirements and competencies expected:
 Master Students in Computer Science.  Development skills: C/C++/C#, Java, python, with some experience of mobile platforms.  Knowledge in computer vision and machine learning techniques is a plus  Highly motivated.
During the internship you will acquire a significant knowledge in advanced computer vision and machine learning techniques while working closely with researchers and engineers. Additionally, you will become knowledgeable of Agile developments methodologies (eg: SCRUM, Continuous Integration) URL sujet détaillé :
:
Remarques :





SM207134 Résilience d'algorithmes d'algèbre linéaire numérique pour la simulation à grande échelle


Description


Pour contenir le budget énergétique des machines parallèles " extreme scale " des choix technologiques sont incontournables tels qu'accroître démesurément le nombre de cóurs de calcul et réduire leur voltage de fonctionnement. Une conséquence directe de ces orientations est que le taux de panne va croître aussi bien les pannes matérielles conduisant à la perte de cóurs ou des nóuds (hard faults). Les approches basées sur des points de reprise (checkpoint restrart) seront trop coûteuses pour passer à l'échelle et des solutions alternatives doivent être considérées. Cellesci nécessitent de remettre en cause les algorithmes numériques et leur mise en óuvre parallèle.
Dans le contexte de méthodes numériques en algèbre linéaire, l'objectif de ce stage est de revisiter les méthodes asynchrones de type point fixe qui permettent l'utilisation de techniques peu synchronisantes lors de la reprise après une faute. URL sujet détaillé : https://team.inria.fr/hiepacs/files/2018/11/2019_resilienceAsynchrone.pdf
Remarques :  Encadrants: Emmanuel Agullo, Luc Giraud  Tél: E. Agullo 05 24 57 41 50  L. Giraud 05 61 19 30 25
 Courriels: emmanuel.agullo.fr; luc.giraud.fr





SM207135 Use of Machine Learning Techniques to Help Find Proofs in the Coq Proof Assistant


Description


In this internship, we propose to build, in the framework of the Coq system, a method of proof assistance, which given a goal, could recommend a tactic to apply. The idea is to extract this recommendation by applying machine learning algorithms to large corpus of proofs (typically the standard library of Coq, for example). To do so, we could consider recent work done in the framework of Isabelle/HOL or first experiments that were carried out within the framework of Coq. URL sujet détaillé : http://www.lirmm.fr/~delahaye/docs/mlcoq.pdf
Remarques : Le stage s'effectuera au sein de l'équipe MaREL du LIRMM. Le stage sera rémunéré. L'encadrement sera réalisé par David Delahaye (Université de Montpellier, LIRMM, David.Delahaye.fr).





SM207136 Deductive Approaches for Process Verification


Description


Process algebras, such as Milner's CCS, can model and verify communicating parallel systems. Here, we consider algebras such as CCS or LOTOS. We exclude mobile algebras like πcalculus. Milner defined a set of equivalence relations (strong equivalence ~, observational equivalence ≈, observational congruence =). These relations are defined on the operational semantics of the given processes on labeled transition systems (LTS). In practice, these relations are also verified on LTS. The objective of this internship is to build an automated deduction tool to verify these relationships, without trying to translate processes into transition systems. URL sujet détaillé : http://www.lirmm.fr/~delahaye/docs/proccoq.pdf
Remarques : Le stage s'effectuera au sein de l'équipe MaREL du LIRMM (à Montpellier) et en collaboration avec le LGI2P de l'IMT Mines Alès. L'encadrement sera réalisé par David Delahaye (Université de Montpellier, LIRMM, David.Delahaye.fr) et Thomas Lambolais (IMT Mines Alès, LGI2P, Thomas.Lambolaisales.fr).





SM207137 Familial monads and structural operational semantics


Description


Structural operational semantics is a method for specifying the dynamics of programming languages by induction on their syntax. A crucial issue is to find sufficient conditions for the produced language to behave well. E.g., is program equivalence a congruence? Such questions have led to quite a few 'formats', whose diversity has been mostly unified by 'functorial operational semantics', an abstract, categorical framework. However, functorial operational semantics does not scale well to languages with variable binding.
In recent work (to appear at POPL '19), we define a new categorical framework based on 'familial monads', which cover languages with variable binding, and in which we smoothly prove two important, standard properties of the obtained semantics: 'congruence of bisimilarity' and soundness of 'bisimulation up to context'. The goal of the internship/thesis is to contribute to the development of this framework, either by looking into formats, or by extending it beyond congruence of bisimilarity and soudness of bisimulation up to context, to related techniques like Howe's method, weak bisimulation, environmental bisimulation, or finding solutions to process equations. In the longer run, we could investigate new formats based on familial monad theory. URL sujet détaillé : https://www.lama.univsavoie.fr/pagesmembres/hirschowitz/stages/familialsosen.pdf
Remarques : Possibilité d'indemnisation.





SM207138 Honest Product Reviews


Description


An increasing amount and value of information about products and places is stored in collections of user reviews. This information is messy, often contradictory or redundant but taken together hides troves of insights that users without time to read them all are interested in.
In this internship you will explore recent summarization techniques and apply them to the problem of creating a unified review which reflects that wisdom of the crowd.
At Naver Labs Europe you will interact with researchers working in the latest developments of machine learning and natural language processing, apply them to our data using our GPU cluster. We regularly publish in top conferences, release dataset and otherwise interact with our academic collaborators.
Requirements ● PhD or Master (researchoriented) level ● Knowledge of deep learning as applied to NLP ● Good coding skills, including at least one the major deep learning toolkits (preferably pytorch)
Duration 56 months Application instructions To apply, please send a mail with CV and cover letter to matthias.galle.com
URL sujet détaillé : http://www.europe.naverlabs.com/NAVERLABSEurope/Internships/HonestProductReviews
Remarques :





SM207139 Ordonnancement pour la Gestion d'ordre de vente dans un contexte de technologie Blockchain.


Description


Étudier la dynamique des taches de calcul dans une infrastructure de calcul distribué afin de produire un algorithme gouvernant l'émission d'ordres de vente par le scheduler.
La blockchain est une technologie innovante qui promet de révolutionner bon nombre de domaines tel que la finance, la logistique, la gestion d'objets connectés, les assurances, le jeu vidéo, etc. Audelà des cryptomonnaies tel que bitcoin, le développement des smartcontracts, introduit par Ethereum, permet d'imaginer des applications de la blockchain dans de nombreux contextes publiques ou privé. Le caractère distribué et tolérant aux fautes byzantines d'une telle structure en fait un outil parfait pour l'organisation d'infrastructures décentralisées. iExec utilise cette technologie pour l'organisation d'un Cloud décentralisé ou chacun peut mettre à disposition des ressources de calcul et se voit rémunéré pour cela. Au cóur de cette infrastructure, les workers sont organisés en workerpools qui sont gérés par des entités adhoc appelés schedulers.
Au delà des problématiques d'ordonnancement de taches à proprement parler, un scheduler a aussi pour mission d'agréger les ressources des workers afin de les publier dans la marketplace. Ce mécanisme nécessite une évaluation des capacité de calcul des workers distribués et non fiable ainsi qu'une compréhension des dynamiques de réplications mises en place par iExec pour assurer la fiabilité des résultats.
L'objectif de ce stage est d'étudier la dynamique des taches de calcul dans une infrastructure de calcul distribué afin de produire un algorithme gouvernant l'émission d'ordres de vente par le scheduler. Cette algorithme devra prendre en considération la non fiabilité des workers ainsi que tous les mécanismes de certifications qui viendraient pénaliser le scheduler si celuici n'arrivait pas à produire un consensus.
iExec est une entreprise lyonnaise qui développe des solutions de Cloud distribué en lien avec la blockchain. Ce Cloud cible aussi bien les utilisateurs traditionnels de ressources Cloud que les smartcontracts nécessitant une puissance de calcul dont ne dispose par la blockchain.
Avalon (http://avalon.enslyon.fr) est une des équipes Inria hébergée au LIP (ENS Lyon). L'objectif à long terme d'Avalon est de contribuer à la conception de modèles de programmation supportant un grand nombre d'architectures, de les implémenter en maîtrisant les concepts algorithmiques sousjacents, et d'en étudier l'impact sur les algorithmes au niveau applicatif. Pour atteindre ses objectifs, l'équipe envisage de contribuer à différents niveaux comme l'algorithmique distribuée, les modèles de programmation, la découverte et le déploiement de services, l'organisation et la composition de services, la gestion d'importants volumes de données, le calcul en nuage, etc. Nos résultats théoriques sont validés sur des prototypes logiciels à destination de différentes thématiques scientifiques telles que la bioinformatique, la physique, l'astrophysique, etc. La start'up iExec est née au sein de l'équipe Avalon.
A l'issue de ce stage, il sera possible de poursuivre ces travaux en doctorat. URL sujet détaillé :
:
Remarques : Ce stage est coencadré par iExec et E. Caron de l'ENS de Lyon.





SM207140 Scheduling and mapping strategies for reducing energy consumption of edge applications and services


Description


Deploying edge processing capabilities closed to users is an original approach in order to reduce latency of digital services. This training period will address the scenario of applications deployed between clouds, edge and fog infrastructures. Such services and applications can generate a big environmental footprint with an enormous electrical consumption and a non optimized usage of resources. Several models, heuristics and approaches will be explored during this training period in order to reduce the electrical footprint of such big applications. The impact and flexibility supported by new generation networks (SDN, NFV) will be included in such exploration.
This internship will occur in the Avalon team of LIP laboratory (ENS Lyon) where several members work on GreenIT and how to improve energy efficiency of large digital systems (datacenters, clouds, networks). Interested students must contact Laurent Lefevre laurent.lefevrelyon.fr for having more details on the addressed subject. URL sujet détaillé :
:
Remarques : This training period will be co advised between Laurent Lefevre (laurent.lefevrelyon.fr) and Olivier Gluck (Olivier.Glucklyon1.fr). This internship can lead to a PhD continuation.





SM207141 Training data for a certifiable Artificial Neural Network based algorithm


Description


As part as the certification process, AIRBUS has to demonstrate that a safety critical software is compliant with its intended use in its operational context and that any residual problem does not jeopardize the safety requirements. Machine learning and specifically Artificial Neural Network (ANN) technology will be widely used in software design in the next future. The main objective of this internship will be to investigate the properties of the training set of data, the ANNbased algorithm and the related training process that could conduct to a measurable and acceptable result. URL sujet détaillé : https://company.airbus.com/careers/jobsandapplications/searchforvacancies~jobid=001A4B0A914A1EE8B79F42E431B2C500~.html
Remarques : Dates: 1er semestre 2019. Stage rémunéré. Poursuite en thèse possible.





SM207142 Stable sets and coloring via convex relaxations


Description


This internship will focus on algorithms for finding stable sets and coloring graphs with few colors using approaches from convex programming. In general, these problems are hard to approximate to within any nontrivial factor, and therefore these approaches will not be useful. However, in many instances, it has been demonstrated that assuming some structure on a graph leads to algorithms with guarantees that are out of reach for general graphs. For example, it has been conjectured that a graph without a fixed forbidden induced subgraph and with no large clique has a large stable set. Another example is the problem of coloring 3colorable graphs. While the best known algorithms yield proper colorings with n^{epsilon} colors for some fixed constant epsilon < 1, forbidding a P_k for k at most 7 allows this problem to be solved in polynomial time. For the class of perfect graphs, one can use convex relaxations to compute the maximum stable set and chromatic number in polynomial time. The goal of this project is to study how to apply tools from convex optimization to these problems. For example, a natural approach to finding a stable set in a graph is through a semidefinite relaxation (i.e., vector program). The program can be strengthened by adding constraints; for example, constraints can be derived using hierarchies. While, in general, the bounds obtained are unlikely to be very informative, we will focus on graph classes whose structure can be used to obtain better bounds and simpler algorithms. URL sujet détaillé :
:
Remarques : possibilité de rémuneration





SM207143 Stratégies GPU pour le clustering de graphes de grande taille


Description


The goal of this project is to explore the use of Graphic Processing Units (GPUs) for the clustering of large graphs (eg: the road network of very large urban areas or small countries).
GPU computing is indeed strongly related to the success of different machine learning methods such as deep learning or xgboost. There is now a growing interest to use it for other machine learning strategies (see eg nvidia rapids). Here, its use will be applied to graph clustering, where the information is not regularly organized in the memory.
The trainee will first have to deeply understand how GPU computing works as well as a classic graph clustering algorithm. The heart his or her work will then be to explore different solutions to model the graph clustering problem so that it can be efficiently solved using GPUs. URL sujet détaillé : http://laurent.risser.free.fr/TMP_SHARE/StageGraphsAnalysis.pdf
Remarques :





SM207144 Research internship in machine learning theory


Description


I have one Master 2 research internship opportunity in machine learning theory, which can be about one of the following two problems. The first one is about approximation theory for deep feedforward neural networks; the goal is to read three very recent papers and address related open problems. The second possible topic is about an open problem in sequential learning theory; namely about optimal bounds for online linear regression with adversarial data. For more details, please see the attached pdf document. URL sujet détaillé : https://www.math.univtoulouse.fr/~sgerchin/docs/internshiplearningGerchinovitz.pdf
Remarques : If the internship is successful, we will apply for a PhD grant on the same research topic.





SM207145 On the success of ExpectationMaximization


Description


Consider a set of points X in R^d generated by a mixture of k Gaussians. How to recover the parameters of the distribution (the means and the variances)? This is a fundamental problem that arise in many applications stemming from machine learning and data analysis. A very popular approach for solving this problem is to use the expectationmaximization (EM) heuristic (https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm). This heuristic is very good in practice both in terms of the quality of the solution output and for its running time. However, from a theoretical standpoint, very little is known about it. How fast is the convergence? How good is the solution? Even for simple cases such as k=3, or d=2, not much is known.
The goal of the internship is to make a step toward the understanding of the practical success of the EM heuristic by proving that the algorithm output the "correct" solution in "reasonable" time, starting with simple scenarios.
The internship will be in Sorbonne Université, in Jussieu, Paris, advised by Vincent CohenAddad. Collaborations and research visits to work with Varun Kanade in Oxford and with Aurélien Garivier in ENS Lyon will be possible.
URL sujet détaillé :
:
Remarques :





SM207146 Deep learning for semantic information access


Description


1 internship followed by 1 PhD position at the Toulouse University, France.
The Renault Labs France and the Informatics Research Institute of Toulouse (IRIT) invite applications for a fully funded PhD position on "Deep learning for texts and knowledge bases access." The PhD will be cosupervised by FrançoisPaul Servant (Renault), Prof. Lynda Tamine and Dr. Jose Moreno.
Thesis description =================The thesis targets two main objectives: 1) the semantic representation of documents that mention entities from different external resources; 2) the categorization of documents by family of entities mentioned and the search of documents meeting entities' needs. To achieve these objectives, we plan to move towards an approach based on deep learning to solve both representation and access problem (categorization, information retrieval), constrained by the content and structure of multiple external resources (terminology, thesauri, knowledge graphs etc.).
>From the point of view of representation, we are in line with recent works based on the joint regularization of neural embeddings augmented by resources [Faruqui2014; Yu2014; Wang2014; Yamada2016]. This work is based on the hypothesis that learned representations are interpretable if they are aligned with entities derived from resources so that representations of entities obtained in latent space are all the closer as they are associated with semantically related entities in the external resources. These representations extended to sentences, texts, are exploitable in an information search task [Nguyen2018], in the identification of mentions of entities [Moreno2017], or the categorization of short textual documents [Kim2014]. Although distributional representations exist for words/texts, structured resources and their combinations, no work is interested in the constrained regularization of multiple resources, nor in the multilevel structuring of entities in these resources. One of the first works in this direction uses Poincaré geometry to represent the hierarchies in resources[Nickel2017], but completely ignores the representation of relationships between entities. However, relationships are omnipresent in today's widely used knowledge bases, including those considered at Renault.
The thesis project faces new scientific challenges related to the definition of adequate neural architectures and associated cost functions, capable of learning compositionality (semantic compositionality) both in the local context (text) and global contexts (resources).
The envisioned starting date is September 2018 (starting in early 2019 is also possible).
Requirements ===========We are looking for one candidate with a strong focus on information retrieval/NLP and machine learning with the following profile:
+ Good Master's degree in Computer Science, Statistics, Mathematics or related disciplines (essential) + Good programming skills in Python/KerasTensorFlowTorch (essential) + Advanced knowledge in algorithms and data structures (optional) + Ability to work independently and be selfmotivated (essential) + Excellent communication skills in English (essential  minumil score of 750 in TOEIC)
Applying =======To apply, please email your application to: françoispaul.servant.com, lynda.lechani.fr, and jose.moreno.fr.
The application should consist of the following: + a curriculum vitae + transcript of marks according to M1M2 profile or last 3 years of engineering school (with indication on the ranking if possible) + covering letter + letter(s) of recommendation including at least one letter drawn up by a university referent
Potential candidates will be invited for an interview with the supervisors. The application file should be sent
Conditions of employment =======================Internship: gratification 540 euros Thesis  You will be hired on fixedterm contract (3 years contract  CIFRE) at Renault, a world leader in car manufacturing. Payment with respect with academic profile.
Working at Toulouse (IRIT/Renault) =================================You will be integrated in two teams with academic and industrial profiles: the IRIS team of IRIT recognized for its research activities in the field of information retrieval and information synthesis with a focus on the use of Deep Learning technologies and the team Renault.
The IRIT lab represents one of the major potential of the French research in computer science, with a workforce of more than 700 members including 272 researchers and teachers 204 PhD students, 50 postdoc and researchers under contract and also 32 engineers and administrative employees.
Toulouse is located on the banks of the Garonne River, 150 kilometres from the Mediterranean Sea, 230 km from the Atlantic Ocean and 680 km from Paris. It is the fourthlargest metro area in France, with 1,312,304 inhabitants as of January 2014. Toulouse is the centre of the European aerospace industry, with the headquarters of Airbus, the Galileo positioning system, the SPOT satellite system, ATR and the Aerospace Valley. It also hosts the European headquarters of Intel and CNES's Toulouse Space Centre (CST), the largest space centre in Europe. Thales Alenia Space, and Astrium Satellites also have a significant presence in Toulouse. The University of Toulouse is one of the oldest in Europe (founded in 1229) and, with more than 103,000 students, it is the fourthlargest university campus in France, after the universities of Paris, Lyon and Lille. The city was the capital of the Visigothic Kingdom in the 5th century and the capital of the province of Languedoc in the Late Middle Ages and early modern period, making it the unofficial capital of the cultural region of Occitania (Southern France).
References ========= [Faruqui2014] Faruqui M., Dodge J., Jauhar S. K., Dyer C., Hovy E., Smith N. A. Retrofitting Word Vectors to Semantic Lexicons, NAACL, 2014. [Moreno2017] Moreno, J. G., Besançon, R., Beaumont, R., D'hondt, E., Ligozat, A. L., Rosset, S., Grau, B. (2017, Combining word and entity embeddings for entity linking. In Extended Semantic Web Conference (ESWC) pp. 337352, 2017. [Nickel2017] Nickel, M., & Kiela, D. Poincaré embeddings for learning hierarchical representations. In Advances in Neural Information Processing Systems (pp. 63416350), 2017. [Nguyen2018] Gia Nguyen, Lynda Tamine, Laure Soulier, Nathalie Souf, A TriPartite Neural Document Language Model for Semantic Information Retrieval. In Extended Semantic Web Conference (ESWC), 2018. [Yu2014] Yu M., Dredze M. Improving Lexical Embeddings with Semantic Knowledge, ACL, p. 545 550, 2014. [Wang2014] Wang Z., Zhang J., Feng J., Chen Z., " Knowledge Graph and Text Jointly Embedding ", EMNLP, p. 1591 1601, 2014 [Yamada2016] Yamada, I., Shindo, H., Takeda, H., Takefuji, Y., " Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation ", CoNLL, p. 250259, 2016 URL sujet détaillé :
:
Remarques :





SM207147 Deep learning methods for the sensorbased activity recognition tasks


Description


Evaluation of machine learning methods, based on a dataset from a smart home (coworking space), about activities observed over 20 working days. The dataset is proposed by Orange and INRIA (paper "A Dataset of Routine Daily Activities in an Instrumented Home" by UCAmI 2017). The idea is to evaluate different deep learning algorithms, for different purposes:  Recognition of atomic activities;  Learning scenarios;  Sensor clustering for given activities / scenarios;  Learning similar sequences The learning process could be completed with more or less predefined expert knowledge (ex.: labeling, location of sensors).
Depending on the candidat's motivation and results, the work could be followed with for example the following tasks:  Identification of abnormal situations;  Further activities prediction of further values for the sensors; URL sujet détaillé :
:
Remarques :





SM207148 Active and semisupervised learning using topological data analysis


Description


With the tremendous growth of data available electronically, the constitution of labeled training sets for learning, which often requires a skilled human agent or a physical experiment, becomes unrealistic. One alternative is to gather a small set of labeled training examples = (mathbf{x}_i, y_i)_{1leq ileq l} sim mathcal{D}^l try to take advantage of the huge amount of unlabeled observations _l = (mathbf{x}_i)_{l+1leq i leq l+u} sim mathcal{D}_X^u with>>l to find a prediction function more efficient that the one found based on the labeled training set.
However, in some cases, the set of labeled training examples is not well constructed, in the sense that it does not reveal the complexity of the data set. For example, it would be true if labeled observations were originally drawn randomly from a huge set of unlabeled data, without revealing the intrinsic nature of the whole set.
In this project, we propose to consider topological data analysis (TDA) to improve the selection of informative labeled examples in conjunction with semisupervised algorithms (Amini et al., NIPS 2009). TDA is a growing mathematical field which is starting to be widely applied in many different contexts (see Chazal et Michel, arXiv 2017 for a survey on the subject). It consists in using algebraic tools, which can be computed, to detect some topological features of the data we are considering. TDA has already been used for machine learning algorithm as introduced in Chazal et al., J.ACM 2013 et Wasserman, Annual Review of Statistics and Its Application 2018, among others.
The aim here is to detect interesting observations (regarding the topology of the data) in order to be labeled, to ask for their labels (active learning step) and then to use semisupervised learning algorithms to improve the performance of the prediction function. The resulting algorithm will be tested on realworld applications using data extracted from different information retrieval problems. We have in mind a specific data set coming from biology application (ongoing collaboration with I2BC). URL sujet détaillé :
:
Remarques : Ce stage sera coencadré, avec MassihReza Amini (Laboratoire d'Informatique de Grenoble) et Rémi Molinier (Institut Fourier)





SM207149 Effects in Type Theory


Description


We propose to incorporate effects in the theory of proof assistants. We notice that at stake is not only the certification of programs with effects, but that it has also implications in semantics and logic. Type theory is based on the lambda calculus which is purely functional. Noticing that any realistic program contains effects (state, exceptions, inputoutput...), many works are focusing on certified pro gramming with effects: notably Ynot, and more recently F⋆ and Idris, which propose various ways for encapsulating effects and restricting the dependency of types on effectful terms. Effects are either specialised, with monads with Hoarestyle pre and postconditions found in Ynot or F⋆, or more general, with algebraic effects implemented in Idris. There is on the other hand a deficit of logical and semantic investigations. Yet, a type theory that integrates effects would have logical, algebraic and computational implica tions through the CurryHoward correspondence. The goal is to develop a type theory with effects that accounts both for practical experiments in certified programming, and for clues from denotational semantics and logical phenomena, starting from the unifying setting of (linear) callbypushvalue [1, 2]. Several extensions with dependent types of callbypushvalue have been proposed [3, 4]. The difficulty resides in expressing the dependency of types on effectful terms, and severe and artificial restrictions are considered. The goal is to understand and relax the various restrictions through the idea of semantic value uncovered in recent works. Our distinctive approach is to seek a proper extension of type theory, that would give the foundations to a classical type theory [5] but also give a foundational role to terminationchecking traditional (intuitionistic) proof assistants. My recent works with PierreMarie Pédrot [6, 7] suggest that it is possible to define a monadic translation in Type Theory by a careful combination of the callbyname decomposition in callbypushvalue and the use of the EilenbergMoore adjunction that decomposes any monad. This can be seen as a monadic translation that works in a dependently typed setting. The only assumption on the monad is that the universe of all types forms an algebra for this monad. The preliminary results are encouraging1 although further investigations remain to be done. With this approach, it should be possible to define many effects in CIC, if not all (because the free monad over an algebraic effect is covered by our setting). 2 Team The internship will be held in the CoqHoTT ERC project, advised by Nicolas Tabareau, inside the Gallinette Inria team in Nantes. References [1] P. B. Levy, CallByPushValue: A Functional/Imperative Synthesis, ser. Semantic Structures in Com putation. Springer, 2004, vol. 2. 1 A Coq plugin is already available at https://github.com/CoqHott/coqeffects/. 1 [2] P.L. Curien, M. Fiore, and G. MunchMaccagnoni, =93A Theory of Effects and Resources: Adjunction Models and Polarised Calculi,=94 in Proc. POPL, 2016. [3] D. Ahman, N. Ghani, and G. D. Plotkin, =93Dependent Types and Fibred Computational Effects,=94 in Proc. FoSSaCS, 2015. [4] M. Vákár, =93A Framework for Dependent Types and Effects,=94 arXiv preprint arXiv:1512.08009, 2015. [5] H. Herbelin, =93A Constructive Proof of Dependent Choice, Compatible with Classical Logic,=94 in LICS 2012  27th Annual ACM/IEEE Symposium on Logic in Computer Science. Dubrovnik, Croatia: IEEE Computer Society, Jun. 2012, pp. 365=96374. [Online]. Available: https://hal.inria.fr/hal00697240 [6] P. Pédrot and N. Tabareau, =93An Effectful Way to Eliminate Addiction to Dependence,=94 in 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 2023, 2017, 2017, pp. 1=9612. [7] P.M. Pédrot and N. Tabareau, =93Failure is Not an Option An Exceptional Type Theory,=94 in ESOP 2018  27th European Symposium on Programming, ser. LNCS, vol. 10801. Thessaloniki, Greece: Springer, Apr. 2018, pp. 245=96271. [Online]. Available: https://hal.inria.fr/hal01840643 URL sujet détaillé : http://tabareau.fr/sujet_stage_meven_bertrand.pdf
Remarques :





SM207150 L'entropie de Rényi et la concentration de la mesure


Description


La théorie de l'information étudie les limites fondamentales sur les tâches de traitement de données comme la compression et la transmission de données sur un canal bruité. Cette théorie repose sur la notion d'entropie de Shannon d'une variable aléatoire, qui quantifie son degré d'aléa, et permet de caractériser ces tâches: par exemple, on peut compresser d'une variable aléatoire X en environ nH(X) bits, où H(X) dénote l'entropie de X.
Or, si l'on s'intéresse aux questions de concentration de la mesure reliées à ces tâches (par exemple, si l'on tente de compresser n copies de X dans moins de nH(X) bits, la probabilité d'erreur tendelle vers 1 exponentiellement vite?), une autre notion d'entropie intervient: l'entropie de Rényi. Celleci permet en effet de prouver une version "informationnelle" de l'inégalité de Chernoff qui suffit à résoudre les cas les plus simples, et certains résultats plus complexes peuvent également être compris sous cet angle.
Cette approche est cependant étonnamment peu développée, et certains cas plus complexes qui font par exemple intervenir des variables aléatoires dépendantes ne peuvent toujours pas être traitée avec cette méthode. Le but de ce projet serait donc d'approfondir cet axe de recherche; plusieurs questions plus spécifiques pourront être abordées selon les intérêts du stagiaire. URL sujet détaillé : http://members.loria.fr/FDupuis/stagerenyiconcentration.pdf
Remarques :





SM207151 High Performance Deep Reinforcement Learning


Description


Deep learning algorithms need high computational power to deal with increasingly larger datasets. Reinforcement learning goal is to selflearn a task trying to maximize a reward (wining a game for instance) interacting with simulations. Recently, researchers have successfully introduced deep neural networks enabling to address more complex problems. This is often refereed as Deep Reinforcement Learning (DRL). DRL managed for instance to play many ATARI games. Very recently AlphaGo Zero was a leap forward in AI as it outperformed the best human players (and itself) after being trained without using data from human games but solely through reinforcement learning. The process requires an advanced infrastructure for the training phase. To speed up the learning process and enable a wide exploration of the parameter space, the neural network interacts in parallel with several instances of the simulation. For instance AlphaGo Zero trained during more than 70 hours using 64 GPU workers and 19 CPU parameter servers for playing 4.9 million games of generated selfplay, using 1,600 simulations for each Monte Carlo Tree Search.
The goal of this internship is to work on developing a high performance infrastructure for deep reinforcement learning. We will first target the Go game following the architecture proposed for AlphaGoZero. URL sujet détaillé : https://team.inria.fr/datamove/joboffers/#DRL
Remarques : Coadvised with Philippe Preux (Sequel team at Lille  https://team.inria.fr/sequel/)





SM207152 Placement statique pour l'ordonnancement de tâches sur plateforme haute performance hétérogène


Description


Task scheduling and resource allocation are increasingly essential operations in largescale systems, as the difference in cost (in time, energy) between calculation and communication costs increases with new generations of HPC machines, and that the hierarchy of data access is becoming more complex.
Traditionally, large scale systems have been programmed using the standard MPI programming interface. This is, however, at a fairly low level, and such programming requires manual description of the communications. Their optimization, to go beyond the BSP (BulkSynchronous Parallelism) paradigm, which poses serious problems of load balancing, thus requires an important engineering cost. Different new paradigms have been proposed to abstract programming at a higher level. Some rely on a masterslave paradigm, what limits their scalability; others require to rewrite the code in an object paradigm or to describe algebraically the whole application computation. The approach proposed here is based on StarPU [1] and consists of preserving the existing "sequential code" aspect of the applications, while at the same time focusing them on largescale systems, by automatically generating a suitable task graph. This approach has already been successfully applied in the case of a single computing node (with multicores and GPUs) in [2,3,4,5], where the relative advantages of static scheduling strategies ( calculated before launch) and dynamic (calculated at runtime) were discussed.
The objective of the internship is to study the following problem: how to place the data of an application described as a task graph, in order to minimize the communications while guaranteeing good load balancing? The internship will begin with a study of the scientific literature on the subject, then focus on the design and analysis of algorithms for this problem. Extensions can be considered in a second step: possibility to replicate computations, taking into account the heterogeneity of the platform (presence of GPUs), ...
Keywords: resource allocation, static scheduling, dynamic scheduling, StarPU, heterogeneity, distributed algorithms
Comments: Thesis funding is acquired for a subject in continuation of this topic of internship for a start of the thesis in September 2019
References :
[1] http://starpu.gforge.inria.fr
[2] Olivier Beaumont, Lionel EyraudDubois, Suraj Kumar Fast Approximation Algorithms for TaskBased Runtime Systems, Concurrency and Computation: Practice and Experience, Wiley, 2018, 30 (17)
[3] Olivier Beaumont, Lionel EyraudDubois, Suraj Kumar Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for TaskBased Runtime Systems on Multicores and GPUs, IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2017, Orlando, United States.
[4] Olivier Beaumont, Terry Cojean, Lionel EyraudDubois, Abdou Guermouche, Suraj Kumar Scheduling of Linear Algebra Kernels on Multiple Heterogeneous Resources International Conference on High Performance Computing, Data, and Analytics (HiPC 2016), Dec 2016, Hyderabad, India . IEEE, 2016, Proceedings of the IEEE International Conference on High Performance Computing (HiPC 2016)
[5] Emmanuel Agullo, Olivier Beaumont, Lionel EyraudDubois, Suraj Kumar Are Static Schedules so Bad? A Case Study on Cholesky Factorization IEEE International Parallel & Distributed Processing Symposium (IPDPS 2016), May 2016, Chicago, IL, United States. IEEE, 2016
URL sujet détaillé :
:
Remarques : Sujet coencadré avec Lionel Eyraud Dubois
Commentaires : Un financement de thèse est acquis pour un sujet en prolongement de ce sujet de stage pour un démarrage de la thèse en Septembre 2019





SM207153 Caractérisation de traces de navigation


Description


This internship aims at investigating the characteristics of users navigation traces on an "infotainment" website (www.melty.fr)
In previous works, we observed that some types of sessions are related to low content diversity consumption. Our assumption is that the structure of the site, as well as the nature of the recommendation links lead to isolate the user in a specific topic.
The main goal of the internship is to test this assumption by exploring navigation traces corresponding to different recommendation techniques. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2018/11/sujet_algodiv.pdf
Remarques : coencadrement : Pedro Ramaciotti (LIP6), Christophe Prieur (Télécom Paris Tech)





