SM207-1 Modélisation de calcul quantique d'ordre supérieur  
Admin  
Encadrant : Michele PAGANI
Labo/Organisme : IRIF - Univ. Paris 7
URL : https://www.irif.fr/~michele/
Ville : Paris

Description  

The context of the internship is higher-order quantum programming languages and their denotational semantics. In quantum computation, one consider a particular kind of memory with exotic properties: data is non-duplicable, may be entangled, and the operations on the memory are fundamentally probabilistic. The structure of higher-order computation in this context is still under study.

A denotational semantics for a programming language is a way to characterize this structure. It is a mathematical formalism where types are seen as some sort of mathematical structure (vector space, module, ...) and programs are morphisms over these structures.

The case of conventional, probabilistic languages is now well-understood, and there have been some approaches for extending this to quantum computation. The objective of the internship is to pursue in this direction.
URL sujet détaillé : http://www.monoidal.net/stuff/sujet_adequation.pdf

Remarques : coencadrement avec Benoit Valiron (LRI -- Univ. Paris Saclay)




SM207-2 Étude de la sémantique de Isabelle/HOL  
Admin  
Encadrant : Benoit VALIRON
Labo/Organisme : LRI - Univ. Paris Sud
URL : http://www.monoidal.net
Ville : Saclay

Description  

One of the strength of a proof assistant comes from the logic it implements. If this logic might be more or less expressive, one of the essential required property is consistency. The objective of this project is to study the consistency of the core logic of the proof assistant Isabelle/HOL. The first step will be to formalize this core and validate it by testing the generated code extracted from the formalization. The second step will be to concentrate on more advanced features such as the typeclass mechanism or the study of the HOL fragment.
URL sujet détaillé : http://www.monoidal.net/stuff/sujet_pure.pdf

Remarques : coencadrement avec Burkhart Wolff, entre l'equipe Modhel et l'equipe VALS du LRI.




SM207-3 Compilation d'oracle pour calcul quantique  
Admin  
Encadrant : Benoit VALIRON
Labo/Organisme : LRI - Univ. Paris Sud
URL : http://www.monoidal.net
Ville : Saclay

Description  

One of the possible use of quantum computation is to solve classical problems. This requires to encode the structure of problem onto the quantum memory: this is called the oracle. As quantum memories only support reversible operations, it is then required to be able to somehow compile a classical description in the form of a sequence of reversible operations, usually represented as a reversible circuit. A convenient representation for a classical description if a program in a functional language: the subject of this project is to propose an efficient compilation frmework from functional, classical programs onto reversible circuis.
URL sujet détaillé : http://www.monoidal.net/sujet_reversible.html

Remarques : coencadrement avec Burkhart Wolff, entre l'equipe Modhel et l'equipe VALS du LRI.




SM207-4 Development of adversarial classifiers using Bayesian games  
Admin  
Encadrant : Patrick LOISEAU
Labo/Organisme : LIG
URL : http://www.eurecom.fr/~loiseau/
Ville : Grenoble

Description  

General presentation of the topic:

Classification algorithms (a class of learning algorithms) are routinely used in crucial security problems such as detecting attacks (e.g., fraud, spam, theft, malware, etc.). However, standard classification algorithms often perform poorly in such scenarios because an adaptive attacker can shape his attacks in response to the algorithm. Hence the data faced in security problem does not satisfy the standard learning assumption that it its distribution is independent from the algorithm. This has led to a recent interest in developing methods for adversarial classification but existing methods make overly pessimistic assumptions that affect their performance in practice.

Internship program and objectives:

In this internship, we propose to use game theory to model the objectives of the defender (learner) and the attacker (data generator) in order to derive better methods for adversarial classification. We will build on our recent work [1] that proposes a simple game theoretic model of adversarial classification and derives optimal classifier for this model; but that makes the limiting assumption of complete information (about the attacker). Then, the main objective of the internship will be to develop and investigate a model with incomplete information in order to derive more flexible adversarial classification methods that do not require knowledge of the attacker. More specifically, the student will:
- develop an incomplete information model based on a Bayesian game;
- investigate theoretically the game's equilibria=97in particular look for an algorithmic reduction of the defender's strategy space that gives the set of optimal classifiers;
- analyze the solution's performance and robustness using simulations;
- if time permits, extend the study to a dynamic context with sequential learning.

Expected abilities of the student:

Strong background in probability, knowledge of machine learning, basics of game theory

References:

[1] Lemonia Dritsoula, Patrick Loiseau, and John Musacchio. A game-theoretic analysis of adversarial classification. IEEE Transactions on Information Forensics and Security, 12(12):3094=963109, December 2017.


URL sujet détaillé : http://www.eurecom.fr/~loiseau/Internship-LIG-adversarial-learning.pdf

Remarques :



SM207-5 Fairness in multi-stage decision making  
Admin  
Encadrant : Patrick LOISEAU
Labo/Organisme : LIG
URL : http://www.eurecom.fr/~loiseau/
Ville : Grenoble

Description  

General presentation of the topic:

Machine learning algorithms have become omnipresent in everyday life; they are used in increasingly many areas such as advertising, but also hiring or justice. As learning algorithms become more ubiquitous, there is a growing concern about how they affect humans in possibly inconspicuous ways. In particular, it was recently observed that many algorithms discriminate (even unintentionally) users based on features (such as race or gender) which are deemed unacceptable. This started a very intense debate on the fairness of machine learning and a high interest in designing fair learning algorithms.

Internship program and objectives:

In this internship, we will study the case of multi-stage decision making for classification or candidates selection. In this setting, we have a set of candidates. In a first stage, the algorithm selects a subset of the candidates based on some features. In a second stage, for the candidates preselected, the algorithm gets access to additional features to make the final selection. Such multi-stage decision making is used for instance in most hiring processes. The goal of the internship will be to investigate fairness in multi-stage decision making. The student will start by investigating optimal algorithms for multi-stage decision making and their fairness properties according to different definitions (there exist multiple definitions of fairness, see [1-5]). Then, he/she will design modified algorithms that respect fairness constraints and investigate whether, for certain definitions, it is impossible to do so.

Expected abilities of the student:

Strong background in probability and machine learning

References:

[1] M. B. Zafar, I. Valera, M. Gomez-Rodriguez, K. P. Gummadi, A. Weller. From Parity to Preference-based Notions of Fairness in Classification. In NIPS, 2017.
[2] M. Joseph, M. Kearns, J. Morgenstern, A. Roth. Fairness in Learning: Classic and Contextual Bandits. In NIPS, 2016.
[3] M. Hardt, E. Price, N. Srebro. Equality of Opportunity in Supervised Learning. In NIPS, 2016.
[4] R. Zemel, Y. Wu, K. Swersky, T. Pitassi, C. Dwork. Learning Fair Representations. In ICML, 2013.
[5] M. B. Zafar, I. Valera, M. Gomez-Rodriguez, K. P. Gummadi. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. In WWW, 2017.

URL sujet détaillé : http://www.eurecom.fr/~loiseau/Internship-LIG-multistage-decision-making.pdf

Remarques :



SM207-6 Service Problems within Highly Dynamic Distributed Systems  
Admin  
Encadrant : Swan DUBOIS
Labo/Organisme : LIP6, UPMC Sorbonne Universités
URL : https://wp-systeme.lip6.fr/estate/
Ville : Paris

Description  

The availability of wireless communications has drastically increased in recent years and established new applications. Humans, agents, devices, robots, and applications interact together through more and more heterogeneous infrastructures, such as mobile ad hoc networks (MANET), vehicular networks (VANET), (mobile) sensor and actuator networks (SAN), body area networks (BAN), as well as always evolving network infrastructures on the Internet. In such networks, items (users, links, equipments, etc.) may join, leave, or move inside the network at unforeseeable times.

The dynamics of such networks, the heterogeneity of devices, usages, and participants, and often the unprecedented scale to consider, make the design of such infrastructures extremely challenging. For a vast majority of them, the dynamics are also unpredictable. Furthermore, designing applications on top of such networks requires to deal with the lack of infrastructure and numerous topological changes.

Therefore, it becomes necessary to define and to develop new accurate models capturing the features of the considered objects: users' mobility, system stability, dynamics of applications, etc.. Recently, numerous models for these harsh environments have been gathered in a general framework: the Time-Varying Graphs (TVGs). Based on this framework, REGAL team recently proposed a quite thoroughgoing study of fixed point problems (like maximal matching, minimal dominating set, etc.) in highly dynamic systems. In particular, some necessary and sufficient topological conditions are exhibited for these problems.

The main goal of this internship is to initiate a similar study about service problems in highly dynamic systems. We propose to focus on one of the following fundamental problems: Mutual Exclusion, oken Circulation, Propagation of Information with Feedback, etc. All this problems received great attention in static systems but have barely been considered in the context of highly dynamic systems.

The scientific agenda is mainly threefold:
- Studying service problems in the context of TVG with the goal to provide a specification that make sense in highly dynamic systems;
- Second, the main challenge is to produce necessary and sufficient conditions to enable existence of solutions to this specification in highly dynamic systems;
- Third, we would like to design some distributed algorithms that meet these necessary and sufficient conditions in order to obtain optimal solutions (with respect to impossibility results).
URL sujet détaillé : https://pages.lip6.fr/Swan.Dubois/pdf/sujet_stage_ESTATE.pdf

Remarques : Co-adviser : Franck Petit, LIP6, UPMC Sorbonne Universités

Regular internship gratification by ANR project ESTATE



SM207-7 Visual quality metrics for 3D scenes in Virtual and Mixed Reality  
Admin  
Encadrant : Guillaume LAVOUE
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/guillaume.lavoue/
Ville : Lyon

Description  

The proposed Master/PhD position is in the context of the ANR PISCo project (Perceptual Levels of Detail for Interactive and Immersive Remote Visualization of Complex 3D Scenes) which aims at proposing novel algorithms and tools allowing interactive visualization, in immersive contexts (Virtual and Mixed reality, with local/remote 3D content), with a very high quality of user experience. As 3D scenes are visualized through a certain viewport, the goal of the project is to optimize the display in this viewport by proposing (1) Tools for the generation and compression of high quality levels of details, (2) Visual quality metrics capable of predicting the quality of these levels of detail and driving their generation, (3) Visual attention models capable of predicting where the observer is looking and thus selecting and filtering the primitives and levels of detail. A distinctive property of the project lies into the fact that we will consider rich 3D data, including not only geometric information but also animation and complex physically based materials represented by texture maps (color, metalness, roughness, normals).

The proposed Master/PhD position concerns the item (2) above. The objective is to study the perceptual mechanisms involved in immersive 3D data visualization, where 3D data do not contain only geometry but also complex material (different kinds of texture maps), illumination information and animation data. The final objective is to produce both near-threshold and supra-threshold quality indices, capable of predicting the perceptual impact of modifications applied to the geometry, texture maps and animation of a rich 3D scene. The PhD student will benefit from the recent related works from the involved research teams:
URL sujet détaillé : http://liris.cnrs.fr/guillaume.lavoue/Proposal_LIRIS_LS2N_en.pdf

Remarques : Co-encadrement avec Patrick Le Callet - www.irccyn.ec-nantes.fr/~lecallet =96 LS2N lab, University of Nantes.





SM207-8 Algorithm and optimization of covering problems using flying drones  
Admin  
Encadrant : Christelle CAILLOUET
Labo/Organisme : COATI, Inria/I3S lab
URL : https://team.inria.fr/coati/
Ville : Sophia Antipolis

Description  

Recent advances of technology have led to the development of flying drones that act as wireless base stations to track objects lying on the ground. This kind of robots (also called Unmanned Aerial Vehicles or UAVs) can be used in a variety of applications such as vehicle tracking, traffic management, battery harversting and fire detection.

The goal is to investigate the 3D deployment of multiple UAVs in order to cover a set of mobile targets. Each UAV has limited energy and its coverage performances depend on their altitude and connectivity between each other.
Theoreticaly, this problem is related to the set covering problem (and its dynamic version), and the 3D packing problem.

In this internship, we want to extend existing works on providing an efficient and reliable drone placement and scheduling by adjusting the drones position ensuring the surveillance of all the targets at the same time (mobile or fixed targets).

More precisely, the goal of the internship is to provide:
- linear models solving the covering problem of mobile targets while ensuring connectivity among the drones (using for instance techniques like column generation);
- approximation algorithms in order to solve efficiently big size instances.

URL sujet détaillé : :

Remarques :



SM207-9 Machines d'écoute programmables  
Admin  
Encadrant : Nicolas OBIN
Labo/Organisme : Ircam-STMS
URL : www.ircam.fr :
Ville : Paris

Description  

L'objectif de ce stage est de d'explorer les techniques de programmation bayésienne pour voir comment elle peuvent s'appliquer à la re-construction et à l'extension de la machine d'écoute polyphonique du suiveur de partition temps-réel Antescofo développé à l'Ircam. Les principaux points du stage seront :
1) De formaliser l'apprentissage par programmation bayésienne dans le cadre des machines d'écoute, possiblement en ligne et faiblement supervisées.
2) D'étudier comment cette machine d'écoute peut s'articuler dans un langage hôte, permettant ainsi de mêler intimement spécification de la perception et spécification de la réaction.
3) Eventuellement, d'explorer la possibilité de réaliser des synthèses sonores à partir des modèles appris
URL sujet détaillé : http://anasynth.ircam.fr/home/media/machines-d%E2%80%99e%CC%81coute-programmables

Remarques : Rémunération standard stage de laboratoire + tickets restaurants



SM207-10 Data-Aware Scheduling at Scale  
Admin  
Encadrant : Guillaume AUPY
Labo/Organisme : Inria
URL : http://gaupy.org
Ville : Bordeaux

Description  

Context:
--
In the past, scheduling algorithms have mostly been developed around the constraints linked to computing.
While computing power of supercomputers keeps on increasing at an exponential rate, their capacity to manage data movement experiences some limits. It is expected that this imbalance will be one of the key limitation to the development of future High-Performance Computing (HPC) applications. We propose to rethink how the data created by the applications and stored during the computation (also known as I/O for Input/Output) is managed in supercomputers.

As an example, in 2013, Argonne (a research lab in the US) upgraded its house supercomputer: moving from Intrepid (Peak performance: 0.56 PFlop/s; peak I/O throughput: 88 GB/s) to Mira (Peak performance: 10 PFlop/s; peak I/O throughput: 240 GB/s). While both criteria seem to have improved considerably, the reality behind is that for a given application, its I/O throughput scales linearly (or worse) with its performance, and hence, what should be noticed is a downgrade from 160 GB/PFlop to 24 GB/PFlop!

In future large-scale platforms, the way I/O movements are scheduled is more and more critical to optimize performances. In this project we propose to add a layer of data-movements scheduling to the usual job scheduling in super-computers.
More specifically, the novelty of this project is to account for known HPC application behaviors (periodicity, limited number of concurrent applications) to define data scheduling strategies.

Internship program and objectives:
--
During the internship, the student will work on:
- Modeling HPC applications and HPC platforms, looking for structural arguments on the shape of I/O movement and computations
- Developing and analysing new scheduling algorithms that take those structural arguments into account. As an example of natural extensions of our preliminary work [1]:
- The shape of a compute vs I/O period
- The multiplicity of entry-point into the job scheduler (multiple I/O nodes).


[1] http://gaupy.org/ressources/pub/reports/RR-9037.pdf


Prerequesite:
- Good knowledge in algorithmics and analysis of algorithms
- Basic knowledge of probability theory
- Basic knowledge in programming*

List of class at ENS Lyon related to this internship: CR08, CR12, CR01, CR09.

*Depending on the liking/expertise of the student, there will be a possibility of implementing and experimenting those algorithms on top tier HPC platform.

URL sujet détaillé : http://gaupy.org/ressources/files/dash_anr_proposal.pdf

Remarques : Co-advised by Emmanuel Jeannot. Research will be performed in the Tadaam team (https://team.inria.fr/tadaam/).

We are looking for an intern that wants to continue as a PhD student.
Funding is secured both for the internship and a PhD following the internship via the DASH ANR project. This also includes generous travel money during the PhD thesis.



SM207-11 Multi-scale adaptative management of HPC plateform  
Admin  
Encadrant : Georges DA COSTA
Labo/Organisme : IRIT, Université de Toulouse
URL : www.irit.fr/~Georges.Da-Costa :
Ville : Toulouse

Description  

Top High Performance Computing (HPC) infrastructure have millions of cores and classical management systems do not scale. Monitoring infrastructure as an example centralize all datas in a single (overloaded) point. Scheduling is usually done at the global level using algorithms which are O(n log(n) ) in the best case and usually O(n^2).

The goal of this internship is to design and evaluate the complexity of next-generation large-scale management systems in which the middleware will be able to split itself in order to adapt to the infrastructure usage. If some part of an infrastructure is running long-term application there is no need to do monitoring at a frequency of 1s. Different parts of the infrastructure will then be managed by different instances with different time granularity.

The internship will be done in the context of SEPIA team specialised in large scale distributed systems (HPC and Clouds), particularly in scheduling and multi-objective optimization. SEPIA team is part of IRIT lab (750 peoples) in Toulouse.

URL sujet détaillé : :

Remarques :



SM207-12 Robust and Efficient Scheduling in Servers with Unexpected Faults  
Admin  
Encadrant : Jean-marc PIERSON
Labo/Organisme : IRIT, University of Toulouse
URL : www.irit.fr/~Jean-Marc.Pierson :
Ville : Toulouse

Description  

In this subject we are interested in exploring the robustness of scheduling algorithm in the context of servers operated in conditions where faults may occur. Due to these faults, there is an impact on the scheduling of the work to be done, for instance a planned task must be delayed. The servers can be standalone servers where the impact is on the scheduling of process to the CPU, or servers in a Cloud environment where the impact is on the scheduling of virtual machines between servers, or in a server at the hypervisor level. The robustness of the scheduling algorithm is defined here as its ability to cope with the faults without the necessity to recompute all the planned scheduling but allowing only little (or even no) change, in order to keep its efficiency.
The internship will be done in the context of SEPIA team specialised in large scale distributed systems (HPC and Clouds), particularly in scheduling and multi-objective optimization. SEPIA team is part of IRIT lab (750 peoples) in Toulouse.


URL sujet détaillé : :

Remarques : Rémunération du stage



SM207-13 Homotopie pour la résolution de systèmes polynomiaux  
Admin  
Encadrant : Simone NALDI
Labo/Organisme : XLIM - Université de Limoges
URL : http://www.xlim.fr/recherche/pole-mathematiques-informatique-image/mathematique-securite-de-linformation/cf
Ville : LIMOGES

Description  

Solving polynomial systems, that is, finding the common roots of several multivariate polynomials, is a central question in commutative algebra and symbolic computation. It arises in many applied contexts (e.g. biology, chemistry, mechanical engineering), where many problems boil down to computing a sample set of real solutions, or to compute all complex solutions in case there are finitely many. Developing efficient algorithms for these tasks remains a challenge in its whole generality. In most situations, the current best strategy is to exploit the structure of the polynomials defining the system (e.g. sparsity, symmetries). For this, a class of methods of great success and efficiency is that of deformation techniques such as homotopy.

The first goal of the internship is to get familiarity with polynomial system solving. Then, the student will get acquainted with the use of homotopy; with classes of examples coming from the applications. He or she will study one or several of the most efficient symbolic algorithms such as the geometric resolution algorithm [1] and the polyhedral homotopy algorithm [2].

One of the current important challenges concerning the class of deformation techniques is to get efficient implementations in the symbolic setting: there is currently a large gap between theoretical complexity estimates and practical efficiency. Thus, if the student is interested in programming, a part of the internship can be dedicated to the design of efficient implementations of the current best algorithms for solving polynomial systems that have finitely many solutions.
URL sujet détaillé : https://www.unilim.fr/pages_perso/vincent.neiger/research/projectM2.pdf

Remarques : Co-supervision with / co-encadrement avec : Vincent Neiger.

Please don't hesitate to contact us if you have questions regarding any practical or scientific aspect of the internship.
N'hésitez pas à nous contacter pour toute question, que ce soit sur les aspects pratiques ou scientifiques.

simone.naldi.fr
vincent.neiger.fr



SM207-14 Certificates with Non-Linear Arithmetic Decision Procedures  
Admin  
Encadrant : Pascal FONTAINE
Labo/Organisme : LORIA
URL : http://www.loria.fr/en/
Ville : Nancy

Description  

Many applications, notably in the context of verification for critical systems, e.g.~in the areas of transportation or energy, rely on checking the satisfiability of logic formulas containing non-linear real arithmetic. Existing decision procedures on the arithmetic side have been mostly used in a black-box style. In case of satisfiability of a subset of arithmetic constraints one or few sample solutions can be provided, while in case of unsatisfiability the prover is left alone with the negative answer. Additional information would clearly support the robustness of the combination of systems by allowing for efficient verification of the answers offered. It could furthermore help the prover to learn in the sense of exploiting the additional information during the remaining computation. We propose to study the production and checking of certificates for non-linear arithmetic decision procedures.
URL sujet détaillé : https://members.loria.fr/PFontaine/Certified_NLRA.pdf

Remarques :



SM207-15 Verification of equivalence-based properties  
Admin  
Encadrant : David BAELDE
Labo/Organisme : IRISA ou LSV
URL : http://www.lsv.fr/~baelde/
Ville : Rennes ou Cachan

Description  

Please see attached PDF.

Keywords: security protocols, privacy, unlinkability, equivalence properties, concurrency theory.
URL sujet détaillé : http://www.lsv.fr/~baelde/sujet_equiv.pdf

Remarques :



SM207-16 Certified Taylor approximations  
Admin  
Encadrant : Xavier THIRIOUX
Labo/Organisme : IRIT/ENSEEIHT
URL : https://www.irit.fr/-Equipe-ACADIE
Ville : TOULOUSE

Description  

Taylor approximations are tools pervasively used to provide polynomial approximations of functions in a variety of situations.

We propose a new way of computing Taylor approximations with the following desirable features: multivariate case, certified approximations, on-demand computation, efficient but also correct-by-construction data-structures.

The applicant will hold a multi-disciplinary position, with computer science topics such as advanced type system features, proofs-as-programs and efficient data-structures, but also the domains of Taylor series, approximations and errors.
URL sujet détaillé : http://thirioux.perso.enseeiht.fr/stages/sujet_stage_2017.pdf

Remarques :



SM207-17 Delivering VoD services to vehicles along a highway  
Admin  
Encadrant : Thomas BEGIN
Labo/Organisme : LIP
URL : https://www.inria.fr/equipes/dante
Ville : Lyon

Description  

Vehicular networks bringing connectivity to the vehicles for the purpose of safety and entertainment are expected to be deployed in the upcoming years. While different technologies including LTE and WiMAX have been considered, IEEE 802.11p (aka WiFi) has dragged much attention.

In this work, we consider the following scenario. Passengers in a vehicle are willing to use VoD (Video on Demand) service in order to watch a movie during their journey to their destination. Several RSUs (Road Side Unit) delivering a wireless connection to passing-by vehicles are scattered along the road. We assume that the network operator may adapt the quality of the video (changing its codec) or the speed of the vehicle (assuming driverless cars) so that the passengers watch their movie without any disruption. In order to achieve this goal, one needs to download enough data and filling a buffer with frames that are kept for being played later when the vehicle will have no connection.

The objective of this work is to design and validate against simulations an algorithm that selects the video codec or regulates the vehicle speed so that passengers experiences a seamless play of their movie. The algorithm will be based on a predictive model that takes into account the number of RSUs, the expected number of contending vehicles on each RSU, the bandwidth of RSUs, the distance to travel, the length of the movie, etc.

Recommended skills: 802.11 protocol, discrete-event simulator, ns-3, performance evaluation.
URL sujet détaillé : :

Remarques : Supervisors: Thomas Begin, Isabelle Guérin Lassous and Anthony Busson
E-Mail : thomas.begin-lyon.fr - isabelle.guerin-lassous-lyon.fr - anthony.busson-lyon.fr




SM207-18 Compression de grands graphes en utilisant leurs propriétés et la complémentarité de leurs paramètres  
Admin  
Encadrant : Hamida SEBA
Labo/Organisme : LIRIS/ Equipe Graph Algorithms and Applications
URL : https://liris.cnrs.fr
Ville : Villeurbanne

Description  

Graph compression is gaining popularity as a precious tool for identifying both structure and meaning in massive data. The goal of this master project is to study selected graph properties in order to identify those able to construct simple reduced representations of graphs.

URL sujet détaillé : https://liris.cnrs.fr/~hseba/Master/Stage_Master2_Recherche_Graphes_et_Algorithmes.pdf

Remarques : Co-encadrement avec Mohammed Haddad (lIRIS)
This internship can be followed by a PhD thesis



SM207-19 Formalisation en Coq d'une approche graphique de la liaison de variable  
Admin  
Encadrant : Tom HIRSCHOWITZ
Labo/Organisme : LAMA
URL : http://lama.univ-smb.fr/~hirschowitz
Ville : Chambéry

Description  

Despite the wide body of work on the subject, variable binding remains
a serious difficulty when formalising, e.g., logic or programming
language theory in a proof assistant. The present internship proposal
sets out to formalise Wadsworth's lambda-graphs in Coq, as yet another
alternative representation of variable binding. The most salient
aspect of lambda-graphs compared to other approaches is that they are
not an inductive datatype in the first place. Instead, they are
defined as (a kind of) graphs satisfying a certain correctness
criterion. The main drawback immediately arises: we expect some
difficulties in proving the desired induction principles, which are
crucial for reasoning about terms with variable binding. The main
advantages are hoped to follow from the rich categorical structure of
lambda-graphs. In particular, alpha-equivalence of lambda-terms, i.e.,
equivalence up to renaming of bound variables, is here modelled by
mere isomorphism. Furthermore, if one works in a univalent setting,
alpha-equivalence thus becomes equality.

Tentative program: (0) introduction to lambda-graphs and the required
category theory, (1) formalisation of (a generalisation of)
lambda-graphs, (2) proof of induction principles, (3) attempts at
proving significant results (e.g., from the POPLmark challenge).

URL sujet détaillé : https://lama.univ-savoie.fr/~hirschowitz/stages/2017/lambda-graphs.pdf

Remarques : The internship is co-advised by Tom Hirschowitz (CNRS, LAMA,
Universit'e Savoie Mont Blanc in Le Bourget-du-Lac) and Nicolas Tabareau
(INRIA, Ecole des Mines de Nantes) and could take place in either
location, depending on the taste of the applicant.




SM207-20 Validation formelle d'un OS pour systèmes à alimentation intermittente  
Admin  
Encadrant : David CACHERA
Labo/Organisme : IRISA
URL : http://people.irisa.fr/David.Cachera/index_en.html
Ville : Stage co-encadré par David Cachera (ENS Rennes/IRISA), Delphine Demange (Univ. Rennes 1/IRISA) et Tanguy Risset (INSA Lyon/Citi).

Description  

The familiar Von Neumann architecture model based on volatile memory is currently questionned by the emergence of non volatile memory (NVRAM) which retain information even without power supply. Several technologies, including STT-MRAM, PCM, FeRAM, are still competing in this area. Today it is already possible to program systems containing NVRAM, e.g. the TI MSP-EXP430FR5739 micro-controller.
In this internship, we focus on ultra low consumption systems, commonly used in IoT (Internet of Things), which do not depend on battery supply. The Sytare project, developed by the Socrate team (Citi, Lyon), allows for executing applications on a transiently powered system. It is based on a checkpointing mechanism, extended to handle peripheral devices. In order to gain confidence in the behaviour of these objects, we aim at justifying on a formal basis the validity of these mechanisms.
The purpose of this internship is to set up a framework for formally validating the Sytare mechanisms that save and restore the processor state when an unpredictable power failure occurs. We thus need a semantic model of the system that is both realistic and accurate enough for capturing the wanted correctness criteria. More precisely, we aim at

1. formalizing the correctness of the checkpointing mechanism,
2. formally validating its Sytare implementation,
3. generalizing this work to peripheral devices management.


The internship will be based at IRISA (Rennes, Celtique team), with frequent interactions with the Socrate team (Citi, Lyon). The successful candidate should have strong knowledge in semantics and formal methods and good programming skills, especially in C, together with a strong background in architecture and operating systems.

URL sujet détaillé : http://people.irisa.fr/David.Cachera/Recherche/2017-Master-Celtique-Socrate.pdf

Remarques : Stage co-encadré par David Cachera (ENS Rennes/IRISA), Delphine Demange (Univ. Rennes 1/IRISA) et Tanguy Risset (INSA Lyon/Citi).




SM207-21 Profilage et analyse des accès mémoire concurrents  
Admin  
Encadrant : Claude TADONKI
Labo/Organisme : Centre de Recherche en Informatique (CRI) - MINES ParisTech
URL : http://www.cri.ensmp.fr/
Ville : Fontainebleau

Description  

L'importante évolution des architectures durant les deux dernières décennies s'est accompagnée d'une complexification notable des systèmes de mémoire. D'une part, concernant les architectures multi-coeurs, on dispose de plusieurs niveaux de mémoire directement ou indirectement partagés par les différents cóurs, et éventuellement des mémoires privées individuelles situées au plus proche des unités de traitements. D'autre part, concernant les accélérateurs, on a des mémoires séparées pour lesquelles les échanges nécessitent des mécanismes explicites de transfert de données. Les architectures hétérogènes unifiant les deux types d'architectures, voire plus, on comprend alors la difficulté notoire de cerner l'impact des accès mémoire sur la performance globale des systèmes modernes. Comprendre ce mécanisme et le modéliser de manière assez précise permettraient d'affiner les prédictions de temps de calculs et aussi de dégager une méthodologie d'optimisation des programmes parallèles, principalement sur les machines multi-coeurs. Il faut préciser qu'avec le temps, le coût des accès mémoire tend à devenir dominant; et dans bon nombre de cas, on y trouve la cause principale de l'inefficacité globale des programmes usuels. Cette étude débutera par un inventaire des différents systèmes de mémoire hiérarchique et leur principe de fonctionnement, suivi d'un important travail de profilage sur diverses architectures avec des scénarios judicieusement choisis. De cette étape, le candidat récoltera des informations et des données qui lui permettront de clarifier les mécanismes sous-jacents aux transferts de données en mode concurrent, et ensuite de présenter une synthèse didactique qui permettra d'éclairer les programmeurs sur les réalités fonctionnelles de l'activité mémoire.
URL sujet détaillé : http://www.cri.ensmp.fr/offres_stages.html

Remarques : Prise en charge et/ou indemnités



SM207-22 Verifiable Secret Sharing on the Red Belly Blockchain  
Admin  
Encadrant : Vincent GRAMOLI
Labo/Organisme : University of Sydney
URL : http://poseidon.it.usyd.edu.au/~gramoli/web/php/
Ville : Sydney, Australia

Description  

Blockchain technologies are promising at leveraging distributed machines to tolerate attacks, however, they rely heavily on sensitive information like private keys to identify their users. The Red Belly Blockchain is a new secure blockchain developed at the University of Sydney that tolerates t failures out of n>3t machines. Unfortunately, it suffers from the same drawbacks as other blockchains in that users loose their assets and cannot use the system as soon as they forget or loose their private key.

The goal of this project is to leverage the inherent distributed nature of the blockchain to develop a secure and private recovery service for sensitive information, like private keys. A user will rely on a Verifiable Secret Sharing (VSS) in order to chop his sensitive information into encrypted chunks and to distribute them among m>t nodes. This distribution based on erasure codes guarantees that a coalition of t nodes is insufficient to retrieve any of this information but allows a user to retrieve it by authenticating itself to k>t among these m nodes. Using the service any node will then be able to safely store their sensitive information to retrieve it later on and without disclosing it to any other node.

Excellent programming skills required, cryptographic knowledge would be a plus.
URL sujet détaillé : http://redbellyblockchain.io/

Remarques : Financial support possible.



SM207-23 Smoothed analysis of walking strategies in Delaunay triangulation  
Admin  
Encadrant : Olivier DEVILLERS
Labo/Organisme : INRIA Nancy - Grand Est
URL : https://members.loria.fr/Olivier.Devillers/
Ville : Nancy

Description  

The complexity of traversing a triangulation of a set of n points is between linear (in the worst-case) and sqrt(n) in the random case. The aim of the internship is to obtained a trade-off between these two complexities when the data are subject to increasing noise.
URL sujet détaillé : https://members.loria.fr/Olivier.Devillers/stage-smoothed.pdf

Remarques : Gratification légale



SM207-24 Autonomous object recognition in videos using Deep learning and Developmental learning  
Admin  
Encadrant : Frédéric ARMETTA
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/frederic.armetta/shortCurriculum.html
Ville : Lyon

Description  

This internship aims to contribute to the development of an autonomous object recognition system for videos. In this context, the system is exposed to a visual flow (videos) and has to extract (proto-)objects to iteratively refine its internal representation for them. The purpose of this work is to develop such a system that is able to autonomously recognize and differentiate objects thanks to the building of an internal representation for these objects. Taking inspiration from human perception and following the constructivist learning paradigm, we want to get rid of the use of a large labeled database, prior knowledge or sophisticated object detectors, but instead provide an autonomous development. The problem and associated objectives differ from the general way to address the learning using supervised reinforcement methods like deep learning. Indeed, no large dataset would ideally be available for the system (extraction of visual identified objects should be part of the result and not provided).

This internship will capitalize on previous promising preliminary results. The current system can first extract global shapes to catch candidate objects from the video, using simple temporal filters (for instance, a Kalman filter) and the spatio-temporal coherence of objects (movement and spatial overlap can help to define instances of objects as similar). It then uses Siamese Neural Networks to learn a similarity metric by providing pairs of examples marked as coming from the same or from different classes. This model constructs a manifold that can be used to classify examples of unknown classes. Following these guidelines, preliminary results show that the system is able to classify new instances of objects with a good accuracy. Nevertheless, the way to maintain and make evolve this representation raises many questions that can be deepened on a short or long term according to the analysis of needs in the course of the project (catastrophic forgetting, active learning, overfitting, ability to generalize, little data, etc.).

A challenging topic that we would like to deepen during this internship relies in the possibility to use the so built internal representation to facilitate the object extraction from the videos. Indeed, without any knowledge of the objects and due to the relatively simple temporal filtering to detect candidate objects, the first extraction is coarse and highly sensitive to environmental noise. The internal representation could then be used to validate and outline the candidate objects. In this case, the object catching and the internal representation for objects evolve together. The process we want to elaborate is a self-starting one operating without external input. In other words, the so form system has to learn how to perceive efficiently in order to be able to learn more, and reciprocally. We face here a chicken-and-egg cognitive problem, also known as a representation bootstrapping problem.
URL sujet détaillé : https://tomuss-fr.univ-lyon1.fr/2017/Public/Stage_Recherche/upload_get_public/1_5/144_0/Stage_Recherche_Sujet_fichier_sujetStage.pdf

Remarques : Co-advisors: Mathieu Lefort (SMA, LIRIS) et Stefan Duffner (Imagine, LIRIS)

Financial reward: ~500 =80 per month



SM207-25 Using deep learning to pansharpen satellite images  
Admin  
Encadrant : Yuliya TARABALKA
Labo/Organisme : Inria Sophia-Antipolis
URL : https://team.inria.fr/titane/
Ville : Sophia-Antipolis

Description  

Context: The latest generation of satellite-based imaging sensors (Pleiades, Sentinel, etc.) acquires big volumes of Earth's images with high spatial, spectral and temporal resolution (up to 50cm/pixel, 50 bands, twice per day, covering the full planet!). These data open the door to a large range of important applications, such as the planning of urban environments and the monitoring of natural disasters, but also present new challenges, related to the efficient processing of high volumes of data with large spatial extent.

Subject: Recent works have shown that deep learning methods succeed in getting detailed classification maps from aerial data [Volpi and Tuia 2016]. However, when applying a deep learning classifier to images acquired by recent satellite sensors, the resulting classification maps are not very accurate. This is due to the fact that satellite images are more blurry than the aerial ones due to atmospheric effects. Another reason is that many satellite imaging sensors capture separately a very high-resolution panchromatic (grey-scale image) and a multispectral (color) image with lower spatial resolution; the so-called fusion, or emph{pansharpening} method, must be further applied to fuse these two sources of information. The errors due to the pansharpening method induce errors in the classification results.
The goal of this project is to propose a new pansharpening method to get a multispectral image with a very-high spatial resolution from two satellite sources: panchromatic and multispectral data. We will explore deep learning frameworks for this purpose. The main idea is to use the very-high-resolution aerial images for synthesizing a training dataset, consisting of the input panchromatic and multispectral images, and the resulting very-high-resolution multispectral image we want to obtain. The work will then consist in designing and training a deep learning architecture to pansharpen images. We will finally evaluate how the new method improves classification accuracies.
URL sujet détaillé : https://team.inria.fr/titane/files/2017/10/2018InternFusionFinal.pdf

Remarques :



SM207-26 Stochastic Games and Semidefinite Programming  
Admin  
Encadrant : Elias TSIGARIDAS
Labo/Organisme : Inria & UPMC (LIP6 - Paris 6) CNRS
URL : http://www-polsys.lip6.fr/~elias/
Ville : Paris

Description  

* Présentation générale du sujet.

Shapley introduced stochastic games in 1953 and since then they are a subject of intensive study. They model dynamic interactions in an environment that changes in response to the behavior of the players. Their applications include industrial organization, resource economics, market games, communication networks. For many families of stochastic games we can describe the equilibrium strategies and the payoffs of the players using systems of polynomial equations and inequalities [1]. Particular instances of continuous stochastic games rely on optimizing a linear function restricted to a convex set defined by linear matrix inequalities. This problem is known as Semidefinite Programming (SDP). SDP extends linear programming and has become a fundamental ingredient in polynomial and combinatorial optimization, complexity theory, quantum computing, control, signal processing, computational finance, and data mining.



* Objectifs du stage.

The main solution concept in game theory is the equilibrium, that characterizes the possible outcome in settings with rational agents who want to maximize their payoffs. For a special class of stochastic games with players having continuous actions, the minimax equilibria and the optimal strategies are computed using SDP [3], see also [2]. We would like to study the complexity of the algorithms for computing the equilibrium and the optimal strategies for such games. Precise complexity estimates would deepen our understanding and would lead to improved algorithms.





* Bibliography

[3] K. A. Hansen, M. Koucky, N. Lauritzen, P. B. Miltersen, and E. P. Tsigaridas. Exact algorithms for solving stochastic games. In Proc. 43rd Annual ACM Symp. Theory of Computing (STOC), 2011.

[2] R. Laraki and J. B. Lasserre. Semidefinite programming for min-max problems and games. Mathe- matical programming, 131(1=962):305=96332, 2012.

[3] P. Shah and P. A. Parrilo. Polynomial stochastic games via sum of squares optimization. In Proc. of 46th IEEE Conference on Decision and Control (CDC), pages 745=96750. IEEE, 2007.

URL sujet détaillé : :

Remarques : Possibility for regular internship gratification by ANR JCJC project GALOP.



SM207-27 Exploration de la topologie et du placement d'une mémoire partagée répartie sur machines hétérogènes  
Admin  
Encadrant : Loïc CUDENNEC
Labo/Organisme : l'énergie, les technologies pour l'information et la santé, et la défense. Reconnu comme un expert dans ses domaines de compétences, le CEA est pleinement inséré dans l'espace européen de la recherche et exerce une présence croissante au niveau international. Situé en île de France sud (Saclay), le Laboratoire d'Intégration des Systèmes et des Technologies (LIST) a notamment pour mission de contribuer au transfert de technologies et de favoriser l'innovation dans le domaine des systèmes de calcul parallèle.
URL : http://www-list.cea.fr
Ville : PALAISEAU

Description  

Le Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA) est un acteur majeur en matière de recherche, de développement et d'innovation. Cet organisme de recherche technologique intervient dans trois grands domaines : l'énergie, les technologies pour l'information et la santé, et la défense. Reconnu comme un expert dans ses domaines de compétences, le CEA est pleinement inséré dans l'espace européen de la recherche et exerce une présence croissante au niveau international. Situé en île de France sud (Saclay), le Laboratoire d'Intégration des Systèmes et des Technologies (LIST) a notamment pour mission de contribuer au transfert de technologies et de favoriser l'innovation dans le domaine des systèmes de calcul parallèle.

La maîtrise de la consommation d'énergie pousse les constructeurs à augmenter le niveau d'hétérogénéité des machines de calcul. Un exemple est l'architecture micro-serveur dans laquelle des nóuds à base de x86, ARM, GP-GPU et FPGA peuvent cohabiter, ce qui complexifie l'utilisation du système. Afin de faciliter la programmation, nous proposons un système à mémoire partagée (DSM) pouvant déployer du code utilisateur spécialisé pour différents nóuds, ainsi que des serveurs en charge de gérer les données partagées. Les liens établis entre les codes utilisateurs et les serveurs constituent une topologie. Cette topologie, ainsi que son placement sur les nóuds, déterminent les performances de l'application (durée d'exécution, consommation d'énergie...). Nous proposons de développer des outils d'aide à la décision pour la topologie et le placement les plus adaptés à une application et un contexte d'exécution donnés.

Dans ce contexte le stage doit intervenir sur les pistes suivantes.

- RO : contribuer aux stratégies d'exploration des topologies et des placements de tâches,

- Aide à la décision : analyser les résultats d'une exécution ou d'un modèle analytique et proposer des outils pour de l'aide à la décision (Par exemple partir des statistiques calculées par l'exécution de la mémoire partagée et générer un Pareto).

Cette proposition est réservée aux BAC+5. De bonnes connaissances en recherche opérationnelle et en analyse de données sont nécessaires. La maîtrise d'au moins un langage de programmation est indispensable.
URL sujet détaillé : :

Remarques :



SM207-28 Computational Optimal Transport for Early Universe Reconstruction  
Admin  
Encadrant : Bruno LEVY
Labo/Organisme : Inria Nancy Grand-Est/Loria
URL : https://members.loria.fr/BLevy/positions/EUR_sujet.pdf
Ville : Nancy

Description  

General presentation of the topic: One of the challenges in cosmology is
to test some theoretical models of the universe using diverse sources of measure-
ment, such as redshift acquisition surveys, that can produce large-scale maps
of the universe, and acquisition of the cosmic microwave background by the
Plank probe, that maps radiations dating back to the early stage of the uni-
verse (when it was 380 000 years old). Early Universe Reconstruction (EUR)
consists in evolving a map of the universe reconstructed from a redshift acquisi-
tion survey back in time, and try to evaluate the flucuations at the early state,
and compare them with the microwave background to see whether the model is
realistic.

Objectives / Workplan: Recent progress in computational optimal
transport decreased computation times from weeks to less than an hour for
multi-millions galaxy clusters, and are currently studied in starting exchanges
between the numerical geometry group ALICE in Nancy, the group MOKA-
PLAN in Paris on Optimal Transport and the Paris Institute of Astrophysics.
The goal of the training period is to further develop the existing algorithm
and software prototype with the goal of pushing even further the size of the
datasets that can be processed (from millions to billions). This requires both
a mathematical analysis of a new hierarchical algorithm as well as computer
implementation.


URL sujet détaillé : https://members.loria.fr/BLevy/positions/EUR_sujet.pdf

Remarques :



SM207-29 Topological optimization for 3D printing using Voronoi diagrams  
Admin  
Encadrant : Bruno LEVY
Labo/Organisme : Inria Nancy Grand-Est/Loria
URL : https://members.loria.fr/BLevy/
Ville : Nancy

Description  

General presentation of the topic: Recent development of additive manu-
facturing technology (3D printing) makes it possible for the first time to create
shapes with internal sructures or intricate geometries that were otherwise com-
pletely impossible to manufacture using traditional ways. This makes it possible
to directly use the result of mathematical algorithms that compute the shape
that is best adapted to a certain task, for instance resisting mechanical loads,
or conducting heat in a prescribed manner.

Objectives / Workplan: The objective of this internship is starting ex-
perimenting with a new algorithm for topological optimization, where the shape
to be optimized is represented as a Voronoi diagram, a well known structure in
computational geometry. We think that the representation as a Voronoi diagram
will make it possible to adaptively capture the fine-scale geometric details of the
optimal geometry. The first step will consist in expressing the functional of the
optimization problem (compliance) in terms of Voronoi diagram, and compute
its derivative with respect to the parameters that define the Voronoi diagram
(a set of points called generators, and possibly some prescribed cell volumes).
The second step will consist in developing an optimization algorithm, and start
experimenting with an implementation.

URL sujet détaillé : https://members.loria.fr/BLevy/positions/TOPOPT_sujet.pdf

Remarques :



SM207-30 Inverse homogenization of implicit functions  
Admin  
Encadrant : Monique TEILLAUD
Labo/Organisme : Inria/Loria
URL : https://members.loria.fr/Monique.Teillaud/
Ville : Nancy

Description  

The objective of the internship is to perform inverse homogenization of periodic microstructures defined by an implicit function.

A first step will involve to run the 3D periodic mesh generator or the CGAL library (www.cgal.org) on an implicit function.
A second step will require to analytically relate the inverse homogenization gradients (computed with FEM from the tetrahedral mesh) with the parameters of the function.
Once the above pipeline is completed we will use it to optimize novel types of implicit functions based on recent research on skeletal microstructures as well to explore which type of physical properties it can optimize for.

URL sujet détaillé : https://members.loria.fr/Monique.Teillaud/positions/2017-microtructures-internship.pdf

Remarques : Co-encadrement avec Jonàs Martínez
(jonas.martinez-bayona.fr)
https://sites.google.com/site/jonasmartinezbayona/



SM207-31 Characterizing the Performance and Energy Efficiency of In- memory Storage Systems  
Admin  
Encadrant : Gabriel ANTONIU
Labo/Organisme : Inria
URL : https://team.inria.fr/kerdata/
Ville : Rennes

Description  

Most large popular web applications, like Facebook and Twitter[1-2], have been relying on large amounts of in-memory storage to cache data and offer a low response time. As the main memory capacity of clusters and clouds increases, it becomes possible to keep most of the data in the main memory. This motivated the introduction of in-memory storage systems.

While prior work has mainly focused on how to exploit the low-latency of in-memory access at scale, there is very little visibility into the energy-efficiency of in-memory storage systems. Even though it is known that main memory is a fundamental energy bottleneck in computing systems [3-4](i.e., DRAM consumes up to 40% of a server's power).

The goal of this internship is to first identify relevant existing in-memory storage systems and clearly describe their goals and features. Potential targeted systems are: Redis, Memcached, and Riak-KV. Then, an experimental evaluation is envisioned in order to identify trade-offs between performance and energy efficiency. Last, we expect to model these trade-offs or to propose a new metric that would allow to better quantify the energy efficiency of in-memory storage systems.

[1] R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani, =93Scaling memcache at facebook,=94 in Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), Lombard, IL, 2013, pp. 385=96398
[2] M. Iravani, =93How twitter uses redis to scale - 105tb ram, 39mm qps, 10,000+ instancess,=94 2015. [Online]. Available: https://www.linkedin.com/pulse/how-twitter-uses-redis-scale-
105tb-ram-39mm-qps-10000-iravani
[3] [Online]. Available: http://www.nrdc.org/energy/data-center-efficiencyassessment.asp
[4] A. N. Udipi, N. Muralimanohar, N. Chatterjee, R. Balasubramonian, A. Davis, and N. P. Jouppi, =93Rethinking dram design and organization for energy-constrained multi-cores,=94 in Proceedings of the 37th Annual International Symposium on Computer Architecture, ser. ISCA '10, 2010, pp. 175=96186.
[5] Y. Taleb, S. Ibrahim, G. Antoniu and T. Cortes, "Characterizing Performance and Energy- Efficiency of the RAMCloud Storage System," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, 2017, pp. 1488-1498.
URL sujet détaillé : https://team.inria.fr/kerdata/master-internship-positions/

Remarques : Co-advisor: Yacine Taleb; yacine.taleb.fr



SM207-32 Audit the transparency mechanisms provided by social media advertising platforms  
Admin  
Encadrant : Oana GOGA
Labo/Organisme : LIG
URL : https://people.mpi-sws.org/~ogoga/
Ville : Grenoble

Description  

In recent years, targeted advertising has been source of a growing number of privacy complaints from Internet users [1].
At the heart of the problem lies the opacity of the targeted advertising mechanisms: users do not understand what data
advertisers have about them and how this data is being used to select the ads they are being shown. This lack of
transparency has begun to catch the attention of policy makers and government regulators, which are increasingly
introducing laws requiring transparency [2].

To enhance transparency, Twitter recently introduced a feature (called =93why am I seeing this=94) that provides users with an
explanation for why they have been targeted a particular ad. While this is a positive step, it is important to check whether
these transparency mechanisms are not actually deceiving users leading to more harm than good. The goal of the project
is to verify whether the explanations provided by Twitter satisfy basic properties such as completeness, correctness and
consistency. The student will need to develop a browser extension or a mobile application that is able to collect the ad
explanations received by real-world users on Twitter and to conduct controlled ad campaigns such that we can collect the
corresponding ad explanations.

Throughout the project the student will be able to familiarize himself with the online targeted advertising ecosystems, learn
to conduct online experiments and measure their impact and conceptually reflect at what is a good explanation.
URL sujet détaillé : https://people.mpi-sws.org/~ogoga/offers/Internship_2017_Twitter_explanations.pdf

Remarques :



SM207-33 Build a platform for increasing the transparency of social media advertising  
Admin  
Encadrant : Oana GOGA
Labo/Organisme : LIG
URL : https://people.mpi-sws.org/~ogoga/
Ville : Grenoble

Description  

In recent years, targeted advertising has been source of a growing number of privacy complaints from Internet users [1].
At the heart of the problem lies the opacity of the targeted advertising mechanisms: users do not understand what data
advertisers have about them and how this data is being used to select the ads they are being shown. This lack of
transparency has begun to catch the attention of policy makers and government regulators, which are increasingly
introducing laws requiring transparency [2].
The project consists in building a tool that provides explanations for why a user has been targeted with a particular ad that
does not need the collaboration of the advertising platform. The key idea is to crowdsource the transparency task to
users. The project consists in building a collaborative tool where users donate in a privacy-preserving manner data about
the ads they receive. The platform needs to aggregate data from multiple users (using machine learning) to infer
(statistically) why a user received a particular ad. Intuitively, the tool will group together all users that received the same
ad, and look at the most common demographics and interests of users in the group. The key challenge is to identify the
limits of what we can statistically infer from such a platform.
Throughout the project the student will be able to familiarize himself with the online targeted advertising ecosystems, learn
to collect data from online services and apply machine learning and statistics concepts on real-world data.
URL sujet détaillé : https://people.mpi-sws.org/~ogoga/offers/Internship_2017_crowdsouring.pdf

Remarques :



SM207-34 Assessing the trustworthiness of online identities  
Admin  
Encadrant : Oana GOGA
Labo/Organisme : LIG
URL : https://people.mpi-sws.org/~ogoga/
Ville : Grenoble

Description  

The lack of strong identities, i.e., secure and verifiable identities that are backed by a certificate from a trusted authority
(e.g., passport or social security number) has been a long-standing problem on the Web. While strong identities could
provide better security for Web services they failed to achieve mass adoption because they significantly raise the sign-on
barrier for new users and raise privacy concerns =96 users cannot be anonymous. Consequently, most Web services today provide weak identities that are not backed by any certificate. Unfortunately, weak identities are heavily exploited by malefactors to create multiple fake identities with ease. Such fake identities are traditionally known as Sybils [1] and are typically used to inundate services with spam, spread fake news, or post fake reviews. Not only do these problems impact
our daily activities, but they also deeply affect the economies and political life of our countries. The goal of this project is
to investigate methods to assess the trustworthiness of online identities and content (e.g., check whether a particular
restaurant review is fake or real or whether an individual is interacting with a legitimate user or an attacker).

More precisely the project consists in investigating whether we can quantify the trustworthiness of online identities in
monetary terms using the price of identities in black markets [2]. Popular black-market services like Freelancer and
Taobao [6] allow job postings promising creation of fake identities with different levels of grooming. With the availability of such job posting data, the idea is very simple: Given that the advertised black-market price for an identity groomed to a
certain level is X, and that the expected utility that can derived from the activities of this identity is Y , if X> Y , we can
expect that a rational attacker will have no incentive to use such an identity for malicious activities. The second step is to
measure the extent to which we can increase the accountability of weak identities if we link the (potentially anonymous) identities of users across multiple systems.

Throughout the project the student will be able to familiarize himself with the different ways online systems can be
exploited by attacker and possible countermeasures, learn to collect data from online services and apply and analyze this
data.
URL sujet détaillé : https://people.mpi-sws.org/~ogoga/offers/Internship_2017_trust.pdf

Remarques :



SM207-35 Optimal design of parallelized light signals for optically controlled gene networks in single cells  
Admin  
Encadrant : Jakob RUESS
Labo/Organisme : InBio Unit - Institut Pasteur
URL : https://research.pasteur.fr/en/team/experimental-and-computational-methods-for-modeling-cellular-processes/
Ville : Paris

Description  

1. Context
With the exponential decrease of DNA synthesis costs, and the development of standardized cloning methods and of liquid handling robots, the automation of biological experiments will have a decisive impact on tomorrow's research, notably in health and biotechnology domains. Actually, it has already started and =93cloud biology=94 companies offer to =93access a fully automated cell and molecular biology laboratory, all from the comfort of web browsers=94 (Transcriptic). We will therefore enter an era in which novel experiments can be specified entirely on the computer as control algorithm that give instructions to the experimental platform.

2. Problem and Goals
In collaboration with Calin Guet at IST Austria, we have recently constructed a computer controlled experimental platform in which hundreds of single bacterial cells, growing in a microfluidic device (mother machine), can be observed (by microscopy) and stimulated (through an optically inducible gene expression system) at the same time. Importantly, the observations are automatically processed and the computer can make real-time decisions on how to continue stimulating each individual cell. This allows us to perform experiments that could provide much more information about the functioning of biochemical processes at the level of single cells than conventional experiments. However, given that this experimental platform is the first of its kind, it is unclear what experiments would be the most useful and how the resulting data should be analyzed. The goal of this project is to develop and implement mathematical methods that can be used for the design of experiments that provide the maximal amount of information about the underlying gene expression dynamics. Our long-term goal is to develop press button tools for the biology lab of the future that can be used to automatically characterize and model gene networks with minimal experimental effort.



URL sujet détaillé : https://lifeware.inria.fr/~batt/InternOptimalDesignOptosystem.pdf

Remarques : The work will be jointly supervised by Jakob Ruess and Gregory Batt at Institut Pasteur (Paris) in the newly created Inria/Pasteur InBio group. The project is at the core of the long-term research goals of the group and, if successful, could potentially serve as the basis of a PhD-project in which the methods are developed further and applied in our wet lab.



SM207-36 calcul multiparties sécurisé (Secure Multiparty Computation) et blockchains  
Admin  
Encadrant : Daniel AUGOT
Labo/Organisme : INRIA Saclay--Île-de-France et LIX
URL : http://pages.saclay.inria.fr/daniel.augot/index-eng.html
Ville : Palaiseau

Description  

En 2015, les auteurs de la publication [5] et du white paper [6] ont
introduit un modèle de blockchain, baptisée enigma, qui propose un
ensemble de techniques permettant de faire du calcul distribué
sécurisé. Ce projet ne semble pas avoir abouti tel quel, et si le nom
enigma est resté, https://blog.enigma.co, il n'est pas clair que les
idées scientifiques du white paper aient été développées, et la
compagnie subsistante en semble très éloignée. Une technique
cryptographie importante avait été proposée par les auteurs, celle du
calcul multiparties sécurisé, (Secure Multiparty Computation, MPC). Il
s'agit de protocoles qui permettent à n participants, qui possèdent
chacun un secret, de calculer une fonction (de leur choix) de leur
secrets sans rien révéler leurs secrets. La problématique est bien
illustrée par le problème des milliardaires socialistes qui veulent
savoir qui est le plus riche sans rien révéler de leur richesse
[4]. La variante proposée ici est sûre au sens de la théorie de
l'information (incondtionnellemwent sŷre), et repose de manière
essentielle sur le partage de secret, voir l'ouvrage récent [3]. Le
partage du secret est en réalité très proche de la théorie du codage
algébrique. Dans la vision du white paper d'enigma, ces protocoles
devaient permettre de construire une plateforme de calcul distribué
sur des données chiffrées. D'un autre coté, le calcul multiparties
sécurisée souffre du problème de l'équité, fairness. En effet, un
participant malhonnête qui a obtenu le résultat à une certaine étape
du protocole peut décider de se retirer avant la fin de l'exécution,
privant d'autres participants du résultat. Les auteurs de [1], voir
aussi la communication ACM [2], proposent d'utiliser des mécanismes de
pénalité financière pour inciter les participants à exécuter le
protocole jusqu'à sa fin. Ces mécanismes sont en particulier
réalisables de manière cryptographique avec bitcoin. Cette thématique
de mélange de MPC et de blockchain semble être restée lettre morte, et
l'objectif du stage est de reprendre ces travaux, et leurs idées, de
comprendre les limitations, et voir ce qui est réalisable
actuellement, par exemple dans ethereum, qui à l'époque n'était pas
aussi populaire que maintenant.

Références

[1] Marcin Andrychowicz et al. =93Secure Multiparty Computations on
Bitcoin=94. In : 2014 IEEE Symposium on Security and Privacy, SP 2014,
Berkeley, CA, USA, May 18-21, 2014. IEEE Computer Society, 2014,
p. 443=96458. isbn : 978-1-4799-4686-0. https://doi.org/10.1109/SP.2014.35.

[2] Marcin Andrychowicz et al. =93Secure Multiparty Computations on
Bitcoin=94. In : Commun. ACM 59.4 (mar. 2016), p. 76=9684. issn :
0001-0782. http://doi.acm.org/10.1145/2896386.

[3] Ronald Cramer, Ivan Damg=E5rd et Jesper Buus Nielsen. Secure
Multiparty Computation and Secret Sharing. Cambridge University Press,
2015. isbn : 9781107043053.
https://www.cwi.nl/news/2015/first-book-quantum-secure-multi-party-computation-published

[4] Andrew Chi-Chih Yao. =93Protocols for Secure Computations (Extended
Abstract)=94. In : 23rd Annual Symposium on Foundations of Computer
Science, Chicago, Illinois, USA, 3-5 November 1982. IEEE Computer
Society, 1982, p. 160=96164. https://doi.org/10.1109/SFCS.1982.38.

[5] Guy Zyskind, Oz Nathan et Alex Pentland. =93Decentralizing Privacy :
Using Blockchain to Protect Personal Data=94. In : 2015 IEEE Symposium
on Security and Privacy Workshops, SPW 2015, San Jose, CA, USA, May
21-22, 2015. IEEE Computer Society, 2015, p. 180=96 184. isbn :
978-1-4799-9933-0. https://doi.org/10.1109/SPW.2015.27.

[6] Guy Zyskind, Oz Nathan et Alex Pentland. =93Enigma : Decentralized
Computation Platform with Guaranteed Privacy=94. In : CoRR
abs/1506.03471 (2015). http://arxiv.org/abs/1506.03471

URL sujet détaillé : :

Remarques : Matthieu Rambaud https://perso.telecom-paristech.fr/rambaud/ sera co-encadrant



SM207-37 Low Latency Storage for Stream Data  
Admin  
Encadrant : Alexandru COSTAN
Labo/Organisme :
URL : https://team.inria.fr/kerdata/
Ville : Rennes

Description  

An increasing number of Big Data applications deal with small data (in the orders of Bytes or KiloBytes). This trend is easily observed in domains like finance, weather forecast, IoT, insurance or social networks. In such contexts, small items are continuously collected from the stream sources or are received from other stream processing computations. Even if the stream engines running the applications are processing such stream data on the fly, by passing it through the topology of stream operators, there is an increasing need to be able to store such items efficiently. Unlike traditional storage, the main challenge of storing stream data is the large number of small items (arriving at rates easily reaching tens of millions of parameters per second).

There is clear need for a dedicated solution for low latency stream storage. Such a solution should provide on the one hand traditional storage functionality and on the other hand stream-like performance (i.e., low latency I/O access to items/range of items). The goal of this internship is to explore the plausible paths towards such a dedicated storage solution: the main requirements and challenges, the design choices (e.g., a standalone component vs. an extension of an existing Big Data solution like HDFS) an architectural overview and experimental validation on the Grid5000 platform as well as public clouds (Microsoft Azure, Amazon WS).

To achieve this we plan to use several design principles: data partitioning schemes and the ability to deal with billions of small items, inspired by the Kafka approach for streaming; techniques for managing data in distributed caches that enable ns access time across large collections of items, as the ones introduced in RamCloud and DXRam; dynamic metadata partitioning for increased scalability and support for high concurrency, as used in Ceph, Giraffa and CalvinFS; enhanced I/O through microbatches for reading/writing from/to HDFS, inspired by the Spark approach for data processing.

References:
[1] Kreps Jay, Narkhede Neha, and Rao Jun. Kafka: A distributed messaging system for log processing. In Proceedings of 6th International Workshop on Networking Meets Databases, NetDB'11, 2011.
[2] RamCloud http://web.stanford.edu/~ouster/cgi-bin/projects.php
[3] Klein Florian and Schottner Michael. Dxram: A persistent in-memory storage for billions of small objects, PDCAT, IEEE, December 2013.
[4] Ceph http://ceph.com
[5] Giraffa https://github.com/GiraffaFS/giraffa/wiki
[6] Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip Shao, and Daniel J. Abadi. 2012. Calvin: fast distributed transactions for partitioned database systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD '12).
[7] Apache Spark http://spark.apache.org
[8] Irina Botan, Gustavo Alonso, Peter M. Fischer, Donald Kossmann, and Nesime Tatbul. Flexible and scalable storage management for data-intensive stream processing. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, EDBT '09, pages 934=96945
[9] Fangjin Yang, Eric Tschetter, Xavier L ́eaut ́e, Nelson Ray, Gian Merlino, and Deep Ganguli. Druid: A real-time analytical data store. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14, pages 157=96168
[10] Grid5000 www.grid.5000.fr
URL sujet détaillé : https://team.inria.fr/kerdata/master-internship-positions/

Remarques : Main advisors:
Ovidiu Cristian Marcu (Inria) ovidiu-cristian.marcu.fr Alexandru Costan (IRISA) alexandru.costan.fr

Other advisors:
Gabriel Antoniu (Inria) gabriel.antoniu.fr
Radu Tudoran (Huawei Research) radu.tudoran.com
Bogdan Nicolae (Argonne National Laboratory) bogdan.nicolae.org

Required skills:
- Strong knowledge of computer networks and distributed systems
- Knowledge on storage and (distributed) file systems
- Strong programming skills (e.g. C/C++, Java)
- Working experience in the areas of Big Data management, Cloud computing, HPC, is an advantage
- Very good communication skills in oral and written English.

To apply, please email a CV to Dr. Gabriel Antoniu and Dr. Alexandru Costan.



SM207-38 Multimodal merging for social robots  
Admin  
Encadrant : Mathieu LEFORT
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/mathieu.lefort/
Ville : Lyon

Description  

To have a consistent perception of its surrounding world, an agent has to continuously merge the data flows coming from its various sensors. Multiple studies were done in animals and humans to understand this process at the neuronal and behavioral levels. Especially, it looks like humans are able to merge data in a statistically optimal fashion.
Through this internship, the selected candidate will participate in the AMPLIFIER project that targets to study how an artificial agent can autonomously learn to merge multiple data sources and especially how it can attribute the right relevance to each source, with respect to the precision of the sensors, the current stimulus, the task, ... Our hypothesis is that active perception (i.e. active sampling of data in the environment, e.g. by moving the eyes to sample visual data) plays a key role in this process. We want to test it in psychophysics experiments and explore how it can improve human-robot interactions.
More precisely, the intern will study how to integrate the relevance of modalities in the neural field paradigm that was previously used for multimodal merging. On one hand, this work will be used for modeling psychophysical results (performed by other members of the project). On the other hand, it will be coupled with previous work on active perception and anticipation learning. This raises research questions on multimodal attention that will be studied during this internship.
The AMPLIFIER project (2018-2022) includes members from Lyon 1 University (LIRIS, CRNL), Univ. Grenoble Alpes (LJK, Gipsa-lab, LPNC) and Hoomano, a start-up located in Lyon. The intern will thus have to interact with multiple collaborators (academic and firm researchers, engineers) from multiple domains (computer science, psychophysics, statistics, computational neuroscience, robotics).
URL sujet détaillé : http://liris.cnrs.fr/mathieu.lefort/jobs/subject_M2_amplifier.pdf

Remarques : Co-advising: Jean-Charles Quinton (associate professor at SVH team, LJK laboratory, Grenoble)



SM207-39 Liens entre exigences et spécifications formelles  
Admin  
Encadrant : Jeanine MME SOUQUIÈRES
Labo/Organisme : LORIA (CNRS, INRIA, Université de Lorraine)
URL : http://dedale.loria.fr
Ville : Nancy

Description  

Improving the quality of a system begins by the requirements elicitation.Our goal is to take into account the validation since the understanding of the requirements and all along the development of their Event-B specification. Our challenge is to bridge the gap between requirements, those of the client, and the specification, that of the computer scientist. We make explicit the interactions between the requirements and the specification under construction.

This subject consists in analyzing the requirements from a case study concerning the control a hemodialysis machine. It consists of:
- a detailed understanding of the cases study,
- the development of theses requirements in an Event-B spécification,
- using The Rodin platform tools, the ProR plug-in allow us to manage the trace of the requirements and their specification, and different tools of verification and validation (prover, ProB, animator, model-checker).



URL sujet détaillé : http://dedale.loria.fr/ (sujet-lyon-2018.pdf)

Remarques : Possibilité de rénumération



SM207-40 Algorithmique pour les flots de liens  
Admin  
Encadrant : Clemence MAGNIEN
Labo/Organisme : LIP6 (CNRS, UPMC)
URL : http://www.complexnetworks.fr/
Ville : Paris

Description  

Many networks occur in different context: social networks, web graphs, interaction between proteins, word co-occurrence networks, emails, etc. The state-of-the-art has begun to design solutions for analysing the structure of these networks, to be able to answer questions such as: what are the most important nodes? can the network be decomposed in relevant groups?

When the networks are dynamic, the usual approach consists in agregating the information in the form of a graph. However, in many contexts (network traffic, financial transactions, communications between people, etc.), the data is in the form of link streams: it is a series of triples (A, B, t) meaning that A interacted with B at time t (for instance, computer A sent a packet to computer B, or bank account A sent mony to bank account B, etc.)

The goal of this internship is to participate to the definition and computation of fundamental notions for describing link streams.
URL sujet détaillé : http://www.complexnetworks.fr/wp-content/uploads/2016/10/algo.pdf

Remarques :



SM207-41 Composition d'ordonnancements parallèles  
Admin  
Encadrant : Matthieu MOY
Labo/Organisme : Laboratoire d'Informatique du Parallélisme (LIP)
URL : https://matthieu-moy.fr/spip/?Master-2-Recherche-Composition-d-ordonnancements-paralleles
Ville : Lyon

Description  

Depuis le début des années 2000, la limite de miniaturisation des transistors force à multiplier les unités de calcul (processeurs, processeurs spécialisés) des superordinateurs pour améliorer les performances. De nombreux verrous doivent être levés, comme la distribution efficace d'une application sur les unités de calcul. Pour cela, l'application doit être divisée en unités à exécuter en parallèle et les communications entre unités doivent être réglées. Il existe des algorithmes capables de paralléliser automatiquement des noyaux de calcul intensifs comme les opérations matricielles.

L'équipe CASH nouvellement créée travaille sur des approches innovantes d'extraction du parallélisme vers une représentation intermédiaire avec pour objectif la production de code efficace pour accélérateurs matériels de type FPGA. Nos outils sont des compilateurs dits " source-to-source ", qui lisent du code C séquentiel, en extraient le parallélisme, puis génèrent du code qui explicite le parallélisme.

L'objectif de ce stage est d'expérimenter les stratégies de passage à l'échelle d'un algorithme de parallélisation. On suppose le noyau divisé en plusieurs sous-noyaux pour lesquels la parallélisation est possible. Le but du stage est de trouver une façon de composer les parallélisations des sous-noyaux pour produire une parallélisation globale à la fois correcte et efficace.
URL sujet détaillé : https://matthieu-moy.fr/spip/IMG/pdf/tiling-composition.pdf

Remarques : Co-encadré par Christophe Alias et Matthieu Moy.



SM207-42 Elastic Deployment of a Replicated Public Key Infrastructure (PKI)  
Admin  
Encadrant : Pierpaolo CINCILLA
Labo/Organisme : IRT-SystemX
URL : http://www.irt-systemx.fr/en/project/sca/
Ville : Renault, PSA Peugeot Citroën, Valeo, Transdev, IDnomic, Trialog, Yogoko, Oppida, and Telecom ParisTech.

Description  

IRT SYSTEMX CONTEXT
Located within a global scientific campus, the Research Institute of Technology SystemX is dedicated to the future system digital engineering. Working closely with the best French research organizations and bringing together academic & industrial research teams. SystemX has the ambition to generate major technologies and solutions by using digital engineering breakthroughs and to spread skills in all economic sectors.
You will take part to a research collaborative project called Secure Cooperative Autonomous systems (SCA). The SCA project is positioned to provide operational solutions to respond to new technological and economic challenges of automotive environment. The project partners are: Renault, PSA Peugeot Citroën, Valeo, Transdev, IDnomic, Trialog, Yogoko, Oppida, and Telecom ParisTech.

INTERNSHIP CONTEXT
You will be part of a project team composed by 3 interns working in collaboration within the SCA project.
One of the objectives of the SCA project is to implement the security management infrastructure of Cooperative Intelligent Transport Systems (C-ITS). The stakes are high because these ITS systems must be able to securely process thousands of messages exchanged per second and provide strong guarantees for the protection of personal data in accordance with national legislation and EU directives. The reliable and secure infrastructure (PKI) developed in the ISE project will therefore needs to respond to the issue of large-scale design in order to be able to distribute billions of digital identities to the ITS embedded stations. Moreover, the PKI should be able to adapt to the incoming load: it should increase the resource allocated when the load increase and diminish them when the load decrease.
The internship focuses on two main topics: (i) a research axis on adaptive PKI dimensioning and (ii) a practical axis on the PKI deployment to the cloud (in particular Kubernetes). The selected candidate will define PKI dynamic replication objectives and requirements, draw up a dynamic dimensioning strategy and measure its impact on costs/performances ratio. The internship involves a research work on distributed systems, dynamic replication and cooperative ITS. This research work will be submitted for publication in an international conference or workshop.
YOUR MISSIONS

- Definition of dynamic dimensioning objectives/requirements
- Definition of the replication strategy
o Parameters to measure (CPU, memory, network) in order to take the decision to increase/decrease the resources
o Resources increasing/decreasing strategy
- Measurements campaign
- Results analysis and benchmarking
- Results submission to an international conference or workshop
The ideal candidate should meet the following criteria: Master degree (ongoing), experience on software development and distributed systems, interested in research. The internship last for six months and is located at IRT-SYSTEMX headquarter in Paris-Saclay campus (France).
YOUR SKILLS
 Good knowledge on software engineering and deployment (java, jenkins)
 The knowledge of Docker / Kubernetes is an advantage
URL sujet détaillé : https://pages.lip6.fr/Pierpaolo.Cincilla/downloads/Projet_CREE_2018_SCApi_03_03_en.pdf

Remarques : You will be part of a project team composed by 3 interns working in collaboration within the SCA project at IRT-SystemX.



SM207-43 Algorithms for exact sum of squares decompositions  
Admin  
Encadrant : Victor MAGRON
Labo/Organisme : LIP6 Equipe-projet INRIA/CNRS/UPMC PolSys
URL : http://www-verimag.imag.fr/~magron
Ville : Paris

Description  

Certification and validation of computational results is a major issue for modern sciences raising challenging problems at the interplay of mathematics and computational aspects of computer science. One can emphasize in this context several applications arising in the design of modern cyber-physical systems with a crucial need of exact certification. In particular, one tries to avoid incidents such as the Patriot missile crash in 1991, the FDIV Pentium bug in 1994 or more
recently the collision of Google's self-driving car in 2016. These issues give rise to many mathematical problems. Polynomial optimization (which consists in computing the infimum of a polynomial function under algebraic constraints) is one of the most important and challenging one.The emergence of this exciting new field goes back to the last decade and has led to striking developments from a cross fertilization between (real) algebraic geometry, applied mathematics, theoretical computer science and engineering.

Consider for instance the problem of minimizing
4 x^4 + 4 x^3 y - 7 x^2 y^2 - 2 x y^3 + 10 y^4 over R^2. One way to certify that its minimum is 0 is to decompose this polynomial as a sum of squares (SOS), which is the core subject of study in real algebra. Here the decomposition is (2 x y + y^2)^2 + (2 x^2 + x y - 3 y^2)^2>= 0. In general, one can compute such SOS decompositions by solving a semidefinite program (SDP), which is a standard tool in applied mathematics and convex optimization. In SDP, one optimizes a linear
function under the constraint that a given matrix is semidefinite positive, i.e. has only nonnegative eigenvalues. One particular issue
arising while relying on SDP solvers is that they are often implemented using floating-point arithmetic, thus output only approximate certificates.

The challenging goal of this internship is to design algorithms to compute exact certificates while controlling the bit complexity of the algorithmic procedures.

Preliminary work will consist of studying the existing algorithms to obtain exact SOS decompositions of nonnegative polynomials. In
particular, the case of univariate polynomials has been recently handled by means of classical techniques from symbolic computation (root isolation, square-free decomposition). A
promising research track would be to apply the certification algorithms to a multivariate polynomial through a reduction to the univariate case. The idea is to characterize the set
of minimizers of this polynomial by exploiting the information given by the Jacobian. After
designing the certification framework, further efforts should lead to provide the related bit complexity estimates, both on runtime and
output size. Practical experiments shall be performed through implementing a tool within a computer algebra software such as Maple. One expected goal is to compare the performance of the tool with existing frameworks.
URL sujet détaillé : http://www-verimag.imag.fr/~magron/sujets/multisos.pdf

Remarques : Co-encadrement avec Mohab Safey El Din (https://www.lip6.fr/Mohab.Safey)

Rémunération prévue



SM207-44 Liens entre génération aléatoire et génération exhaustive  
Admin  
Encadrant : Yann STROZECKI
Labo/Organisme : Université Pierre et Marie Curie (LIP6)
URL : http://www.prism.uvsq.fr/~ystr/index.html
Ville : Paris

Description  

Usually we measure the time for an algorithm to provide a solution. Enumeration algorithms, provide all solutions satifying some constraint and their number can be huge. Therefore the usual notion of complexity is the delay between two consecutive solutions. In this internship we will try to further connect random generation of solutions and polynomial delay exact algorithms.

More details in the pdf at http://www.prism.uvsq.fr/~ystr/enum_random.pdf.
URL sujet détaillé : http://www.prism.uvsq.fr/~ystr/enum_random.pdf

Remarques :



SM207-45 Étude des jeux stochastiques simples et d'une famille de méthodes pour les résoudre  
Admin  
Encadrant : Yann STROZECKI
Labo/Organisme : Université Pierre et Marie Curie (LIP6)
URL : http://www.prism.uvsq.fr/~ystr/index.html
Ville : Paris

Description  

We are interested in a two player game with randomness over a graph called a SSG (simple stochastic game). Finding optimal strategies is known to be in NP cap CoNP but we do not know any polynomial time algorithm. The aim of the internship is to study a new method for solving SSGs, which captures several existing algorithms for simpler subclasses.
URL sujet détaillé : http://www.prism.uvsq.fr/~ystr/ssg_internship.pdf

Remarques : Co-encadrement avec David Auger et Pierre Coucheney de l'UVSQ.
Possibilité d'être localisé soit à Jussieu (Paris 6) ou Versailles (UVSQ)



SM207-46 Randomness and Continuous Time Dynamical Systems. Kolmogorov Complexity and Analog Models of Computation  
Admin  
Encadrant : Olivier BOURNEZ
Labo/Organisme : LIX (Laboratoire d'Informatique de l'Ecole Polytechnique)
URL : http://www.lix.polytechnique.fr/~bournez/
Ville : Palaiseau

Description  

General Introduction

Generation of randomness relies in practice often on purely continuous (time and space) processes. This includes the most emblematic example of randomness generation : rolling a dice. While the dynamic of a dice is given by a easy to establish purely deterministic law of evolution, its output, in form of a roll in {1, 2, . . . , 6} is considered by everybody to be random.

From a mathematical point of view, its evolution is given by some polynomial ordinary differential equation y′ = p(y) starting from some initial data y(0) = y0. Mathematical theories and theorems have been devised to model this phenomenon and explain how a purely deterministic process can generate randomness.

This includes ergodic theory which studies dynamical systems with an invariant measure and related pro- blem, initially motivated by problems of statistical physics. One of its main concern is the behavior of dynamical systems when they are allowed to run for a long time, assuming the restriction to systems preserving some mea- sure. In that theory, the randomness of a dice is explained by the incertitude present in the initial condition y0, modeled as random, and its dynamics is seen as a process preserving this initial randomness..

Computer Science Issues

Computable versions of several of theorems of ergodic theory have been devised, in particular in the frame- work of Computable Analysis [4].

However, another approach, even if probably less common in 2017 (but was at the time of first computers that were analog), is, however very natural : a dice can be considered as a process computing a value in {1, 2, . . . , 6}. More generally, every system modeled by an ordinary differential equation can be considered as doing a computation in a similar sense. This was the approach at the time of analog computers (namely up to the 70's) : given a continuous system to be simulated, one built an analog computer, that is a mechanical and later on electronic process, whose evolution was corresponding to the system to be modeled. The term analog, in analog computation comes historically from the idea of computing by analogy, even in recent years, it is mostly understood as being the opposite of digital : indeed, these machines were also working on continuous quantities such as angles or voltage, and not on bits as in today's computers.

Recent results have established that these models are naturally captured by polynomial ordinary differential equations, and that polynomial ordinary differential equations have many similarities with Turing machines. In particular, they can simulate Turing machines (and conversely). The class of solutions of polynomial ordinary
1differential equations have many closure properties rather similar to the notion of computable function. A recent breakthrough, obtained by the PhD thesis of Amaury Pouly was to establish that time is corresponding, in a deep sense, to the length of solutions. One of the interests of such results is that they lead ways to present concepts from computability or complexity about continuous functions in an implicit way (or in an internal way if you prefer) : one can explain what a computable function over the real is without any reference to digital concepts such as Turing machines, but only concepts very natural in analysis (namely length, distance, etc.). The PhD thesis of Amaury Pouly [2], co-supervised by Olivier Bournez and Daniel Gra ̧ca, has been awarded in August 2017 the selective Ackermann PhD Award.

The purpose of this work is to understand if a nice theory for randomness based on Kolmogorv complexity can be built for these models. This is related to the project RACAF funded by ANR (French National Research Agency).

Description of the work

Coming back to our example. One sees that this is possible to generate a random number {1, 2, . . . , 6} using a polynomial ordinary differential equation. Namely, consider the one corresponding to the dice dynamics. Can one generate any function ? We answered positively to this question through the result published in ICALP'17 [1], solving an open question from Rubel in 81. This has been remarked and a post in the well-known blog Go ̈del's Lost Letter and P=NP have been devoted to our result 1 [3]

These results establish the first steps to possibly define concepts similar to Kolmogorov complexity for continuous functions. Of course, Kolmogorov complexity could probably be defined by transferring the questions back to the digital world, but we believe they can possibly be defined in an implicit/internal way, without any reference to concepts such as Turing machine. For example, by replacing the concept of =93simplest program=94 by simplest ordinary differential equation/initial condition, one may expect theorems similar to the one known in the classical settings. The most fundamental definitions and statements in the classical settings rely indeed on the fact that there exists universal Turing machines. Results established in [1] are clearly a first substantial step towards this, as they lead a way to talk about universal (polynomial) ordinary differential equations.


R ́ef ́erences

[1] Olivier Bournez and Amaury Pouly. A universal ordinary differential equation. In International Colloquium on Automata Language Programming, ICALP'2017, 2017.
[2] Amaury Pouly. Continuous models of computation : from computability to complexity. PhD thesis, Ecole Polytechnique and Unidersidade Do Algarve, Defended on July 6, 2015. 2015. https ://pastel.archives- ouvertes.fr/tel-01223284, Prix de Th`ese de l'Ecole Polyechnique 2016, Ackermann Award 2017.
[3] RJLipton+KWRegan. Modeling reality. post in the blog =94g ̈odel's lost letter and p=np=94. https ://rjlipton.wordpress.com/2017/08/09/modeling-reality/.
[4] K. Weihrauch. Computable Analysis : an Introduction. Springer, 2000.

URL sujet détaillé : http://www.lix.polytechnique.fr/~bournez/load/sujet-2017-18-randomness-ens-lyon.pdf

Remarques :
The true topic of the work is related to computability theory. This requires only common and basic knowledge in ordinary differential equations. Most of the intuitions of our today's constructions come from classical computability.

There is no specific prerequisite for this internship, except some knowledge about computability theory. This subject can be extended to a PhD. Possibilities of funding according to the administrative situation of candidates.



SM207-47 Turing Neural Computer, Differential Neural Computer, Neural Stack Machine, Neural Network Pushdown Automata etc. . .. Some recent models from Deep Learning from a Computability and Complexity Theory Perspective.  
Admin  
Encadrant : Olivier BOURNEZ
Labo/Organisme : LIX (Laboratoire d'Informatique de l'Ecole Polytechnique)
URL : http://www.lix.polytechnique.fr/~bournez/
Ville : Palaiseau

Description  


General Introduction

The successes of deep learning models and methods in various impressive applications in various fields of artificial intelligence, such as in speech recognition, image recognition, natural language processing is currently giving birth for a clear renewal of interest about analog computations, that is to say, to models working over continuous quantities instead of bits. As you probably know, and just to cite one application, the first machine able to beat all best professional Go players is based on deep learning technology.

Current deep learning models may have a rather complex architecture built from various modules, but most of these modules are essentially artificial neural network models. An artificial neural network (NN) consists of many simple, connected processors called neurons. The state of each neuron is given by some real number called its activation value. Some input neurons gets this activation value from the environment. Every other neuron evolves by applying a composition of a certain one-variable function σ (usually a sigmoid) with an affine combination of the activations of the other neurons to which it is connected. Some of the neurons, considered as output neurons, may trigger actions on the environment : see [1] for an overview.

From a computation theory point of view, these models are discrete time models of computation working over some continuous space.

Most of the work related to (deep) artificial neural networks is devoted nowadays to finding architecture and weights that make a NN exhibit a desired behavior in a given context of application. A very popular and successful technique to determine suitable weights for solving a given problem is back-propagation, that is to say a reinforcement learning technique based on a gradient descent method applied to a error functions expected to be minimized over the learning set : see e.g. [1].

Applying any gradient descent method requires a differentiable error function with respect to all involved parameters. The very numerous applications of deep learning techniques have recently given birth to various other analog models of computation. All these models have in common to be differentiable end-to-end versions of models inspired by classical computability. This includes the popular Long Term Short Memory (LTSM) architecture [5], or the so-called Differentiable Neural Computers [2], or the Neural Turing machines [3]. The underlying principle of these constructions is to extend an artificial neural networks by coupling it to an external memory resource. This external resource can also be a stack as in the so-called Neural Network Pushdown Automata [7] and Neural Stack machines [4].

Description of the work

The computational power of these models is, from our point of view, not really understood. Most of them have not really been considered from computability or complexity point of view.

The computational theory of neural network models have been rather well studied since the 1990's : see e.g. [6]. However, all these models have new aspects (such as an external memory) which make them possibly really differ from classical neural network models. Hence, we believe that these models are not so clearly covered by existing results.

The purpose of this work is to compare these analog models to classical models. We started to investigate this through the L3 internship of Florent Gu ́epin (from ENS Lyon). We believe that many rather nice statements and theorems can be established to pursue this preliminary study and work. The purpose is to pave the undestanding of these new models from a computability and complexity theory point of view.

R ́ef ́erences

[1] Christopher M Bishop. Neural networks for pattern recognition. Oxford university press, 1995.
[2] Alex Graves, Greg Wayne, and Ivo Danihelka. Neural turing machines. arXiv preprint arXiv :1410.5401,
2014.
[3] Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwin ́ska, Sergio Go ́mez Colmenarejo, Edward Grefenstette, Tiago Ramalho, John Agapiou, et al. Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626) :471=96476, 2016.
[4] Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, and Phil Blunsom. Learning to transduce with unbounded memory. In Advances in Neural Information Processing Systems, pages 1828=961836, 2015.
[5] Sepp Hochreiter and Ju ̈rgen Schmidhuber. Long short-term memory. Neural computation, 9(8) :1735=961780, 1997.
[6] Hava T. Siegelmann. Neural Networks and Analog Computation - Beyond the Turing Limit. Birkauser, 1999.
[7] GZ Sun, CL Giles, HH Chen, and YC Lee. The neural network pushdown automation : model, stack and learning simulations. 1993.
URL sujet détaillé : http://www.lix.polytechnique.fr/~bournez/load/sujet-2017-18-models-deep-learning-ens-lyon.pdf

Remarques : The true topic of the work is related to computability and complexity theory.

There is no specific prerequisite for this internship, except some very basic knowledge about computation theory. This subject can be extended to a PhD. Possibilities of funding according to the administrative situation of candidates.



SM207-48 Polyhedral Abstract Interpretation  
Admin  
Encadrant : Laure GONNORD
Labo/Organisme : LIP
URL : laure.gonnord.org/pro :
Ville : LYON

Description  

Context : program analysis for compilers

The polyhedral model is a collection of techniques developed around a common intermediate representation of programs : integer polyhedra.
However it suffers from a lack of expressivity, since non "regular" loops cannot be represented nor analysed within this framework.

The objective of the internship is to reformulate the concepts and the algorithms developed within the polyhedral model in a more general framework, namely, abtract interpretation.
The expected result is a way to represent and compute "dependences" for general programs as well as a prototype that demonstrates the feasibility of the approach.
URL sujet détaillé : http://laure.gonnord.org/pro/research/2018-IApoly.pdf

Remarques : I have funding for this subject, as well as a funding for a Phd on a close subject, beginning sept 2018.



SM207-49 Characterization of NP or PSPACE with polynomial ODE Continuous time Analog machines and models of computation.  
Admin  
Encadrant : Olivier BOURNEZ
Labo/Organisme : LIX (Laboratoire d'Informatique de l'Ecole Polytechnique)
URL : http://www.lix.polytechnique.fr/~bournez/
Ville : Palaiseau

Description  

General Presentation

Today's theoretical computer science, and in particular classical computability and complexity consider mostly computations over a discrete space with a discrete space. This aims at modeling today's computers, which are digital computers working over bits. This covers today's machines, and today's classical models such as Turing machines working over words over a finite alphabet with a discrete time.

But machines where time is continuous can be considered and can indeed be built. Such machines are analog, and are working over continuous quantities like voltage. Notice that first ever built programmable computers were actually analog machines. This includes for example the differential analysers that were first mechanical machines working over quantities encoded by angles of shafts, and later on electronic machines working over quantities like continuous voltages. Such machines were typically used to solve ordinary differential equations. Realizing today analog computer does not yield to major difficulties : see example [5] for a book about the history of computing, not forgetting analog computing history as in most of the books.

It turns out that the corresponding computability and complexity theory has not received so much attention : even if models of computation where space could be continuous and time remains discrete have been considered (see e.g. [1], or [6]), these models are still discrete time.

Notice that there is no hope to get a unification of the type of Church-Turing thesis for all such models. However, there is indeed something similar in many aspects for continuous time and space models.

The purpose of this internship is to focus on these latter models : i.e. truly analog (continuous time and space) models of computation such as differential analysers or electronic analog computers. In this context, for many reasons, the equivalent of a Turing machine can be considered as a polynomial ordinary differential equation (pODE). Indeed, a (projection of a) solution of such a pODE, that could be considered as the analog of computable functions enjoy many stability properties similar to stability properties of computable functions. All common analytic functions are in this class, an observation similar to the fact that all common functions in mathematics are computable. Such functions are stable by most of the operations (addition, multiplication, subtraction, division, inverse, composition, ODE solving, etc...). Some analytic computable functions are known not to be in this class. However, if a modern definition of computability is considered for pODEs, then computable functions for Turing machines and by pODEs coincide. Etc . . .

Context of the work

We recently proved that there is a very natural and well-founded way of measuring time of computation for such functions : time corresponds to the length of the trajectory.

This surprising result, was awarded the ICALP'2016 best paper award (Track B), and led to the (European PhD) Ackermann Award 2017 to Amaury Pouly, cosupervised by Daniel Grac ̧a in Portugal and myself in Palaiseau, France.

On one hand, this demonstrates that analog models of computation (in particular old and first ever consi- dered models of machines) are equivalent to modern models both at the computability (we established this fact about 10 years ago [3]) and complexity level (this was an open problem, and part of the awarded facts [4, 2]).

More importantly, on the other hand, this opens new lights on classical computability and complexity : Indeed, this proves that polynomial time (class P or FP for respectively decision problems and functions) can be defined very easily using only very simple concepts from analysis : polynomial ordinary differential equations and length of curves. Indeed, this states for example the following rather unexpected fact : a function over the real is computable in polynomial time iff it can be uniformly approximated by solutions of a polynomial ordinary differential equation of polynomial length.

Description of the work

The purpose of this internship is to understand if classes such as NP or PSPACE can also be characterized. One of the purposes if for example to formulate the question P = NP? in the framework of (mathematical) analysis.
We have some preliminary statements in that direction, but the purpose is to reach a nice(r) and elegant characterization of NP and PSPACE.


R ́ef ́erences

[1] L. Blum, M. Shub, and S. Smale. On a theory of computation and complexity over the real numbers ; NP completeness, recursive functions and universal machines. Bulletin of the American Mathematical Society, 21(1) :1=9646, July 1989.
[2] O. Bournez, D. S. Gra ̧ca, and A. Pouly. Polynomial Time corresponds to Solutions of Polynomial Ordinary Differential Equations of Polynomial Length. Journal of the ACM, 2017. Accepted for publication.
[3] Olivier Bournez, Manuel L. Campagnolo, Daniel S. Grac ̧a, and Emmanuel Hainry. Polynomial differential equations compute all real computable functions on computable compact intervals. Journal of Complexity, 23(3) :317=96335, June 2007.
[4] Amaury Pouly. Continuous models of computation : from computability to complexity. PhD thesis, Ecole Polytechnique and Unidersidade Do Algarve, Defended on July 6, 2015. 2015. https ://pastel.archives- ouvertes.fr/tel-01223284, Prix de Th`ese de l'Ecole Polyechnique 2016, Ackermann Award 2017.
[5] Bernd Ulmann. Analog computing. Walter de Gruyter, 2013.
[6] K. Weihrauch. Computable Analysis : an Introduction. Springer, 2000.
URL sujet détaillé : http://www.lix.polytechnique.fr/~bournez/load/sujet-2017-18-classes-complexite-ens-lyon.pdf

Remarques : The true topic of the work is related to computability and complexity theory. This requires only common and basic knowledge in ordinary differential equations. Most of the intutions of our today's constructions come from classical computability and complexity.
There is no specific prerequisite for this internship. This subject can be extended to a PhD. Possibilities of funding according to the administrative situation of candidates.

The supervisor is also open on variations on these questions according to the preferences of the candidate (focusing on algorithmic aspects =94 how to derive an efficient algorithm using continuous methods=94, logical aspects =94what is the associated proof theory=94, etc.).



SM207-50 Proof of the correction of a set library in JavaSript  
Admin  
Encadrant : Jean-pierre JACQUOT
Labo/Organisme : LORIA
URL : http://www.loria.fr
Ville : Nancy

Description  

Correct software are more and more required as their presence is more en more ubiquituous.
Formal methods allow us to prove software correction but they are not sufficiently used. One of
the causes is the absence of tools to validate the formal models, i.e. to check that models are
consistent with the actual users' needs.

During his PhD, F. Yang has developed JeB, a tool which allows us to build simulators of specifications
written in Event-B [1]. A simulator is an executable version of a specification produced by inserting
"hand-written" pieces of code where the automatic translation failed. It is thus possible to execute
specification at all levels of abstraction. JeB uses JavaScript as its target language. Its design and
implementation are described in [2,3,4]. Examples of simulators are accessible through http://dedale.loria.fr/.

The scientific challenge to our approach is simply stated: How to prove that the executable model (in JavaScript)
is correct wrt the formal model (in Event-B)?

In practice, the translation is designed as a syntaxic transformation of the Event-B operators into function
calls to a set library. The correction of the translation is mostly dependant of the correction of the library.
To assert this, we intend to use the K framework (http://www.kframework.org) in which a full semantics of JavaScript
has been written.

The proposed subject aims at studying and defining a framework to conduct the proof of the set library.
Work can be organized around three themes:
- evaluating the adequacy of the K framework toward our goal,
- proving a few functions of the library, and
- setting-up a framework to facilitate the proof of the whole library.

[1] J.-R. Abrial. Modeling in Event-B: System and Software Engineering.
Cambridge University Press, 2010
[2] F. Yang. A Simulation Framework for the validation of Event-B Specifications.
These. Universite de Lorraine, Novembre 2013
[3] F. Yang, J.-P. Jacquot, J. Souquieres. Proving the Fidelity of Simulations of
Event-B Models. In 15th HASE proceedings, 2014
[4] F. Yang, J.-P. Jacquot, J. Souquieres. The case of Using Simulation ot Validate
Event-B Specifications. In 19th APSEC proceedings, 2012

URL sujet détaillé : http://dedale.loria.fr

Remarques : Reglementary allowance possible



SM207-51 Nombre achromatique des graphes signés  
Admin  
Encadrant : Eric SOPENA
Labo/Organisme : LaBRI - Université de Bordeaux
URL : http://www.labri.fr/perso/sopena/
Ville : Bordeaux

Description  

A signed graph is a graph all of whose edges are assigned a positive or negative sign. If u is a vertex of a signed graph G, resigning at u consists in changing the sign of all edges incident with u. Two signed graphs are equivalent if one can be obtained from the other by a sequence of resignings.
A homomorphism from a signed graph G to a signed graph H is an edge and sign preserving mapping from V(G) to V(H). The chromatic number of G is then defined as the smallest order of a complete signed graph K such that there exists a signed graph G', equivalent to G, that admits a homomorphism to K.
The goal of this project is to initiate the study of the achromatic number of signed graphs, the achromatic number of a signed graph G being defined as the largest order of a complete signed graph K such that there exists a signed graph G', equivalent to G, that admits an edge-surjective homomorphism to K.
The achromatic number of (ordinary) undirected graphs has been extensively studied in the literature and the workprogram will mainly consist in determining to what extent the known results in this area can be extended to signed graphs.

Keywords. Signed graph, Homomorphism, Achromatic number, Complete colouring.

URL sujet détaillé : http://www.labri.fr/perso/sopena/pmwiki/uploads/Proposals/Hocquard_Sopena_Nombre_achromatique_des_graphes_signes.pdf

Remarques : Stage co-encadré par Hervé Hocquard (http://www.labri.fr/perso/hocquard/)



SM207-52 Prédire les interactions protéine-protéine  
Admin  
Encadrant : Florian RICHOUX
Labo/Organisme : LS2N, Université de Nantes
URL : http://richoux.fr
Ville : Nantes

Description  

L'objectif de ce stage est de proposer un format de données ainsi qu'un modèle de deep learning pour la prédiction d'interaction protéine-protéine.

Ce projet de recherche contient plusieurs étapes :

1. Dans un premier temps, ne prendre en compte que la chaîne polypeptidique (c'est-à-dire la chaîne d'acides aminés) composant chaque protéine. Le format de données sera ici simplement les deux chaînes polypeptidique des deux protéines à tester
pour prédire si oui ou non elles interagissent entre elles. Il faudra proposer et expérimenter un modèle de deep learning apprenant ces interactions.

2. Avec les mêmes données, la seconde étape consiste à affiner la prédiction pour non
plus déterminer si deux protéines interagissent, mais préciser à quel(s) niveau(x) dans leur chaîne polypeptidique respective l'interaction a lieu, s'il y en a une.

3. Enfin, l'objectif ultime de ce projet de recherche est de descendre au niveau atomique en prenant en compte la représentation 3D de la structure atomique de chaque protéine, allant bien au delà de la simple chaîne polypeptidique. Un véritable challenge se pose alors pour trouver un format de données convenable ainsi qu'un modèle de deep learning capable de prédire où se situe l'interaction au niveau atomique. Une telle méthode de prédiction serait inédite dans le domaine, ce qui souligne le caractère ambitieux de ce projet de recherche.
URL sujet détaillé : http://richoux.fr/stages/deep_protein.pdf

Remarques : Le stage sera gratifié à hauteur d'environ 550=80 par mois (en fonction du nombre de jours ouvrés).



SM207-53 Composing Code Rewriting Directives  
Admin  
Encadrant : Sébastien MOSSER
Labo/Organisme : I3S (UMR 7271 CNRS / UNS)
URL : http://www.i3s.unice.fr/~mosser/
Ville : Sophia Antipolis

Description  

Rewriting code tools (e.g. Spoon) that perform source-to-source transformations of a given program, are used everywhere, from code optimisation to automatic repairing, anti-pattern solving. However, all these tools face the same kinds of problems: how to deal with conflicting writes?

There are many instances of this problem: antipattern resolution, peephole optimization in LLVM, critical pair analysis in graph transformation, ...

In this internship, we propose to formalize the notion of code rewriting in the specific context of program refactoring. The idea is to (i) implement program rewriting rules to support refactoring directives, (ii) formally analyze theses rule definitions according to different methods and (iii) define an empirical benchmark that measure the accuracy of the conflict detection mechanisms when applied to real-life programs.
URL sujet détaillé : http://laure.gonnord.org/pro/research/2018-rewritingcode.pdf

Remarques : This work will be supervised mostly by Sebastien Mosser in Sophia Antipolis and remotely by Laure Gonnord (Laure.Gonnord-lyon.fr) in Lyon.



SM207-54 Optimisation sous incertitude dans StarCraft 2  
Admin  
Encadrant : Florian RICHOUX
Labo/Organisme : LS2N, Université de Nantes
URL : http://richoux.fr
Ville : Nantes

Description  

Ce stage consiste à modéliser la partie optimisation d'un problème via le formalisme
Stochastic csp modifié, de modéliser l'incertitude lié à l'environnement via un modèle d'apprentissage statistique, d'implémenter ces modèles et de la valider expérimentalement. Plus en détail, il faudra :

=97 Réfléchir aux modifications à apporter au formalisme Stochastic csp afin de pouvoir modéliser des problèmes de prises de décision sous incertitude en se débarassant de paramètres arbitraires, et y dégager des propriétés tel que la complexité.

=97 Proposer un modèle d'apprentissage statistique pour appréhender l'incertitude liée à l'environnement, au contexte du problème étudié.

=97 Concevoir et implémenter un algorithme de recherche locale pour résoudre nativement les problèmes d'optimisation modélisés dans le formalisme proposé.

=97 Modéliser des problèmes d'optimisation sous incertitude dans StarCraft, et expérimenter les performances du solveur de recherche locale implémenté.
URL sujet détaillé : http://richoux.fr/stages/optimisation_starcraft2.pdf

Remarques : Le stage sera gratifié à hauteur d'environ 550=80 par mois (en fonction du nombre de jours ouvrés).



SM207-55 Vérification automatisée de structures de données réparties pour des processeurs multicóur  
Admin  
Encadrant : Loïg JEZEQUEL
Labo/Organisme : LS2N
URL : http://www.irccyn.ec-nantes.fr/~jezequel/
Ville : Nantes

Description  

English title: Automated verification of distributed data-structures for multicore processors.

Building data-structures for multicore processors while ensuring properties such as equity, data coherency, or deadlock freeness, is a difficult task. Usually, correctness of such data-structures implementations is proven by hand. Most proofs are based on enumeration of cases. This may be reasonable for simple structures (such as lists) but becomes much less tractable for more complex structures (such as priority queues or trees).

The goal of this internship will be to use model-checking (a verification technique) for tracking bugs in data-structures implementations for multicore processors. As a first step =97 in order to become familiar with model checking tools =97 the student could model simple data-structures implementations on a given multicore processor, artificially introduce bugs in them, and automatically find these bugs. Then, we will look at more complex data-structures implementations (references can be found in the detailed subject) =97 selected because they very likely have bugs, while this has not been proved yet =97 and will try to exhibit bugs in them.
URL sujet détaillé : http://www.irccyn.ec-nantes.fr/~jezequel/verif-data-struct.pdf

Remarques : Co-encadrement avec Corentin Travers (ENSEIRB-MATMECA, Bordeaux).

Rémunération possible sur l'ANR PACS : https://lipn.univ-paris13.fr/PACS/



SM207-56 Caractérisation de traces de navigation  
Admin  
Encadrant : Lionel TABOURIER
Labo/Organisme : LIP6 - Université Pierre et Marie Curie
URL : http://www.complexnetworks.fr/
Ville : Paris

Description  

The internship is part of the ANR project Algodiv.


The purpose of the internship is to analyze navigation traces on webpages and describe the users' trajectories.

We are trying to answer the following questions :
- how to characterize the navigation traces based on their length, the type of webpages visited, or the time spent on each page ?
- Is it then possible to classify the users' behaviours ?
- Do we observe changing of behaviours over long periods ?
- Is it possible to identify causation relationships between the trajectories observed and the recommendation proposed ?

In the context of the Algodiv project, several datasets (on media as well as commercial websites) should be available to support this study.


URL sujet détaillé : http://www.complexnetworks.fr/wp-content/uploads/2017/09/sujet_algodiv.pdf

Remarques : ANR project webpage : http://algodiv.huma-num.fr/



SM207-57 Analysing Software Containers from a Compilation point of view  
Admin  
Encadrant : Sébastien MOSSER
Labo/Organisme : I3S (UMR 7271 CNRS / UNS)
URL : http://www.i3s.unice.fr/~mosser/
Ville : Sophia Antipolis

Description  

The adoption of the container approach is tremendously increasing. Containers ensure that
a given microservice will run the same regardless of its environment, easing the repeatability of build,
test, deployment and runtime executions. Containers are faster at runtime and boot-time, lighter than virtual machines, and scale better. In the container field, the Docker engine quickly became the reference platform for builds and deployments.

Building a non-trivial container image is a difficult process. It leads to erroneous deployments, detected at runtime. Moreover, the technologies supporting microservice deployment evolve constantly, to make it more efficient or scalable.

In this internship, we propose to address the challenges of deployment descriptor definition from a compilation point of view. We propose to first build an empirical datasets of available dockerfiles (e.g., by crawling GitHub) and identify in these dockerfiles classical flaws encountered by developers (for example by using classification mechanisms). Then, we will work at the descriptor language level to (i) provide a sound grammar to safely define and extend descriptors, (ii) define formal detection rules to identify in existing descriptors the identified design flaws and (iii) provide a bidirectional transformation from our representation to deployable descriptors.

URL sujet détaillé : http://laure.gonnord.org/pro/research/2018-dockercompil.pdf

Remarques : This work will be supervised mostly by Sebastien Mosser in Sophia Antipolis and remotely by Laure Gonnord (Laure.Gonnord-lyon.fr) in Lyon.



SM207-58 Temporalité dans la recommandation de spectacles  
Admin  
Encadrant : Lionel TABOURIER
Labo/Organisme : LIP6 - Université Pierre et Marie Curie
URL : http://www.complexnetworks.fr/
Ville : Paris

Description  

The purpose of this internship is to build a recommendation method, based on an original formalism, which draw benefit from the temporal information in the data.

The internship aims at industrial applications. It is a joint project with Delight, a SaaS platform addressed to live performance producers, in order to improve their knowledge of their audience and increase the attendance and the revenue of the shows. This kind of recommendation has several difficulties of its own. One of them is that a live performance is often a one-time venue, so that it is not possible to collect information about the spectators of the venue itself before recommending the show.

We investigate a specific method to circumvent this problem, which consists in gathering shows into groups, depending on their features (genre, artists etc). Then, we attempt to predict purchases between spectators and groups of shows. In this way, recommendation can be understood as a problem of link prediction in a link stream, that is to say inferring which interactions will happen and when.

URL sujet détaillé : http://www.complexnetworks.fr/wp-content/uploads/2017/10/sujet_tres.pdf

Remarques : stage en collaboration avec Delight
http://delight-data.com/fr/



SM207-59 Algorithms for subset sum and number balancing  
Admin  
Encadrant : Alantha NEWMAN
Labo/Organisme : G-SCOP
URL : http://pagesperso.g-scop.grenoble-inp.fr/~newmana/
Ville : Grenoble

Description  

We propose to study problems of the following type: Given a set of positive integers, (i) find a subset with a specified sum, or (ii) find two (non-equal) subsets whose sums are very close.

For example, with respect to the latter problem, it is known that there exist two subsets (of a set with n numbers) whose sums have difference at most 1/(n^(log n)), but it is unknown how to find such sums. If the sum of the given n numbers is at most 2^n-2, then there exist two equal subsets (by the pigeonhole principle), but it is unknown how to efficiently find these sets. There are also several related problems for which there has been
recent progress.

The goal of this project is to obtain a good understanding of the literature and to design algorithms for some of these problems.

URL sujet détaillé : :

Remarques :



SM207-60 Well Structured FIFO Automata  
Admin  
Encadrant : Alain FINKEL
Labo/Organisme : LSV (ENS PARIS SACLAY)
URL : http://www.lsv.fr/~finkel/
Ville : CACHAN

Description  

General Context
The theory of Well Structured Transition Systems, (WSTS) allows the
automatical verification of infinite-state systems, that can be finitely repre-
sented and tested [5, 8, 6]. Termination, boundedness and coverability (the
coverability problem is a variant of the reachability problem : it remains
to decide whether there exists a reachable state in the upward closure of a given state) are decidable for WSTS.
For complete WSTS [6], the Karp and Miller algorithm [11, 6] computes the finite set of maximal elements of the downward closure of the reachability set. This algorithm logs a state space exploration of the reachability set with a finite tree, allowing to decide safety and liveness problems. FIFO automata is a powerfull model since the FIFO mechanism allows to simulate the tape of turing machines. Various decidable classes of FIFO
automata have been studied. However, there exist today no satisfying characterization of well structured FIFO automata. We propose a systematic study of well structured FIFO automata.

Objectives
1. Survey the litterature about FIFO automata : monogeneous [5], linear [7], word linear [10], half-duplex [2], cyclic [1] and lossy FIFO automata [3, 9, 4].

2. Find and analyse the different classes of well structured FIFO automata associated with different orders on the set of states as the prefix ordering, the subword ordering,...

Location
This internship will be supervised at the Ecole Normale Supérieure de Cachan.
URL sujet détaillé : http://www.lsv.fr/~finkel/2016-2017/wsts-fifo.pdf

Remarques :



SM207-61 Dessiner des graphes comme polygones convexes  
Admin  
Encadrant : Kolja KNAUER
Labo/Organisme : LIF/LIS Aix-Marseille Université
URL : http://www.lif.univ-mrs.fr/
Ville : Marseille

Description  

This internship is about straight-line drawings of graphs in the plane such that the vertices lie in convex position and/or the midpoints of the edges lie in convex position. It is known, that a graph that can be drawn satisfying this type of conditions can have only a bounded number of edges. The goal of the internship is to study such drawings and improve the known bounds.

These problems can be reformulated in terms of large convex sets in Minkowski sums of planar point sets and have applications in Algebraic Complexity. However, no preliminary knowledge in these topics is required.
URL sujet détaillé : http://pageperso.lif.univ-mrs.fr/~kolja.knauer/minkonvex2.pdf

Remarques : co-encadrement: Ignacio Garcı́a-Marco (ignacio.garcia-marco.univ-mrs.fr)



SM207-62 Certification of Distributed Self-Stabilizing Algorithms Using Coq  
Admin  
Encadrant : Karine ALTISEN
Labo/Organisme : Verimag
URL : http://www-verimag.imag.fr/~altisen
Ville : Grenoble

Description  

http://www-verimag.imag.fr/~altisen/padec_internship_2018.pdf
URL sujet détaillé : http://www-verimag.imag.fr/~altisen/padec_internship_2018.pdf

Remarques :



SM207-63 Models of Distributed Algorithms  
Admin  
Encadrant : Karine ALTISEN
Labo/Organisme : Verimag
URL : http://www-verimag.imag.fr/~altisen
Ville : Grenoble

Description  

http://www-verimag.imag.fr/~altisen/distancek_internship_2018.pdf
URL sujet détaillé : http://www-verimag.imag.fr/~altisen/distancek_internship_2018.pdf

Remarques :



SM207-64 Ensuring Correctness of (Data) Graph Transformations  
Admin  
Encadrant : Rachid ECHAHED
Labo/Organisme : LIG (Laboratoire d'Informarique de Grenoble)
URL : http://lig-membres.imag.fr/echahed/
Ville : Grenoble

Description  

The objective of this project is to tackle the problem of correctness
of systems processing graph-like structures annotated with attributes.
Graph-like structures are used to describe a wide range of situations
in a precise yet intuitive way. Different kinds of graphs are used in
modeling techniques depending on the investigated fields, which
include computer science, chemistry, biology, quantum-computing, etc.
When system states are represented by graphs, it is natural to use
rules that transform graphs to describe the system evolution.

The analysis of the considered complex computational systems involves
the verification of different kinds of properties, like structural
properties of states, properties of computations, probabilities of
reaching specific system states, etc. Correctness is thus achieved by
using different analysis techniques and tools to verify each of the
desired properties. To suitably express these properties, a variety of
logics may be necessary. In this project, we investigate the analysis
of complex systems specified using graph transformations, a framework
that is very well-suited to the specification of highly parallel
systems that is based on the notion of graph to describe states and
rules to describe the evolution steps of the system.

In a recent work [1,2], a rule-based framework, which handles graph
structures, has been introduced together with some decidable logics to
capture some of the effects of the considered systems. The methods of
[1,2] consider graph shapes lacking any attributes. This focus on
graph structures ignoring the attributes was prudent as an initial
approach, since the analysis of pointer structures is in itself very
challenging already.

The goal of this project is to develop a framework extending that of
[1,2] which supports some attributes and operations on data as well as
a formalism for analyzing the considered systems. A Hoare-like
calculus will be developed for the considered framework based on some
specific classical or dynamic logics.

For more details please contact the supervisor at: echahed.fr
This project could be followed by a 3 years PhD thesis in the
considered area. Other projects related to graph transformations are
available from the supervisor.


Work plan.

1. Get familiar with graph transformation techniques as well as some
of the intended proof techniques (small dedicated bibliography).

2. Extend procedures in [2] in order to take into account
some attributes, starting with decidable theories (e.g.
Presburger arithmetic, strings).

3. Design and implement a first prototype if time permits

4. Writing of the Master's thesis

[1] J. Brenas, R. Echahed and M. Strecker: Proving Correctness of
Logically Decorated Graph Rewriting Systems. In Proceedings of the 1st
International Conference on Formal Structures for Computation and
Deduction, (FSCD) 2016, pp: 14:1-14:15

[2] J. Brenas, R. Echahed and M. Strecker: Ensuring Correctness of
Model Transformations while Remaining Decidable. In Proceedings of the
13th International Colloquium on Theoretical Aspects of Computing
(ICTAC) 2016, pp 315-332

URL sujet détaillé : :

Remarques : Co-encadrant : Jon Hael Brenas, Univ. Tennessee (USA)

Possibilité de rémunération du stage.

Possibilité de poursuite en thèse.



SM207-65 Taxonomy of thresholded boolean automaton networks inferred from logical constraints.  
Admin  
Encadrant : Nicolas GLADE
Labo/Organisme : TIMC-IMAG, Université Grenoble Alpes, Domaine la merci
URL : http://www-timc.imag.fr/
Ville : La Tronche

Description  

Taxonomy of thresholded boolean automaton networks inferred from logical constraints.

Nicolas Glade (TIMC-IMAG), E-mail. nicolas.glade-grenoble-alpes.fr, 04 56 52 00 42
Laurent Trilling (TIMC-IMAG), E-mail : laurent.trilling-grenoble-alpes.fr, 04 56 52 00 97

TIMC-IMAG, UGA - CNRS UMR5525, Faculté de Médecine de Grenoble, 38710 La Tronche Cedex

This internship is a continuation of several works realized at TIMC-IMAG laboratory [1-6]. Genetic regulatory networks can be modeled in a boolean formalism. We showed that their structure, properties and dynamics can be found (precisely inferred) by using logic constraint-based programming (in Prolog or in ASP for Answer Set Programming), from all the available knowledge. This knowledge, notably interactions between the network's nodes (e.g. genes) and their activity, is now encoded in the form of combinations of constraints that are themselves translated in a logical form and sent to a satisfiability solver (SAT-like). Constraints can be imposed to the dynamics of several nodes of a network (that could describe a function), and in the same time on some parts of the structure (graph, parameters of the network). The solver then gives back the complete set of valid models (those which satisfy the logical constraints) known as the satisfiable set. This method is not limited to the validation of individual models but furnishes a collection of models, all different but all satisfiable.
In a previous work [3], we expressed neural networks formalism (thresholded boolean automaton networks) in ASP and particularly focused on methods to obtain the complete sets of non-redundant networks of fixed size N (number of nodes) realizing particular 'complex' functions (like encoding a binary sequence made of a repeated motif). This work also showed that, within such satisfiable sets, some networks look similar (with few parameters changing) or at least having common parts while other present very different structure. We also found that networks of a given size N realizing a function can serve as modules to realize one part of a more complex function. We finally started to understand how such N-sized networks can be expanded to larger networks by =91unfolding' (decomposing) logical functions belonging to specific nodes.
We now want to study the taxonomy of networks within a satisfiable set (horizontal study) and the relations between networks of different sizes, i.e. either by inclusion of submodules or by folding/unfolding of logical functions (vertical study). The challenge is now to decide which criterion to use to compare networks : comparisons on the whole network dynamics, i.e. on the basins of attraction, comparisons on the logical functions that compose the networks, comparisons between network structures, i.e. on the weight matrices. Our motivation is to contribute to the study of the functioning of biological regulation networks, by analyzing how they can be constructed from previous ones, or on the contrary how new strongly entangled networks related to particular functions can emerge due to evolutionary processes. Once defined, a distance between networks will indeed help to classify networks and understand their relationships. This will give information on the different strategies to encode a particular function as it is the case in living systems.
The student will learn how to express thresholded Boolean automaton networks in the logical language Answer Set Programming based on a non-monotonic logic. He will encode more and more complex binary sequences in networks of size N, by increasing N progressively, and will systematically study a taxonomy of the whole set of satisfiable models. The student will then work on identifying good criteria to compare networks and start a systematic taxonomy study at a large scale, i.e. including the numerous networks that can compose a satisfiable set.

Bibliographie.

1. F. Corblin, E. Fanchon, L. Trilling, Applications of a formal approach to decipher discrete genetic networks, BMC Bioinformatics (2010) 11:385
2. H. Ben Amor, F. Corblin, E. Fanchon, A. Elena, L. Trilling, J. Demongeot, N. Glade, Formal methods for Hopfield-like networks, Acta Bioteoretica, Acta Biotheor. (2013) 61, 21-39
3. Q.-T. Vuong. R. Chauvin, S. Ivanov, N. Glade, L. Trilling, A logical constraint-based approach to infer and explore diversity and composition in thresholded Boolean automaton networks, Proceedings of the Complex Networks 2017 conference (2017), Lyon, in press
4. Fabien Corblin, Conception et mise en óuvre d'un outil déclaratif pour l'analyse des réseaux génétiques discrets , 2008
5. Adrien Elena, Robustesse des réseaux d'automates booléens à seuil aux modes d'itération. Application à la modélisation des réseaux de régulation génétique, 2009
6. Hedi ben Amor, Méthodes numériques et formelles pour l'ingénierie des réseaux biologiques : Traitement de l'information par des populations d'oscillateurs, Approches par contraintes et Taxonomie des réseaux biologiques , 2012
URL sujet détaillé : :

Remarques : co-encadrement (Laurent Trilling, TIMC-IMAG)

Possibilité de poursuite en thèse.



SM207-66 La Hiérarchie de Wagner et ses Extensions -- The Wagner Hierarchy and its Extensions  
Admin  
Encadrant : Olivier FINKEL
Labo/Organisme : Equipe de Logique Mathématique -- Institut de Mathématiques de Jussieu - Paris Rive Gauche
URL : http://www.logique.jussieu.fr/~finkel/
Ville : Paris

Description  

The Wagner hierarchy is a fine hierarchy
classifying the rational omega-languages with regard to their topological complexity. The location of a rational omega-language in this hierarchy can be determined in an effective manner from the structure of a Muller automaton accepting
this language. A first part of the stage will consist in the study of the Wagner hierarchy determined by Wagner in [Wag79]. Next two directions may be considered:

1) The study of the extension of the Wagner hierarchy to one counter automata without zero test [Fin01]. One could search the complexity of the following problem: "determine the position of a given one-counter omega-language in this hierarchy". Further questions would consist in the
study of possible effective extensions of this hierarchy.

2) The omega-powers of rational languages form a strict subclass of the class of rational languages of infinite words. The topological complexity of these omega-powers and their locations in the Wagner hierarchy is not yet completely known. This subject might be firstly investigated from the papers [Wag79, LitTim87].

This work might be pursued with a PhD Thesis, there are quite a few open questions in the field of automata on infinite objects.

References:

[Fin01] O. Finkel, An Effective Extension of the Wagner Hierarchy to Blind Counter Automata,
Lecture Notes in Computer Science, Volume 2142, Springer, 2001, p. 369-383.

[LitTim87] I. Litovsky and E. Timmerman, On Generators of Rational omega-Powers Languages,
Theoretical Computer Science, Volume 53 (2-3), 1987, p. 187-200.

[Sta97] L. Staiger, omega-Languages,
in Handbook of Formal Languages, Volume 3, Springer, 1997, p. 339-387.

[Wag79] K. Wagner, On omega-Regular Sets,
Information and Control, Volume 43 (2), 1979, p. 123-177.
URL sujet détaillé : http://www.logique.jussieu.fr/~finkel/M2LMFI/Sujet-stage-M2-2017.pdf

Remarques :



SM207-67 ANALYSIS OF REACHABILITY PROBLEM IN BIOLOGICAL REGULATORY NETWORKS  
Admin  
Encadrant : Olivier ROUX
Labo/Organisme : LS2N UMR CNRS 6004 - Nantes
URL : https://ls2n.fr/equipe/meforbio/, https
Ville : Nantes

Description  

Context and Objectives
Computational biology is of great interests in biological research as the computational capacity of super computer is evolving drastically. With the research for decades, many modeling frameworks suited for different biological problems becomes promising. Among these problems, the reachability problem is very classical and is still important, as many of them can be transformed to reachability problem. Basically, the reachability problem is in the field of model-checking [1] (e.g. check the final state is of AF or EF). Brute force search often leads to lack of memory and timeout as the state space is exponential and the traverse is costly.
We developed a concurrent modeling framework: Automata Network [3], to describe the biological state transi- tions in detail, together with a goal-oriented static structure: Local Causality Graph [5], to decide whether some desired state is reachable from initial state by recursive reasoning. This approach prevents global search thus leads to an acceptable cost for the solution. The implementation is a command-line tool called Pint1. Although Pint is able to solve most of reachability problems efficiently and effectively, there are some instances that are still compli- cated, either timeout or undecidable. The reason is: Pint does not perform a real search in state space thus does not ensure the traverse of all possible solutions.
Our current direction is to apply logic programming2 to refine the complicated cases or find a less costly equivalent/quasi-equivalent form. Logic programming is a scheme that solves the problem with a priori knowledge and constraints without explicit algorithms. This characteristic makes it strong in puzzle-solving e.g. Sudoku. Among the schemes of logic programming, SAT solvers [4, 2] is designed originally only for SAT problems3 but has a high scalability as many problems are transformable to SAT problems.

Candidates
No biological background is needed. Knowledge/experience in model checking, logic programming will be helpful.

References
Edmund M Clarke and Qinsi Wang. 25 years of model checking. In International Andrei Ershov Memorial Conference on Perspectives of System Informatics, pages 26=9640. Springer, 2014.
Niklas E ́en and Niklas So ̈rensson. An extensible sat-solver. In International conference on theory and applications of satisfiability testing, pages 502=96518. Springer, 2003.
Maxime Folschette, Lo ̈ıc Paulev ́e, Morgan Magnin, and Olivier Roux. Sufficient conditions for reachability in automata networks with priorities. Theoretical Computer Science, 608:66=9683, 2015.
Fangzhen Lin and Yuting Zhao. ASSAT: Computing answer sets of a logic program by sat solvers. Artificial Intelligence, 157(1- 2):115=96137, 2004.
Lo ̈ıc Paulev ́e, Morgan Magnin, and Olivier Roux. Static analysis of biological regulatory networks dynamics using abstract inter- pretation. Mathematical Structures in Computer Science, 22(04):651=96685, 2012.


URL sujet détaillé : :

Remarques : Supervisors:
Xinwei CHAI 3rd year PhD student xinwei.chai.fr
Morgan MAGNIN Professor morgan.magnin.fr
Olivier ROUX Professor (Team leader) olivier.roux.fr



SM207-68 Modélisation interactive multi-matériaux à base voxels sur GPU  
Admin  
Encadrant : David COEURJOLLY
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/david.coeurjolly/
Ville : Villeurbanne

Description  

Contexte

En informatique graphique, que ce soit pour la modélisation d'objets
géométriques ou le rendu réaliste d'images, de nombreuses approches
s'intéressent à la représentation géométrique à base de voxels. En effet, cette représentation a l'avantage de rendre certains opérateurs de traitement de géométrie triviaux (intersection/union --{constructive solid geometry}--, tests de collisionldots)[HvDM+14]. Par ailleurs, la structure de grille permet de représenter efficacement des objets multi-matériaux ou multi-interfaces tout en garantissant certaines propriétés topologiques cite{ju2002dual,schaefer2007manifold}. Au delà de la
modélisation, l'approche discrète permet également la simulation
efficace de nombreux phénomènes naturels, environnementaux ou
météorologiques complexes
[WCMT07, JFBB10,MGG+10] ou le rendu par illumination globale [CNLE09, THGM11, VSA15].


Pour représenter des scènes ou géométries complexes de grande résolution, de nombreuses structures de données efficaces ont
été proposées pour représenter de grands mondes voxeliques
[WCMT07, JFBB10,MGG+10]
(par exemple des objets géométriques complexes sur une grille de
k^3ée en 575MB) avec de nombreuses solutions
technologiques comme OpenVDB
[MLJ+13] ou
GVDB sur cartes
graphiques. Ces structures de données exploitent généralement l'aspect
sparse des objets discrets et la redondance géométrique dès
qu'il s'agit de construire une structure hiérarchique.


Projet


Dans un travail récent, nous avons proposé un outil de régularisation de surfaces de voxels dans un contexte multi-matériaux. Ce dernier nous permet d'obtenir très efficacement des surfaces lisses respectant toutes les interfaces entre régions et la topologie associée. Cette approche consiste à modéliser une énergie définie sur la structure dont le minimum correspond à la surface lisse par morceaux cible (cf [CGL17] pour le cas mono-label). Cette formulation nécessite un champ de vecteur cohérent en tout point de la surface pour l'instant évaluée sur CPU. Un premier prototype de régularisation des surfaces (minimisation de l'énergie) existe cependant sur GPU.

L'objectif du stage est de construire un environnement interactif sur GPU permettant :
- la modélisation interactive (suppression/ajout de matière) de multi-matéraux pour de grands volumes de voxels sur GPU (via donc des structures de données performantes) ;
- l'estimation rapide sur GPU d'un champ de normales cohérentes. Un travail préliminaire sur GPU nous permet d'envisager sereinement cette étape [PLC+16].
- La régularisation complète sur GPU en temps réel.


L'objectif ultime est donc de pouvoir interagir avec la matière représentée sous forme de voxels en temps interactif tout en visualisant des surfaces régularisées lisses par morceaux.
URL sujet détaillé : http://liris.cnrs.fr/david.coeurjolly/internships/SujetRegVox.pdf

Remarques :
Le stage se déroulera au sein du laboratoire LIRIS (Nautibus, Lyon1) sous la direction de David Coeurjolly, en collaboration avec d'autres chercheurs dans le cadre d'un projet inter-équipe avec l'équipe R3AM, spécialisée en rendu. Une indémnisation est prévue pour ce stage. Celui-ci pourra donner lieu à un sujet de thèse.



SM207-69 Static Analyses for Complex data structures scheduling  
Admin  
Encadrant : Laure GONNORD
Labo/Organisme : LIP
URL : laure.gonnord.org/pro :
Ville : LYON

Description  

Context : program analysis for compilers - semantics - data structures

The objective is to be able to transform programs that operate on complex data structures such as trees, lists, so that to optimise them.

The internship candidate will:
- extract challenging code from the literature benchmarks and demonstrate their potentiality in terms of optimisation
- design a mini language, a clear semantics, and the program dependences in terms of this semantics.
- formalize the code generation and the notion of valid schedule in this particular context.
URL sujet détaillé : http://laure.gonnord.org/pro/research/2018-scheduling-ds.pdf

Remarques : Codirection with T. Yuki (rennes) and L. Morel (insalyon & CEA)

I have funding for this subject, as well as a funding for a Phd on a close subject, beginning sept 2018.



SM207-70 User profiling  
Admin  
Encadrant : Antoine BOUTET
Labo/Organisme : CITI - Inria Privatics group
URL : https://sites.google.com/site/antoineboutet/
Ville : Lyon

Description  

User Profiling =96 Comparison of the user profiling capability derived from mobility traces with that resulting from Web search and navigation data.

As the profiling has become the norm on the Internet, the personal data of users is massively
collected without the consent of the individuals concerned [EN16]. In addition, as mobile devices have been adopted at a large scale during these last years, most of the interactions between users and online services are now done through their mobile devices. Consequently, the location data of users is obviously part of the tracking [ABC+13], ASF+15, EAS+17]. Recent works in the literature have demonstrated that mobility is a very rich contextual information in the sense that it has a strong inferential potential in terms of information that can be revealed about the individuals whose movements are recorded [GKC13, F15, RKC+16].

This work aims to compare the user profiling capacity derived from the mobility data to that resulting from the Web search and navigation data.

To achieve this objective, it will be requested to build a user profile through its mobility by conducting different state-of-the-art inference attacks such as the extraction of points of interest and the associated activities, the inference of the transportation modes, etc. This profile will be compared to another profile generated from the search and navigation activities of the same user.

The work will be split in different steps:
Analyze the state-of-the-art techniques of user profiling
Implement some attacks on mobility data in order to build a user profile (manipulation of open APIs, crossing different data sources, detection of transport mode, inferring the sex and the age, ...)
Using existing tools or implement some attacks on web search history and navigation to build a user profile
Analyze and compare the two profiles
Depending on the progress, it will also be possible to analyze the impact of different protection mechanisms on the profiling capabilities and how these profiles are exploited by mobile and web applications

This internship will integrate the CITI / Inria Privatics research group at Lyon (France). A monthly stipend will be provided to the intern.

Please send a CV and a detailed letter of motivation to antoine.boutet-lyon.fr.

URL sujet détaillé : http://liris.cnrs.fr/~aboutet/intern-profiling.pdf

Remarques :



SM207-71 Type System for complexity analysis of Java programs  
Admin  
Encadrant : Romain PECHOUX
Labo/Organisme : LORIA
URL : https://members.loria.fr/RPechoux/
Ville : Nancy

Description  

The aim of Implicit Complexity is to design criteria (type systems, semantic interpretations) to prove that programs belong to a given complexity class. The goal is to obtain certificates providing upper bounds on the memory and time needed by a program for a correct execution. A new implicit complexity analysis based on a type system for imperative and object oriented languages was proposed in articles [1], [2] and [7]. This analysis is inspired by Data Ramification techniques [3, 4] and by non-interference (control flow analysis) [5]. It ensures that if a program can be typed and terminates, it will run in polynomial time (or in a different context, polynomial space).
The main objectives of the project are the following :
-Increase the number of programs that can be analyzed using program transformation techniques.
-Combine the complexity analysis with tools for showing the termination of imperative and OO programs (for example, [6]).
-Increase the expressivity of the analyzed language (forks, threads, ...).
-Explore the common cases in real world programs for which the analysis fails and correct (or extend) the type system to capture them.
URL sujet détaillé : https://members.loria.fr/ehainry/tsjacp.pdf

Remarques : sujet co-encadré par Emmanuel Hainry



SM207-72 Approximations intérieures et vérification de propriétés temporelles de systèmes hybrides  
Admin  
Encadrant : Sylvie PUTOT
Labo/Organisme : http
URL : http://www.lix.polytechnique.fr/Labo/Sylvie.Putot/
Ville : Palaiseau

Description  

General presentation: The verification of critical control systems, classically modeled as hybrid systems, implies verifying that all their possible executions satisfy specifications given as temporal properties. These systems involve parameters uncertainties, which requires abstractions to verify or refute these properties not only on trajectories, but globally on sets of trajectories.
Our objective is to develop the theoretical and practical tools to build an abstract model-checker of temporal numerical properties of hybrid systems, properties expressed in logics such as MTL (Metric Temporal Logic) or STL (Signal Temporal Logic).

Objectives of the internship: The internship aims at combining internal and external approximations of the set of reachable states of uncertain hybrid systems, in order to prove or refute temporal properties of these systems. Computing over-approximations (or external approximations) of reachable states of hybrid systems has become a relatively classical subject. But calculating under-approximations (or internal/inner approximations) is notoriously more difficult. The objectives of the internship are:
- extend for hybrid systems, the inner-approximations proposed in [HSCC14, HSCC17] for discrete and continuous systems. A difficulty is the accurate interpretation of discrete transitions.
- define an interpretation, based on the previously defined over and under-approximations, of temporal properties, possibly quantified on the parameters.
- experiment these reachability analyses on real drones models (including uncertainties), such as those we have in the team (TurtleBot3, crazyflie2.0 etc.), and compare with practical results. An axis could also be to embed some versions of these reachability algorithms for (very) bounded time horizon, on these low-powered platforms, in order to monitor higher-level functions (trajectory planning, obstacle avoidance etc.)
Depending on the intern's interest, the emphasis may be more strongly on one or the other of these axes.
Possibility of follow-up in PhD.

Bibliography:
[HSCC 2014] E. Goubault, M. Kieffer, O. Mullier and S. Putot, Inner approximated reachability analysis, Proceedings of Hybrid Systems: Computation and Control HSCC 2014
[HSCC 2017] E. Goubault, S. Putot, Forward Inner-Approximated Reachability of Non-Linear Continuous Systems. HSCC2017

URL sujet détaillé : http://www.lix.polytechnique.fr/Labo/Sylvie.Putot/Stages/stage_sousapprox.pdf

Remarques : co-encadrement par Eric Goubault



SM207-73 Energy sources commitment  
Admin  
Encadrant : Patricia STOLF
Labo/Organisme : IRIT
URL : https://www.irit.fr/datazero/index.php/en/
Ville : Toulouse

Description  

The internship will be done in the context of SEPIA team specialised in large scale distributed systems (HPC and Clouds), particularly in scheduling and multi-objective optimization. SEPIA team is part of IRIT lab (750 peoples) in Toulouse.

This subject is in the context of the DATAZERO project funded by ANR (French National Funding Agency). This project is a collaboration with FEMTO (Belfort), LAPLACE (Toulouse) laboratories and with EATON company (Grenoble) and focuses on optimization of datacenters using renewable energy sources.

In this subject we are interested in optimizing the energy sources commitment (i.e. the scheduling at different timescales of the power output of each component in the power system) to deal with servers load. The different energy sources considered are: wind turbines, solar panels, storage elements (like battery and fuel cell). A connection to the traditional grid will be possible for selling purpose in case of over-production.
Different time windows will be considered to address mid / long term scheduling. Different profile demands will be dealt with: predicted energy requested demands and probabilistics demands.
In this master thesis we aim to study the dynamic adaptation of the energy units commitment in order to improve energy efficiency.
URL sujet détaillé : :

Remarques :



SM207-74 Formal verification of Ocaml's serialisation  
Admin  
Encadrant : Julien TESSON
Labo/Organisme : LACL
URL : http://tesson.julien.free.fr/internship/#serialisation
Ville : Créteil

Description  

# Context
Serialization is the process of transforming a graph of objects
(we use here *object* in the broad sens of a datastructure which can
contain references to other *objects*) into a sequence of bytes (or
text), such that this sequence can be reconstructed as an identical
object graph afterwards. There are various reasons why serialization
is needed =96 such as saving an object state to persistent storage or
streaming an object across a communication link. More generally, an
object is serialized whenever the manipulation of an equivalent object
in a different program context than its origin is required. The
serialization process can also be referred to as marshalling,
pickling, flattening or persisting. To serialize an object, a basic
algorithm recursively traverses all the object's references to have
the complete object representation (the complete object graph, rooted
at the serialized object). During this process, the first occurrence
object results in this object being serialized. The subsequent
references to the same entity need to avoid serializing this object
again.

# Goal
Most of the existing results on serialization [1,2] deal with the
encoding and decoding process but not with the traversal of the graph
of dependencies between entities in the memory.
In this internship we propose to verify the marshalling operation
performed in Ocaml, and prove properties of this algorithm concerning
both the correct encoding of information and the correct traversal of
the graph.

An abstract model of correct serialisation, written in Isabelle/HOL [3],
will be provided as a starting point for the specification.

Our objective is to reuse the intership results in the study of the
semantics of distributed programming languages and in the proof of
their correct implementation.

# Work Program
In a first time the intern will have to check the c code of Ocaml's
serialisation, to verify that it can be fed to a program
verification framework such as FramaC [4], and, if not, do minor
code modifications in order to fit the language constraints of the
program verification framework.

Then the necessary parts of the formalisation of correct serialisation already written with
Isabelle proof assistant will have to be translated into the program
verification framework's logic.

And the correctness of the C code with respect to the translated
specification will be established.

The next step, if time allows it, will be to use the properties of the serialisation process
in the context of parallel multi-threaded execution (Ocaml multicore [5]) in order to assert some correctness properties of the Ocaml multicore library.


[1] Alexander Ahern and Nobuko Yoshida. Formalising java rmi with explicit code mobility. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, OOPSLA '05.
[2] Heather Miller, Philipp Haller, Eugene Burmako, and Martin Odersky. Instant pickles: generating object-oriented pickler combinators for fast and extensible serialization. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA 2013.

[3] Isabelle/HOL. Proof assistant. http://isabelle.in.tum.de/
[4] FramaC. Source-code analysis of C software. http://frama-c.com/
[5] Stephen Dolan , Leo White , Anil Madhavapeddy. Multicore OCaml. In
the OCaml Users and Developers Workshop.
http://ocamllabs.io/doc/multicore.html


URL sujet détaillé : :

Remarques : This internship is proposed in the context of an ongoing informal collaboration
between Julien Tesson, Assistant professor at university Paris-Est Créteil, and Ludovic
Henrio, CNRS researcher at I3S, Scale project.

Ludovic Henrio will co-advise the intern, and a part of the intership could be at Sophia-Antipolis if the intern wants to.



SM207-75 Génération de code de fonctions élémentaires  
Admin  
Encadrant : Florent DE DINECHIN
Labo/Organisme : équipe Socrate du laboratoire CITI (INRIA/INSA-Lyon)
URL : metalibm.org :
Ville : Lyon, France

Description  

La bibliothèque mathématique (libm) est un composant du système d'exploitation qui offre les fonctions élémentaires (exponentielle, logarithme, fonctions trigonométriques et leurs inverses, fonctions hyperboliques, ...) et certaines fonctions spéciales courantes (erf, gamma, ...). La performance est critique, puisque de nombreuses applications de calcul intensif passent une fraction importante de leur temps à évaluer de telles fonctions. Chaque évolution du processeur demande à réévaluer la pertinence des choix algorithmiques faits: des équipes d'ingénieurs, chez les principaux constructeurs de processeurs, travaillent à réimplémenter la libm sur chaque nouveau processeur. De plus, il faut de nos jours plusieurs versions de chaque fonction: différents contextes ont besoin de différentes spécifications, que ce soit en terme de performance (minimiser la latence, ou maximiser le débit du calcul), ou en terme de précision (favoriser la précision et la reproducibilité des calculs, ou favoriser la performance au détriment de la précision).

Pourtant, face à cette complexité, les fonctions elles-mêmes, en tant qu'objets mathématiques, n'évoluent pas: leur implantation est basée sur une collection limitée de recettes bien connues, comme l'approximation polynômiale ou la tabulation de valeur précalculées. Le développement d'un libm est donc actuellement un art qui consiste à assembler au mieux ces recettes, pour maximiser la performance sous une contrainte de précision. Le projet MetaLibm a pour objectif de formaliser cet art, dans le but d'automatiser l'expertise mise en jeu.

Dans ce stage, on s'insérera dans ce projet pour y développer les fonctions hyperboliques inverses. En parallèle, on travaillera sur la génération de code optimisé pour les processeurs de la famille ARM (le prototype actuel produit du code pour Intel et Kalray).

L'intérêt de ce stage est de couvrir un spectre qui va des mathématiques (analyse et calcul d'erreur fin) jusqu'à l'architecture (utilisation pertinente des ressources des processeurs actuels) en passant par la preuve formelle des codes produits.


Ce stage se déroulera au CITI, sur le campus de la Doua à Lyon.
Metalibm étant écrit en Python et produisant du C, un goût pour ces deux langages est nécessaire.

URL sujet détaillé : :

Remarques :



SM207-76 Circuits de transformée de Fourier rapide calculant au plus juste  
Admin  
Encadrant : Florent DE DINECHIN
Labo/Organisme : équipe Socrate du laboratoire CITI (INRIA/INSA-Lyon)
URL : http://flopoco.gforge.inria.fr
Ville : Lyon, France

Description  

La transformée de Fourier rapide (FFT) est au coeur du traitement du
signal. La littérature s'est beaucoup intéressé à son implémentation
logicielle, et à sa complexité en temps/espace/mémoire de travail. La
plupart de ces articles ignore la question de la précision du calcul,
se contentant d'utiliser un type des données (flottant ou virgule
fixe) largement trop précis. On voit aussi régulièrement des
descriptions d'implémentations matérielles pour un contexte donné (par
exemple tel ou tel standard de téléphonie mobile). Dans ce genre
d'application, la FFT s'exécute sur du matériel dédié (VLSI ou FPGA),
et chaque bit calculé se paye: il est pertinent de se demander à
quelle précision doit calculer chaque opérateur de la FFT pour
garantir une certaine exactitude (accuracy) du résultat
final. Actuellement, les articles se contentent de constater, par
exemple, qu'un calcul sur 16 bits donne un résultat de qualité
suffisante pour l'application. Mais alors, est-ce que 15 ou 12 bits ne
feraient pas l'affaire? Et est-ce que tous les opérateurs de la FFT
doivent travailler avec la même précision?

Dans ce stage, on s'attachera à réaliser un générateur d'architectures
de FFT calculant au plus juste. Le but est que le résultat final soit
correct au dernier bit, mais aussi que chaque additionneur, chaque
multiplieur soit dimensionné en sorte de ne jamais calculer un bit qui
ne soit utile au résultat final. Pour cela, il faudra formaliser, en
termes de précision, l'enchaînement des calculs dans les différentes
variantes de la FFT, en tenant compte des identités mathématiques qui
fondent la FFT. Une conséquence sera de raffiner la formule de
complexité classique (n.log(n) opérations sur les réels) avec un
paramètre p qui décrit la précision, en bits corrects, de la sortie.

Ce travail sera validé par un prototype dans le projet open-source
FloPoCo. FloPoCo étant un générateur de VHDL écrit en C++, un goût
pour ces deux langages est nécessaire.

Références bibliographiques:
Sur la FFT, voir par exemple
E. Brigham, Fast Fourier Transform and Its Applications, Prentice Hall
Sur FloPoCo, voir http://flopoco.gforge.inria.fr/bib/flopoco.html
et en particulier les références [27] et [33] de cette page.


URL sujet détaillé : :

Remarques :



SM207-77 Simulation parallèle  
Admin  
Encadrant : Florence PERRONNIN
Labo/Organisme : LIG
URL : https://team.inria.fr/polaris/
Ville : Grenoble

Description  

Les grandes infrastructures (réseau sans fil, calcul hautes performances, cloud, smart grids, ...), délicates à modéliser en raison d'un grand nombre de paramètres et de leur caractère stochastique, sont optimisées grâce à des simulations préalables avant leur mise en oeuvre en vraie grandeur. Or, ces simulations sont coûteuses en temps de calcul, il est donc naturel de les paralléliser.
Le terme de simulation parallèle recouvre aujourd'hui plusieurs paradigmes différents : temps parallèle, modèle synchrone (network paradigm), simulation optimiste... certaines de ces approches étant encore à un stade exploratoire.
Ces différentes approches ont été implémentées dans des logiciels de simulation indépendants. Cependant, il est aujourd'hui difficile de savoir si la simulation parallèle permet réellement de gagner du temps, car ces logiciels sont validés séparément sur des modèles et plate-formes différents. Dans ce stage, on s'intéressera à l'évaluation de performances de ces différentes approches, à la fois sur le plan analytique (calcul de bornes sur l'accélération atteignable) et expérimental.

Travail attendu
1. État de l'art de la simulation parallèle
2. Élaboration d'un benchmark commun : modèle de réseau et modèle d'application
3. Comparaison analytique des différentes approches sur le modèle retenu
4. Choix et comparaison expérimentale d'outils logiciels (e.g. ROSS, PSI).

Bibliographie
Chandy, K. M., & Misra, J. (1979). Distributed simulation: A case study in design and verification of distributed programs. IEEE Transactions on software engineering, (5), 440-452.
Jefferson, D., Beckman, B., Wieland, F., Blume, L., & DiLoreto, M. (1987). Time warp operating system (Vol. 21, No. 5, pp. 77-93). ACM.
Mubarak, M., Carothers, C. D., Ross, R., & Carns, P. (2012, November). Modeling a million-node dragonfly network using massively parallel discrete-event simulation. In High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion: (pp. 366-376). IEEE.
Cope, J., Liu, N., Lang, S., Carns, P., Carothers, C., & Ross, R. (2011, June). Codes: Enabling co-design of multilayer exascale storage architectures. In Proceedings of the Workshop on Emerging Supercomputing Technologies (Vol. 2011).
Fujimoto, R. M. (2000). Parallel and distributed simulation systems (Vol. 300). New York: Wiley.
Fujimoto, R. (2015, December). Parallel and distributed simulation. In Proceedings of the 2015 Winter Simulation Conference (pp. 45-59). IEEE Press.
Nicol, D., Greenberg, A. and Lubachevsky, B. (1994). Massively Parallel Algorithms for Trace-Driven Cache Simulations, IEEE Trans. Parallel Distrib. Syst. 5(8), 1994, pp. 849-859.
Thi, T. H. D., Fourneau, J. M., & Quessette, F. (2014, June). Time-Parallel Simulation for Stochastic Automata Networks and Stochastic Process Algebra. In International Conference on Analytical and Stochastic Modeling Techniques and Applications (pp. 140-154). Springer International Publishing.
URL sujet détaillé : :

Remarques : Ce stage sera co-encadré par Arnaud Legrand. Il se fera également en collaboration avec Jean-Marc Vincent et Vincent Danjean.



SM207-78 Process scheduling under pipeline constraints  
Admin  
Encadrant : Christophe ALIAS
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/christophe.alias/
Ville : Lyon

Description  

Floating point arithmetic operators (, -, times, /,sqrt{.}, ldots100 24 25 29 44 46 100 110 118 200 201 259 used in hardware accelerators are pipelined to guarantee the circuit bandwidth. An operation a result at the date +k with number of pipeline stages. If the result of used by an operation have to wait for its availability:>i+k100 24 25 29 44 46 100 110 118 200 201 259pipeline constraint). Otherwise, the pipeline of be frozen until the data is available. Scheduling under pipeline constraints consists in reorganizing the computations to reduce the total execution time while satisfying the pipeline constraints. This problem is known to be NP-complete on simple sequential codes without tests and loops (basic blocks). Yet the exact solution using integer programming has a reasonnable execution time in practice.

The goal of this internship is to address scheduling under pipeline constraints for {em regular process networks}: a set of processus communicating through channels, each processus executing an arithmetic operator with a control predictable at compile-time. More precisely, we seek to:
- Predict/bound the latency assuming an optimal use of pipelines
- Build an algorithm which find an schedule for each process.

The approach will be validated on the kernels of Polybench/C.

URL sujet détaillé : http://perso.ens-lyon.fr/christophe.alias/stages/pipeline-dpn.pdf

Remarques : Co-encadrement avec Matthieur Moy, MCF UCBL (LIP)



SM207-79 Mixed size memory model for the x86 processor/Modèle mémoire accès mélangés pour le processeur x86  
Admin  
Encadrant : Luc MARANGET
Labo/Organisme : Inria, centre de recherche de Paris
URL : http://gallium.inria.fr/~maranget
Ville : Paris

Description  

Modern shared memory multi-processors do not follow the simple model of thread execution interleaving. Instead, they exhibit additional, relaxed, behaviours. Studying these behaviours and more generally relaxed memory models is an active field of research. This internship consists in a realistic and practical extension of the memory model of the Intel x86 family of processors. The intern will design the new model and improve the memory model simulator herd7 so that it can handle the new model.
URL sujet détaillé : http://gallium.inria.fr/~maranget/mixed-x86.pdf

Remarques :



SM207-80 Parallel Surrogate-assisted Blackbox Optimization  
Admin  
Encadrant : Bilel DERBEL
Labo/Organisme : The internship is to be conducted within the International Associated Lab LIA-MODO France / Japan, and the bigMO bilateral France / Hong Kong ANR project. A research visit abroad can be scheduled.
URL : http://www.cristal.univ-lille.fr/~derbel
Ville : Lille

Description  

[See detailed description - attached pdf]
Keywords. Optimization, machine learning, surrogates, evolutionary algorithms, parallel computing.
Abstract. The objective of the internship is the design and analysis of distributed/parallel surrogate-assisted algorithms for expensive blackbox optimization problems. Blackbox optimization refers to the situation where nothing about the objective function to be minimized (or maximized) is available, e.g., no derivates, no explicit analytic formulation, etc. In an expensive setting, where evaluating the function at one point is time consuming, the goal is to find a global minimizer (or maximizer) by evaluating (sampling) the least number of points. In this context, we are interested in using surrogates (meta-models, response surfaces) such as Kriging, RBF, which allow one to provide an estimation (or prediction, or approximation) of the function at every point given only a restricted number of samples. Besides, the goal is to set up a fully distributed approach so that the optimization process can scale up with the number of available compute resources. For this purpose, the main challenge is to enable parallelism at the algorithmic level, for instance, by exploring the use of an ensemble of surrogates, the use of cooperative sampling strategies, the use of decomposition, etc. As such, the general methodology is to investigate the foundations of autonomous decentralized optimization algorithms leveraging techniques at the crossroads of a number of fields such as machine learning, global optimization, evolutionary computing, and exploiting the power offered by modern distributed and massively parallel compute platforms.
[See detailed description - attached pdf]
URL sujet détaillé : http://www.cristal.univ-lille.fr/~derbel/m2projects/M2-internship-surrogates.pdf

Remarques : The internship is to be conducted within the International Associated Lab LIA-MODO France / Japan, and the bigMO bilateral France / Hong Kong ANR project. A research visit abroad can be scheduled.

The internship can be funded and can eventually be extended with a 3-year funded PhD thesis.



SM207-81 Cell morphing on data structures  
Admin  
Encadrant : David MONNIAUX
Labo/Organisme : VERIMAG
URL : http://www-verimag.imag.fr/~monniaux/
Ville : Grenoble

Description  

Recent work (Cell morphing, Monniaux & Gonnord, SAS 2016) has shown how to largely automate proofs of procedures operating over arrays: the approach automatically finds inductive loop invariants for this.

The purpose of the internship is to extend this to more general data structures.

URL sujet détaillé : :

Remarques : Co-encadrement avec Laure Gonnord du LIP. Possibilité de faire le stage au LIP ou à VERIMAG.



SM207-82 Factorisation de polynômes et satisfaisabilité d'inéquations polynomiales  
Admin  
Encadrant : David MONNIAUX
Labo/Organisme : VERIMAG
URL : http://www-verimag.imag.fr/~monniaux/
Ville : Grenoble

Description  

Exploitation de la factorisation de polynômes pour décider la satisfaisabilité de systèmes d'inéquations polynomiales

CONTEXTE

L'avionique, l'automobile, l'aérospatiale utilisent des programmes embarqués pour le pilotage automatique de leur engins.

Pour analyser le comportement de ces programmes et certifier leur bon fonctionnement on utilise des outils (Static Analyzer, SAT & SMT solver) qui effectuent une analyse symbolique des programmes. Ces analyses permettent de vérifier qu'un programme respecte des propriétés de sûreté telles que "La vitesse de l'engin ne dépassera jamais xxx km/h".

Au laboratoire VERIMAG nous développons de tels outils (en Ocaml, C, C++) et nous prouvons leur correction en COQ.

SUJET DU STAGE

Les programmes qui font des calculs de trajectoires utilisent des polynômes multi-variables, classiquement à trois variables x,y,z, tel que P(x,y,z) = x^2 + y^3 + x*z - y*z.
En particulier pour approximer des fonctions telles que sin, cos, log, tan, ...

Garantir la correction de tels programmes se ramènent à décider s'il existe un point (x,y,z) qui satisfait un système constitué d'équations polynomiales P(x,y,z)=0 et d'inéquations polynomiales Q(x,y,z)>=0.
Il s'agit donc d'étudier le signe des polynômes dans l'espace (x,y,z) et pour résoudre ce problème il est avantageux de factoriser les polynômes

P(x,y,z) = P1(x,y,z) * P2(x,y,z) s'annule si P1 ou P2 s'annullent

Q(x,y,z) = Q1(x,y,z) * Q2(x,y,z) est positif là où Q1 et Q2 sont positifs et là où Q1 et Q2 sont négatifs

L'étude se ramènent alors à celle de polynômes P1,P2,Q1,Q2 de moindre degré.

Des travaux précédents avec B.Grenet du LIRM ont montré qu'on pouvait efficacement extraire les facteurs affines, c'est à dire de la forme c0 + c1 x + c2 y + c3 z, des polynômes P et Q.
Or on dispose des outils pour étudier facilement le signe des polynômes comportant des facteurs affines.

Notre étude à révéler que les solvers SMT usuels n'exploitent pas cette information.

LE STAGE consiste à améliorer la résolution des solveurs SMT en exploitant la factorization et la règle des signes ( plus * plus = plus, moins * moins = plus, ...).

MOTS-CLEFS

Polynômes multivariés, factorisation, SMT solver (Satisfaisabilité Modulo Theory), Étude expérimentale


PRÉ REQUIS

Le candidat/la candidate devra
- être à l'aise avec les notions mathématiques de polynômes et de facteurs (univariables),
- être à l'aise avec les raisonnements logiques,
- savoir programmer en python,
- avoir envie d'apprendre à utiliser un solver SMT et un logiciel de mathématique (Sage, Mathematica),
- être intéressé par la preuve automatique

URL sujet détaillé : http://im2ag-pcarre.e.ujf-grenoble.fr/CDExportProjetHtml?idSujet=2249

Remarques : Co-encadrement avec Michaël Périn.



SM207-83 Compilation de Programmes C en Formules Logiques  
Admin  
Encadrant : David MONNIAUX
Labo/Organisme : VERIMAG
URL : http://www-verimag.imag.fr/~monniaux/
Ville : Grenoble

Description  

CONTEXTE DU STAGE
L'avionique, l'automobile, l'aérospatiale utilisent des programmes embarqués pour le pilotage automatique de leur engins.

Pour analyser le comportement de ces programmes et certifier leur bon fonctionnement on utilise des outils (Static Analyzer, SAT & SMT solver) qui effectuent une analyse symbolique des programmes. Ces analyses permettent de vérifier qu'un programme respecte des propriétés de sûreté telles que "La vitesse de l'engin ne dépassera jamais xxx km/h".

Au laboratoire VERIMAG nous développons de tels outils (en Ocaml, C, C++) et nous prouvons leur correction en COQ.


SUJET DU STAGE

Compilation Programme C vers Formules Logiques

afin d'analyser les programmes C à l'aide d'outils de déduction automatique

Chaque instruction d'un programme impératif peut-être traduite en implication logique par la méthode de calculs des plus faibles préconditions.

Un programme devient alors une conjonction de formules logiques (des clauses de Horn) que l'on peut donner à un solveur SMT (Satisfaction Modulo Theory)

pour ensuite le questionner sur les propriétées satisfaites par le programme et ainsi vérifier certaines propriétés de correction du programme

(pas de division par 0, pas d'accès en dehors des bornes d'un tableau,...).



Les clauses de Horn sont un format intéressant à plusieurs titres :

- un format indépendant du langage de programmation puisque la sémantique du langage est explicitée lors de la traduction en clauses de Horn.

- un format d'entrée de très nombreux moteurs de logique,

- un format d'échanges entre outils de vérification, simple à parser, à générer.



L'objectif du stage est d'écrire un compilateur C vers Clause de Horn en s'appyuant sur l'architecture de compilateur CompCert C

(seul compilateur certifié sans bogue en Coq).

Dans un premier temps il s'agit de définir la transformation en Ocaml, l'optimiser pour éliminer les formules triviales,

Et si le temps le permet l'incorporer dans la version Coq de CompCert et la prouver en Coq.



MOTS CLEFS

Compilation, Plus Faible Préconditon, Preuve de Programme, Sovleur Satisfaction Modulo Theory, Assistant de Preuve Coq.



PRÉ-REQUIS

Le candidat/la candidate devra

- connaître le langage C et les principes de compilation,

- avoir envie d'apprendre un nouveau langage (Ocaml),

- être à l'aide avec les raisonnements logiques,

- être intéressé par la preuve automatique
URL sujet détaillé : http://im2ag-pcarre.e.ujf-grenoble.fr/CDExportProjetHtml?idSujet=2246

Remarques : Co-encadrement avec Michaël Périn.



SM207-84 Recherche de bugs et de vulnérabilités dans des systèmes distribués par exécution concolique  
Admin  
Encadrant : David MONNIAUX
Labo/Organisme : VERIMAG
URL : http://www-verimag.imag.fr/~monniaux/
Ville : Grenoble

Description  

L'exécution symbolique consiste à simuler un programme sur des données partiellement symboliques =97 on conservera par exemple comme symboliques certains octets du fichier d'entrée. On tente ainsi d'explorer tous les chemins d'exécution accessibles du programme, pour trouver des bugs ou des trous de sécurité.

Il est très coûteux de garder symboliques beaucoup de variables, et parfois c'est impossible (par exemple si le programme appelle une librairie externe ou le système, qu'on ne peut explorer symboliquement). On concrétise alors certaines valeurs (si on concrétise tout, alors on se ramène à une exécution classique sur le processeur ou un simulateur de processeur). L'exécution symbolique+concrète est dit "concolique".

Par exemple, Microsoft a un outil a usage interne appelé SAGE pour trouver des trous de sécurité dans Windows, Office etc. Côté outils publiquement disponibles, on trouve notamment KLEE, basé sur l'environnement de compilation LLVM.

Quand on programme un logiciel distribué (envoi de messages entre machines), il est très facile et courant de faire des bugs, car il est difficile de songer à tous les ordres d'exécution possibles. On aimerait donc pouvoir les rechercher avec des outils concoliques =97 mais KLEE ne supporte pas de simuler les entrelacements d'exécution!

Le but du stage serait donc d'étudier comment transformer KLEE dans ce sens (on pourra, dans la mesure du possible, illustrer avec les envois de message MPI).


URL sujet détaillé : :

Remarques : (collaboration avec Camille Coti, LIPN, développeuse d'OpenMPI)



SM207-85 Domain popularity ranking based on passive DNS traffic  
Admin  
Encadrant : Maciej KORCZYNSKI
Labo/Organisme : Grenoble Informatics Laboratory-Grenoble Alps University
URL : https://www.liglab.fr
Ville : Grenoble

Description  

Context:

The most popular techniques for measuring the popularity of a website depend on search hits or the deployment of client-side measurement agents that may make them susceptible to various manipulation methods and therefore may undermine their results. The most prominent example is Alexa Traffic Ranking [1] that is based on aggregated historical traffic from millions of Alexa Toolbar users [2]. Therefore, it depends on the adoption of the Alexa Browser extension. Another important limitation is that websites with low measured=0E traffic will not be accurately ranked [3].

Internship Program and Objectives:

The objective of this internship is to create an alternative ranking based on the analysis of Domain Name System (DNS) traffic rather than on HTTP traffic that would potentially overcome the above-mentioned limitations. While there have been some recent efforts using various active [4] and passive measurements [5, 6, 7] of DNS tra=0Ec, which reflect the popularity of Internet activity, we argue that ranking based on passively observed DNS traffic coming from thousands of sensors located around the world would be the most accurate.

The project would consist on the analysis of DNS queries and developing a popularity algorithm that takes into account the number of queries observed in the passive dataset in=0C defined time windows, a Time to Live (TTL) value of each domain indicating how long the domain "stays" in DNS caching servers, and more.

To validate this work, the intern would be able to compare the obtained results with ground truth data, more speci=0Ccally with the DNS traffic collected at authoritative domain name servers of one of the leading Top Level Domain (TLD) registry operators.

The research would include a comparison with the Alexa 1M list and Cisco Umbrella popularity rankings. A list of 1 Million most popular domains along with the algorithm could be available for Internet users free of charge.

References
[1] "Alexa Top Sites" https://aws.amazon.com/alexa-top-sites.
[2] "The Alexa Extension" http://www.alexa.com/toolbar.
[3] "How are Alexa's traffic rankings determined" https://support.alexa.com/hc/en-us/articles/200449744-How-are-Alexa-s-tra=0Ec-rankings-determined-.
[4] M. A. Rajab, F. Monrose, A. Terzis, and N. Provos, "Peeking through the cloud: Dns-based estimation and its applications" in Proceedings of the 6th International Conference on Applied Cryptography and Network Security, ser. ACNS'08, 2008, pp. 21. [Online]. Available: https://www.cs.jhu.edu/=18moheeb/webpage =0Cles/DNS.pdf
[5] L. Deri, S. Mainardi, M. Martinelli, and E. Gregori, "Exploiting dns tra=0Ec to rank internet domains" in 2013 IEEE International Conference on Communications Workshops (ICC), June 2013, pp. 1325{1329. [Online]. Available: http://luca.ntop.org/tricans2013.pdf
[6] A. Mayrhofer, "How popular is this Domain? Yet another (DNS based) approach" http://schd.ws/hosted_files/icann58copenhagen2017/7a/DNS%20Magnitude%20-%20Mayrhofer.pdf.
[7] "Cisco Umbrella 1 Million" https://umbrella.cisco.com/blog/blog/2016/12/14/cisco-umbrella-1-million/.


Prerequisites:
=0F- Good knowledge in algorithms
=0F- Good knowledge in programming (e.g. Python)
=0F- Knowledge in networking would be a plus

URL sujet détaillé : http://mkorczynski.com/Popularity.pdf

Remarques : This internship will be performed in close collaboration with industry partners. It can be followed by a PhD thesis.



SM207-86 Automatic Theorem Proving in Coq  
Admin  
Encadrant : Chantal KELLER
Labo/Organisme : LRI, Univ. Paris-Sud
URL : https://www.lri.fr/~keller/index-en.html
Ville : Orsay

Description  

SMTCoq in a plugin for the Coq interactive theorem prover that offers the possibility to automatically discharge Coq goals to external automatic solvers.

The objective of the internship is to build expressive and reliable Coq automatic tactics out of this interaction with external solvers.

(See detailed subject.)
URL sujet détaillé : https://www.lri.fr/~keller/Documents-recherche/Propositions-stage/smtcoq-en.pdf

Remarques : - Co-advisor : Valentin Blot
- Funding available
- Possibility to continue for a PhD



SM207-87 Graph and Automata Algorithms for Verification  
Admin  
Encadrant : Laurent DOYEN
Labo/Organisme : LSV, ENS Cachan (Laboratoire Spécification et Vérification)
URL : http://www.lsv.fr/~doyen/
Ville : Cachan (Paris)

Description  

Graph and automata theory are widely used in the algorithmic solution of fundamental problems in verification. Verification problems often boil down to solving reachability questions in graphs, optimization problems, and combinatorial problems in automata.

Fundamental algorithmic problems in computer science have been well studied, such as the shortest path problem, the travelling salesman problem, etc.
A similar classical problem in discrete planning is the finite-horizon planning problem, where the input is a directed graph with weights
assigned to every edge and a time horizon T, and the goal is to find a path of length T that maximizes the total utility defined as the sum of the weights of the path. This computational problem for finite-horizon planning has applications in artificial intelligence and robotics, as well as in control theory and game theory.

In this internship (with possible continuation as a phd thesis), we consider relaxations of the finite-horizon problem where the original question with a fixed horizon T is replaced by an expected time horizon, either given through a fixed stopping-time distribution, or through an adversarial distribution where the stopping-time distribution is unknown and decided by an adversary. We are looking for algorithmic solutions and structural properties in the case
of graphs, as well as in more powerful models such as Markov processes, pushdown graphs, and timed systems. Several theoretical questions can be investigated and the solutions and heuristics can possibly lead to prototype implementations.


URL sujet détaillé : http://www.lsv.fr/Stages/Fichier/m2-18-ld-2.pdf

Remarques :



SM207-88 Algorithms for Expected-Time Reachability  
Admin  
Encadrant : Laurent DOYEN
Labo/Organisme : LSV, ENS Cachan (Laboratoire Spécification et Vérification)
URL : http://www.lsv.fr/~doyen/
Ville : Cachan (Paris)

Description  


Timed automata are a simple natural model of computation for the specification
of timed behaviors. They are finite automata extended with clocks, i.e. real-valued variables that increase with the passage of time. Timed automata have been used in a wide range of applications where real-time aspects of computation is important, and they are supported by a variety of efficient tools and model-checkers. The main algorithmic property of timed automata is that the reachability problem (i.e., to decide if a given state of the automaton is accessible from the initial state) is decidable and PSPACE-complete. Some extensions of timed automata with weights and probabilities, as well as robustness questions are also decidable.
However, extending timed automata with stopwatch (real-valued clock variables that can pause while counting time) leads to undecidability of the reachability problem~cite{HKPV98}.

Recently, the time-bounded variant of the reachability problem has been considered for real-time systems, where the final state should be reached within T time units, where T is a given constant. This formulation gives rise to decidability of the reachability problem for timed automata extended with stopwatches.

In this internship (with possible continuation as a phd thesis), we consider generalizations of the time-bounded reachability problem where the original question with a fixed horizon T is replaced by an expected time horizon, either given through a fixed stopping-time distribution, or through an adversarial distribution where the stopping-time distribution is unknown and decided by an adversary. We are looking for algorithmic solutions and structural properties in the case
of timed automata, including robustness analysis. Several theoretical questions can be investigated and the solutions and heuristics can possibly lead to prototype implementations.

URL sujet détaillé : http://www.lsv.fr/Stages/Fichier/m2-18-ld-1.pdf

Remarques :



SM207-89 Formalizing higher-order process calculi in Coq  
Admin  
Encadrant : Sergueï LENGLET
Labo/Organisme : Irisa
URL : https://members.loria.fr/SLenglet/
Ville : Rennes

Description  

Process calculi, like CCS or the pi-calculus, are formal models of distributed and concurrent systems where agents are represented as message-passing processes. Higher-order calculi allow executable processes in messages; they are used to model dynamic reconfiguration in a network for instance.

We want to formalize and provide tools to reason about higher-order process calculi in the Coq proof assistant [1]. A first issue is the representation of a process calculus in Coq, and in particular of its binding structure. In a preliminary work [2], we provide a formalization of the higher-order process calculus HOpi using the locally nameless representation for binders.

The goal of this internship is to revisit this formalization using a different representation of binders, called the nominal representation. The intern will write the syntax and semantics of the calculus using the nominal representation, and then some of the proofs of [2], to compare the pros and cons of the two representations when formalizing higher-order calculi.

[1] https://coq.inria.fr/
[2] https://hal.inria.fr/hal-01614987
URL sujet détaillé : :

Remarques : Knowledge in Coq is not mandatory. A taste for formal languages (like lambda calculus) is preferable.



SM207-90 Genome Rearrangements considering Intergenic Sizes  
Admin  
Encadrant : Guillaume FERTIN
Labo/Organisme : LS2N, Université de Nantes
URL : https://ls2n.fr/equipe/combi/
Ville : Nantes

Description  

See PDF file at
http://pagesperso.lina.univ-nantes.fr/~fertin-g/sujetM2.pdf
URL sujet détaillé : http://pagesperso.lina.univ-nantes.fr/~fertin-g/sujetM2.pdf

Remarques :



SM207-91 Continuity analysis to detect forgiving regions in code  
Admin  
Encadrant : Benoit BAUDRY
Labo/Organisme : KTH-ICT
URL : http://people.rennes.inria.fr/Benoit.Baudry/
Ville : Stockholm, Suède

Description  

The goal of this project is to exploit the sound techniques of continuity analysis, in order to build analyzers that can automatically locate forgiving regions in large programs. More specifically, the student is expected to:
- survey the state of the art for continuous analysis and differential privaycy
- experiment with existing tools
- study the feasability of continuous analysis for Java methods
- evaluate static analysis to detect forgiving regions in Java code

URL sujet détaillé : http://people.rennes.inria.fr/Benoit.Baudry/continuity-analysis-to-detect-forgiving-regions-in-code/

Remarques : Possibilité de financer le logement à Stockholm



SM207-92 Even nicer interpolants  
Admin  
Encadrant : David MONNIAUX
Labo/Organisme : VERIMAG
URL : http://www-verimag.imag.fr/~monniaux/
Ville : Grenoble

Description  

You wish to find inductive invariants (as usual in program verification). From a logical point of view, these can be seen as "Craig interpolants" of certain formulas.

An idea: before trying to find complicated interpolants (say, disjunction of conjunction of 25 linear inequalities), first attempt finding simple interpolants (say, ONE linear inequality).

This is the idea behind some recent work (Hiroshi Unno, Tachio Terauchi; Inferring Simple Solutions to Recursion-Free Horn Clauses via Sampling, TACAS 2015), but we would like to improve on that, both on scope and efficiency.
URL sujet détaillé : :

Remarques :



SM207-93 Y a-t-il des bugs dans GNU MPFR ?  
Admin  
Encadrant : Paul ZIMMERMANN
Labo/Organisme : Inria/LORIA
URL : https://www.loria.fr
Ville : Nancy

Description  

Le but du stage est de trouver des bugs dans GNU MPFR, via deux voies complémentaires : via une approche directe on cherchera des bugs qui ne dépendent pas de la taille des mots-machine, et donc qu'on pourra trouver via une approche exhaustive pour une petite taille ; et via une approche indirecte on cherchera à prouver formellement qu'il n'existe pas de bug, un blocage dans la preuve formelle nous guidant vers un bug potentiel.

D'une part, on implantera donc dans un nouveau logiciel (mini-mpfr) les algorithmes utilisés pour les opérations arithmétiques de base de MFPR (addition, soustraction, multiplication, division, racine carrée). Ces algorithmes sont basés sur des
opérations élémentaires sur des tableaux de mots (limbs) qui sont implantées via la bibliothèque GMP. Sur les machines modernes, un limb a 64 bits.
Ce logiciel mini-mpfr permettra de paramétrer le nombre w de bits par limb, et avec w=2 ou w=3 on espère pouvoir détecter plus facilement des bugs, soit par des tests exhaustifs sur un petit nombre de mots, soit par des tests aléatoires.

D'autre part, on essaiera de prouver formellement les algorithmes utilisés par MPFR, que ce soit en petite précision ou en précision arbitraire. On espère ainsi pouvoir trouver des bugs passés inaperçus jusque là, ou bien détecter des branches
mortes, ce qui permettrait d'optimiser la bibliothèque MPFR.

URL sujet détaillé : https://members.loria.fr/PZimmermann/bedop.pdf

Remarques : Ce stage est proposé dans le cadre d'une collaboration avec Karthik Bhargavan (équipe Prosecco, Inria Paris).



SM207-94 Cryptanalyse de schémas multivariés  
Admin  
Encadrant : Renaud DUBOIS
Labo/Organisme : Thales (laboratoire Chiffre)
URL : https://www.thalesgroup.com/fr
Ville : Gennevilliers

Description  

Contexte
=======Dans le cadre de l'appel à propositions du NIST pour la standardisation de cryptographie post-quantique,
des schémas reposant sur des problèmes basés sur des polynômes multivariés vont être proposés. La difficulté
dans la conception de tels cryptosystèmes est de trouver un système central que l'on sache exprimer en
terme de polynômes multivariés quadratiques dans lequel on puisse insérer une trapdoor facile à inverser.
Plusieurs constructions ont été proposées, qui tombent essentiellement dans deux catégories : les schémas à
corps unique tel que le schéma UOV, et les schémas à corps mixés tels que C*, HFE, HMFEv-, ABC.
Une partie du stage consistera à recenser ces propositions, récupérer les implémentations disponibles et les
comparer. On étudiera les outils algorithmiques nécessaires à leur implémentation, les voies d'optimisations
ainsi que les caractéristiques spécifiques et génériques que l'on pourrait envisager pour le développement
d'accélérateurs matériels. Un des principaux outils de cryptanalyse de ces schémas sont les bases de Gr=F6bner.
La compréhension des différences d'efficacité de cet outil vis-à-vis des différents schémas est une des voies
de recherche de ce stage.

Description du stage
===================1. Dans un premier temps une étude bibliographique des différents schémas soumis au NIST sera réalisée,
ainsi qu'un rassemblement des sources disponibles.
2. Dans un second temps, une analyse de l'implémentation des schémas sera réalisée, ainsi qu'une synthèse
sur les performances relatives des différents schémas sur différentes architectures.
3. Dans un dernier temps on retiendra un sous ensemble réduit de schéma afin de réaliser une analyse
de sécurité incrémentale de ceux-ci. On réduira tout d'abord le schéma à un exemple jouet, puis on se
donnera une succession de dimensionnements sur lesquels on analysera l'efficacité de différents outils de
cryptanalyse. Le but de l'analyse est de se donner un modèle du niveau de sécurité de ces schémas et
de (in)valider les propositions faites par les concepteurs.

Durée des travaux
================La durée prévue pour ce stage est de 6 mois.
URL sujet détaillé : www.di.ens.fr/~prest/Stages/2018_stages_lch_HFE.pdf :

Remarques : Le stage débutera vers avril 2018 et se déroulera sur une période de 6 mois sur le site de
CRISTAL, Thales Communications & Security à Gennevilliers, au sein du laboratoire chiffre LCH.
Le profil recherché est celui d'étudiant motivé par le travail au sein d'une équipe de R&D dans un grand
groupe, avec de bonnes compétences en mathématiques complétées par des compétences en programmation.
La rémunération mensuelle est, à titre indicatif, d'environ 1250 euros brut.



SM207-95 Amélioration d'un schéma de signature post-quantique  
Admin  
Encadrant : Thomas PREST
Labo/Organisme : Thales (laboratoire Chiffre)
URL : http://www.di.ens.fr/~prest/
Ville : Gennevilliers

Description  

Contexte
=======En raison de la menace croissante que représentent les ordinateurs quantiques pour la cryptographie à
clef publique actuellement déployée, des efforts de standardisation de cryptographie dite "post-quantique"
ont été lancés, le plus notable étant l'appel à propositions du NIST.
Avec d'autres partenaires (IBM, IRISA, etc.), Thales compte soumettre un schéma de signature basé
sur les réseaux euclidiens : Falcon. Le but de ce stage est l'amélioration au sens large de ce
schéma. En particulier, deux techniques algorithmiques y sont utilisées, et liées à :
- la résolution d'équation du type fG - gF = 1 mod q où f, g, q sont donnés en entrée et f, g, F, G, sont dans
Z[x].
- l'utilisation de samplers Gaussiens sur les réseaux.
La première technique nécessite l'utilisation de très grands entiers. Pour la deuxième, il est fait un
usage intensif d'arithmétique en point flottant. Dans les deux cas, cela rend leur mise en oeuvre complexe,
notamment sur systèmes embarqués.

Description du stage
===================Le déroulement du stage s'effectuera comme décrit ci-après :
1. Dans un premier temps, le ou la stagiaire réalisera une étude bibliographique de Falcon et de ses "ancêtres".
2. Ensuite, son but sera de développer des algorithmes qui ne souffrent pas des limitations énoncées plus haut, c'est-à-dire :
(a) un algorithme permettant de résoudre l'équation fg - gF = 1 mod q sans utiliser de grands entiers ;
(b) un sampler Gaussien n'utilisant pas de point flottant ;
(c) de manière générale, toute proposition d'amélioration qui rendrait le schéma plus efficace (en
temps et/ou mémoire), plus sûr ou plus simple à implémenter sera prise en considération.
Nous recherchons des candidats possédant de solides compétences en algèbre, en algorithmique et un fort
esprit d'initiative. Des notions correctes en probabilités et la maîtrise d'un langage de programmation seront
aussi appréciées. Enfin, le candidat devra faire preuve de bonnes capacités de travail en équipe.

Durée des travaux
====================La durée prévue pour ce stage est 6 mois.
URL sujet détaillé : www.di.ens.fr/~prest/Stages/2018_stages_lch_LBC.pdf :

Remarques : Le stage débutera vers avril 2018 et se déroulera sur une période de 6 mois sur le site de
CRISTAL, Thales Communications & Security à Gennevilliers, au sein du laboratoire chiffre LCH.
Le profil recherché est celui d'étudiant motivé par le travail au sein d'une équipe de R&D dans un grand
groupe, avec de bonnes compétences en mathématiques complétées par des compétences en programmation.
La rémunération mensuelle est, à titre indicatif, d'environ 1250 euros brut.



SM207-96 Private Set Intersection  
Admin  
Encadrant : Emeline HUFSCHMITT
Labo/Organisme : Thales (laboratoire Chiffre)
URL : https://www.thalesgroup.com/fr
Ville : Gennevilliers

Description  

Contexte
=======Private Set Intersection (PSI) est un outil permettant à deux parties de connaître l'intersection de leurs
bases de données respectives, sans partager plus d'information que cette intersection. En particulier les deux
parties ne veulent pas transmettre la taille de leur base de données ni permettre à l'autre partie de récupérer
des informations sur le contenu de celles-ci. Avec des applications immédiates dans le cadre des réseaux
sociaux ou des systèmes de messageries, les algorithmes de PSI sont amenés à manipuler un grand nombre
de données que la plupart ne peuvent supporter.
Pour répondre à cette problématique de nombreuses solutions ont été proposées dans la littérature, en
tachant de prendre en compte des contraintes à la fois en terme d'efficacité et de sécurité. Les premières
solutions utilisaient des protocoles basés sur des chiffrements homomorphes ou d'autres techniques
de chiffrement asymétriques. L'article de Huang, Evans et Katz en 2012 marque un tournant dans la recherche
à ce sujet. En effet leur article propose une solution basée sur les garbled circuits de Yao. Par la
suite, toutes les avancées dans le domaine PSI font appel à des primitives cryptographiques complexes :
calcul multi-parties, filtres de Bloom, transfert équivoque, etc.
L'objectif de ce stage est de prendre en main les techniques cryptographiques sous-jacentes au problème
afin de pouvoir faire une comparaison fiables des différentes propositions de la littérature, puis d'en faire une
implémentation opérationnelle. Cette implémentation pourra servir de base pour un démonstrateur logiciel.
Dans cette optique le stage s'effectue en collaboration avec l'équipe de développement logiciel de Thales
Communication & Security.

Description du stage
===================Ce stage se déroulera en deux partie :
1. une étude bibliographique à la fois des protocoles de PSI et des primitives cryptographiques nécessaire
à la compréhension de ceux-ci.,
2. une implémentation d'un schéma choisis par le stagiaire en accord avec ses encadrants.
L'objectif de ce stage est de faire un état de l'art des méthodes de PSI et une études des algorithmes
sous-jacents - calcul multi-partie, transfert inconscient, etc. - avant d'en choisir un, d'étudier sa sécurité en
profondeur et de l'implémenter.
Le (La) candidat(e) devra avoir de bonnes compétences en cryptographie et algorithmie cryptographique
ainsi qu'une forte maîtrise des langages C et C++. Le candidat devra aussi faire preuve de bonnes capacités
de travail en équipe ainsi que d'un esprit d'initiative et sera force de proposition tout au long du stage.
URL sujet détaillé : www.di.ens.fr/~prest/Stages/2018_stages_lch_PSI.pdf :

Remarques : Le stage débutera vers avril 2018 et se déroulera sur une période de 6 mois sur le site de
CRISTAL, Thales Communications & Security à Gennevilliers, au sein du laboratoire chiffre LCH.
Le profil recherché est celui d'étudiant motivé par le travail au sein d'une équipe de R&D dans un grand
groupe, avec de bonnes compétences en mathématiques complétées par des compétences en programmation.
La rémunération mensuelle est, à titre indicatif, d'environ 1250 euros brut.



SM207-97 Transfer learning for social robots  
Admin  
Encadrant : Mathieu LEFORT
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/mathieu.lefort/
Ville : Lyon

Description  

Deep learning provides some major machine learning achievements in the previous years and leads to state of the art performances in multiple domains (object recognition, natural language processing, go, ...). However, these models need a lot of (possibly manually annotated) data examples to train. Such large database are not yet available for social robotics tasks. Moreover, autonomous robots will have to adapt to unknown situations so that representative databases cannot be constructed a priori and others solutions have to found. One of the next breakthrough in artificial intelligence will thus be the ability to incrementally construct knowledge, based on previously learned ones, and to generalize to similar situations.
This internship will study the problem of transfer learning for social robots. Previous works (based on deep learning architectures) proposed to transfer knowledge from one modality to another (e.g. visual to audio) by learning the function from one representation space to another.
Another approach is the zero-shot learning that targets to associate labels to examples from unseen categories. These works are based on the learning on a common embedding space, where some geometrical properties appears. The subject of this internship is to study these paradigms and see if they are applicable to social robots. This transfer can be considered at 3 different scales :
- Transfer the learning of one modality to solve one task to the use of another modality to solve the same task
- Transfer the knowledge to solve one task to another (similar) task
- Transfer the knowledge to solve a task on one robot to other robots (with different sensors
and motors)
The intern will be part of the Behaviors.ai project between the LIRIS laboratory and Hoomano and will interact with the members of the project (people and robots).
URL sujet détaillé : http://behaviors.ai/wp-content/uploads/sites/5/2017/10/behaviors_transfer.pdf

Remarques : Co-advising: Amélie Cordier (CSO Hoomano)



SM207-98 Empathy Models in Human-Robot Interaction  
Admin  
Encadrant : Alexandre GALDEANO
Labo/Organisme : LIRIS et Hoomano
URL : behaviors.ai :
Ville : Lyon

Description  

Nowadays, so called =93social robots=94 lack, despite of their name, of social intelligence which is a prerequisite for relevant interactions with humans. However, empathy =97 =93the ability to feel and understand others' emotional experiences=94 =97 emulated on social robots improves their interactions' quality, and relevance with humans.
The goal of this internship is to help new empathy models' positioning regarding existing ones. For that, the intern will re-implement some of the state-of-the-art empathy models and will check the results' reproducibility. Then, the intern will compare these models' ability to improve human-robot interactions in order to establish a baseline with which future algorithms emulating empathy could be compared with.
So, if you want to join us and make robots less dumb, this internship is for you!
URL sujet détaillé : http://behaviors.ai/wp-content/uploads/sites/5/2017/10/intership_empathy_offer_english.pdf

Remarques : Co-advising: Amélie Cordier (CSO Hoomano)



SM207-99 Inferential analysis of maliciously registered domains  
Admin  
Encadrant : Maciej KORCZYNSKI
Labo/Organisme : Grenoble Informatics Laboratory-Grenoble Alps University
URL : https://www.liglab.fr
Ville : Grenoble

Description  

Context:

Cybercriminals register new domain names every day to launch Internet-scale attacks, such as phishing or malware campaigns. For years, there has been anecdotal evidence indicating that miscreants choose to abuse top-level domains (TLDs), such as .com, .org, .top, that offer low domain name registration prices. However, there is no study that would prove this hypothesis or systematically analyze preferences of cybercriminals. One miscreant may indeed prefer lower registration prices but other may choose to abuse a registrar that offers specific payment methods (for example Bitcoin or Western Union). On the other hand, registrars or TLD operators might offer domains for free but, to prevent domain abuse, perform extensive checks to confirm identity of registrants. Therefore, they might not experience many malicious registrations. This project will help TLD registries to determine a number of possible measures to preemptively mitigate abusive and malicious activities across TLDs.

Internship Program and Objectives:

In our previous study, we systematically analyzed how different structural and security-related properties of new generic TLD (gTLD) operators influence abuse counts [1]. Our inferential analysis revealed that abuse counts primarily correlate with strict registration requirements, i.e. miscreants prefer to register domain names, which are generally open for public registration (example.com), rather than domains for which registries may impose restrictions on who or which entities can register their domains (example.bank).

The goal of the internship is to perform a fine-grained statistical analysis that will take into account the collected registration properties to determine driving factors of domain abuse. In this study, the intern will use domains that were maliciously registered (and blacklisted by reputable organizations such as Anti-Phishing Working Group) and will correlate the concentrations of abused domains with the registration policies (pricing, payment methods, registration strictness, etc.) at the time of domain creation. The student will systematically distil a set of registration features preferred by the attackers using generalized linear models (GLM).


References:
[1] M. Korczynski, M. Wullink, S. Tajalizadehkhoob, G.C.M. Moura, and C. Hesselman, "Statistical Analysis of DNS Abuse in gTLDs", Final Report ICANN DNS Abuse Study, August 2017, https://www.sidnlabs.nl/downloads/papers-reports/sadag-final-09aug17-en.pdf

Prerequisites:
- Good knowledge in statistics
- Good knowledge in scripting (e.g. Python)
URL sujet détaillé : http://mkorczynski.com/Malicious.pdf

Remarques : This internship will be performed in close collaboration with industry partners. It can be followed by a PhD thesis.



SM207-100 Similarity Searches and Efficient Indexes on Graphics Processing Units  
Admin  
Encadrant : Michael GOWANLOCK
Labo/Organisme : Northern Arizona University
URL : https://nau.edu/SICCS/
Ville : Flagstaff

Description  

This project will study efficient data retrieval on GPUs for data-intensive computing applications. Example applications include: clustering, similarity join algorithms, and trajectory similarity searches. These applications all require finding data that is relevant to the query object. Thus, efficient indexes are required for pruning the search space to find relevant data. However, many indexes yield irregular memory access patterns and are not cache-efficient. Therefore, we will develop indexing data structures that are tailored for the architecture of the GPU that aim to maximize parallel efficiency and reduce bottlenecks related to the data distribution. Furthermore, many of the key insights into performance trade-offs on the CPU do not apply to the GPU, thereby increasing the design space for efficient GPU-tailored indexes.

See my webpage for some of my papers:
http://jan.ucc.nau.edu/mg2745/
URL sujet détaillé : :

Remarques : Will work with the candidate to determine visa requirements, housing, and other necessities.



SM207-101 Notions de complexité dans les SFT 2D  
Admin  
Encadrant : Pierre GUILLON
Labo/Organisme : possibilités de financements, en particulier math-info, avec le LabEx Archimède (rassemblant le LIF et l'I2M).
URL : http://pguillon.perso.math.cnrs.fr
Ville : Marseille

Description  

Un sous-décalage de type fini bidimensionnel est un ensemble de coloriages de ℤ² par un nombre fini de couleurs défini par des contraintes locales (éviter certains motifs).
Ceci équivaut au formalisme des pavages de Wang, et un cas particulier est donné par les diagrammes espace-temps d'automates cellulaires (1D).
En tant que modèle du calcul massivement parallèle, il est tentant d'y établir une théorie de la complexité.

Il existe une notion de complexité pour l'évolution d'un automate cellulaire avec origine, mais des notions plus liées à la dynamique (sans origine) peuvent apparaître plus naturelles à étudier : la cylindricité, la vitesse de convergence comme sofique, la croissance de l'automate minimal du langage, des notions inspirées de la complexité de communication ou de l'entropie, les fonctions de quasi-périodicité...

La première partie de ce stage sera d'appréhender (la plupart de) ces
notions, et de les relier par des inégalités.

Ensuite, on essaiera de comprendre si ces quantités apparaissent plus contraintes
lorsque le SFT présente des propriétés de déterminisme (essentiellement le cas des automates cellulaires). On pourra commencer par étudier des exemples classiques, et peut-être arriver finalement aux constructions autosimilaires qui peuvent reconnaître n'importe quel langage récursivement énumérable.
URL sujet détaillé : http://pguillon.perso.math.cnrs.fr/stage.pdf

Remarques : possibilités de financements, en particulier math-info, avec le LabEx Archimède (rassemblant le LIF et l'I2M).



SM207-102 Apprentissage profond pour la segmentation interactive d'images  
Admin  
Encadrant : Santiago VELASCO-FORERO
Labo/Organisme : CMM / Ecole des Mines de Paris
URL : http://cmm.ensmp.fr/~velasco/
Ville : Fontainebleau / Paris

Description  

La segmentation est souvent une étape essentielle dans une application d'analyse d'images. Les méthodes d'apprentissage profond ont permis de faire des progrès significatifs dans ce domaine [1]. Cependant, ce genre d'approche requiert des bases de données annotées. Obtenir ces annotations en quantité suffisante est souvent difficile.

L'objet du stage est le développement de méthodes d'apprentissage profond pour la segmentation interactive d'images. Le but est de fournir à l'utilisateur final des outils lui permettant d'interagir simplement avec des propositions de segmentation afin d'améliorer de façon itérative la méthode de segmentation. On testera des méthodes basées sur les réseaux génératifs concurrentiels (generative adversarial nets) [2] et les attaques par des adversaires [3]. Les premières applications porteront sur l'analyse d'images de la peau acquises grâce à des techniques récentes de microscopie, notamment la microscopie multiphoton.

Le travail se déroulera au Centre de Morphologie Mathématique de MINES ParisTech (http://cmm.ensmp.fr), sous la direction de Santiago Velasco-Forero et d'Etienne Decencière. Le stagiaire commencera par faire une étude bibliographique sur le sujet, pour ensuite proposer des solutions au problème posé, qui seront discutées avec les encadrants. Il implémentera la solution retenue en langage Python, à l'aide d'outils ouverts (keras, tensorflow, scikit-learn, etc.). Le rapport final servira de base à une publication scientifique.

Le candidat retenu sera embauché sur CDD, et rémunéré à hauteur du salaire minimum.

Les candidats intéressés doivent envoyer leur CV et une lettre de motivation à :
Santiago VELASCO-FORERO (velasco.ensmp.fr)

[1] Ronneberger, Fischer, and Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015.
[2] Goodfellow et al. "Generative adversarial nets." Advances in neural information processing systems, 2014.
[3] Kurakin, Goodfellow, and Bengio. "Adversarial machine learning at scale." ICML, 2017
URL sujet détaillé : :

Remarques : a. La durée du stage sera comprise entre 4 et 6 mois.
b. Le candidat retenu sera embauché sur CDD, et rémunéré à hauteur du salaire minimum (SMIC).
c. Possibilité de faire une thèse après stage.




SM207-103 Modélisation de la dynamique d'éléments transposables au sein d'une population de génomes et implémentation efficace sur machine parallèle.  
Admin  
Encadrant : David DEFOUR
Labo/Organisme : Université de Perpignan
URL : http://lgdp.univ-perp.fr/
Ville : Perpignan

Description  

With the advent of the 'population genomics', a tremendous period of fundamental research and applications to societal problems (ie: erosion of biodiversity, food security in the context of global warming, personalized medicine...) has begun. A fundamental consequence of these developments is that a person is now characterized by his genes, but also by the interactions with the rest of his DNA. These 'non-genes' or 'dark matter' of the genome, are mainly Transposable Elements (ET),

The objective of this internship is to propose a structure able to represent the information of the ET of an individual genome and of global population in order to simulate efficiently possible operations that can occurs on such elements (transposition, deletion, inactivation, reactivation, competitive interactions).
The proposed structure will have to provide fast accesses to information, be memory efficient, allow as efficiently as possible large simulation.

URL sujet détaillé : http://perso.univ-perp.fr/david.defour/wp-content/uploads/Sujet-M2-2017.pdf

Remarques : Co-encadrement: Sébastien Gourbiere,
Rémunération: ~550=80/mois.
Possibilité de poursuite en thèse.



SM207-104 From arithmetic to Boolean circuit lower bounds  
Admin  
Encadrant : Pascal KOIRAN
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/pascal.koiran/
Ville : Lyon

Description  

The goal is to adapt lower bound methods for arithmetic circuits to Boolean circuits, and especially to circuits of small depth with modular gates.
More details in the pdf below.
URL sujet détaillé : http://perso.ens-lyon.fr/pascal.koiran/stageM2_1718.pdf

Remarques :



SM207-105 Formal Modelling and Analysis in the context of Anti-diabetic Drug Discovery  
Admin  
Encadrant : Cédric LHOUSSAINE
Labo/Organisme : CRIStAL
URL : http://www.cristal.univ-lille.fr/~lhoussai/
Ville : Lille

Description  

This internship is proposed as part of a new collaboration between the bioComputing team and biologists from the Arab American University of Jenin, Palestine. The global topic is the *design of drugs treating Diabete type 2*, disease due to resistance to insulin action, whose worldwide prevalence rapidly rises. The biologist partners of the project focus on drugs based on plant extracted molecules. They currently work on the *anti-diabetic effects of basil* (/Ocimum basilicum - OB/), more precisely on the effects that OB has on *GLUT4 translocation*. Their experiments showed that OB extracts increased GLUT4 translocation up to 7 times, and identified a set of 53 chemical compounds present in OB. The OB anti-diabetic effects might be mediated in part through one or more of these identified compounds. Starting work in the bioComputing team (hopefully completed before the training period begins) will apply state of
the art methods (eg molecular docking and/or coexpression networks) to infer which proteins may be targeted by OB compounds, in order to isolate compounds potentially involved in insulin signaling. Then the core of the project will investigate further the remaining candidates.

In this large and ambitious project, the bioComputing team is involved in two main tasks : *modelling* and *analysis*. The internship should focus on either modelling or analysis, according to the candidate's preferences. *The modelling task*, which requires a strong interest in biology, consists in elaborating a *formal computational model of insulin signaling*, including GLUT4 translocation. The bioComputing team uses languages based on *abstract biochemical reactions*, similar to Petri Nets and BioCHAM language, that let one choose the appropriate abstraction level according to biological knowledge, relevance of details and planed analyses. The model must be validated against biological data. Once completed, its characteristics will give hints on the kind of formal analyses that can be conducted. Hence the *analysis task*, for which a strong interest in the *formal semantics of programming languages* is required. One goal of the project is to determine which OB compounds (or sets of compounds) are more likely to increase GLUT4 abundance at the plasma membrane. More generally, we aim at designing a *new formal analysis*, operating on
1) a model (a set of reactions),
2) an output objective represented by a particular reaction of the model for which one wants to observe an increase or decrease of its activity,
3) a least set of reactions that represent new behaviors that one may add to the model.
The analysis should determine which behaviors are more likely to reach the output objective. The team already develops methods based on *abstract interpretation* for models lacking kinetic information. The analysis will be applied in the project, initially using a very rough formal model of insulin signaling taken from the litterature, with GLUT4 insertion in the membrane as the objective.

References and more biological details are provided in the extended version.

URL sujet détaillé : http://www.cristal.univ-lille.fr//BioComputing/stages/diabete.html

Remarques : co-encadrement avec Mirabelle Nebut et Cristian Versari.
gratification mensuelle de 554.40=80.
possibilité de poursuite en thèse.



SM207-106 Preuve formelle en calcul réseau  
Admin  
Encadrant : Pierre ROUX
Labo/Organisme : ONERA
URL : http://perso.ens-lyon.fr/pierre.roux/
Ville : Toulouse

Description  

De nos jours les avions ne peuvent se passer d'un important réseau embarqué pour faire communiquer les nombreux capteurs et actionneurs qui y sont disséminés. Ces réseaux ayant une fonction critique, en particulier pour les commandes de vol, il est important d'en garantir certaines propriétés telles des délais de traversé ou l'absence de débordement de buffers. Le calcul réseau est une méthode mathématique permettant de réaliser de telles preuves [2]. Elle a joué un rôle clef dans la certification du réseau AFDX, dérivé de l'ethernet, utilisé à bord des avions les plus récents (A380, A350).

Le calcul réseau se base sur des résultats mathématiques relativement simples mais déjà bien assez subtils pour qu'il soit très facile de commettre des erreurs ou des omissions lors de preuves papier. Par ailleurs, les assistants de preuve sont un bon outil pour réaliser une vérification mécanique de ce genre de preuves et obtenir un très haut niveau de confiance dans leurs résultats.

On souhaite donc formaliser les propriétés fondamentales à la base de la théorie du calcul réseau. Ces résultats font intervenir des propriétés relativement basiques sur les nombres réels, telles des bornes supérieures voire des limites de fonctions linéaires par morceaux. On se propose pour cela d'utiliser l'assistant de preuve Coq ainsi que la récente librairie Coquelicot [1] étendant sa librairie de réels de base.

L'objectif à long terme de cette ligne de travaux étant la réalisation d'un outil de calcul réseau accompagnant ses résultats d'éléments permettant d'en réaliser une preuve formelle, y compris sur des configurations " industrielles ", on pourra également s'intéresser au cours du stage à la réalisation d'un prototype sur des cas d'étude simples. Ce prototype pourrait s'appuyer sur les traces fournies par l'outil industriel RtaW Pegase.

Références

[1] Sylvie Boldo, Catherine Lelay, and Guillaume Melquiond. Coquelicot: A User-Friendly Library of Real Analysis for Coq. Mathematics in Computer Science, 9(1):41=9662, March 2015.
[2] Jean-Yves Le Boudec and Patrick Thiran. Network calculus: a theory of deterministic queuing systems for the internet, volume 2050. Springer Science & Business Media, 2001.

URL sujet détaillé : http://w3.onera.fr/stages/sites/w3.onera.fr.stages/files/dtis-2018-046.pdf

Remarques : Possibilité de poursuite en thèse (financement assuré par le projet ANR RT-proofs).



SM207-107 Multiresolution Volumetric Editing  
Admin  
Encadrant : Lionel UNTEREINER
Labo/Organisme : ICube =96 équipe Mimesis (Inria/Université de Strasbourg)
URL : http://mimesis.inria.fr
Ville : Strasbourg

Description  

# Context
In computer graphics and animation, subdivision surfaces are widely used to create visually appealing 3D models [3]. When aiming for a plausible physical behavior of those models for 3D animation or games, physically-based simulation comes into play. Then, a volumetric mesh has to be generated from those models. In many cases, it takes several loops of design and simulation to adjust a 3D geometry so that it shows the intended behavior in the simulation. This work targets a tighter integration of 3D modeling and physically-based simulation in computer graphics.

# Goal
The goal of this internship is to set up a multiresolution volumetric editing tool. For this purpose, the main challenge is to enable volumetric editing primitives [2] and in the multiresolution framework.

# Bibliography
[1] Lionel Untereiner, David Cazier, Dominique Bechmann , n-Dimensional multiresolution representation of subdivision meshes with arbitrary topology, In Graphical Models, 2013
[2] Christian Altenhofen, Felix Schuwirth, André Stork, Dieter Fellner, Volumetric subdivision for consistent implicit mesh generation, In Computers & Graphics, 2017
[3] Denis Zorin, Peter Schr=F6der, and Wim Sweldens, Interactive multiresolution mesh editing. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques (SIGGRAPH '97), 1997
URL sujet détaillé : http://mimesis.inria.fr/job-offers/master-internships/

Remarques : Co-advisor: David Cazier (ICube/Mimesis)



SM207-108 Interactive coupled lattices generation for real time physically-based mesh deformation  
Admin  
Encadrant : Lionel UNTEREINER
Labo/Organisme : ICube =96 équipe Mimesis (Inria/Université de Strasbourg)
URL : http://mimesis.inria.fr
Ville : Strasbourg

Description  

# Context
Computing the behavior of deformable structures is an intensively studied problem in the fields of computer graphics and computer animation. When it comes to real time simulations coarse finite element meshes are used as an embedding domain for the fine surface to deform. These coarse meshes obviously do not embed correctly the surface's features and thus do not lead to the intended behavior.

# Goal
The goal of this internship is to set up a new meshing method that generates a hexahedral finite element mesh suitable for real time simulations. The mesh will evolve interactively following the user's inputs thanks to an innovative mechanical coupling system. The intern will evaluate his meshing algorithm in the context of elastic solid modeling by using the SOFA framework.

# Bibliography
[1] Binh Huy Le and Zhigang Deng, Interactive cage generation for mesh deformation. In Proceedings of the 21st ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D '17), 2017
URL sujet détaillé : http://mimesis.inria.fr/job-offers/master-internships/

Remarques : Co-advisor: Stéphane Cotin (ICube/Mimesis)



SM207-109 Computing Tree-decompositions of Graphs  
Admin  
Encadrant : Nicolas NISSE
Labo/Organisme : Inria
URL : http://www-sop.inria.fr/members/Nicolas.Nisse/
Ville : Sophia Antipolis

Description  

voir le lien ci-dessous
http://www-sop.inria.fr/members/Nicolas.Nisse/StageTDGST2017.pdf
URL sujet détaillé : http://www-sop.inria.fr/members/Nicolas.Nisse/StageTDGST2017.pdf

Remarques : Co-encadrant : Julien Bensmail
Julien.BENSMAIL.fr



SM207-110 Chirotopes of planar point sets  
Admin  
Encadrant : Xavier GOAOC
Labo/Organisme : LIGM
URL : monge.univ-mlv.fr/~goaoc/ :
Ville : Marne-la-Vallée

Description  

A chirotope is a combinatorial object associated with a sequence of points in the plane (like a permutation is naturally associated with a sequence of real numbers). In short, chirotopes encode the orientation (clockwise or counterclockwise) for every triangle. This limited information proves sufficient for solving several problems in computational geometry (computation of convex hull, detection of intersections among segments, enumeration of triangulations...).

This internship proposes to study: (1) conditions under which a chirotopes can be drawn in a constrained manner, say with its convex hull on a circle, and (2) algorithms to decide if two point sets not only have the same chirotopes, but can be extended into larger point sets with the same chirotopes. The two questions are related, and introduce to a more general investigation of pattern avoidance in chirotopes.
URL sujet détaillé : monge.univ-mlv.fr/~goaoc/internship-chirotopes18.pdf :

Remarques : Internship co-advised by Alfredo Hubard (MdC at UPEM).

A "gratification de stage" is planned.



SM207-111 Combinatorial properties of random and pseudo-random graphs.  
Admin  
Encadrant : Andrei ROMASHCHENKO
Labo/Organisme : LIRMM
URL : http://www.lirmm.fr/~romashchen/
Ville : Montpellier

Description  

The study of random graphs is a large and a well developed branch of graph theory. It deals with the typical properties that hold with high probability for a randomly chosen graph. For example, it is known that the vast majority of graphs of fixed degree satisfy the definition of an expander. The expander graphs posses several interesting properties: high vertex expansion, strong connectivity, fast mixing, etc. Graphs of this type have many important applications in computer science and in coding theory.

Thus, for several practical applications we need expander graphs. But though almost any randomly chosen graph is an expander, it is rather hard to produce such a graph explicitly, in a deterministic way. On the other hand, generating and storing a large random graph is a an expensive procedure, which consumes much resources.

We propose to study an intermediate approach that combines the usual deterministic and probabilistic paradigms: we propose to produce graphs using (classical) pseudo-random number generators. The idea is to compare the properties of pseudo-random graphs with the typical properties of truly random graphs. More technically, we propose to estimate theoretically some specific combinatorial properties of truly random graphs (expansion rate, spectral gap) and then compare them with the properties of pseudo-random graphs by numerical experiments. Both positive and negative results would be interesting: either we obtain efficient constructions that produces pseudo-random graphs with interesting combinatorial properties (useful for applications), or we discover new tests to distinguish truly random sequences from pseudo-random ones.

URL sujet détaillé : http://www.lirmm.fr/~romashchen/stages/prg-graphes.pdf

Remarques : Co-encadrant: Alexander SHEN (LIRMM)



SM207-112 The optimal secret key agreement and information inequalities.  
Admin  
Encadrant : Andrei ROMASHCHENKO
Labo/Organisme : LIRMM
URL : http://www.lirmm.fr/~romashchen/
Ville : Montpellier

Description  

Suppose that each of two parties (two persons, two computers, any two electronic devices) called Alice and Bob holds some piece of information A and B respectively, and these pieces of information are correlated with each other. Alice and Bob can interact with each other via a public communication channel. The aim of Alice and Bob is to agree on a common secret key Z so that the eavesdropper (who does not know A and B but who can intercept the communication between Alice and Bob) gets no information on this key. The question is how large this common secret key can be made (how much information it can contain). This problem has an elegant solution: it can be shown that in some natural setting the maximal size of the common secret key Z is equal to the value of the mutual information between A and B (defined in terms of Kolmogorov complexity.)

We propose to generalize this problem to the case of>2 parties. Though several partial results on this generalization are known, many natural questions remain open. We suggest to focus on two types of questions:

1. Quantitative questions: compute the size of the maximal value of the common secret key given all mutual information quantities for the input data.

2. Algorithmic questions: construct efficient (poly-time computable) communication protocols of the secret key agreement for some natural types of correlation between the initial pieces of information given to the parties involved in the protocol.

To attack the first type of questions we propose to employ the bounds based on information inequalities. In the questions of the second type we suppose to use the technique of coding theory and different methods of hashing.



Prerequisites: The candidate should have some basic knowledge of classical information theory. The knowledge of Kolmogorov complexity and communication complexity is a big plus.


URL sujet détaillé : http://www.lirmm.fr/~romashchen/stages/inf-ineq.pdf

Remarques :



SM207-113 Elliptic curve protocols for constrained devices  
Admin  
Encadrant : Thomas PLANTARD
Labo/Organisme : Institute of Cybersecurity and Cryptology, University of Wollongong
URL : https://eis.uow.edu.au/scit/institute-cybersecurity-cryptology/overview/index.html
Ville : Wollongong, Australia

Description  

This project goal is to modify ECC protocol, such that it adapts to a wide range of devices while guaranteeing the same level of security.
Devices will be categorized in function of their constraints regarding memory, energy and time.
ECC protocol will be modified at each of its layers: group layer, elliptic curve layer and finite field layer.
The expected outcome of this project is a ECC prototype which automatically adapts its computation in function of the environment constraints it encounters on the device on which it is used.
URL sujet détaillé : https://www.uow.edu.au/~thomaspl/internship.pdf

Remarques : Funding is 900DAU by month.
Even if this project can be divided in separated subtasks, this project is a group project and at least one other student will be involved.
There are possibilities for a PhD with scholarship in the continuity of this internship.



SM207-114 Multiresolution control variates for image rendering  
Admin  
Encadrant : Nicolas BONNEEL
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/~nbonneel/
Ville : Villeurbanne

Description  

Control variates are standard variance reduction techniques for Monte Carlo integration. They consist in finding a function which integral is (approximately) known that correlates with the function to be integrated to improve the convergence speed.

For image rendering, raytracing uses Monte-Carlo methods to integrate light paths in order to estimate pixel values. These methods converge slowly, and produce noisy images when stopped prematurely.

The goal of the internship is to exploit a coarse resolution image estimate as a surrogate for the control variate, both theoretically and practically. We will extend this idea to the multi-resolution rendering problem, leading to a fully multi-level Monte-Carlo method for image rendering. The link between this approach and wavelets will also be investigated.

Related articles:
- "Monte Carlo theory, methods and examples" Chap. 8, Art B. Owen, 2013, http://statweb.stanford.edu/~owen/mc/Ch-var-basic.pdf
- "Image-space Control Variates for Rendering", Rousselle et al., 2016, https://cs.dartmouth.edu/~wjarosz/publications/rousselle16image.html
- "Hierarchical Monte Carlo Image Synthesis", Alexander Keller, 2001, https://pdfs.semanticscholar.org/d8d4/648da166a1933bc676ef2423203ed87d340b.pdf
URL sujet détaillé : :

Remarques : - The internship can be funded.
- Co-advised between: Nicolas Bonneel, Julie Digne, Jean-Claude Iehl
- Desired skills: Knowledge in proba/statistics, programming (C++), basics of computer graphics




SM207-115 Un modèle catégorique pour un langage de description de diagrammes quantiques.  
Admin  
Encadrant : Simon PERDRIX
Labo/Organisme : LORIA
URL : https://members.loria.fr/SPerdrix/
Ville : Nancy

Description  

Context.
The ZX-calculus is a powerful graphical language for quantum reasoning, it captures some fundamental quantum notions like entanglement, causality, complementarity and how they interact. One of the main challenges in the development of the ZX-calculus is to use it as an intermediate language between high-level languages, used to describe quantum algorithms, and various backends, depending on the chosen quantum technologies.
ZX-calculus is equipped with a powerful equational theory for deciding whether two programs are equivalent, or for program optimisation.

Objectives.
To use ZX-diagrams as an intermediate language, the development of a new quantum diagram description language is necessary. The first objective is to extend the categorical model for quantum circuit description language to ZX-diagrams. Then, a particular attention will be paid to the categorical axiomatization of the quantum control of quantum diagrams.
URL sujet détaillé : https://members.loria.fr/SPerdrix/files/doc/sujet-ZX.pdf

Remarques :



SM207-116 Database Learning under Updates  
Admin  
Encadrant : Michael MATHIOUDAKIS
Labo/Organisme : LIRIS CNRS
URL : liris.cnrs.fr :
Ville : Lyon

Description  

Database Learning is a new research area that addresses the task of continuously improving query performance via learning from past query answers. Essentially, the approach consists in learning a probabilistic model that predicts the answer of a query, given its similarity to previously observed queries. As a field, it lies at the intersection of Database Systems and Machine Learning, and builds on top of earlier work on Approximate Query Processing (AQP). Existing work addresses the Database Learning task in static data settings. In this internship, the goal is to develop algorithms for dynamic settings. In such settings, the underlying database undergoes updates that modify its content, thus raising the need for updating the learned model.
URL sujet détaillé : https://www.dropbox.com/s/sjtjk1yfsl6a5p3/Database%20Learning%20under%20Updates.pdf?dl=0

Remarques : Advisors:

Angela BONIFATI, CNRS LIRIS, Lyon, France, angela.bonifati-lyon1.fr
Hamamache KHEDDOUCI, CNRS LIRIS, Lyon, France, hamamache.kheddouci-lyon1.fr
Michael MATHIOUDAKIS, CNRS LIRIS, Lyon, France, michael.mathioudakis.cnrs.fr
Mohammad SADOGHI, University of California Davis, USA, msadoghi.edu




SM207-117 Quantitative Languages of Infinite Words  
Admin  
Encadrant : Olivier SERRE
Labo/Organisme : IRIF (CNRS & Univ. Paris 7)
URL : https://www.irif.fr/~serre/
Ville : PARIS

Description  

The goal of the research is to find robust definition of quantitative languages of infinite words defined by mean of finite state weighted automata.

See extended abstract given by the link below.
URL sujet détaillé : https://www.irif.fr/~serre/Files/sujetM2Lyon.pdf

Remarques : co-encadrement avec Nathanaël Fijalkow



SM207-118 Space of metrics on a combinatorial surface  
Admin  
Encadrant : Francis LAZARUS
Labo/Organisme : GIPSA-Lab, UMR CNRS 5216
URL : http://www.gipsa-lab.fr/~francis.lazarus/
Ville : Grenoble

Description  

Let G = (V, E) be a graph cellularly embedded on a surface S. Given a weight assignment E → R+ the length of a walk (closed or not) in G is the sum of the weights of its edges. A weight assignment thus defines a discrete metric and the goal is to study the space of such metrics. There is a natural equivalence between metrics that divides this space into cells. Among the basic questions, one may ask
- what is the shape/complexity of a cell in the partition?
- Is there a finite set of parameters =96 other than a weight function =96 to describe
a cell in the partition?
- Can we detect equivalence efficiently?

URL sujet détaillé : http://www.gipsa-lab.fr/~francis.lazarus/Stages/metrics-M2-2018.pdf

Remarques : Co-encadrement avec Arnaud de Mesmay
http://www.gipsa-lab.fr/%7Earnaud.demesmay/

Possibilité de rémunération, possibilité de poursuite en thèse



SM207-119 Discrete systolic inequalities  
Admin  
Encadrant : Arnaud DE MESMAY
Labo/Organisme : GIPSA-Lab, UMR CNRS 5216
URL : http://www.gipsa-lab.fr/~arnaud.demesmay/
Ville : Grenoble

Description  

The systole of a surface is the length of the shortest non-contractible cycle on this surface. The goal of this internship is to obtain improved inequalities controlling the systole in a discrete setting, as well as families of combinatorial surfaces getting as close as possible to this upper bound. In this discrete setting, the surface comes with a cellularly embedded graph, and the systole is the length of the shortest non-contractible walk on this graph. This length (also called edge-width) appears naturally in topological graph theory as a
quantitative measure of (non)-planarity of the graph: the shorter the systole, the less planar the graph is.


URL sujet détaillé : http://www.gipsa-lab.fr/~francis.lazarus/Stages/systole-M2-2018.pdf

Remarques : Co-encadrement avec Francis Lazarus
http://www.gipsa-lab.fr/~francis.lazarus/

Possibilité de rémunération, possibilité de poursuite en thèse



SM207-120 Quantitative verification of population protocols  
Admin  
Encadrant : Markey Nicolas
Labo/Organisme : IRISA
URL : http://www.irisa.fr/sumo/
Ville : Rennes

Description  

Parameterized verification is a powerful technique for reasoning about several instances of similar models, e.g. when the exact number of copies of some component can be arbitrary. Distributed algorithms are a motivating example for this setting, as their correctness should not depend on the exact number of agents.

Population protocols, introduced by Angluin et al. in 2004, are a classical model in this setting:
all agents have the same finite state space, and they change state when "meeting" other agents, following simple rewriting rules and obeying a randomized scheduler. Algorithms have been
proposed recently to check qualitative properties of population protocols, such as almost-surely reaching a consensus configuration, where all components end up in the same state.

The aim of this internship is to tackle quantitative questions in this setting. A first objective is to evaluate performances of the distributed algorithm they represent (e.g. evaluate the average time before a consensus is reached). A second objective is to use quantities to relax the rigid notions studied in qualitative analysis (e.g. approximate consensus, where most agents agree on a common value).

URL sujet détaillé : http://people.irisa.fr/Nicolas.Markey/tmp/param-quant-2017.pdf

Remarques : Stage co-encadré par Nathalie Bertrand. Rémunération possible.



SM207-121 On kernelization when parameterizing by distance to triviaity  
Admin  
Encadrant : Marin BOUGERET
Labo/Organisme : LIRMM
URL : http://www.lirmm.fr/~bougeret/
Ville : Montpellier

Description  

(The subject is written in English but both of us can work in French or English interchangeably)


There is a whole area of parameterized algorithms and kernelization investigating the complexity ecology, where the objective is to consider a structural parameter measuring how "complex" is the input, rather than the size of the solution. For instance, parameterizing a problem by the treewidth (or the treedepth) of its input graph has been a great success for the development of FPT algorithms. However, the situation is not as good for kernelization, as many problems do not admit polynomial kernels by treewidth (or treedepth) unless NP /poly.

Within structural parameters, of fundamental importance are parameters measuring the so-called "distance from triviality" of the input graphs, like the size of a vertex cover (distance to an independent set) or a feedback vertex set distance to a forest). Unlike treewidth, these parameters may lead to both positive and negative results for polynomial kernelization. An elegant way to generalize these parameters is to consider a parameter allowing to quantify the triviality of the resulting instance, measured in terms of its treedepth.

More precisely, for a positive integer c, a c-treedepth modulator of a graph G is a set of vertices X such that the treedepth of G-X is at most c. Note that for c=1, a c-treedepth modulator corresponds to a vertex cover.

The starting point of the presented topic is [1], where the authors proved that a large class of problems admits a polynomial kernel on nowhere dense graphs when parameterized by the size of a c-treedepth modulator. This raised the natural question of determining for which problems this result could be extended to larger graph classes.

In [2] we started to investigate this question. It turned out that (parameterized by the size of a c-treedepth modulator):
- Vertex Cover admits a polynomial kernel on general graphs.
- Dominating Set does not admit a polynomial kernel unless NP /poly, even on degenerate graphs.

Very recently,

- It has been proved in [4] that the F-minor-free deletion problem (which generalizes Vertex Cover and Feedback Vertex Set) parameterized by the size of a c-treedepth modulator admits a polynomial kernel on general graphs (generalizing our result of [2]).
- New variants of distance to triviallity have been considered for vertex Cover (for example in [3]).

The objective of this internship is to continue the above work in the following directions:

(1) Understand which measures of triviality (close to a treedepth modulator) allow polynomial kernels for Vertex Cover. Indeed, as treedepth has many different characterizations, we think it should be possible to find natural other measures that are "close" to and smaller than treedepth.

(2) Close the gap between the size of our kernel in [2] and the lower bound provided in [3].

(3) See if the huge (new .. and complex!) machinery of [3] could be simplified when considering for example the Feedback Vertex Set problem (which would generalize [2] but be a particular case of [3]).




[1] Kernelization using structural parameters on sparse graph classes.
J. Gajarsky, P. Hlineny, J. Obdrek, S. Ordyniak, F. Reidl, P. Rossmanith, F. S. Villaamil, and S. Sikdar.
In Proc. of the 21st European Symposium on Algorithms (ESA), 2013.

[2] How much does a treedepth modulator help to obtain polynomial kernels beyond sparse graphs?
M. Bougeret, I. Sau.
In Proc. of the 12th International Symposium on Parameterized and Exact Computation (IPEC), 2017.
https://arxiv.org/abs/1609.08095.

[3] Smaller parameters for vertex cover kernelization.
E.-M. Hols and S. Kratsch.
In Proc. of the 12th International Symposium on Parameterized and Exact Computation (IPEC), 2017.

[4] Polynomial Kernels for Hitting Forbidden Minors under Structural Parameterizations.
B. M. P. Jansen, A. Pieterse.
URL sujet détaillé : :

Remarques : Co-supervised by Ignasi Sau at LIRMM, Montpellier, http://www.lirmm.fr/~sau/ )



SM207-122 Preuve formelle et mise au point d'algorithmes d'arithmétique complexe  
Admin  
Encadrant : Jean-michel MULLER
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/jean-michel.muller/
Ville : Lyon

Description  

Many problems are naturally expressed in terms of complex numbers. Unfortunately, the complex arithmetics offered by some programming languages are frequently inaccurate, may induce "spurious" overflow/underflow (i.e., an intermediate calculation leads to an overflow whereas the final result would have been representable), or are too slow.
Efficient algorithms are possible, yet their proof is rather involved: providing formal proof would significantly improve the confidence in the claimed results.

We have worked for a few years on some aspects of the problem:

- rigorous analysis of classical algorithms for complex arithmetic (or of basic building blokes that allow to build complex arithmetic algorithms, e.g., evaluation of sqrt(a^2+b^2) in floating-point arithmetic)
(examples are:
http://perso.ens-lyon.fr/jean-michel.muller/VersionAccepteeNumericalAlgorithmsFeb2016.pdf
http://perso.ens-lyon.fr/jean-michel.muller/mcom3123.pdf)

- proposition of more accurate algorithms (for instance complex square-roots: work in progress)

- design a a framework for analysis and formal proof of algorithms using floating-point arithmetic
(examples are:
https://link.springer.com/article/10.1007%2Fs11786-014-0181-1
https://www.lri.fr/~melquion/doc/14-msc-expose.pdf)

The goal of the training period is to:
- provide a formal proof of an algorithm (for which we already have a "paper proof") that evaluates expressions of the form ax+by (cleary an important basic building block of complex arithmetic)
- analyse an algorithm due to Kahan for complex square roots (no published error bound exists), and possibly suggest variants

The work clearly can be followed by a PhD.

The co-advisors will be:

- for the Formal proof/Coq part: Guillaume Melquiond, INRIA researcher, LRI (Orsay)
- for the Computer arithmetic part: Jean-Michel Muller, CNRS researcher, LIP (Lyon)

https://www.lri.fr/~melquion/
http://perso.ens-lyon.fr/jean-michel.muller/

URL sujet détaillé : :

Remarques : co-advising: Guillaume Melquiond (LRI, Orsay), and Jean-Michel Muller (Lyon).




SM207-123 Comptage de modèles et symétries  
Admin  
Encadrant : Pierre MARQUIS
Labo/Organisme : CRIL et CRIstAL
URL : http://www.cril.univ-artois.fr/~marquis/
Ville : Lens et Lille

Description  

Comptage de modèles et symétries

Le problème du comptage des modèles d'une formule propositionnelle joue un rôle important tant en théorie
(en tant que problème complet pour la classe #P, en tenant compte du théorème de Toda) qu'en pratique par
le fait que le calcul de probabilités dans le cadre bayésien s'y réduit.

Pour améliorer l'efficacité calculatoire des compteurs de modèles, divers pré-traitements ont été proposés jusqu'ici.
Le stage proposé vise à étudier dans quelle mesure l'exploitation de symétries dans les formules peut
constituer un pré-traitement utile au comptage des modèles de certaines familles d'instances, à l'instar de
ce qu'elle permet quand le problème ciblé est celui de décider leur cohérence (le fameux problème SAT).

Une approche possible est de compléter l'instance traitée avec des contraintes additionnelles qui ne conservent
qu'un représentant de chaque classe d'équivalence, tout en lui associant un poids égal au cardinal de cette classe.

Le sujet comporte un volet théorique (groupe des permutations, mots de Lyndon, bornes sur les encodages
possibles des contraintes à ajouter) et un volet pratique (utilisation en amont des programmes Nauty et Saucy
de détection d'automorphismes dans les graphes pour déterminer des symétries, utilisation en aval des compteurs
de modèles de l'état de l'art cachet, c2d, d4, expérimentations sur diverses familles d'instances)

Le stage sera encadré par Florent Capelli (CRIStAL, Lille) et par Jean-Marie Lagniez et Pierre Marquis (CRIL, Lens).
Il pourra être poursuivi en thèse.

Quelques références

Fadi A. Aloul, Karem A. Sakallah, Igor L. Markov:
Efficient Symmetry Breaking for Boolean Satisfiability. IEEE Trans. Computers 55(5): 549-558 (2006)

Adnan Darwiche:
Decomposable negation normal form. J. ACM 48(4): 608-647 (2001)

Jean-Marie Lagniez, Pierre Marquis:
An Improved Decision-DNNF Compiler. IJCAI 2017: 667-673 (2017)

Tian Sang, Paul Beame, Henry A. Kautz:
Performing Bayesian Inference by Weighted Model Counting. AAAI 2005: 475-482 (2005)

http://www3.cs.stonybrook.edu/~algorith/implement/nauty/implement.shtml

http://vlsicad.eecs.umich.edu/BK/SAUCY/
URL sujet détaillé : :

Remarques : Le stage sera encadré par Florent Capelli (CRIStAL, Lille) et par Jean-Marie Lagniez et Pierre Marquis (CRIL, Lens). Il donnera lieu à une gratification.



SM207-124 Homomorphisms of planar graphs  
Admin  
Encadrant : Pascal OCHEM
Labo/Organisme : LIRMM
URL : http://www.lirmm.fr/~ochem/
Ville : Montpellier

Description  

Graph homomorphism is a nice generalization of coloring. A famous result of Hell and Nesetril is that deciding if a graph G has a homomorphism to a fixed graph H is polynomial if H is bipartite and is NP-complete otherwise. If the input graph G is required to be planar, then this defines the problem "planar H-coloring". This problem is known to be polynomial if H is K_1, K_2, K_4 or the Clebsch graph. This problem is known to be NP-complete if H is an odd cycle, the dodecahedron, the icosahedron, a graph with girth 5 and maximum degre 3, a circular q-clique with q=4n/(2n-1), ...

The goal of the stage is to determine the complexity of planar H-coloring for as many cores H as possible.
This implies to learn a lot about homomorphism and reduction techniques for NP-completeness.
URL sujet détaillé : http://www.lirmm.fr/~ochem/hom.pdf

Remarques :



SM207-125 Most Critical Elements for Multiobjective Optimization Problems  
Admin  
Encadrant : Cristina BAZGAN
Labo/Organisme : LAMSADE, Université Paris-Dauphine
URL : http://www.lamsade.dauphine.fr/~bazgan/
Ville : Paris

Description  

The goal of this internship is to study the most critical elements of a system for multiobjective combinatorial optimization problems. Such elements are items, nodes or links of a network whose suppression or damage causes the largest degradation of the performance of the system. For the first time in the literature, we consider the case where performance is measured through multiple criteria. This involves a large number of practical contexts where the most sensitive parts of a system must be protected. To address these issues, we aim at studying the complexity, designing, developing, and testing algorithms for the multiobjective version of classical optimization problems: shortest path, minimum spanning tree, a restricted version of knapsack.
URL sujet détaillé : http://www.lamsade.dauphine.fr/~bazgan/Stages/stageMostVitalMulti.pdf

Remarques : Co-advisors: Sonia Toubaline, Daniel Vanderpooten



SM207-126 Optimization in quantum information theory  
Admin  
Encadrant : Frédéric DUPUIS
Labo/Organisme : LORIA
URL : http://members.loria.fr/FDupuis/
Ville : Nancy

Description  

Several problems in information theory, such as channel capacity, can be expressed as geometric programs, which are a particular type of optimization problems for which there exist efficient algorithms. However, when looking at the quantum versions of these problems, we naturally arrive at what might be called "matrix geometric programs": geometric programs in which scalars have been replaced by matrices. The goal of this internship would be to investigate basic properties of this type of optimization problems and their applications to quantum information theory.
URL sujet détaillé : http://members.loria.fr/FDupuis/projets/m2-geometric-programs.pdf

Remarques :



SM207-127 Native floats in Coq  
Admin  
Encadrant : Erik MARTIN-DOREL
Labo/Organisme : IRIT / Université Paul Sabatier
URL : https://www.irit.fr/~Erik.Martin-Dorel/
Ville : Toulouse

Description  

The overall context of this internship deals with interactive theorem proving and computer arithmetic.

The objective of the internship is to develop a computationally-efficient theory of floating-point arithmetic for the Coq formal proof assistant.

One such theory could then serve as a building block to develop certified and efficient numerical algorithms in Coq.

(cf. detailed subject)
URL sujet détaillé : https://www.irit.fr/~Erik.Martin-Dorel/sujets/native_floats_in_Coq.pdf

Remarques : Co-advisor: Pierre Roux (ONERA)



SM207-128 Data Anonymization and Security  
Admin  
Encadrant : Cristina BAZGAN
Labo/Organisme : LAMSADE, Université Paris-Dauphine
URL : http://www.lamsade.dauphine.fr/~bazgan/
Ville : Paris

Description  

There are multiple methods of data anonymization. K-anonymity is a commonly used method that
defines a degree of anonymity ``k.'' A data sample is considered anonymous if each record is indistinguishable from at least k other records. In the context of the degree-based anonymization, a graph is said to be k-anonymous for an integer k, if for every vertex in the graph there are at least k-1 other vertices with the same degree. The computational complexity of making a given undirected graph k-anonymous either through at most s vertex deletions or through at most s edge deletions was studied.
The goal of the internship is to study new models for degree-based anonymization.
URL sujet détaillé : http://www.lamsade.dauphine.fr/~bazgan/Stages/stageAnonymization.pdf

Remarques :



SM207-129 Complexity of optimal list coding  
Admin  
Encadrant : Omar FAWZI
Labo/Organisme : LIP
URL : www.omarfawzi.info :
Ville : Lyon

Description  

Information theory studies the fundamental limit of basic information processing tasks such as compression or channel coding over various kinds of channels and constraints. These problems can all be seen as an optimization over encoding and decoding functions with specific properties.

In traditional information theory, one assumes some structure on the resource, typically, a memoryless channel or iid source, and tries to determine an entropic quantity that determines the optimal asymptotic rate.

The objective of this internship would be to look at the optimization problem from an algorithmic and complexity theoretic point of view. How hard is this optimization problem to solve, to approximate? More precisely, the objective is to study the list coding problem over a noisy channel.

References:
- https://arxiv.org/abs/1508.04095
- https://arxiv.org/pdf/1611.06924

Prerequisites: Probability, Information theory, Optimization (linear and semidefinite programming, approximation algorithms).



URL sujet détaillé : :

Remarques :



SM207-130 Graphes plongés sur les surfaces  
Admin  
Encadrant : Luca CASTELLI ALEARDI
Labo/Organisme : LIX (Ecole Polytechnique)
URL : http://www.lix.polytechnique.fr/~amturing/
Ville : Palaiseau

Description  

The main goal of this internship is to study, from both the combinatorial and algorithmic points of view, a class of structural properties characterizing the notion of graph planarity.
We will consider the characterization known as "Schnyder woods" (and the closely canonical orderings), which can be expressed in term of edge orientation and coloring: they lead to a number of applications in several domains, such as graph drawing and encoding, random and exhaustive generation, compact representations, plane spanners, ...
In particular we deal with toroidal graphs, which are graphs embedded on torus. For these graphs, two recent generalizations of Schnyder woods have been proposed, leading to new efficient algorithms for drawing and encoding graphs on surfaces.
One of the goals of the internship will be to extend to the toroidal case the algorithms for efficiently generating and embedding triangulated graphs.

URL sujet détaillé : http://www.lix.polytechnique.fr/~amturing/stages.html

Remarques :



SM207-131 Verification of Distributed Algorithms  
Admin  
Encadrant : Corentin TRAVERS
Labo/Organisme : LaBRI
URL : www.labri.fr :
Ville : Bordeaux

Description  

Context:
Distributed applications represent a significant part of our everyday life. Typically, our personal data are stored on remote distributed servers, data management relies on remote applications reachable
via smartphones, data-intensive computations are performed on computer clusters, etc. Since distributed
applications are deployed at increasingly large scale, they have to be reliable and robust, satisfying stringent
correctness criteria.

On the one hand, distributed algorithms are most often tailored for a specific system model. For example, communication
can be synchronous or asynchronous, or a certain number of errors and failures of certain type can happen
during executions. These assumptions are crucial and may lead to different families of algorithms designed for
a specific context. While there exists a large collection of sophisticated techniques for algorithm design and
analysis, most of the existing distributed algorithms, together with their correctness proofs, remain tailored
to one very specific framework and are surprisingly difficult to generalize to different, but apparently similar
frameworks.

On the other hand, formal methods aim at automatically asserting correctness of
distributed algorithms. Indeed, when the algorithm under consideration is designed for a fixed finite number of
processes using only finite-domain variables, software tools such as SPIN [2] are able to show correctness
properties automatically. In general, however, distributed algorithms are designed for a non-fixed number
of processes and may use variables whose domain are not bounded.
Parameterized verification
amounts to
establishing correctness of an algorithm no matter how many processes actually interact. A widely used proof
technique in parameterized verification is the
cut-off principle
[1], which builds on the following deduction
rule: If the algorithm is correct when at most
N
processes intervene, then it is correct for any number of
processes. The correctness of the algorithm then boils down to verify a finite-state system consisting in
N
processes.

Goals:
The main goal of this internship is to develop new verification techniques for distributed algorithms.
The work will focus on algorithms for
agreement tasks, such as
consensus. A good starting point might be
trying to remove some of (the many) restrictions in [3] or/and extend the cutoff bounds to more severe
failures models, such as byzantine failures.

References
[1] E. A. Emerson and K. S. Namjoshi. On reasoning about rings.
Int. J. Found. Comput. Sci.
, 14(4):527550,
2003.
[2] Gerard J. Holzmann. Software model checking with SPIN.
Advances in Computers
, 65:78109, 2005.
[3] Mari ́c O., Sprenger C., Basin D. Cutoff Bounds for Consensus Algorithms.
Computer Aided Verification
(CAV'17)
. Lecture Notes in Computer Science, vol 10427. Springer, 2017
URL sujet détaillé : http://www.labri.fr/perso/travers/M2/da.pdf

Remarques : Co-advisors: Anca Muscholl, Igor Walukiewicz

The internship is funded within the framework of the ANR project FREDDA



SM207-132 A Multi Debugger for Hop.js  
Admin  
Encadrant : Alan SCHMITT
Labo/Organisme : Celtique, Inria
URL : http://people.rennes.inria.fr/Alan.Schmitt/
Ville : Rennes

Description  

It is often said that the "S" in "IOT" ("Internet of Things") stands for security. Indeed, security is usually a secondary concern if considered at all. The ANR project CISC aims at certifying IOT applications by leveraging compilation techniques and a formalization of the protocols used. This internship takes place in that setting.

The compilation target for the protocols considered in the CISC project is an extension of the multitier language Hop.js [3, 4]. Hop.js is an extension of JavaScript [2] enabling the development of distributed applications using a single program that contains the code for every node, such as a client and a server. The purpose of this internship is to provide a tool to explore the behavior of Hop.js programs. More precisely, we propose to develop a multi debugger for Hop.js. A double debugger is a step-by-step interpreter that allows to inspect not only the state of the interpreted program but also the state of the interpreter, a multi debugger is a double debugger for distributed programs, where several interpreters interact. The starting point of the internship is a double debugger for JavaScript [1].

The work program of this internship are as follows.
- Define a semantics for the distributed part of Hop.js.
- Develop an OCaml interpreter for Hop.js.
- Extend the notion of double debuggers to multi debuggers for distributed programs.
- Provide a multi debugger for Hop.js.

References

[1] Arthur Charguéraud, Alan Schmitt, and Thomas Wood. JSExplain: an interactive debugger for the JavaScript specification. 2017. url: http://jsexplain.org/.
[2] ECMA. ECMAScript 2017 Language Specification (ECMA-262, 8th edition). June 2017. url: https://www.ecma-international.org/ecma- 262/8.0/index.html.
[3] Manuel Serrano, Erick Gallesio, and Florian Loitsch. =93Hop: a language for programming the web 2.0=94. In: Proceedings of the First Dynamic Languages Symposium. Portland, OR, USA: ACM, Oct. 2006, pp. 975=96985.
[4] Manuel Serrano and Vincent Prunet. =93 A Glimpse of Hop js=94. In: International Conference on Functional Programming (ICFP). ACM. 2016.
URL sujet détaillé : http://people.rennes.inria.fr/Alan.Schmitt/internships/multihopjs2018.pdf

Remarques : The student will be remunerated. There is also an opportunity to continue this work as a PhD student in the setting of the CISC project.



SM207-133 Distributed Delaunay Triangulation  
Admin  
Encadrant : Pooran MEMARI
Labo/Organisme : LIX
URL : https://www.lix.polytechnique.fr/stream/
Ville : Palaiseau

Description  

Motivated by the needs of a scalable out-of-core surface reconstruction algorithm, this project deals with the computation of distributed Delaunay triangulations of massive point sets. The project will be based on a recent algorithm which takes as input a point cloud, already partitioned into so-called tiles, and computes a set of triangulations, that provably provides local views of the Delaunay triangulation of all input points, whose direct construction would in practice be very costly in terms of time and memory requirements.

Extending the classical divide & conquer approaches, the proposed tile & merge approach features an efficient distribution of the input, of the computation, as well as of the output.
This efficiency is due to an uncentralized model to represent, manage and locally construct the triangulation corresponding to each tile. Identifying the minimal set of vertices that need to be exchanged between the neighboring tiles, our method is guaranteed to reconstruct, within each tile triangulation, the star of its local points, as though it were computed within the Delaunay triangulation of all points. Indeed, the global consistency between tile triangulations is ensured through a peer-to-peer merging strategy based on a merge graph, computed up-front, between tiles that need to agree on shared simplices.

The goal of the internship would be to analyze and improve the proposed method from different aspects, namely distributed computation and memory efficiency.


URL sujet détaillé : http://www.lix.polytechnique.fr/~memari/Stage-DDT.pdf

Remarques : Co-encadrant : Mathieu Brédif, Université Paris Est, LaSTIG MATIS, IGN ENSG



SM207-134 Model Checking for open concurrent systems.  
Admin  
Encadrant : Eric MADELAINE
Labo/Organisme : Inria Sophia-Antipolis Mediterannee
URL : http://www-sop.inria.fr/members/Eric.Madelaine/
Ville : Sophia-Antipolis

Description  

Context: We have developed a new semantic framework for the analysis of composition operators in concurrent systems. Open pNets (parameterized networks of synchronized automata) are new semantic objects used to define in a symbolic way the semantics of composition operators, but also of various type of "open" programs, like parallel and distributed programming skeletons, or generic distributed algorithms. We define the operational semantics of open pNets, using "open transitions", and "open automata" that include symbolic hypotheses on the behavior of the pNets "holes". Data in pNets and in open transitions is handled symbolically, using logical predicates, and reasoning on pNets behaviours is done by using an SMT solver dealing with these predicates.
One important tool for the analysis of concurrent systems is Model Checking (MC), that is checking whether some property expressed as a temporal logic formula is valid on the states of a (finite) model of its behaviour. Building a Model Checker for pNets (more precisely for open automata) has a challenging perspective to build a finite algorithm for checking properties of systems involving unbounded data. Typical MC algorithms are traversing the model (together with the formula), matching each action in the model transitions with "actions predicates" in the formula. For open transitions, matching could be (naively) replaced by checking the satisfiability of a formula, using
a Satisfiablity Modulo Theories (SMT) solver. The main difficulties with this approach is that a positive reply from an SMT solver gives an instantiation of the predicate variables validating the formula, not all possible instantiations, and that each instantiation would correspond to a specific unfolding of the automaton for the next step.
Subject: after familiarization with the pNet framework and with classical MC algorithms, the objective of the work is to answer some of the following questions:
- is it possible to define a (bounded) MC algorithm using satisfiability on individual open transitions, through a finite "unfolding" of the open automaton based on the results provided by the SMT solver?
- is it possible, based on the predicates in the transitions, to construct an abstract unfolding (that may be finite in some cases)?
- is it possible to delay the satisfiability checking, simply assembling the predicates along the way, and submit it only at the end? (hint: for this we may need to define some restrictions on the structure of the pNets to get termination) ?
- is it possible to achieve one of the above compositionally, which is by working with sub-components of an open pNet, instead of using the composed open automaton?
and (eventually) to define / implement a prototype.

URL sujet détaillé : http://www-sop.inria.fr/members/Eric.Madelaine/stages2017/MC-pNets.pdf

Remarques : Co-encadrant:
Simon Bliudze, INRIA Lille Nord - Europe
http://www.bliudze.me/simon/
Email: simon.bliutze.fr




SM207-135 From an OCaml interpreter to a pretty-big-step inductive semantics in Coq  
Admin  
Encadrant : Alan SCHMITT
Labo/Organisme : Celtique, Inria
URL : http://people.rennes.inria.fr/Alan.Schmitt/
Ville : Rennes

Description  

The JSExplain tool [2] is based on an interpreter written in monadic style in a tiny subset of OCaml. The goal of this internship is to design a systematic method to transform an interpreter written in the strict style of JSExplain into a set of inductive rules in pretty-big-step (PBS) style [1] to obtain a formal semantics in Coq.

More precisely, the student will rely on the front-end of our compiler from the subset of OCaml to other languages as a first step. For each construction of the subset of OCaml, they will design a transformation in PBS style, relying on monadic constructions to decide how to delegate the evaluation of subterms. If necessary, annotations to the source code of the interpreter will be added, in the form of ppx extensions.

References
[1] Arthur Charguéraud. =93Pretty-big-step semantics=94. In: Proceedings of the 22nd European Symposium on Programming (ESOP 2013). Springer, 2013, pp. 41=9660.
[2] Arthur Charguéraud, Alan Schmitt, and Thomas Wood. JSExplain: an interactive debugger for the JavaScript specification. 2017. url: https://github.com/jscert/jsexplain.
URL sujet détaillé : http://people.rennes.inria.fr/Alan.Schmitt/internships/ocamltopbs2018.pdf

Remarques : The student will be remunerated. There is also an opportunity to continue this work as a PhD student.



SM207-136 Schéma d'étiquetage de graphe pour le pré-calcul des k-plus courts chemins  
Admin  
Encadrant : David COUDERT
Labo/Organisme : Inria Sophia Antipolis
URL : http://www-sop.inria.fr/members/David.Coudert/
Ville : Sophia Antipolis

Description  

Ce stage porte sur la conception de méthodes de pré-calculs pour accélérer les requêtes de plus courts chemins dans les graphes. On s'intéressera en particulier à des schémas d'étiquetage de graphes et au calcul des k-plus courts chemins.


URL sujet détaillé : https://team.inria.fr/coati/files/2017/11/2018-hub-labeling-k-chemins.pdf

Remarques : Co-encadrant: Nicolas Nisse





SM207-137 Verification of Blocking Analyses for Multicore Real-Time Systems  
Admin  
Encadrant : Bj=F6rn BRANDENBURG
Labo/Organisme : Max Planck Institute for Software Systems (MPI-SWS)
URL : http://prosa.mpi-sws.org
Ville : Kaiserslautern

Description  

Project Context

Real-time systems are computing systems that must react to external stimuli within stringent response-time bounds and can be found at the heart of many safety-critical systems (e.g., think airbags, anti-lock brakes, collision avoidance systems, etc.). To ensure the safety of such systems, schedulability analysis (i.e., real-time scheduling theory) is used to assess and validate their timing correctness a priori (i.e., before deployment, it is checked that all deadlines will always be met). However, such schedulability analyses can be quite complicated and rely on non-obvious reasoning =97 so how do we even know that the analyses used to validate the safety of real-time systems are themselves correct?

The PROSA project is a systematic and large-scale effort to create a trustworthy foundation for provably sound schedulability analyses, using the Coq proof assistant. A key characteristic of PROSA is that we prioritize readability over all other concerns to ensure that specifications remain accessible to readers without a background in formal proofs.

Objective of the Project

Virtually all multicore real-time systems require some form of synchronization primitive to ensure mutually exclusive access to shared resources (e.g., data structures in shared memory, I/O ports, etc.). The most simple and most widely used synchronization primitive for multicore systems is a spin lock, where tasks that wait to gain access to locked resources busy-wait by executing a delay loop that wastes cycles (i.e., something like while (!try_to_lock()) { /* do nothing */ }). Of course, such spinning causes extra delays that can be dangerous in real-time systems. A subfield of real-time scheduling theory called blocking analysis has developed methods for statically bounding the worst-case delays due to lock contention. The objective of this project is the first mechanization and verification of such a blocking analysis for spin locks.

Prerequisites

- Some experience with Coq or other proof assistants (basic knowledge is sufficient). Prior knowledge of real-time systems and real-time scheduling theory is not required.
- Proficiency in English. (Proficiency in German not required; the institute language is English.)
- A taste for formal reasoning =97 if you like playing with Coq and want to apply it to real problems, this project is for you!

References

F. Cerqueira, F. Stutz, and B. Brandenburg, =93PROSA: A Case for Readable Mechanized Schedulability Analysis=94, Proc. 28th Euromicro Conference on Real-Time Systems (ECRTS 2016), pp. 273=96284, July 2016.

A. Wieder and B. Brandenburg, =93On Spin Locks in AUTOSAR: Blocking Analysis of FIFO, Unordered, and Priority-Ordered Spin Locks=94, Proc. 34th IEEE Real-Time Systems Symposium (RTSS 2013), pp. 45=9656, December 2013.


URL sujet détaillé : :

Remarques : Internships at MPI-SWS are paid. MPI-SWS provides free housing for interns.





SM207-138 Research and development of a gene prediction algorithm  
Admin  
Encadrant : Bianca, Edlira HABERMANN, NANO
Labo/Organisme : Bioinformatics Core Facility, IBDM / TAGC, Marseille
URL : http://www.ibdm.univ-mrs.fr/fr/equipe/biologie-computationelle/
Ville : Marseille

Description  

In computational biology gene prediction refers to the process of identifying the regions of DNA that encode genes. Evidence-based gene prediction relies on already available transcriptome data from the organism itself or closely related organisms.

The standard algorithms used in ab initio gene prediction make use of several paradigms such as: machine learning methods (neural networks), probabilistic Hidden-Markov Models (HMM) or more generally context free language parsing.
When transcriptome data is available, standard local alignment algorithms techniques are used to map the known transcripts to the genome. Some software tools combine ab initio and transcriptome-based techniques.

We have recently sequenced the genome of a salamander, the mexican Axolotl (Ambystoma mexicanum). The Axolotl has the capacity to regenerate entire body parts, such as its limbs or its tail, therefore, it is a popular model organism for research in regenerative medicine. The Axolotl genome is largely expanded, it is 10 times larger the human genome. Its large introns make the correct prediction of genes difficult with standard gene prediction software.

For this M2 stage, the aim is to develop an algorithm for evidence-based gene prediction in large genomes as the Axolotl one. Different gradual steps can be achieved depending on time and study advancement. No biological background is needed.
Step1: Study the standard algorithm paradigms used for gene prediction (ab initio and transcriptome based). Test some standard software and some combined ab initio with transcriptome-based software to see why and how they fail.
Step2: Identify the causes of failure of these standard gene prediction algorithms on genomes such as the one from Axolotl, which contain grossly extended intron sizes and large repetitive elements.
Step3: Identify and/or implement potential solutions for improved transcriptome-based gene prediction in such cases.
URL sujet détaillé : http://eda.mutu.net/files/GenePredictionAxolotl.pdf

Remarques : During this internship you will be part of a highly interdisciplinary laboratory, you will be under the supervision of B. Habermann as the group leader with special knowledge in bioinformatics, and by E. Nano, the computer science specialized scientist of the group. No biological background is needed for this internship.



SM207-139 Graph Algorithms Techniques for low resolution model of large protein assemblies  
Admin  
Encadrant : Dorian MAZAURIC
Labo/Organisme : Inria Sophia Antipolis - Méditerranée
URL : https://team.inria.fr/abs
Ville : Sophia Antipolis

Description  

Graph Problems.
Let H be a hypergraph with V(H) its set of vertices and E(H) its set of hyperedges. The Minimum Connectivity Inference (MCI) problem consists in fnding a smallest set of edges E
satisfying the following constraints: the set of nodes of every hyperedge of H must induce a connected subgraph in G = (V(H);E). This problem is motivated by a problem in bioinformatics, see Context below.

Research programme.
The aim of this internship is to develop algorithms for some generalized versions of MCI handling combinatorial constraints reflecting biophysical properties (bounded maximum degree, constraints on the diameter, ...). For each variant of the problem, the first aim is to determine the complexity of the problem (polynomial-time solvable, NP-hard).
In (the very likely) case the problem is NP-hard, the goal is then to develop efficient approximation algorithms or to prove that this problem is hard to approximate (APX-hard). We
also plan to develop parameterized algorithms and/or moderately exponential algorithms.
The same study for different instance classes is also planned in order to obtain faster and/or
more accurate algorithms for specific problems (e.g. by integrating biophysical assumptions).

Context.
A macromolecular assembly is composed of subunits (e.g. proteins). We assume that the composition, in terms of individual subunits, of selected complexes of the assembly is known. Indeed, a given assembly can be chemically split into complexes by manipulating chemical conditions. A node represents a subunit and there is an edge between two nodes if the two corresponding subunits are in contact in the assembly. The hypergraph represents the different selected complexes. The MCI problem consists in finding a smallest set of contacts satisfying some the connectivity constraints on complexes.

Background. Theoretical computer science and/or bioinformatics and/or applied mathematics.

Misc. Ideally, the MSc will be followed-up by a PhD thesis. Based on the results for
inference problems, we will then develop novel methods to design atomic resolution models of molecular assemblies.
URL sujet détaillé : http://www-sop.inria.fr/teams/abs/positions/master18-theoretical.pdf

Remarques : Coencadrement avec Frédéric Havet (I3S (CNRS, Université Côte d'Azur), Inria).



SM207-140 Formal Methods for the Development of Distributed Algorithms  
Admin  
Encadrant : Arnaud SANGNIER
Labo/Organisme : IRIF, Univ Paris Diderot
URL : https://www.irif.fr/~sangnier/main-sangnier.html
Ville : Paris

Description  

The development of correct distributed algorithms is challenging, for different reasons. First, in the presence of failures in asynchronous systems, many problems (e.g., consensus) cannot be solved at all. Furthermore, proving a given distributed algorithm to be correct is an extremely difficult task due to the combinatorial explosion of its state space.

In the research project FREDDA (FoRmal methods for the Development of Distributed Algorithms), we develop formal methods to support the design of distributed algorithms. In particular, these methods shall allow us to prove, automatically, that a given algorithm behaves correctly, or to improve its efficiency or robustness. This project lies at the frontier of two research domains with different communities, namely formal verification and distributed algorithms.

In this internship, we will set the first formal basis for models and analysis techniques that facilitate the development of distributed algorithms. For a start, the focus will be on existing distributed algorithms for renaming problems.
URL sujet détaillé : http://www.lsv.fr/~bollig/internship.pdf

Remarques : Co-encadré par Benedikt Bollig (CNRS, LSV, ENS Paris-Saclay).



SM207-141 Multiplayer games over graphs with imperfect monitoring  
Admin  
Encadrant : Patricia BOUYER-DECITRE
Labo/Organisme : LSV (CNRS, ENS Paris-Saclay)
URL : http://www.lsv.fr/~bouyer/
Ville : Cachan

Description  

The goal of the internship is to investigate multiplayer games on graphs, under a partial information hypothesis. Solution concepts that will be considered are Nash equilibria and alike.

Two research objectives can be followed:
* either assume a public signal monitoring and investigate involved solution concepts, like subgame-perfect equilibria.
* or investigate assumptions on the communication graph between processes, that allow to get decidability results for the existence of Nash equilibria.
In both cases, the construction of an abstract epistemic game can be investigated, which would convert the search for complex multiplayer solution concepts to the search for winning strategies in a standard two-player setting.
URL sujet détaillé : http://www.lsv.fr/~bouyer/files/sujet1-2018.pdf

Remarques : Indemnités de stages si étudiant non salarié



SM207-142 Monades pour les fonctions de coût  
Admin  
Encadrant : Thomas Colcombet
Labo/Organisme : Institut de Recherche en Informatique Fondamentale
URL : https://www.irif.fr/~colcombe/
Ville : Paris

Description  

Regular cost functions offer a theory for handling quantitative questions which extend the one of regular languages, and in particular offer several decision properties. The results developed in this framework play a key role in some problems
18related to language theory (star height problem, separation, Mostowsky hierarchy, boundedness of fixpoints of mu-calculus formulae) or verification (quantitative version of the Church synthesis problem, resolution of games with promises). See [Colcombet's habilitation, 2013] (in french) for a broad presentation.

Independently, Bojanczyk introduced a categorical approach for uniformly handling the numerous forms of algebras used in language theory, to start with monoids, algebras over trees, algebras over infinite trees [Recognisable languages over monads, 2015].
So far, the approach of Bojanczyk falls short of properly capturing regular cost functions. One reason for that is that it is developed in the category of Sets.
In subject in question has to provide the proper generalization for Bo- janczyk's work in order to capture these quantitative forms of languages. At- tacking this problem will involve acquiring some knowledge in non-standard analysis.

URL sujet détaillé : :

Remarques :



SM207-143 Modélisation d'un phénomène de tri cellulaire à l'aide d'un automate cellulaire stochastique  
Admin  
Encadrant : Nazim FATÈS
Labo/Organisme : Inria / Loria
URL : https://members.loria.fr/nazim.fates/
Ville : Nancy

Description  

Lors de la formation d'embryons de différents organismes biologiques, on constate un phénomène surprenant de tri cellulaire : si l'on mélange des cellules, elles ont une tendance spontanée à se regrouper entre cellules de même type, et cela sans intervention d'un agent extérieur. Une hypothèse simple a été émise pour expliquer ce phénomène de tri : les cellules de même type ont une certaine affinité entre elles ; une cellule aura une tendance à la stabilité si elle est entourée de cellules similaires et une tendance au mouvement si elle a autour d'elle des cellules de type différent.

L'objectif du stage est de modéliser ce phénomène par des automates cellulaires stochastiques (ou probabilistes). On travaille donc un cadre où le temps, l'espace et les états des cellules sont discrets. Un modèle simple a déjà été proposé par A. Voss-Boehme et A. Deutsch de l'université de Dresden en Allemagne (1). Ce modèle, largement inspiré des modèles utilisés en physique statistique, décrit la dynamique des cellules en considérant simplement les probabilités d'effectuer des échanges de position entre deux cellules voisines sur une grille. Une étude analytique prédit l'existence de phénomènes critiques : pour certaines valeurs des probabilités d'échange, on observe des changements qualitatifs de comportement, par exemple entre une situation où les cellules sont triées et une situation où les cellules forment des groupes de petite taille, ou même des " damiers ".

Nous chercherons à étudier ces phénomènes à l'aide de simulations numériques. Il s'agira de trouver une façon adaptée pour simuler ces systèmes et de contourner la difficulté que représente la nécessité d'effectuer de longs calculs. Le but est dans un premier temps de vérifier les études analytiques en mesurant les frontières des phénomènes critiques par des simulations. En effet, les résultats existants reposent sur des hypothèses fortes et il est fort probable que la simulation du système conduise à un certain écart avec ces prédictions. Par ailleurs, si le temps le permet, on cherchera à explorer d'autres modèles de formation de motifs (structures organisées en respectant une certaine contrainte spatiale).

(1) The cellular basis of cell sorting kinetics, Anja Vo=DF-B=F6hme et Andreas Deustch,
Journal of Theoretical Biology, 263, p. 419-436, DOI: 10.1016/j.jtbi.2009.12.011
URL sujet détaillé : https://members.loria.fr/nazim.fates/doc/stage-tri-cellulaire.pdf

Remarques : Remar

Si possible, codage des expériences en Java dans FiatLux : http://fiatlux.loria.fr/



SM207-144 Large Document Text Summarization with Adversarial Reinforce  
Admin  
Encadrant : Christophe CERISARA
Labo/Organisme : LORIA UMR 7503
URL : http://www.loria.fr
Ville : Nancy

Description  

Text summarization is an important task in Natural Language Processing (NLP). Two main approaches exist to solve it: extractive and abstractive methods. The internship will focus on abstractive methods, which are considered as more difficult because they generate summaries more flexibly with words that may not be contained in the original text, in the same way as humans do. Recently, with the surge of neural network, seq2seq have become the defacto model for this task. We propose to train such abstractive summarization models within the Generative Adversarial Net (GAN) framework, which may help to relax some common assumptions in the domain. Furthermore, GAN may be used to estimate the reward function of a deep reinforcement learning (RL) algorithm. In this context, the first objective of the internship is thus to evaluate and adapt two known methods to improve initialization of seq2seq models and enable their training in the joint GAN/RL framework for abstractive summarization.
URL sujet détaillé : http://synalp.loria.fr/sujetENS.pdf

Remarques : - Co-encadrement du stage: Hoa T. Le (thien-hoa.le.fr), doctorant au LORIA.
- Gratification réglementaire de stage: environ 550=80/mois




SM207-145 Focussed proof systems and the polynomial hierarchy  
Admin  
Encadrant : Anupam DAS
Labo/Organisme : DIKU, University of Copenhagen
URL : http://www.diku.dk/english/
Ville : Copenhague

Description  

How can we use proof theory to talk about the complexity of a logic? Traditionally, proof systems are seen as 'nondeterministic' algorithms for a logic: bounds on proof size yield nondeterministic time bounds for the logic. Later developments in structural proof theory allowed the obtention of deterministic time bounds, space bounds and even co-nondeterministic bounds. Remarkably, 'optimal' proof systems exist for almost all major logics (classical, intuitionistic, modal, linear...); phrased otherwise, it is now clear that proof search provides good algorithms for deciding whether formulae of a logic are valid or not.

Since the 1990s, structural proof theory has undergone a further refinement in the guise of 'focussing'. The insight, due initially to Andreoli, is that the rules of logic can be partitioned into those that, during bottom-up proof search, conduct routine simplifications of a formula ('invertible') and those that constitute genuine choices to be made by the prover ('noninvertible'). This exposes a deep symmetry in not only the rules of logic, but the connectives themselves (conjunction, disjunction, modalities, etc.). The fundamental result for focussed systems is that stages of invertible or noninvertible rules can be organised in proof search in a rather rigorous manner, hence reducing the apparent nondeterminism of proof search.

This project aims to link these new developments in structural proof theory with the traditional desire to obtain optimal bounds for proof search. The main goal is to correspond the alternation between invertible and noninvertible stages in a proof with the alternation of nondeterministic and co-nondeterministic computation in proof search. In practice, this means identifying fragments of PSPACE complete logics (intuitionistic, modal, linear without exponentials) corresponding to the levels of the polynomial hierarchy. These would constitute entirely new theoretical results in themselves, thanks to the technology of focussing, and are closely related to deep open problems in structural proof theory that do not mention complexity at all!

The interested student might want to read section 2 of the following preprint for an introduction to proof-search-as-computation, from a complexity theoretic point of view:

http://www.anupamdas.com/focussed-systems-complexity-bounds.pdf

(Chapter 1 also gives an idea of the methodology and some of the major references for the area.)
URL sujet détaillé : http://www.anupamdas.com/

Remarques : The student can be given some financial support for traveling and moving to Copenhagen.



SM207-146 Monotone complexity  
Admin  
Encadrant : Anupam DAS
Labo/Organisme : DIKU, University of Copenhagen
URL : http://www.diku.dk/english/
Ville : Copenhague

Description  

Monotone functions in computer science are ones that preserve the usual product order on finite Boolean strings. For instance, sorting and clique detection in graphs are famous examples of such functions. However, given a program, say as a Turing Machine or lambda term, can we decide if it computes a monotone function? For simple classes one can readily show that this is possible only for the lower levels of the Chomsky hierarchy, but the problem remains open for some interesting fragments of the context-free languages.

As well as functions that are 'semantically' monotone, one may consider programs that are 'syntactically' monotone, e.g. =AC-free Boolean circuits, which are complete for the monotone Boolean functions. Here detecting montonicity is immediate but at a potential cost in the complexity of the program: a famous theorem of Razborov means that there are monotone functions whose only =AC-free circuits are exponentially larger than some that use =AC. Nonetheless, such programs can be useful for reasoning in general about monotone computation.

This internship aims to tackle both the subject of deciding monotonicity of a program, in the semantic sense, and also generating classes of syntactically monotone programs complete for certain complexity classes, e.g. the (monotone) polynomial-time functions.
The former will take us into the depths of the theory of languages and automata, using classical results to classify the hardness of deciding monotonicity. The latter will investigate 'function algebras', like that of the primitive recursive functions, in particular developing characterisations of monotone classes by way of simple structural constraints on recursive programs.

The interested student might want to read section 1 of the following proposal on this topic for some more context and some references:

http://www.anupamdas.com/milc-proposal.pdf

This project was recently funded by the European Research Commission.
URL sujet détaillé : http://www.anupamdas.com/

Remarques : The student can be given some financial support for traveling and moving to Copenhagen.



SM207-147 Transmission centrality of temporal networks  
Admin  
Encadrant : Marton KARSAI
Labo/Organisme : ENS Lyon/INRIA, IXXI Rhone-Alpes Complex System Institute, SciencePo Paris
URL : http://www.ixxi.fr
Ville : Lyon

Description  

The importance of nodes and links in a complex network is commonly characterised by centrality measures. Their definitions can rely on local or global structural information, or can be inferred from the outcome of some globally diffusing process. Precise notion of centrality is important for example to identify nodes or links in transmitting information in a network, which can provide ways to hinder or postpone ongoing dynamical phenomena like the spreading of epidemic or the diffusion of information. Despite their various definitions on static network structures, their definition in temporal networks, where the diffusion of information is limited by the time-varying interactions of nodes, is an ongoing challenge.

In this internship we aim to extend for temporal networks the definition of a new centrality measure called the transmission centrality. This measure is based on stochastic diffusion processes and captures the importance of links or nodes by estimating the average number of nodes who receive information during a globally spreading diffusion process. Relying on an already defined algorithmic solution introduced for static network, we will aim to extend it for temporal structures to approximate transmission centrality in large networks at low computational cost. Finally we will apply transmission centrality on empirical temporal networks to distinguish between intra and inter-community links, which play particularly different roles in case of temporal social structures.
URL sujet détaillé : :

Remarques : The internship will employ a single candidate - M2 student coming from the fields of Computer Science. The candidate should present programming skills and advanced understanding on graph theory, network science, statistical data analysis and algorithmic design. The internship will be carried out at ENS Lyon-LIP in the DANTE Inria team, and will take place at IXXI Rhone-Alpes Complex System Institute under the supervision of M=E1rton Karsai and Eric Fleury.



SM207-148 Impact of university admission on student's egocentric network  
Admin  
Encadrant : Marton KARSAI
Labo/Organisme : ENS Lyon/INRIA, IXXI Rhone-Alpes Complex System Institute, SciencePo Paris
URL : http://www.ixxi.fr
Ville : Lyon

Description  

The larger scope of the internship is in the domain of dynamical social networks. Our aim is to understand how the egocentric network of an individual is changing as a function of time and varying environment. Special focus will be put on students to see how their interactions, the maintenance of their social ties, and the structure of their egocentric network is varying when entering or graduating from a university. This study will be based on the analysis of a large mobile phone call dataset recording the time varying interactions and approximate location of university students together with their estimated socioeconomic status. Our question is how the social ties are built or broken when an ego enters a new social environment (school or work) and whether socioeconomic status is determinant in building homophilic social ties in such situations. We aim to empirically address a theoretical proposition suggesting that similarity of people in terms of their socioeconomic status is highly determinant to build homophilic social relationships with other people from their own socioeconomic group, even in a neutral, socially equalising environment such as a university.

The dataset used in this internship is is fully anonymised and provided by a company (Grandata). Access to this dataset will be conditional to the signature of a non- disclosure agreement by the intern and the company.
URL sujet détaillé : :

Remarques : The internship will employ a single candidate - M2 student coming from the fields of Computer Science preferably participating in the Modelling Complex Systems orientation. The candidate should present programming skills and advanced understanding on network science, statistical data analysis and social modelling. The internship will be carried out at ENS Lyon-LIP in the DANTE Inria team, and will take place at IXXI Rhone-Alpes Complex System Institute under the supervision of M=E1rton Karsai and Eric Fleury.



SM207-149 Calculs locaux pour données distribuées  
Admin  
Encadrant : Xavier URBAIN
Labo/Organisme : LIRIS
URL : https://liris.cnrs.fr/~xurbain/
Ville : Villeurbanne

Description  

Mots-clefs : algorithmique distribuée, démonstration automatique, bases de données.


Contexte et objectifs scientifiques

Les systèmes de gestion de données de type No/NewSQL se sont de nos jours imposés pour la
gestion des données dans le cadre du BigData. La volumétrie mise en óuvre dans ces systèmes nécessite
l'utilisation de plusieurs machines, ne serait-ce que pour le stockage brut des données.
Lors de l'utilisation d'un système de gestion de données, on peut souhaiter bénéficier d'un certain
nombre de propriétés. Une première propriété est la cohérence des données, à savoir que les lectures
reflètent les écritures qui les précèdent. Une deuxième propriété est la disponibilité, c'est-à-dire que le
système répondra aux différentes requêtes qui lui sont adressées en un temps fini. Une troisième est que,
si on utilise un système de gestion des données distribué, ce dernier soit tolérant aux partitions, c'est-à-
dire résistant aux pertes de messages entre les nóuds du réseau. Un théorème connu [2] établit cependant
l'impossibilité d'obtenir ces trois propriétés en même temps. On s'intéresse donc à des compromis, en
général en affaiblissant la notion de cohérence.
Une des caractéristiques de l'écosystème NO/NewSQL est la diversité des solutions disponibles et
leur côté ad hoc, correspondant à des compris variés en termes de cohérence [1]. À cause de cette di-
versité, il est intéressant de pouvoir automatiser les approches de formalisation et de preuve du bon
comportement de ces systèmes.
On s'intéresse à l'application d'un outil théorique et pratique puissant pour l'étude des entrepôts
distribués : les calculs locaux, lesquels vérifient de bonnes propriétés pour la compréhension, l'étude et
l'analyse des systèmes distribués. Ils sont en effet suffisamment intuitifs pour permettre le développement
de programmes et complètement formels, autorisant ainsi une démarche de certification. On peut les
représenter par des réseaux de processeurs n'ayant accès qu'à des ressources locales, c'est-à-dire les états
des voisins et des canaux qui y mènent. Le comportement d'un algorithme est donc déterminé par l'état
initial du réseau, ses propriétés locales et globales, et enfin les primitives de communication. Les calculs
locaux sont en fait des systèmes d'interactions locales. Parmi ces modèles, les systèmes de récriture de
graphes étiquetés de Litovsky, Métivier et Mosbah [4, 3] bénéficient du haut niveau d'abstraction apporté
par la récriture et ont permis l'obtention d'importants résultats (par exemple un algorithme d'élection de
leader totalement asynchrone ne nécessitant que des messages de taille polynomiale). Ce sont eux que
nous nous proposons d'utiliser dans ce projet.


Travaux

Ce stage concerne l'expression et l'analyse, en termes de récritures (de graphes puis de termes), des
notions pertinentes pour les données distribuées.
À des fins d'automatisation, une partie de ces traductions sera implantée dans une boîte à outils
logicielle pour systèmes de récriture.
On étudiera les systèmes produits afin de développer des techniques automatiques de preuve de
propriété (par exemple de type terminaison).
Ces travaux se poursuivent en thèse, une piste parmi d'autres étant la vérification formelle des pro-
priétés et donc la validation formelle de solutions de cohérence à terme.


Compétences

=97 Démonstration automatique et fondements de la programmation,
=97 Programmation fonctionnelle (Ocaml est un plus).


Références

[1] Philip A. Bernstein and Sudipto Das. Rethinking eventual consistency. In Kenneth A. Ross, Divesh Srivastava, and Dimitris
Papadias, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013,
New York, NY, USA, June 22-27, 2013, pages 923=96928. ACM, 2013.
[2] Seth Gilbert and Nancy A. Lynch. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web
services. SIGACT News, 33(2) :51=9659, 2002.
[3] Igor Litovsky and Yves Métivier. Computing Trees with Graph Rewriting Systems with Priorities. Tree Automata and
Languages, pages 115=96139, 1992.
[4] Igor Litovsky, Yves Métivier, and Éric Sopena. Graph Relabelling Systems and Distributed Algorithms. In Hartmut Ehrig,
Hans-J=F6rg Kreowski, Ugo Montanari, and Grzegorz Rozenberg, editors, Handbook of Graph Grammars and Computing
by Graph Transformation, volume 3, pages 1=9656. World Scientific, 1999.
URL sujet détaillé : https://liris.cnrs.fr/~xurbain/17-18_bd.pdf

Remarques : Coencadrement Emmanuel Coquery, LIRIS



SM207-150 Towards a Time Oriented Programming  
Admin  
Encadrant : Luigi LIQUORI
Labo/Organisme : Inria Sophia Antipolis Mediterranee
URL : https://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwj05Nie66zXAhUEOxoKHTvABi8QFggpMAA&url=http%3A%2F%2Fwww-sop.inria.fr%2Fmembers%2FLuigi.Liquori&usg=AOvVaw2zkhp-6_b5Ng83CQv58gg1
Ville : Sophia Antipolis

Description  

In his seminal paper, Times, Clocks and the Ordering of Events in a Distributed System

http://research.microsoft.com/users/lamport/pubs/time-clocks.pdf

Leslie Lamport introduced the concept of ``one event is happening before another'' and formalize it as a partial order over events. He introduced the ``Timestamp'' (lamport's logical clocks) method of totally ordering events in a concurrent program. He also presented a distributed algorithm for synchronizing a system of logical clocks used to totally order the events. Later, in the '80 other proposals raised (Vector clock, by Fidge, Mattern, Tree clocks by almeida et al.) to detect causality violations.

The stage propose
- a deep understanding of logical clocks (lamport, vector, tree-clocks)

- a study of actual contruct offering time facilities, e.g.
- Real Time MAUDE (RTMaude): a language and tool supporting the formal specification and analysis of real-time and hybrid systems. http://heim.ifi.uio.no/peterol/RealTimeMaude/

- Erlang: a programming language used to build massively scalable soft real-time systems with requirements on high availability.

- An attempt to formally specify a new programming paradigm that we could call ``time oriented programming`` featuring single and multiple time inheritance using as a starting point Pierce's Featherweight Java and Liquori and Spiwack's Feathertrait Java (operation semantics and type system).





BIB

- Lamport, L. (1978). "Time, clocks, and the ordering of events in a distributed system" (PDF). Communications of the ACM . 21 (7): 558=96565. doi:10.1145/359545.359563.
- Colin J. Fidge (February 1988). "Timestamps in Message-Passing Systems That Preserve the Partial Ordering" (PDF). In K. Raymond (Ed.). Proc. of the 11th Australian Computer Science Conference (ACSC'88). pp. 56=9666.
- Mattern, F. (October 1988), "Virtual Time and Global States of Distributed Systems", in Cosnard, M., Proc. Workshop on Parallel and Distributed Algorithms, Chateau de Bonas, France: Elsevier, pp. 215=96226.
- Almeida, Paulo; Baquero, Carlos; Fonte, Victor (2008), "Interval Tree Clocks: A Logical Clock for Dynamic Systems", Interval Tree Clocks: A Logical Clock for Dynamic Systems, Lecture Notes in Computer Science, 5401,


B.C. Pierce. "Featherweight Java: A Minimal Core Calculus for Java". TOPLAS 2001. https://www.cis.upenn.edu/~bcpierce/papers/fj-toplas.pdf

L. Liquori and A. Spiwack. "Feathertrait Java: a modest extension of Feathertrait Java" TOPLAS 2008. http://www-sop.inria.fr/members/Luigi.Liquori/PAPERS/toplas-07.pdf


URL sujet détaillé : :

Remarques : Remuneration, possibilite' de continuer avec un Ph.D.



SM207-151 jsCoq and SerAPI  
Admin  
Encadrant : Emilio Jes=FAs GALLEGO ARIAS
Labo/Organisme : CRI MINES ParisTech
URL : https://github.com/ejgallego/jscoq
Ville : Fontainebleau, Ille de France

Description  

The MINES ParisTech Centre de recherche en informatique (CRI) offers an internship to
explore new user interaction possibilities in proof assistants. The focus will be on extending
the User Interface (UI) capabilities of the popular Coq [3] interactive proof assistant, based on
the Calculus of Inductive Constructions, a dependently-typed functional language. Coq comes
with a powerful proof tactic language and has been key to several mechanization milestones
such as the formalization of Feit-Thompsom theorem or the development of CompCert verified
compiler, among others.

Interactive Theorem Proving (ITP) poses new challenges to UIs. The complexity of proofs
and underlying programming language goes well beyond what regular Interactive Development
Environments (IDE) typically support; literate programming and custom notation and syntax
are pervasive; and good layered display capabilities are essential to an effective understanding
of the tool's output in the frequent "input-response cycle" used during proof design.

The jsCoq project (developed at CRI) is a port of Coq to the modern web browser
platform, allowing web pages to embed interactive proof scripts. The browser platform opens
up new possibilities of interaction, display, distribution, and education. SerAPI is a new
lightweight protocol designed to help machine-based interaction with Coq, such as used
within IDEs and code-analysis tools. SerAPI is based on generic programming techniques
and is developed in close collaboration with upstream tool developers.
This internship is a good opportunity to become familiar with the implementation of
modern proof assistants and challenges of interactive proofs. We are quite open on concrete
goals; we would love to discuss more with interested students.
URL sujet détaillé : https://www.cri.ensmp.fr/people/gallego/docs/jscoq-2018.pdf

Remarques : Co-advised with Pierre Jouvelot.

MINES ParisTech internships may be funded, typically along the lines fixed by law (about 400 euros per month).



SM207-152 Wagner  
Admin  
Encadrant : Emilio Jes=FAs GALLEGO ARIAS
Labo/Organisme : CRI MINES ParisTech
URL : https://github.com/ejgallego/mini-dft-coq
Ville : Fontainebleau, Ille de France

Description  

The MINES ParisTech Centre de recherche en informatique (CRI) offers an internship to
explore formally verified models of digital signal processing. The starting point is Wagner,
a functional synchronous language tailored to mechanical verification
and formal study of efficient DSP models). Wagner originated in the ANR FEEVER project
as an operational model of the Faust DSP language. Wagner can also be seen as a more
formal version of recent developments such as the MathWorks Audio System Toolbox.

We have developed a Wagner compiler, written in OCaml and intended for real-world use,
and a formal model of the language using Coq and the Mathematical Components Library,
together with proofs about relevant properties such as linearity (in the Linear Time-Invariant,
or LTI, sense). Internship goals are open and we encourage the students to get in touch
to discuss about the project; we believe that this internship is a good opportunity to get
knowledgeable about synchronous programming language semantics, abstract machines, and
medium-to-advanced mathematics formalization in Coq.

See more details in the PDF.
URL sujet détaillé : https://www.cri.ensmp.fr/people/gallego/docs/wagner-2018.pdf

Remarques : Co-advised with Pierre Jouvelot.

MINES ParisTech internships may be funded, typically along the lines fixed by law (about 400 euros per month).



SM207-153 On kernelization when parameterizing by distance to triviality.  
Admin  
Encadrant : Marin BOUGERET
Labo/Organisme : LIRMM
URL : http://www.lirmm.fr/~bougeret/
Ville : Montpellier

Description  

(The subject is written in English but both of us can work in French or English interchangeably)

(This internship will be co-supervised by Marin Bougeret and Ignasi Sau at LIRMM, Montpellier, ALGCO Team).

There is a whole area of parameterized algorithms and kernelization investigating the complexity ecology, where the objective is to consider a structural parameter measuring how "complex" is the input, rather than the size of the solution. For instance, parameterizing a problem by the treewidth (or the treedepth) of its input graph has been a great success for the development of FPT algorithms. However, the situation is not as good for kernelization, as many problems do not admit polynomial kernels by treewidth (or treedepth) unless NP /poly.

Within structural parameters, of fundamental importance are parameters measuring the so-called "distance from triviality" of the input graphs, like the size of a vertex cover (distance to an independent set) or a feedback vertex set (distance to a forest). Unlike treewidth, these parameters may lead to both positive and negative results for polynomial kernelization. An elegant way to generalize these parameters is to consider a parameter allowing to quantify the triviality of the resulting instance, measured in terms of its treedepth. More precisely, for a positive integer c, a c-treedepth modulator of a graph G is a set of vertices X such that the treedepth of G-X is at most c. Note that for c=1, a c-treedepth modulator corresponds to a vertex cover.

The starting point of the presented topic is [1], where the authors proved that a large class of problems admits a polynomial kernel on nowhere dense graphs when parameterized by the size of a c-treedepth modulator. This raised the natural question of determining for which problems this result could be extended to larger graph classes.

In [2] we started to investigate this question. It turned out that (parameterized by the size of a c-treedepth modulator):
- Vertex Cover admits a polynomial kernel on general graphs.
- Dominating Set does not admit a polynomial kernel unless NP /poly, even on degenerate graphs.

Very recently,

- It has been proved in [4] that the F-minor-free deletion problem (which generalizes Vertex Cover and Feedback Vertex Set) parameterized by the size of a c-treedepth modulator admits a polynomial kernel on general graphs (generalizing our result of [2]).
- New variants of distance to triviallity have been considered for vertex Cover (for example in [3]).

The objective of this internship is to continue the above work in the following directions:

(1) Understand which measures of triviality (close to a treedepth modulator) allow polynomial kernels for Vertex Cover. Indeed, as treedepth has many different characterizations, we think it should be possible to find natural other measures that are "close" to and smaller than treedepth.

(2) Close the gap between the size of our kernel in [2] and the lower bound provided in [3].

(3) See if the huge (new .. and complex!) machinery of [3] could be simplified when considering for example the Feedback Vertex Set problem (which would generalize [2] but be a particular case of [3]).


******
references
******

[1] Kernelization using structural parameters on sparse graph classes.
J. Gajarsky, P. Hlineny, J. Obdrek, S. Ordyniak, F. Reidl, P. Rossmanith, F. S. Villaamil, and S. Sikdar.
In Proc. of the 21st European Symposium on Algorithms (ESA), 2013.

[2] How much does a treedepth modulator help to obtain polynomial kernels beyond sparse graphs?
M. Bougeret, I. Sau.
In Proc. of the 12th International Symposium on Parameterized and Exact Computation (IPEC), 2017.
https://arxiv.org/abs/1609.08095.

[3] Smaller parameters for vertex cover kernelization.
E.-M. Hols and S. Kratsch.
In Proc. of the 12th International Symposium on Parameterized and Exact Computation (IPEC), 2017.

[4] Polynomial Kernels for Hitting Forbidden Minors
under Structural Parameterizations.
B. M. P. Jansen, A. Pieterse.

URL sujet détaillé : :

Remarques : Co-supervised by Ignasi Sau at LIRMM, Montpellier, http://www.lirmm.fr/~sau/ )



SM207-154 Drawing the complexity landscape of the Independent Set problem in H-free graphs  
Admin  
Encadrant : Rémi WATRIGANT
Labo/Organisme : LIP
URL : http://perso.univ-lyon1.fr/remi.watrigant/
Ville : Lyon

Description  

The main problem studied in this internship is the classical Independent Set problem: given a graph G and an integer k, decide whether there exists an independent set of size at least k, that is, at least k vertices which are pairwise non-adjacent. This problem is NP-complete in general, but becomes tractable when restricted to some special graph
classes.
Given a fixed graph H, we consider the class F_H of graphs that do not admit H as an induced subgraph.
The complexity of Independent Set in F_H is known for many fixed graph H. In particular, it remains NP-hard if H is a clique (of fixed size), or more generally whenever H contains a triangle as an induced subgraph. However, very little is known about the parameterized complexity of Independent Set in F_H . The general questions are
the following:
For which graphs H does Independent Set in F_H admit a Fixed-Parameterized
Tractable (FPT) algorithm? Then, for all such H, for which ones does the problem admit a polynomial kernel?
URL sujet détaillé : perso.univ-lyon1.fr/remi.watrigant/files/sujet.pdf :

Remarques : This internship is co-supervised by Marin Bougeret (LIRMM, Montpellier). In particular, the intern is expected to visit the LIRMM several weeks during the internship.



SM207-155 Ordonnacement de taches et de communications pour les datacenters  
Admin  
Encadrant : Stéphane PERENNES
Labo/Organisme : I3S - INRIA équipe Coati
URL : http://www-sop.inria.fr/members/Stephane.Perennes/index.shtml
Ville : Sophia Antiplis (Nice)

Description  

Certains problèmes d'ordonnancement et de placement peu ou pas du tout étudiés apparaissent
quand on modélise le problème de la réalisation d'un ensemble de tâches dans un data-center. Les tâches
à réaliser sont organisées selon un graphe de tâches-machine qui représente des dépendances acycliques
(DAG). L'originalité provient de la prise en compte des communications.

Au niveau théorique, le stage consistera à étudier l'existence d'algorithmes efficaces de résolution (approchée) de ces problèmes. Au niveau pratique des heuristiques pourront être proposées et étudiées empiriquement.
URL sujet détaillé : http://www-sop.inria.fr/members/Stephane.Perennes/scheduling.pdf

Remarques : Rémunération possible selon les barême INRIA, co encadrement par Frédéric Giroire et Nathann Cohen tout deux chercheurs au CNRS.



SM207-156 Causal Graph Dynamics  
Admin  
Encadrant : Pablo ARRIGHI
Labo/Organisme : LIF (Marseille) & IXXI (Lyon)
URL : http://pageperso.lif.univ-mrs.fr/~pablo.arrighi/
Ville : Marseille & Lyon

Description  

Cellular Automata consist in a grid of identical cells, each of which may take a state among a
finite set. The state of each cell at time t + 1 is computed by applying a fixed local rule to the
state of cell at time t, and that of its neighbours, synchronously and homogeneously across space. Causal Graph Dynamics [1] extend Cellular Automata from fixed grids, to time-varying graphs. That is, the whole graph evolves in discrete time steps, and this global evolution is required to have a number of physics-like symmetries: shift-invariance (it acts everywhere the same), causality (information has a bounded speed of propagation).
One could also introduce any of these three other ingredients:
- Reversibility (the whole evolution is a bijection over the set of labelled graphs). That way we obtain a Computer Science-inspired toy model, that has features mimicking both Quantum theory (reversibility) and General Relativity (evolving topology).
- A geometrical interpretation (by making sure that the graphs are dual to simplicial complexes,
i.e. triangles glued together). That way we obtain an evolution of discrete surfaces, mimicking General Relativity even more closely.
- Stochasticity (by introducing noise, non-determinism). That way we can seek to model phenomena such as social networks, protein-folding and other kinds of reconfigurable networks within the model.
The student will choose between any of these three subtopics according to taste, and be encouraged to tackle some concrete questions we have begun to address.
URL sujet détaillé : http://pageperso.lif.univ-mrs.fr/~pablo.arrighi/projects/StageCGD.pdf

Remarques : Rénumération possible à discuter.

Lieu de stage par défaut au LIF (à Luminy, Marseille).

La possibilité d'effectuer le stage à l'IXXI de Lyon est conditionnée à la disponibilité de bureaux dans cet institut.

N'hésitez par à appeler pour discuter : 0611257467.



SM207-157 Quantum Walking in a Discrete Geometry  
Admin  
Encadrant : Pablo ARRIGHI
Labo/Organisme : LIF (Marseille) & IXXI (Lyon)
URL : http://pageperso.lif.univ-mrs.fr/~pablo.arrighi/
Ville : Marseille & Lyon

Description  

Quantum Walks are the natural, quantum version of Random Walks. According to Feynman, quantum
mechanics is a but theory of probabilities =93with minus signs=94. Indeed, one could say that probabilities get
replaced by complex numbers. More specifically, whereas a random walker moves either left, or right, with
probability a half, a quantum walker moves left with complex amplitude α, and right with amplitude β, where
|α|2+|β|2=1. Why consider such a generalization? Simply because it turns out that this is how particles behave.
For instance, one can easily show [1] that the continuum limit of some quantum walk is the Dirac Equation=AD=AD=ADthe
equation which governs the propagation of the electron.
Discrete Geometry approximates geometry via polygons. For instance a surface (e.g. a sphere) can be
approximated by gluing triangles side be side =AD=AD=AD that's an example of a 2D simplicial complex. The aim of this
internship is to study the continuum limit of quantum walks over simplicial complexes. Why do this?
First of all one may wonder if there is a quantum walk over the 2D euclidean plane, but triangulated, which still
has the Dirac Equation as its continuum limit. The steps to follow towards this aim seem relatively clear to us,
yet answering this question would be the student's main task, and achieving it would already mean a successful
internship.
Still, once this is done, one may wonder about the continuum limits of quantum walks over simplicial complexes
that are less regular. For instance when an electron propagates on a graphene sheet, the deformations of that
sheet have an influence upon its propagation, which is then governed by the Dirac equation in =93curved space=94
[2]. This is the case also whenever general relativity comes into play. Thus, one could hope that the continuum
limit of quantum walk on a triangulated sphere, for instance, coincides with the Dirac equation in curved space
that corresponds to the sphere.
It may be the case, moreover, that the geometry of the simplicial complex is able to encode other fields than the
gravitational field=AD=AD=ADsuch as the electromagnetic field. We have already seen this sort of effects happen with the
mass, for instance: it turns out that a quantum walk on a line with mass m, is equivalent to a quantum walk
without mass, but on a cylinder [3]. What happens if the cylinder is irregular?
Quantum walks over discrete geometries seem to constitute a simple, discrete time, discrete space model à la
Computer Science, but one that is able to grasp, with much elegance, a number of physical phenomena. They
also constitute quantum algorithms (numerical schemes) for simulating physics on a Quantum Computer or
other quantum simulation device.
URL sujet détaillé : http://pageperso.lif.univ-mrs.fr/~pablo.arrighi/projects/StageQW.pdf

Remarques : Rénumération possible à discuter.

Lieu de stage par défaut au LIF (à Luminy, Marseille).

La possibilité d'effectuer le stage à l'IXXI de Lyon est conditionnée à la disponibilité de bureaux dans cet institut.

N'hésitez par à appeler pour discuter : 0611257467.



SM207-158 Dérivation symbolique d’'interpréteur abstrait en Coq  
Admin  
Encadrant : David PICHARDIE
Labo/Organisme : ENS Rennes/Inria/IRISA
URL : http://people.irisa.fr/David.Pichardie/
Ville : Rennes

Description  


L'analyse statique est un outil essentiel pour vérifier la
sécurité d'un logiciel ou pour prouver l'absence d'erreurs à
l'exécution. L'interprétation abstraite [3] propose un cadre
mathématique riche pour prouver la correction des analyses
statiques de programme.

La correction d'une analyse est prouvée par rapport à sa
sémantique formelle. La théorie de l'interprètation abstraite
propose un cadre pour prouver la corection à partir d'une
fonction d'abstraction α et de concrétisation γ qui lient
propriétés concrètes et valeurs abstraites. La propriété de
correction s'exprime alors comme P ⊆ γ ◦ P ♯ ◦ α, stipulant que
tous les comportements concrets P d'un programme doivent être
sur-approximés par l'interprétation abstraite P♯ du programme.

Dans des travaux antérieurs, nous avons montré comment utiliser
ce cadre théorique pour prouver des analyseurs statiques dans
l'assistant de preuve Coq [2, 4]. Pourtant, jusque là, une partie
importante de ce cadre a été ignorée : les fonctions
d'abstraction (notées α) qui permettent de spécifier formellement
les choix optimaux que l'on peut effectuer lors de la conception
d'une analyse, nécessairement approchée. L'utilisation d'un tel
outil théorique en Coq ouvre la porte vers la dérivation
symbolique d'une implémentation d'une analyse, à partir de sa
spécification. Plus précisement, l'analyse optimale d'une
fonction est définie par l'expression α ◦ P ◦ γ mais cette
expression ne donne pas une définition explicite de l'abstraction
optimale. Cependant, des dérivations symboliques permettent
parfois obtenir une bonne approximation, qui sera correcte par
construction.

L'objectif de ce stage est d'étudier le calcul d'un tel opérateur
dans la logique de l'assistant de preuve Coq. Le but est de
développer un ensemble de théorèmes et de tactiques permettant
de suivre au plus près les techniques de dérivation symbolique
récentes [5]. Ces librairies seront mises à l'épreuve sur un
langage de programmation simple et des abstractions classiques
comme les intervalles. Les travaux réalisés prendront la suite
d'un precedent stage [1], en ajoutant en particulier des
opérateurs de point fixe, et d'autres constructions
pertinentes.

[1] Dominique Barbe. Dérivation symbolique d'interpréteur abstrait en Coq. Rapport de stage, http: //perso.eleves.ens-rennes.fr/people/Dominique.Barbe/derivationAI_long.pdf, 2015.
[2] David Cachera and David Pichardie. A certified denotational abstract interpreter. In Proc. of Interna- tional Conference on Interactive Theorem Proving (ITP-10), volume 6172 of Lecture Notes in Compu- ter Science, pages 9–24. Springer-Verlag, 2010. http://people.irisa.fr/David.Pichardie/papers/ itp10.pdf.
[3] Patrick Cousot and Radhia Cousot. Abstract interpretation : A unified lattice model for static ana- lysis of programs by construction or approximation of fixpoints. In 4th ACM Symposium on Principles of Programming Languages (POPL 1977), pages 238–252, 1977. http://cs.nyu.edu/~pcousot/ publications.www/CousotCousot-POPL-77-ACM-p238--252-1977.pdf.
[4] Jacques-Henri Jourdan, Vincent Laporte, Sandrine Blazy, Xavier Leroy, and David Pichardie. A formally- verified C static analyzer. In 42nd symposium Principles of Programming Languages, pages 247–259. ACM Press, 2015. http://people.irisa.fr/David.Pichardie/papers/popl15.pdf.
[5] Jan Midtgaard and Thomas P. Jensen. A calculational approach to control-flow analysis by abstract interpretation. In 15th International Symposium on Static Analysis, (SAS 2008), 2008.
URL sujet détaillé : :

Remarques :



SM207-159 Compilation et programmation constant-time  
Admin  
Encadrant : David PICHARDIE
Labo/Organisme : ENS Rennes/Inria/IRISA
URL : http://people.irisa.fr/David.Pichardie/
Ville : Rennes

Description  


Certains programmes manipulent des données sensibles (des clés
secrètes dans les implémentations des primitives cryptographiques
par exemple) qu'il convient de protéger d'un attaquant observant
les calculs. Une attaque par canaux cachés particulièrement
redoutable consiste à observer les différences de temps de calcul
entre plusieurs exécutions pour apprendre des informations
sensibles. De telles attaques ont été montées sur des
implémentations de RSA [3] ou AES [5].

La discipline de programmation constant-time est une
contre-mesure largement répandue pour éviter ces failles. Elle
s'interdit d'utiliser des tests, des accès mémoires, voire des
multiplications sur des données teintées par le secret. Des
travaux récents proposent des techniques de vérification
automatiques pour assurer qu'un programme adhère bien à une telle
discipline [1,2]. Ces techniques utilisent des analyses statiques
avancées qui opèrent sur une représentation à haut niveau des
programmes. Il convient alors de s'assurer que la compilation de
ces programmes vers du code exécutable ne va pas altérer la
propriété de sécurité. En effet, certaines optimisations peuvent
invalider la propriété constant-time d'un programme [4, 8].

L'objectif de ce stage est d'étudier les compilateurs LLVM [6] et
CompCert [7] du langage C pour comprendre comment les
transformations qu'ils mettent en oeuvre interagissent avec la
propriété constant-time. En fonction des résultats de cette
étude, l'étudiant.e pourra soit proposer une modification de la
transformation incriminée pour lui faire respecter la
manipulation des valeurs teintées, soit proposer une preuve de
correction (sans assistant de preuve) pour les transformations
les plus intéressantes. Nous recherchons un.e stagiaire motivé.e
par la manipulation symbolique de programme (traduction,
optimisation, analyse statique, preuve) et à l'aise en
programmation OCaml et C++. Le stage devra faire l'objet d'un
premier prototype dans le compilateur CompCert et d'une petite
formalisation sémantique (éventuellement avec assistant de preuve).

[1] José Bacelar Almeida, Manuel Barbosa, Gilles Barthe, Fran ̧cois Dupressoir, and Michael Emmi. Verifying constant-time implementations. In 25th USENIX Security Symposium, USENIX Security 16, 2016.
[2] Sandrine Blazy, David Pichardie, and Alix Trieu. Verifying constant-time implementations by abstract interpretation. In 22nd European Symposium on Research in Computer Security, ESORICS'17, 2017.
[3] David Brumley and Dan Boneh. Remote timing at- tacks are practical. In Comp. Networks, 48(5), 2005.
[4] Bart Coppens, Ingrid Verbauwhede, Koen De Bosschere, and Bjorn De Sutter. Practical mitigations for timing-based side-channel attacks on modern x86 processors. In Security and Privacy, pages 45=9660, 2009.
[5] David Gullasch, Endre Bangerter, and Stephan Krenn. Cache games - bringing access-based cache attacks on aes to practice. In 32nd IEEE Symposium on Security and Privacy, 2011.
[6] Chris Lattner and Vikram S. Adve. LLVM : A compilation framework for lifelong program analysis & transformation. In CGO, 2004.
[7] Xavier Leroy. Formal verification of a realistic compiler. Commun. ACM, 52(7), 2009.
[8] David Molnar, Matt Piotrowski, David Schultz, and David Wagner. The program counter security model : Automatic detection and removal of control-flow side channel attacks. In Info. Security and Crypto., 2006.

URL sujet détaillé : :

Remarques :



SM207-160 Identification of DNA in metagenomic samples  
Admin  
Encadrant : Fabio PARDI
Labo/Organisme : LIRMM
URL : https://sites.google.com/site/fabiopardi/contact
Ville : Montpellier

Description  

BACKGROUND. Large-scale biological monitoring through DNA sequencing is a powerful tool, applied today from agronomy (soil ecosystems), to medicine (human and animal pathogens) and ecology (biodiversity screening). Such diagnostics rely heavily on algorithms that attempt to identify the species to which the DNA samples belong. A bottleneck comes from their complexity : thorough evolutionary modelling is desired, but this clashes with the sheer size of the datasets to analyse, leading to serious scalability issues. Recently, we developed a new approach aiming for more scalable species identifications. The main advantage comes from the reduction of heavy evolutionary inferences to a preprocessing phase related to the construction of a database (DB) of short DNA words that were likely present in the ancestors of the current genomes. By using this DB, the need for sequence alignment between the newly sequenced DNA and reference genomes is avoided. As a result, our method =96 a software package already distributed to several users =96 is an order of magnitude faster than competing tools.

OBJECTIVES. These results are encouraging, but the full potential of our approach remains to be fulfilled. Depending on the interests and skills of the stagiaire, her/his work will take one of the folllowing two directions:

1. Currently no preprocessing is applied to the millions of DNA sequences to identify, and no flexibility is allowed when seeking a match between the DB and the query DNA. To deal with these needs, the stagiaire will consider devising better indexing techniques, or allowing approximate string matching between the DB and DNA sequences, or adapt techniques such as "winnowing schemes".

2. A second potential application of our method is the detection of recombination within the metagenomic sample. Recombination is a phenomenon causing the emergence of a new genome combining pieces of other, well-distinct genomes. Its detection is particularly relevant to viruses, for which recombination is widespread, and plays a key role in the evolution of virulence, and in the adaptation to particular hosts and host tissues.

Regardless of the chosen direction of work, a first part of the internship will focus on a bibliographic review of the most important available techniques. Devising suitable algorithms, and implementing them within the software developed by the MAB team (http://www.lirmm.fr/recherche/equipes/mab)
will be the next steps.

URL sujet détaillé : https://sites.google.com/site/fabiopardi/internships-stages

Remarques :



SM207-161 Recherche et Sélection de motifs multiples dans des graphes attribués hétérogènes  
Admin  
Encadrant : Henry SOLDANO
Labo/Organisme : LIPN, Université Paris 13, SPC
URL : https://lipn.univ-paris13.fr/fr/a3-presentation
Ville : villetaneuse

Description  

For some years the A3 team of LIPN is interested in knowledge discovery in networks. Recently we focussed in attributed networks mining, addressing tasks as communities discovery and attribute pattern mining. A particular attention was paid in lifting attribute pattern mining to networks by associating attribute patterns with core subgraphs, where this notion of "core" is considered as flexible and adapted to the kind of network we address. As a recent example we have proposed a new core designed to investigate advice or influence processes in directed graphs.

We are currently interested in enriched graphs in a wide sense (multiplex, multimode, edge attributed networks) for which we want to investigate new pattern mining ideas : we consider graphs in which nodes or links are attributed and also have different types or roles, therefore leading to consider multipatterns associated to extended cores. The practical motivation comes from the increasing domains where such networks appear and represent the knowledge or information we have about various kind of objects related in various ways. Examples of such domains are life science and sociology.

The Master 2 subject concerns the formulation, algorithmics and development of methodologies associated to such multi pattern mining of enriched graphs.

URL sujet détaillé : http://wwwabi.snv.jussieu.fr/people/soldano/sujetM2MultiPatternMining.pdf

Remarques : Le stage est rémunéré selon la grille des stages de Master 2, le minimum étant actuellement 577 euros par mois.
Le stage a lieu dans l'équipe A3 dont plusieurs membres selon impliqués dans le co-encadrement, en particulier Guillaume Santini et Rushed Kanawati



SM207-162 Approximate Learning of Monotone Functions  
Admin  
Encadrant : Oded MALER
Labo/Organisme : VERIMAG
URL : http://www-verimag.imag.fr/~maler/
Ville : Grenoble

Description  

Develop new algorithms for leaning monotone functions between partially-ordered domains.
URL sujet détaillé : http://www-verimag.imag.fr/~maler/proposal-grand-ec-1-monotone.pdf

Remarques : Co-supervised with Nicolas Basset http://www-verimag.imag.fr/~bassetni/



SM207-163 Learning Automata over Large Alphabets as an Alternative to Recurrent Neural Networks  
Admin  
Encadrant : Oded MALER
Labo/Organisme : VERIMAG
URL : http://www-verimag.imag.fr/~maler/
Ville : Grenoble

Description  

Extend our previous work on learning languages over large alphabets (using symbolic automata) as an alternative to recurrent neural network. May lead also to fundamental studies in automata theory.
URL sujet détaillé : http://www-verimag.imag.fr/~maler/proposal-grand-ec-2-large-NN.pdf

Remarques : Co-supervised with Nicolas Basset http://www-verimag.imag.fr/~bassetni/



SM207-164 Temporal Data Mining  
Admin  
Encadrant : Oded MALER
Labo/Organisme : VERIMAG
URL : http://www-verimag.imag.fr/~maler/
Ville : Grenoble

Description  

How to generalize from positive examples in a non-trivial wau
URL sujet détaillé : http://www-verimag.imag.fr/~maler/proposal-grand-ec-3-positive.pdf

Remarques : Co-supervised with Nicolas Basset http://www-verimag.imag.fr/~bassetni/



SM207-165 Extraction d'une ligne de produits de tests logiciels  
Admin  
Encadrant : Tewfik ZIADI
Labo/Organisme : Laboratoire d'Informatique de Paris 6 (LIP6)
URL : https://pages.lip6.fr/Tewfik.Ziadi/
Ville : Paris

Description  

Variability inference, from source code or execution traces, for building software product lines (SPL), is an essential approach to ease SPL adoption. While the inference is now well established and applied to build product lines of actual software, it does not yet really and explicitly consider test code. This internship is about the design, development, experimentation, and evaluation of the extraction of a software product line from test source code, associated with the software product liner under test.
URL sujet détaillé : https://drive.google.com/file/d/13Q0b2QCJrS_vhV7NxIQNL05tKShaicYr/view?usp=sharing

Remarques : Projet proposé en collaboration et co-encadré avec Lom HILLAH (même équipe au LIP6, lom-messan.hillah.fr).
Gratification standard selon la réglementation en vigueur : environ 546=80 par mois.
Indemnité transport de 35=80 par mois + cantine d'environ 100=80 par mois (20 repas).



SM207-166 Correct datatypes for geo-replicated distributed applications  
Admin  
Encadrant : Sreeja Nair
Labo/Organisme : UPMC-LIP6
URL : https://team.inria.fr/regal/
Ville : Paris

Description  

Developing a distributed application is far from trivial, because of the trade-offs between consistency and availability. All developers wish that their application is as available as possible. Still, they need to ensure that the application invariants are respected. The CISE [1] logic can check whether an application specification upholds its expected invariant. This can be used to design applications with just the right amount of consistency.
Marcelino et al. [2] have developed the CEC tool, based on the CISE logic. The user specifies the application in the Boogie language [3], extended with some tool-specific annotations. The tool analyses the specification and provides the user information on any consistency issues. An experience report of the tool usage [4] shows that writing specifications is difficult and error-prone.
The goal of this internship is to enable users to write specifications more easily. The output will be to develop a library of common datatypes, and write reusable templates for common cases. The datatypes can be classified into normal data types such as set, list, graph, tree, etc. and replicated datatypes such as G-set, PN-set etc. The normal data type specifications will be verified using plain Boogie, and the replicated ones using the CEC tool. As many object types are commonly used in most specifications and it is repetitive and error-prone to define the properties of it multiple times. We will develop templates that can be reused later in writing full specifications.

URL sujet détaillé : https://team.inria.fr/regal/job-offers/masters-internship-correct-datatypes-for-geo-replicated-distributed-applications/

Remarques : Rémunéré
Encadrant: Sreeja Nair



SM207-167 Developing a geo-replicated document store  
Admin  
Encadrant : Dimitrios Vasilas
Labo/Organisme : UPMC-LIP6
URL : https://team.inria.fr/regal/
Ville : Paris

Description  

The document-based data model is a popular way of storing data, particularly for applications that handle semi-structured data or unstructured data accompanied by metadata, such as social media posts and multimedia. Document stores, including MongoDB and Apache CouchDB, store data object as JSON-like objects that have varying sets of fields, with different types for each field. This model is attractive as it makes application code easier to write and enables applications to change their database schema as demands evolve.
A basic functionality of document stores is that, beyond the simple key-to-document lookup interface, they offer a query language that allows users to retrieve documents by their semi-structured content. To achieve efficient and scalable semi-structured search, document stores maintain secondary indexes on document attributes.
AntidoteDB is a cloud database developed by the Regal team at LIP6. It provides geo-replication and guarantees both high availability and a high level of consistency. AntidoteDB supports replicated data types such as counters, sets and maps that are designed to work correctly in the presence of concurrent updates and network failures.
Implementing a document store API backed by AntidoteDB can allow a geo-distributed deployment where users update their documents concurrently in multiple data centres with low latency and high availability.
More importantly, Antidote's replicated data types can provide a flexible mechanism for users to explicitly control how conflicts caused by concurrent updates will be resolved, by choosing the appropriate data types based on their application semantics.
Research objectives and methods

The objective of this internship is to implement a document store interface and a query language that supports document retrieval by semi-structured content, based on AntidoteDB. The query language should support point queries on text attributes and interval queries on numerical attributes, and allow complex queries including logical operators (AND, OR, NOT). In order to achieve efficient and scalable search the system should maintain secondary indexes on document attributes.
The intern shall study different indexing techniques (eg inverted indexes, B-trees) [1], different strategies for organising distributed indexes (global or local indexes) [2], and different strategies for updating the index structures [3, 4], select the most appropriate ones for the system, implement the described interface and perform benchmarks.

URL sujet détaillé : https://team.inria.fr/regal/job-offers/masters-internship-developing-a-geo-replicated-document-store-above-antidotedb/

Remarques : Rémunéré
Co-Encadrant: Dimitrios Vasilas



SM207-168 Synchronisation-free mobile gaming  
Admin  
Encadrant : Paolo Viotti
Labo/Organisme : UPMC-LIP6
URL : https://team.inria.fr/regal/
Ville : Paris

Description  

In multiplayer online games the inherent challenges of mobile and edge computing are worsened by the requirements of real-time users' interactions. Thus, developers of such applications face difficult design choices on data storage and distribution that ultimately affect the gaming experience: adopting a weak consistency model across replicas and mobile terminals will expose anomalies to the users, whereas a strong consistency model will increase latency, which may be unacceptable.
Antidote is a data store that provides an adequate consistency semantics with optimal performance by minimizing the need for synchronization between storage replicas. It offers a causally consistent transactional API and a toolkit of convergent data types that accommodates the typical needs of distributed applications.
Research objectives and methods

The aim of this project is to design and develop a simple yet fully-functional mobile gaming application to evaluate the pros and cons of the synchronization-free approach. Besides being an interesting pilot study for mobile gaming, this application will allow us to perform a thorough experimental evaluation of Antidote.
We break down the project into the following steps: 1. brief study of the state of the art; 2. design of game strategy, mockup; 3. development; 4. basic functional testing.

URL sujet détaillé : https://team.inria.fr/regal/job-offers/masters-internship-synchronisation-free-mobile-gaming/

Remarques : Rémunéré
Co-Encadrant: Paolo Viotti



SM207-169 Shared, persistent and mutable big-data structures  
Admin  
Encadrant : Gaël Thomas, Télécom SudParis
Labo/Organisme : UPMC-LIP6
URL : https://team.inria.fr/regal/
Ville : Paris

Description  

In current computer systems, sharing or persisting data is the job of the file system. However, the file system interface is narrow and slow. This penalises applications with a large memory footprint, such as big-data analytics. Consider for instance a map-reduce computation. The map processes produces some large data structure, e.g., a social-network graph, which the reduce processes will consume. Currently, the only practical approach is for the mapper to serialise (marshall) the graph and write it into a file, and the reducers to read the file and deserialise (unmarshall) the graph. This repeated serialise-output-input-deserialise sequence is extremely costly.
However, a recent and exciting development in computer architecture is the advent of very large (gigabytes) main memories, including persistent memories. This has the potential to make traditional file systems obsolete, since sharing data and making it durable can now be done directly in main memory. Returning to the example, the graph can be shared and made persistent directly in main memory instead. A related problem is that of lazily mutating or copying a large pointer-based data structure in memory.
The basic techniques are well known (e.g., mmap or copy-on-write) but they are either too difficult or impractical for application programmers. They are possible but difficult in low-level languages such as C or C++, and practically impossible in managed languages such as Java or Scala.
Therefore, the aim of the internship is to explore how to enable direct sharing between processes, and/or lazy copying/mutation, of a rich pointer-based data structure. This consists of two related sub-problems: how to implement these techniques efficiently inside the execution environment, and how to expose them to the application programmer in a safe and simple fashion.
The intern shall study the state of the art and perform experiments. The intern will build a proof-of-concept prototype, initially making simplifying assumptions, e.g., no concurrent writes, no garbage collection, no JIT, which shall be relaxed little by little. The findings are to be published in the scientific literature.

URL sujet détaillé : https://team.inria.fr/regal/job-offers/masters-internship-shared-persistent-and-mutable-big-data-structures/

Remarques : Rémunéré
Co-Encadrant: Gaël Thomas, Télécom SudParis



SM207-170 Static analysis of fault-tolerant distributed protocols  
Admin  
Encadrant : Cezara DRAGOI
Labo/Organisme : ENS Paris/INRIA Paris team Antique
URL : http://www.di.ens.fr/~cezarad/
Ville : PARIS

Description  

Fault-tolerant distributed data structures are at the core distributed systems, e.g., Amazon Dynamo, Apache Zookeeper, blockchain. Due to the multiple sources of non-determinism, their development is challenging and error prone. We aim to develop automated verification tech- niques that will increase the confidence we have in these systems. This requires identifying a protocol modelization framework that formalizes and simplifies the structure and specification of distributed systems, enabling the automation of their verification. Regarding the verification technique, we focus on static analysis based on SMT solvers and abstract interpretation. The targeted class of protocols includes solutions for consensus, e.g., Paxos, solutions for weaker qualitative forms of agreement, e.g., Lattice Agreement, k-set agreement, or qualitative solutions for agreement, e.g., blockchain algorithms. The internship is financed by the ANR project SAFTA - Static analysis of fault-tolerant algorithms.
URL sujet détaillé : http://www.di.ens.fr/~cezarad/AutomatedVerif.pdf

Remarques :



SM207-171 Compilation schemes for high-level fault-tolerant distributed protocols  
Admin  
Encadrant : Cezara DRAGOI
Labo/Organisme : ENS Paris/INRIA Paris team Antique
URL : http://www.di.ens.fr/~cezarad/
Ville : PARIS

Description  

Fault-tolerant distributed data structures are at the core distributed systems. Due to the multiple sources of non-determinism, their development is challenging. The project aims to increase the confidence we have in distributed implementations of data structures. We think that the difficulty does not only come from the algorithms but from the way we think about distributed systems. We will investigate partially synchronous programming abstractions that reduce the number of interleavings, simplifying the reasoning about distributed systems and their proof arguments. We will use partial synchrony to define reduction theorems from asynchronous semantics to partially synchronous ones, enabling the transfer of proofs from the synchronous world to the asynchronous one. Moreover, we will define a domain specific language, that allows the programmer to focus on the algorithm task, it compiles into efficient asynchronous code, and it is equipped with automated verification engines. The internship is financed by the ANR project SAFTA - Static analysis of fault-tolerant algorithms.
URL sujet détaillé : http://www.di.ens.fr/~cezarad/compilation.pdf

Remarques :



SM207-172 Placement dynamique d'applications dans une architecture Fog/Edge  
Admin  
Encadrant : Frédéric DESPREZ
Labo/Organisme : LIG
URL : https://fdesprez.github.io/
Ville : Grenoble

Description  

Les nouvelles architectures de Clouds seront constituées de micro-datacenters situés en bordure de réseau. Ces plates-formes permettront de répondre aux besoins des utilisateurs tout en réduisant leur empreinte énergétique et en minimisant le coût des communications. Des problématiques de recherche doivent cependant être adressées si l'on souhaite avoir les performances qui permettront de répondre aux applications de demain. Où doivent être déployés les micro datacenters (et avec quelles capacités) ? Comment gérer l'extensibilité de ces plates-formes (il est hors de question d'avoir une solution centralisée) ? Comment placer les applications les utilisant sachant que dans le domaine de l'Internet des objets, les systèmes générant des traitements et des données sont appelés à se déplacer géographiquement (voitures connectées, smartphones, tablettes, ...).

Le stage s'articulera autour de quelques problèmes fondamentaux autour de ces plates-formes et de ces applications fortement dynamiques et distribuées:

- comment modéliser l'infrastructure et les applications
- concevoir des algorithmes de placement (dynamiques) satisfaisant les contraintes de qualité de service, de consommation énergétique et de performances
- prendre en compte l'aspect géographique des déplacement des utilisateurs
- évaluer et comparer les algorithmes à l'aide du simulateur SimGrid et si possible, effectuer des expérimentations en vraie grandeur sur la plateforme de recherche Grid'5000.

Références:
- Le projet Discovery http://beyondtheclouds.github.io

[1] A. Lebre, J. Pastor, A. Simonet and F. Desprez. Revising OpenStack to Operate Fog/Edge Computing Infrastructures. Proceedings of IEEE International Conference on Cloud Engineering (IC2E) 2017, April 2017.
[2] L. Atzori, A. Iera, and G. Morabito, =93The internet of things: A survey,=94 Computer networks, vol. 54, no. 15, pp. 2787=962805, 2010.
[3] A. Ahmed and E. Ahmed, =93A survey on mobile edge computing,=94 in 2016 10th International Conference on Intelligent Systems and Control (ISCO), Jan 2016, pp. 1=968.
[4] F. Bonomi, R. Milito, J. Zhu, and S. Addepalli, =93Fog computing and its role in the internet of things,=94 in Proceedings of the first edition of the MCC workshop on Mobile cloud computing. ACM, 2012, pp. 13=9616.
[5] S. Yi, C. Li, and Q. Li, =93A survey of fog computing: Concepts, applications and issues,=94 in Proceedings of the Workshop on Mobile Big Data, ser. Mobidata '15. ACM, 2015.

URL sujet détaillé : :

Remarques : Co-encadrement avec Farah Ait Salaht, postdoc dans l'équipe Corse. Possibilité de gratification.
Le stage se déroulera dans l'équipe Corse commune avec l'Inria et localisée dans les locaux de Minatec à Grenoble




Last modification : 2018-06-12 11:53:10 laurent.lefevre@ens-lyon.fr View source.