SM2071 Evaluation de performances pour les systèmes embarqués en utilisant les méthodes formelles


Description


Le sujet porte sur l'utilisation de méthodes formelles pour de l'analyse de performance. Les techniques existantes sont soit peu expressives (méthodes analytiques), soit complexes algorithmiquement (méthodes computationnelles). Nous cherchons des compromis entre ces deux classes de méthodes. URL sujet détaillé : http://wwwverimag.imag.fr/Master2R20132014Performance.html?lang=fr
Remarques : Coencadrement avec Karine Altisen, de la même équipe.





SM2072 Techniques de compilation dédiées pour un langage spécifique à un domaine (SystemC)


Description


SystemC est un "langage" très utilisé pour la modélisation de systèmes embarqués.
L'implémentation de référence est une bibliothèque pour le langage C++, mais cela ne permet pas d'exploiter toutes les spécificités de SystemC lors de la compilation.
L'objet du stage est de développer un compilateur dédié pour SystemC. Il pourra utiliser l'outil existant Tweto, qui fait déjà une partie du travail. URL sujet détaillé : http://wwwverimag.imag.fr/~moy/?Dedicatedcompilationtechniques
Remarques :





SM2073 Vérification formelle de programmes SystemC


Description


SystemC est un outil très utilisé pour modéliser des systèmes embarqués. Ce stage s'intéresse à la vérification formelle (par modelchecking) de programmes SystemC. URL sujet détaillé : http://wwwverimag.imag.fr/~moy/?FormalVerificationofSystemC
Remarques : Collaboration possible avec Kevin Marquet, maître de conférences CITI/INSA.





SM2074 Genotype imputation using largescale statistical machine learning


Description


The amount of generated biological data grows at an unprecedented pace. Statistical and machinelearning research efforts are increasingly needed to analyse largescale biological data. As part of this research effort, the objective of the internship is to develop statistical algorithms that can scale with the massive dimension of genomic data. One major use of genomic data is to make associations between genomic features and disease traits. Key technical steps in making associations between genes and diseases consists of identifying the genetic variants that are colocated on the same chromosome (phasing) and imputing missing genetic variants based on reference panels (genotype imputation). These two steps are performed with statistical algorithms that may hit a computational wall soon because of the massive dimension of the genomic data. The objective of the internship is to develop statistical models and algorithms for handling massive data. A first direction will be to adapt matrix completion techniques, which are popular for web recommender systems and which can also provide accurate and fast results for genotype imputation. URL sujet détaillé :
:
Remarques : Stage coencadré par Julien Mairal, chercheur INRIA en machine learning (http://lear.inrialpes.fr/people/mairal/)





SM2075 L'approche par contrats pour l'intégration correcte de composants dans les systèmes temps réel embarqués


Description


L'exploitation des systèmes embarqués temps réel est de plus en plus fréquente dans le milieu industriel. Par rapport à l'objectif envisagé, les soussystèmes (logiciels, hardware, etc..) sont développés en différents langages de programmation et de spécification. D'où, la nécessité d'une composition correcte visàvis des propriétés de sûreté de fonctionnement, incluant la maîtrise du temps réel. L'objectif du stage est d'étudier l'intégration correcte de composants Ravenscar (sousensemble du langage Ada ciblé pour les systèmes embarqués temps réel) et OASIS/PharOS (utilisé originairement pour le domaine nucléaire, langage synchrone pour le contrôle des données dans les systèmes temps réel), tout en préservant leurs propriétés intrinsèques liées à la sûreté de fonctionnement. Les techniques et la méthodologie adoptées dans ce stage se fondent sur les approches par contrats [R1,R2,R3]. L'approche fondée sur les contrats est un moyen pour la spécification d'un système souvent utilisée pour démontrer la correcte intégration de composants, leurs consistances et la préservation de propriétés. L'intégration se joue autant au niveau composant qu'au niveau code, ces dernières étant générés à partir des composants. Ces aspects seront pris en compte dans l'étude. Le travail de stage comprend une étude préliminaire, la mise en oeuvre d'une maquette logicielle/système et la collaboration avec la PME française KronoSafe. Bilbiographie
[R1] Alberto SangiovanniVincentelli, Werner Damm and Roberto Passerone. Taming Dr. Frankenstein: ContractBased Design for CyberPhysical Systems. European Journal of Control, 18(3):217238, 2012. [R2] L. de Alfaro and T. A. Henzinger, =93Interface automata,=94 , ACM Press, 2001 [R3] D. Cancila, R. Passerone, T. Vardanega and M. Panunzio. Toward Correctness in the Specification and Handling of NonFunctional Attributes of HighIntegrity RealTime Embedded Systems. IEEE TII, 2010.
Durée de Stage 46 mois Niveau souhaité Bac + 4/5 Domaine Informatique Mots clés sûreté de fonctionnement, temps réel, contrats, composants Centre Saclay, DRT/DACLE/LaSTRE URL sujet détaillé :
:
Remarques : rémunération fixée par le CEA, prime de fin stage, 50% du pass navigo, remboursement de frais si l'étudiant se déplace en région parisienne
Informations supplémentaires à voir directement avec le point de contact





SM2076 3D CopyPaste


Description


Le but de ce stage est d'imaginer, créer et évaluer une technique automatique pour le recallage de squelettes d'animation entre différents personnages 3D pour l'informatique graphique.
Le descriptif complet du stage (contexte, déroulement du stage, financement, continuation en thèse) se trouve à l'adresse suivante: http://perso.telecomparistech.fr/~tierny/stuff/openPositions/internship2014_a.html URL sujet détaillé : http://perso.telecomparistech.fr/~tierny/stuff/openPositions/internship2014_a.html
Remarques :





SM2077 3D TimeTraveler


Description


Le but de ce stage est d'imaginer, de mettre en oeuvre et d'évaluer une technique générale et efficace, pour l'interpolation temporelle de données 2D ou 3D variant dans le temps, dans le contexte de la visualisation scientifique.
Le descriptif complet du stage (contexte, déroulement du stage, financement, continuation en thèse) se trouve à l'adresse suivante: http://perso.telecomparistech.fr/~tierny/stuff/openPositions/internship2014_b.html URL sujet détaillé : http://perso.telecomparistech.fr/~tierny/stuff/openPositions/internship2014_b.html
Remarques :





SM2078 Offre de stage générique pour Research & Innovation Technicolor


Description


Objectives "Our interns' mission is to help researchers identify new approaches, implement new algorithms, perform tests and benchmarks and contribute to the implementation of demos and prototypes within research teams. We offer research internship opportunities in multiple fields:
 Computer Graphics (CG)  Computer Vision (CV),  Data Mining (DM)  Distributed Computing (DC),  Human Computer Interaction (HCI)  Human Factors (HF)  Security and Privacy (SP)  Video Processing (VP)
Details of the internships offered within these thematics at http://www.technicolor.com/en/innovation/studentday/jobinternshipopportunitiesrilabs Apply at stage.rennes.com " Profile " ' Student in Master 2 in an engineering school or university  Specialization in one of the fields above  Inquiring mind, inventive, passionate .  Skills in prototyping applications on PC or embedded platforms  Good English level
To apply: "If you are interested in joining us, please visit our web site and consult our internship open positions at http://www.technicolor.com/en/innovation/studentday/jobinternshipopportunitiesrilabs and apply at stage.rennes.com
Group: Technicolor (www.technicolor.com) / Research & Innovation. Position: Intern  Location: Rennes, FRANCE  Deadline: Open. Compensation: 1200 =80 per month (before taxes) " URL sujet détaillé :
:
Remarques : Application at stage.rennes.com





SM2079 Offre de stage générique pour Connected Home Rennes ( STB, gateways, etc)


Description


Objectives "Our interns' mission is to help R&D engineers enabling Multiple Play Video and Communications through:  The design / supply of settop box platforms to satellite, cable and telecom operators  The design / supply of access devices to deliver multiplay services to subscribers
We offer internship opportunities in the multiple fields around integrated solutions
 Software  Modem & gateways  Video gateway  Connected Home Devices  Set Top Boxes  Professional services
If you are interested in joining us, please visit our website to see the detailed internship open positions at : http://www.technicolor.com/en/innovation/studentday/internshipopportunitiesconnectedhome/connectedhomeinternships " Profile " Preferably Master 2 Student in engineering school / university  Specialization in computing, software, system architecture, embedded realtime software development, testing, UI, etc.  User =96Oriented mindset  Inquiring mind, inventive, passionate.  Good English level " To apply: "If you are interested in joining us, please visit our web site and consult our internship open positions at http://www.technicolor.com/en/innovation/studentday/internshipopportunitiesconnectedhome/connectedhomeinternships and apply at stage.rennes.com Group: Technicolor (www.technicolor.com) / Research & Innovation. Position: Intern  Location: Rennes, FRANCE  Deadline: Open. Compensation: 1200 =80 per month (before taxes) URL sujet détaillé :
:
Remarques : Application at stage.rennes.com





SM20710 Intelligent Task Allocation in MultiCore Processors for Processing Complex Jobs on Big Dataset.


Description


In recent years, Hadoop has gained enormous attention from the corporate users especially in companies that deal with Big Data [1] such as Facebook, Twitter, and Google, etc., ... It enables implementing scalable solutions that partition large datasets and distribute them randomly across the available clusters of processing nodes. In Hadoop [2], a job (e.g. a query) is submitted to a socalled job tracker that instantiates a job client for computing the input split for the job (notably, the input split is done by the FileInputFormat class of Hadoop). Then, the job is broken into tasks which are then allocated to the data nodes (also known as compute nodes) for processing.
An inefficient approach for allocating tasks will surely penalize performance in terms of processing jobs. For instance, performance metrics such as throughput and queryprocessing time will be affected heavily. It is worth noting that, for a reallife a Big Data processing scenario, both throughput and processing time are the critical performance metrics. Thus, the challenge is to find the most efficient approach for allocating a large number of tasks across the clusters of nodes.
Hadoop comes with a default generalpurpose task allocator. Such a module is unaware of any specificity of the allocated jobs and thus that does not perform any optimization. Thus, when it comes to the matter of efficient task allocation, it mostly leaves this work to the programmer. Efficient task allocation in processing queries on Big Data is a difficult work. This gives rise to the need of an approach that can allocate the tasks across the data nodes of Hadoop clusters in an intelligent manner if specific knowledge about the jobs being allocated and the task nodes can be exploited. The CEDAR project proposes a master thesis project to address this requirement.
Objective The objective of this project is to develop a solution to Hadoop task allocation that will do the followings:  Generate a task graph for a specific job based on Hadoop's InputSplit (which references the data), available task nodes, and other runtime factors.  Allocate tasks across the clusters using the graph.  Migrate tasks dynamically upon failure of a task node to another task node, or schedule task processing in a lazy manner. The goal of this project to combine the power of Hadoop with generalpurpose computing on Graphics Processing Units (GPGPU) [3]. The objective is to use the solution in allocating tasks to clusters of nodes that have computational units CPU [4] or GPUs. Since Hadoop is CPU and IO bound, this approach has great potential for dramatically improving the performance in terms of jobprocessing time (if the task allocation can be carried out in an intelligent manner).
Reference [1] Big Data: http://en.wikipedia.org/wiki/Big_data [2] Hadoop: http://hadoop.apache.org/ [3] GPGPU: http://en.wikipedia.org/wiki/GPGPU [4] CPU : http://en.wikipedia.org/wiki/Central_processing_unit URL sujet détaillé : http://cedar.liris.cnrs.fr/
Remarques :





SM20711 Multiplexing Hadoop into a PeertoPeer CacheOnly Memory Architecture


Description


The Hadoop architecture [1] is gaining popularity as a system for concurrent programming. From the outset, it consists of a onelevel tree for processor nodes, the topmost being a master node delegating and assigning parts of a partitioned task to the lowerlevel slave nodes. A characteristic of this architecture is that while the partitioned jobs are distributed over nodes, data, on the other hand, are copied to all the nodes. This creates obvious inefficiencies. Another aspect of the Hadoop architecture is that the Master/Slave status of nodes is fixed and immutable. This means that if the Master node crashes, the whole setup does too.
A Peertopeer (P2P) architecture [2] is a decentralized distributed network of processors where nodes are not subject to an immutable master/slave configuration.
A CacheOnly Memory Architecture (COMA) [4] is a multiprocessor distributedmemory architecture where data is shared by all the nodes, each node keeping a virtualmemory cache of the data.
The proposed research topic is to study ways of multiplexing Hadoop into a COMA such as the Data Diffusion Machine (DDM) [4], and develop a proofofconcept prototype.
Following are pointers to tools and technologies that are relevant for such a study and the implementation of a proofofconcept prototype.
As it turns out, Hadoop already has what's needed to play with the concept of distributed caching [5]. See also [6]  a Masters Thesis done at IBM Zurich in 2012  for how this can be leveraged for the Hadoop Distributed File System (HDFS) [7]. Also relevant are socalled "inmemory accelerators" (see [8] for ex., or [9] based on Apache Spark [10]). There are a few Java libraries for the coordination of distributed caching (e.g., ZooKeeper [11], and its higherlevel interface Curator [12]).
Looking for any existing peertopeer versions of Hadoop, one clearly emerges: Cassandra [13, 14]. To cite [13], Cassandra's architecture is:
Decentralized Every node in the cluster has the same role. There is no single point of failure. Data is distributed across the cluster (so each node contains different data), but there is no master as every node can service any request.
Supports replication and multi data center replication Replication strategies are configurable. Cassandra is designed as a distributed system, for deployment of large numbers of nodes across multiple data centers. Key features of Cassandra's distributed architecture are specifically tailored for multipledata center deployment, for redundancy, for failover and disaster recovery.
It has a python client interface called Pycassa [15] and a php client interface called Phpcassa [16]. One system has used the idea of combining Cassandra and Hadoop for P2P  it's called Brisk [17] (watch video on [18]). You will notice that Brisk uses a ringbased scheduler (therefore perhaps Disruptor [19] may come handy)... Brisk has been deployed by major industrial players (Facebook, Amazon, ...).
References:
1. http://en.wikipedia.org/wiki/Apache_Hadoop 2. http://en.wikipedia.org/wiki/Peertopeer 3. http://en.wikipedia.org/wiki/Cacheonly_memory_architecture 4. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.48.2301&rep=rep1&type=pdf 5. http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/filecache/DistributedCache.html 6. ftp://ftp.tik.ee.ethz.ch/pub/students/2012FS/MA201204.pdf 7. http://www.ibm.com/developerworks/library/waintrohdfs/waintrohdfspdf.pdf 8. http://www.gridgain.com/products/inmemoryhadoopaccelerator/ 9. http://gigaom.com/2013/04/17/welcometoberkeleywherehadoopisntnearlyfastenough/ 10. http://spark.incubator.apache.org/ 11. http://zookeeper.apache.org/ 12. http://curator.incubator.apache.org/ 13. http://en.wikipedia.org/wiki/Apache_Cassandra 14. http://cassandra.apache.org/ 15. http://pycassa.github.io/pycassa/ 16. http://thobbs.github.io/phpcassa/ 17. http://cdn.parleys.com/p/5148922a0364bc17fc56c5eb/attach_201202171957209853697.pdf 18. http://www.youtube.com/watch?v=RhDJebe2dJQ 19. http://lmaxexchange.github.io/disruptor/
URL sujet détaillé : http://cedar.liris.cnrs.fr/
Remarques :





SM20712 Etude des propriétés spectrales des empilements compacts à plusieurs sphères.


Description


La théorie d'échantillonnage joue un rôle prépondérant dans la synthèse d'image. Les systèmes d'échantillonnage multidimensionnels évolués avec des qualités spectrales contrôlables peuvent également trouver leur application dans d'autres domaines qui font appel aux méthodes de MonteCarlo et de Quasi MonteCarlo multidimensionnelles, par exemple, dans des systèmes de simulations. On propose à l'étudiant d'examiner les propriétés spectrales des empilements compacts à plusieurs sphères, ainsi que les méthodes rapides pour générer de tels arrangements en 2D, en 3D, et en dimensions plus élevées. Ce projet est passionnant et très ambitieux. Pour comprendre à quel point le problème des empilements compacts est nontrivial, lisez l'article de Pfender et Ziegler : vous y verrez que certaines réponses aux problèmes posés depuis des siècles ont été apportées en 2003 par un mathématicien peu connu. Ou lisez les livres de Thompson ou de Conway & Sloane, pour voir à quel point le problème des empilements compacts est lié à d'autres branches de la science, par exemple à la théorie des groupes ou bien à la théorie des codes correcteurs.
Références : Pfender, Florian; Ziegler, G=FCnter M. (September 2004), "Kissing numbers, sphere packings, and some unexpected proofs", Notices of the American Mathematical Society: 873=96883,http://www.ams.org/notices/200408/feapfender.pdf. Thomas M. Thompson (1983) "From errorcorrecting codes through sphere packings to simple groups". Conway, J.H. & Sloane, N.J.H. (1998) "Sphere Packings, Lattices and Groups". Bibiliographie du Professeur Salvatore Torquato: http://cherrypit.princeton.edu/sal.html URL sujet détaillé :
:
Remarques : Possibilité de rénumération. Possibilité de continuer au niveau doctoral.





SM20713 A Theory of Complexity, Condition and Roundoff


Description


Complexity theory has primarily developed for discrete computations. For the numerical computations current in numerical analysis a theory was offered in 1989 by Blum, Shub and Smale. In the pioneer paper where this was done the introduction is closed by a list of open problems, the last of which pointing to the fact that at least two aspects were left out of their exposition. Firtsly, the consideration of roundoff errors and their effect on computation. Secondly, the complexity of iterative methods. Both issues are related to the notion of condition (the analyses of both are expressed in terms of a condition number) and are the warp and woof of numerical analysis. After which they propose "to bring machines over R closer to the subject of numerical analysis, it would be useful to incorporate roundoff error, condition numbers and approximate solutions into our development."
In the last year I laid the grounds for this agenda. In doing so, a number of open questions naturally are posed. The goal of the project is to give an answer to some of these questions. URL sujet détaillé :
:
Remarques : Joint supervision is possible (Pascal Koiran would be an appropriate supervisor at the LIP). Financial support is also possible.





SM20714 Approximation algorithms for social data exploration


Description


We are interested in developing approximation algorithms for the exploration of social datasets and studying their complexities.
We model a social dataset as a bipartite graph with users, items and edges between users and items. An edge represents the action of a user on an item such as rating or tagging. Each record is in the form: . Users have demographics attributes such as their age and occupation.
We define a user group as a set of users who share at least one value for an attribute: e.g., =93female users=94 is a group of users who share a value for gender, and =93young students in CA=94 is a group of users who share age, occupation and location values. The goal of our exploration is to find =93interesting=94 user groups.
We formulate two exploration problems (see references 1 and 2):
 identifying groups under conditions on their labels
 identifying pairs of groups under conditions on their labels
An instance of the first problem would find groups whose ratings are high or low, or whose tags are semantically consistent or not. An instance of the second problem would find pairs of groups who rate or tag items similarly or differently.
The problems defined above are hard ones and require the development of approximation algorithms. The candidate will be asked to formalize those problems and implement algorithms. We will be using real social datasets such as Last.fm, BookCrossing, and MovieLens.
References
1) M. Das, S. AmerYahia, G. Das, C . Yu: MRI: Meaningful Interpretations of Collaborative Ratings. PVLDB 5(11): 10631074 (2011)
2) M. Das, S. Thirumuruganathan, S. AmerYahia, G. Das, C . Yu: Who Tags What? An Analysis Framework. PVLDB 5(11): 15671578 (2012)
3) Last.fm: http://www.last.fm/api
4) BookCrossing: http://www.informatik.unifreiburg.de/~cziegler/BX/
5) MovieLens: http://www.grouplens.org/
6) Hochbaum, Approximation algorithms fro NPhard problems, 1996 (first edition) URL sujet détaillé : http://projetsmastermi.imag.fr/pcarre/CDExportProjetHtml?idSujet=1643
Remarques : stage coencadré rémunération stage





SM20715 Intégration des offsets en calcul réseau pour les systèmes embarqués avioniques


Description


English version:
=04Critical embedded systems (aka Cyberphysical systems) are made of calculators communicatinf through a shared network. The realtime behavior of the global system depend on the behavior of each subsystem, and the one of the network.
Network calculus is a formal method, based on the (min,+) dioid, designed to compute bounds on delay and memory usage. It has been used to certify the A380 backbone.
The objective now is to see how to handle offsets betweens flows. This must be done to analyse networks like TDMA or ARINC 825. The challenge is to do a tight modelling, to get tight values as results.
French version:
Les systèmes embarqués critiques sont désormais constitués d'applications échangeant des messages à travers un réseau partagé. Les fonctions doivent être réalisées en un temps garanti, et cette contrainte s'applique donc aux calculateur comme au réseau.
Le calcul réseau est une méthode formelle générique, basé sur le dioid (min,+), qui permet de calculer des bornes sur les pires temps de réponse des réseaux. Elle a été utilisées pour certifier le coeur de réseau A380.
Le sujet de ce stage consiste à étudier la notion de décalages entres flux, nécessaires pour étudier les réseaux de type TDMA et ARINC 825, afin d'en faire une modélisation fine, capable de calculer le pire temps de réponse exact.
Bibliographie:
[NC01] J.Y. Le Boudec and P/ Thiran. Network Calculus, volume 2050 of LNCS. Springer Verlag, 2001. http://lrcwww.epfl.ch/PS_files/NetCal.htm.
[RTNS12] Combining network calculus and scheduling theory to improve delay bounds, RTNS '12 Proceedings of the 20th International Conference on RealTime and Network Systems URL sujet détaillé :
:
Remarques : Stage rémunéré. Opportunité de thèse.





SM20716 Vérification distribuée de systèmes tempsréel


Description


Realtime systems have become ubiquitous in the past few years. Some of them (automated plane and unmanned systems control, banking systems, etc.) are critical in the sense that no error must occur. Testing these systems can possibly detect the presence of bugs, but not guarantee their absence. It is necessary to use formal methods such as model checking so as to prove formally the correctness of a system.
Realtime systems are characterized by a set of timing constants, such as the reading period of a sensor on an unmanned aircraft system, the traversal time of a circuit by the electric current, or the delay before retransmitting data in a cellphone. Although numerous techniques to verify a system for one set of constants exist, formally verifying the system for numerous values of these constants can require a very long time, or even infinite if one aims at verifying dense sets of values.
It is therefore interesting to reason in a parametric manner, by considering that these constants are unknown, i.e., parameters, and synthesize a constraint on these parameters guaranteeing the system correctness. A method, the inverse method, has been proposed for time Petri nets, a formalism widely used. Starting from a reference valuation of the parameters corresponding to a correct behavior, this method synthesizes a constraint on the parameters guaranteeing the same correct behavior. As a consequence, this guarantees that the system will be correct for any parameter valuation satisfying this constraint.
In order to take advantage of the multicore processors and clusters, the verification algorithm shall be redefined to be adapted to the distributed case. The goal is, for a processor with n cores, that the improved algorithm be (almost) n times faster than on a monocore processor  and similarly for clusters. It may be interesting to base on a modular approach proposed for timed Petri nets. An implementation will also be performed by the intern, so as to validate the proposed approach. URL sujet détaillé : http://lipn.univparis13.fr/~andre/201311ENSLIMdistributedfr.pdf
Remarques : Coencadrement avec Laure Petrucci. Rémunération.





SM20717 Adaptation of a method for the analysis of attributed dynamic graph and its evaluation to describe Vélo'V system usages.


Description


The focus of this internship is the development and the study of methods for the analysis of complex systems seen as attributed dynamic graphs. More precisely, it will consist in the study of a peculiar complex system, the bicycle sharing system of Lyon called Vélo'V. The objective is to analyze the use of Vélo'V from a large body of data including all trips over several years, the sociodemographic characteristics of stations and related information of system users. These data can be structured as a dynamic graph whose vertices represent stations and edges represent the flow of movements during a time interval. Numeric attributes also describe the vertices and the edges. The internship goal will be to disclose the characteristics of the Vélo'V usages at certain times of the week according to socioeconomic characteristics of the station areas. For this, the usage characteristics encoded by the graph topology will be considered in conjunction with the values of the attributes of vertices and edges.
Objectives and Methodology: We recently developed a method to analyze static attributed graph [1], which find the strong covariations between the vertex descriptors, that are either vertex attributes or measures derived from the graph topology. This method gave good results on benchmark graphs and we want to evaluate it on Vélo'V graph. The objective is thus to provide, test and explore new topological descriptors of the flow encoded by the edges of the graph and of the dynamics of the graph to reflect the relations between the attribute value changes and topological changes of the graph. Subsequently, the strong covariations will be used in a predictive model of bicycle sharing systems.
[1] Mining Graph Topological Patterns: Finding Covariations among Vertex Descriptors. A. Bechara Prado, M. Plantevit, C. Robardet, JF. Boulicaut. IEEE Transactions on Knowledge and Data Engineering 25(9):20902104, IEEE, ISSN 1041434. 2013. URL sujet détaillé :
:
Remarques : Coencadrement avec Marc Plantevit





SM20718 Computation of weighted cliques in graphs


Description


The focus of this internship is to develop and study methods for the analysis of complex systems seen as dynamic attributed graphs. More specifically, it consists in the study of the computation of weighted cliques in dynamic graphs whose edges are weighted by several measures. The intended application of this work is the analysis the bicycle sharing system of Lyon called Vélo'V for which we have a large body of data including all trips over several years, as well as several features that describe the stations and the users. These data can be structured in a dynamic graph whose vertices represent stations and the edges represent the movements as flows during a time interval. Numeric attributes also describe vertices and edges. The aim of this master thesis is to disclose characteristic behaviours in Vélo'V usages based on these attributes.
Objectives and method: The extraction of cliques in graphs is an essential tool to highlight strong associations between the entities described by the graph. The main challenge of these approaches is to be able to deal with large graphs and control the complexity of computation associated with the enumeration of all such cliques. In addition, when several cliques are extracted from the same graph, we want them to be the most diverse as possible so that the result is both complete, representative of the entire data and significant. In a recent paper [1] this problem has been formulated as an optimisation problem shown to be NPhard, but that can be approximate by a constant factor thanks to its sub modularity property. The idea of this master thesis is to explore and extend this approach to other forms of weighted cliques and consider their relevancy to the study of the dynamic attributed graph Vélo'V .
[1] As Strong as the Weakest Link: Mining Diverse Cliques in Weighted Graphs. Petko Bogdanov, Ben Baumer, Prithwish Basu, Amotz BarNoy, and Ambuj K. Singh. ECML/PKDD 1, volume 8188 of Lecture Notes in Computer Science, page 525540. Springer, 2013. URL sujet détaillé :
:
Remarques : Coencadrement avec Marc Plantevit





SM20719 Matching large graphs


Description


Graph data have become large and manipulating them based on similarity measure is essential for many applications. Finding an exact matching of two graphs is NPhard and cannot be used even for medium sized graphs. Several approximations are proposed in the literature but they do not scale well with big graph data. The aim of this master project is to investigate how well existing solutions scale with big graphs and implement an new approach for graph comparison. URL sujet détaillé :
:
Remarques :





SM20720 Resilient and energyaware scheduling algorithms for largescale distributed systems


Description


Resilient and energyaware scheduling algorithms for largescale distributed systems
Advisors:  Anne Benoit, ENS Lyon and IUF, Anne.Benoitlyon.fr  Yves Robert, ENS Lyon and IUF, Yves.Robertlyon.fr
(1) Motivation
Largescale distributed systems correspond to wide range of platforms, such as highperformance supercomputers, sensor networks, volunteer computing grids, etc. All these systems have a very large scale, and include millions of components. Such a large scale induces two major problems: resilience and energyconsumption. Resilience is (loosely) defined as surviving to failures. For instance, a failure will occur every 50 minutes in a system with one million components, even if the MTBF (mean time between failures) of a single component is as large as 100 years. Obviously, failure handling is critical for highly parallel applications that use a large number of components for a significant amount of time, because such applications are likely to experience at least one failure during execution. But failure handling cannot be ignored for other applications.
Failures are usually handled by adding redundancy, either continuously (replication) or at periodic intervals (migration from faulty node to spare node, rollback and recovery). In the latter case, the state of an application must be preserved (checkpointing), and the system must roll back to the last saved checkpoint. However the amount of replication and/or the frequency of checkpointing must be optimized carefully. For example, checkpointing frequently degrades performance, but the application is at risk between two checkpoints, and the longer the checkpoint interval, the larger the penalty paid for reexecuting work after a failure.
Largescale distributed systems face a second important challenge: power consumption. Power management is necessary due to both monetary and environmental constraints. Presently large computing centers are among the largest consumers of energy. Energy is needed to provide power to the individual cores and also to provide cooling for the system. In future distributed systems, it is anticipated that the power dissipated to perform communications and I/O transfers will make up a much larger share of the overall power consumption. In fact, the relative cost of communication is expected to increase dramatically, both in terms of latency/overhead and of consumed energy. Using dynamic voltage and frequency scaling (DVFS) is a widely used technique to decrease energy consumption, but it can severely degrade performance and increase execution time.
Redesigning scheduling algorithms to ensure resilience and to minimize energy consumption is a difficult proposition, especially because higher resilience often calls for redundant computations and/or redundant communications, which in turn consume extra energy. In addition, reducing the energy consumption with DVFS increases execution time, hence the expected number of failures during execution.
The main goal of this M2 training period is to investigate tradeoffs between performance, resilience and energy consumption. This is a challenging but unavoidable multicriteria optimization problem, whose solution is critical for many applications and largescale systems.
(2) Technical setting
Initial studies will be focused around task graph scheduling and checkpointing, starting with an application task graph deployed on a largescale platform. In term of traditional scheduling, this corresponds to scheduling with unlimited resources and has polynomial complexity. With resilience into play, what is the complexity of the problem? Which tasks should we checkpoint?
Then, we will consider several applications running on the platform, which means that tasks are already assigned to processors. The same problems should be addressed. Subsequent studies will include another technique (replication) and another constraint (management of a critical resource, such as I/O bandwidth or main memory capacity).
(3) Background
We have extended expertise in the area of parallel algorithms, scheduling techniques, checkpointing strategies and multicriteria optimization problems. Relevant literature will be provided to the student for assimilation and synthesis during the first part of the training period. (4) Required Skills
Algorithm design, complexity, probabilities
(5) Comments
Start and end dates of the training period are flexible.
Continuation into a PhD program is expected. The training period takes place at Ecole Normale Supérieure de Lyon, within the LIP laboratory (CNRS, ENS Lyon, INRIA and Université Claude Bernard de Lyon). The team hosting the period is ROMA (see http://www.enslyon.fr/LIP/ROMA/index.php),
URL sujet détaillé :
:
Remarques : Coencadrement par Anne Benoit, ENS Lyon and IUF, Anne.Benoitlyon.fr





SM20721 Compilation and diagnosis for discrete controller synthesis


Description


Discrete Controller Synthesis (DCS) is a formal method which operates on a nondeterministic program containing controllable variables, and a behavioral property a priori not verified by this program. From this, DCS computes a controller, which, composed with the original program, will enforce the behavioral property on it.
BZR is a reactive language, belonging to the synchronous languages family. It is based on dataflow and automata styles, and its main feature is to include discrete controller synthesis within its compilation. This feature allows to describe programs in an imperative part, which describes possible behaviors, and a declarative part, which specifies behavioral constraints to be respected.
The latter is a behavioral contract mechanisms, where assumptions can be described, as well as an "enforce" property part: the semantics of this latter is that the property should be enforced by controlling the behaviour of the node equipped with the contract. This property will be enforced by an automatically built controller, which will act on free controllable variables given by the programmer.
This Masters will focus on the problem of diagnosis of BZR programs, which is made special by the fact that discrete controller synthesis can find a solution when it exists, but it is not easy to precisely diagnose cases where no solution can be found. This will be studied in relation with related work in declarative and constraints languages, and diagnosis techniques will be defined, and integrated with the imperative aspects in the language. URL sujet détaillé : http://popart.inrialpes.fr/people/delaval/positions/sujetsdcdiagnosis.php
Remarques : Coencadrement avec Éric Rutten (Inria GrenobleRhôneAlpes)





SM20722 Abstraction methods for compilation using discrete controller synthesis


Description


Discrete Controller Synthesis (DCS) is a formal method which operates on a nondeterministic program containing controllable variables, and a behavioral property a priori not verified by this program. From this, DCS computes a controller, which, composed with the original program, will enforce the behavioral property on it.
BZR is a reactive language, belonging to the synchronous languages family. It is based on dataflow and automata styles, and its main feature is to include discrete controller synthesis within its compilation. This feature allows to describe programs in an imperative part, which describes possible behaviors, and a declarative part, which specifies behavioral constraints to be respected.
The latter is a behavioral contract mechanisms, where assumptions can be described, as well as an "enforce" property part: the semantics of this latter is that the property should be enforced by controlling the behaviour of the node equipped with the contract. This property will be enforced by an automatically built controller, which will act on free controllable variables given by the programmer.
This Masters will focus on programming and modelling methodologies, in order to obtain logical (Boolean) abstractions which can improve the use of dicrete controller synthesis. This is needed for example, in order to program the controllers for systems where the useful data for control can be of arbitrary types (integer, real, ...), or also for systems which are naturally distributed, and require a decentralized controller. On the basis of our previous experiments on this topic, the work consist in generalizing manual techniques and defining methods to construct these abstractions. URL sujet détaillé : http://popart.inrialpes.fr/people/delaval/positions/sujetsdcabstraction.php
Remarques : Coencadrement avec Éric Rutten (Inria GrenobleRhôneAlpes)





SM20723 Control of Smart Environments and Probabilistic Models


Description


This work proposal focuses on extending the existing methodology of discrete control synthesis in an environment described by behavioral contracts with the introduction of probabilistic transitions in the subsequent synchronized automata. The application framework for this research are the reactive and adaptive environments. URL sujet détaillé : http://popart.inrialpes.fr/people/delaval/positions/sujetmasterpersyval.html
Remarques : Coencadrement avec Stéphane Mocanu (Gipsalab) Éric Rutten (Inria GrenobleRhôneAlpes)





SM20724 Automates cellulaires et groupe F de R. Thompson


Description


Le groupe Richard Thompson possède différentes représentations : engendré par deux générateurs et deux relations, homéomorphismes linéaires par morceaux de 00,1] les pentes sont des puissances de 2 et les points singuliers des rationnels à coordonnées dyadiques, diagrammes rectangulaires, paires d'arbres binaires enracinés, d'automorphismes du magma libre à 1 générateur (on pourra consulter [1] pour un survol). Il constitue un exemple important en théorie des groupes pour différentes raisons. Il a originellement été utilisé par McKenzie et Thompson pour construire des exemples de groupes finiment présentés avec problème du mot indécidable [2]. Il s'agit aussi du premier exemple de groupe simple infini mais finiment présenté. Ce groupe fait toujours l'objet de nombreuses recherche, parmi lesquelles la question de sa moyennabilité, problème toujours ouvert et qui divise la communauté.
On propose ici de réfléchir à ce problème en s'intéressant aux automates cellulaires (AC) définis sur le groupe En effet, la moyennabilité d'un groupe peut s'exprimer par des propriétés de ses AC. La première caractérisation apparaît dans [3] : Un groupe est moyennable si et seulement si tout AC surjectif est préinjectif, et préserve la mesure produit sur les configurations. Celleci sera complétée dans [4], où il est montré en plus que moyennable si et seulement si tout AC surjectif est équilibré, ou encore si tout AC surjectif est nonerrant.
Une des question abordée pendant le stage pourra consister à étudier les AC sur le groupe Une autre question qu'il peut être envisagé de traiter est la suivante. Les résultats précédents montrent que tous les groupes moyennables vérifient donc le théorème de Myhill (préinjectif implique surjectif), mais sontils les seuls ? Enfin on pourra s'intéresser aux variations autour de ce groupe comme par exemple le groupe de Monod [5] ou ses sousgroupes [6].
Prérequis :
Bonne connaissance des AC sur ^d des résultats classiques.
Objectifs du stage :
 Se familiariser avec le groupe R. Thompson et ses différentes définitions.  Que peuton dire des AC préinjectifs sur le groupe 0 des AC surjectifs ?  Le théorème de Myhill estil vrai sur les groupes moyennables uniquement ? sur le groupe 0
[1] J. W. Cannon, W. J. Floyd et W. R. Parry : Introductory notes on Richard Thompson's groups. Enseign. Math. (2), 42(34):215=96256, 1996.
[2] Ralph McKenzie et Richard J. Thompson : An elementary construction of unsolvable word problems in group theory. In Word problems : decision problems and the Burnside problem in group theory (Conf., Univ. California, Irvine, Calif. 1969 ; dedicated to Hanna Neumann), volume 71 de Studies in Logic and the Foundations of Math., pages 457=96478. NorthHolland, Amsterdam, 1973.
[3] Laurent Bartholdi : Gardens of Eden and amenability on cellular automata. J. Eur. Math. Soc. (JEMS), 12(1):241=96248, 2010.
[4] Silvio Capobianco, Pierre Guillon et Jarkko Kari : Surjective cellular automata far from the garden of eden. Discrete Mathematics & Theoretical Computer Science, 15(3), 2013.
[5] Kate Juschenko et Nicolas Monod : Cantor systems, piecewise translations and simple amenable groups. Ann. of Math. (2), 178(2):775=96787, 2013.
[6] Y. Lodha et J. Tatch Moore : A geometric solution to the von NeumannDay problem for finitely presented groups. ArXiv eprints, août 2013.
[7] Tullio CeccheriniSilberstein et Michel Coornaert : Cellular automata and groups. Sprin ger Monographs in Mathematics. SpringerVerlag, Berlin, 2010. URL sujet détaillé : http://perso.enslyon.fr/nathalie.aubrun/stage_M2_2014.pdf
Remarques : Ce stage sera coencadré par Étienne Ghys de l'UMPA (etienne.ghyslyon.fr).





SM20725 Optimal combination of data consistency protocols depending on the application and the execution platform.


Description


Manycore processors are made of hundreds to thousands cores embedded on a single chip and interconnected by a dedicated network. Onchip memory can be spread among the cores and possibly shared within clusters. Concurrent accesses to shared data are managed thanks to consistency protocols and mechanisms that offer guarantees on data freshness. Within our laboratory, some works have started in order to design and build an experimental platform that allows to dynamically use several data consistency protocols. These protocols have to be chosen and adapted at compilation time, depending on the application and the targeted hardware. This internship aims at designing a decision tool that helps in choosing the right consistency protocol and its associated parameters, based on the execution context. Several research directions can be explored. A first direction is to characterize different existing consistency protocols regarding the cache use rate and the network communications, based on a highlevel representation of the memory topology. A second direction is to find some optimal protocol combinations that can be deployed on the chip, using operational research algorithms. Finally, this work will be evaluated thanks to a prototype running on a manycore testbed such as the Adapteva Parallella. URL sujet détaillé :
:
Remarques :





SM20726 Génération de maillages volumiques pour la simulation physique


Description


M2DisCo and SAARA research teams of the LIRIS laboratory are currently working within the project GenSim around the volume mesh generation adapted to realtime mechanical simulations.
The main goal is to obtain an automatic volume mesh generator capable of integrating estimated error of the few last simulations, while taking into account others constraints such as the maximum number of volume elements. The produced mesh should have elements adapted to the type of simulation (e.g. mass spring, mass tensor or finite elements) and should take into account the mechanical law of each element (e.g. linear or nonlinear).
Main objectives:  Adaptive volume remeshing to improve the current volume mesh quality while respecting some constraints  Volume mesh generation to get a good initial volume mesh
Developments within the MEPP platform in C++ ; a good practice of Object Oriented and generic Programming is needed. URL sujet détaillé : http://liris.cnrs.fr/gensim/sujetstage.html
Remarques : Stage coencadré par M. Fabrice JAILLET chercheur du LIRIS dans le domaine des simulations biomécaniques tempsréel
Rémunération: environ 436=80 par mois





SM20727 Détection de communautés dans les réseaux multiplexes


Description


In this project we are interested in handling the problem of community detection in multiplex complex networks. In the area of complex network analysis, the community detection problem refer to the problem of partitioning a graph into a set of loosely connected dense subgraphs. The problem is known to be NPHard. A lot of heuristicsbased algorithms has been proposed to handle this problem for simple graphs. However, in real world settings, interaction networks are mostly represented by multiplex graphs. This is a multislice graph where each slice has the same set of nodes but different slices contain different sets of links.
Few approaches has addressed this problem for multiplex networks. Most of existing approaches consist on redefining the problem in terms of community detection for simple graphs. This can be made either by aggregating the different slices in one, or by applying classical community detection algorithms on each layer then merge obtained structures.
In this work we are interested in exploring new approaches that take the multiplex nature of target networks in the whole process. Seedcentric approaches are of particular interest. URL sujet détaillé : http://lipn.univparis13.fr/~kanawat/Mastercommultiplex.pdf
Remarques : Indemnité de stage.





SM20728 Coconstruction du sens dans un système sociotechnique complexe


Description


The proposed research will be undertaken in the context of the CoSiSocioTechS project. This project aims to investigate and develop the foundations of controllable and intelligible selforganisation as an operating basis for decentralized and autonomic complex sociotechnical systems. We consider a sociotechnical system as a complex system composed of humans and heterogeneous technical components that are interacting in volatile, dynamic, and open environments. For instance, one can consider smart environments like smart cities, smart buildings or smart homes, where humans interact and coevolve with devices, software applications or robots, in order to perform different kinds of activities and achieve different sorts of objectives. In the context of this project, we will consider an area of smart homes with its inhabitants as a sociotechnical system and consider the issue of sustainable energy management that we address in a decentralized way and considering a humanintheloop approach, which ensures intelligibility and controllability. Intelligibility of such sociotechnical systems means that we need to bridge the =93semantic gap=94 between the human and the autonomic components of a sociotechnical system, through a symbol grounding and concepts alignment techniques for the coconstruction of a humanmachine communication language. Efficient user communication  between autonomic components and human actors  is essential for the success of any sociotechnical system. On the one hand, it enables users to express their highlevel objectives, policies and preferences, which the technical system must follow. On the other hand, it enables the autonomic system to justify its behaviour, provide an evaluation of its goal achievements, explain the causes of possible failures and suggest alternative actions. Such exchanges must be both natural and comprehensible for the user, and sufficiently clear and precise for the technical system. The proposed research is an instantiation of these scientific questions in the framework of a developmental learning approach for Social Ambient Intelligence. The research main objective is to propose a contribution to this research agenda, by addressing the following objectives:  Translate software agents' internal representations and patterns into concepts that are intelligible to human agents. This will be achieved using reinforcement learning techniques and adaptation algorithms applied on databases of initial concepts from the knowledgedomain.  Define mechanisms of coconstruction of meaning, by interaction and composition of existing concepts, and internal representations in order to produce new intelligible concepts.  Use of incentive mechanisms to guide and boost the learning of new concepts. This work is part of a collaboration project with the UBIANT company, on the topic =93Social Ambient Intelligence, in the context of smart home and energy consumption selfmanagement, using a constructivist approach=94.
A PhD Funding (CIFRE) is offered for a successful candidate on this research work.
URL sujet détaillé :
:
Remarques : CoEncadrement :  Frédéric Armetta (LIRISCNRS UCB Lyon 1)
 Olivier Lefevre (Ubiant)
Ce stage sera rémunéré par l'entreprise UBIANT.





SM20729 Behavioural equivalences, coinduction and higherorder calculi




SM20730 Etude de la posture et de la locomotion humaine dans un contexte de rééducation


Description


Contexte : Ce sujet s'insère dans un projet global d'analyse et de modélisation du geste humain avec une application d'utilisation de la réalité augmentée pour la rééducation.
Objectifs. Dans cette travail nous nous proposons de nous intéresser au suivi, à l'analyse, et à la simulation d'un mouvement en se plaçant plus particulièrement dans un contexte de rééducation physique afin d'amener le sujet vers plus d'indépendance et d'autonomie dans sa vie quotidienne.  L'aide à la rééducation par exemple sur une restitution (reconstruction) du mouvement réel versus simulation d'un mouvement visé (suite à un accident, une opération nécessitant de la rééducation, ...).  L'amélioration de la posture/geste par une visualisation réalité augmentée et l'utilisation de métaphores : simulation de déplacement/trajectoire de ballon virtuel à suivre ou à attraper. Réalisation. Dans ce travail le candidat devra dans un premier temps s'imprégner des techniques liées à la pratique de la rééducation afin de mieux cerner la problématique et les enjeux. Le travail portera ensuite sur le traitement (en temps réel) des informations issues de l'acquisition de données vidéo. Ainsi, nous nous intéresserons à plusieurs aspects de la locomotion.  Sa posture sera caractérisée à chaque phase de l'étude d'un point de vue cinématique. Afin d'améliorer la précision de la capture à partir de caméra, une phase de cinématique inverse sera probablement à envisager en tenant compte de la biomécanique exacte du bassin et des jambes. En effet, la configuration des os, muscles, cartilages, tendons vont favoriser certaines postures dont devra tenir compte la phase de capture.  La dynamique du mouvement sera ensuite exploitée afin de pouvoir suivre l'évolution des paramètres biomécaniques au cours d'un traitement, c'estàdire sur plusieurs séances consécutives.
Motsclés : vision, analyse et suivi de mouvement, reconnaissance d'activités, ergothérapie, réalité virtuelle et / ou augmentée
URL sujet détaillé :
:
Remarques :





SM20731 Performances des réseaux de substitution sans fil


Description


Les réseaux de substitution sans fil sont envisagés pour venir en aide à un réseau filaire ou sans fil en panne ou en surcharge. Les réseaux de substitution sans fil peuvent être constitués d'entités mobiles équipées d'interface de communication sans fil. Les entités mobiles sont contrôlées et il est possible de les positionner à des endroits bien déterminés. Le placement de ces entités est une étape critique qui a un impact important sur les performances des applications véhiculées par ces réseaux. En effet, d'une part le médium sans fil utilisé pour communiquer est un médium partagé, fortement sujet aux erreurs et dont les performances sont très dépendantes de l'environnement (position relative des entités communicantes, contexte environnant, etc.). D'autre part, ces réseaux présentent souvent une topologie sans fil multisaut ce qui complexifie l'acheminement des données de boutenbout. Afin d'obtenir de bonnes performances avec ces réseaux, il est donc important de savoir où bien positionner les entités mobiles. Certains travaux existants se sont intéressés aux performances des réseaux sans fil multisauts ayant une topologie de chaîne, mais ces travaux reposent très souvent sur des hypothèses simplificatrices, comme une couche physique idéale, la non prise en compte de certains mécanismes MAC, etc. Récemment, nous avons proposé un modèle plus réaliste pour évaluer les performances des réseaux en chaîne constitués de 3 ou 4 sauts [1]. les résultats obtenus avec ce modèle sont très prometteurs et nous aimerions le généraliser et l'utiliser. Le but de ce stage est :  d'étendre le modèle que nous avons proposé à des configurations en chaîne avec k sauts et véhiculant du trafic dans les 2 sens de la chaîne,  comparer le modèle développé au modèle de Bianchi très populaire pour les réseaux locaux sans fil [2],  d'utiliser le modèle afin de déterminer un bon placement des entités sans fil dans le réseau de substitution. Des connaissances en réseau et plus particulièrement en réseau sans fil sont importantes. Des connaissances dans le simulateur NS2 sont un plus.
Bibliographie 
[1] Hierarchical Modeling of IEEE 802.11 Multihop Wireless Networks, T. Abreu, B. Baynat, T. Begin and I. Guérin Lassous, accepted to ACM MSWiM 2013.
[2] Giuseppe Bianchi: Performance analysis of the IEEE 802.11 distributed coordination function. IEEE Journal on Selected Areas in Communications 18(3): 535547 (2000) URL sujet détaillé :
:
Remarques : Isabelle GuérinLassous (http://perso.enslyon.fr/isabelle.guerinlassous/)





SM20732 Détection d'événements et/ou d'anomalies dans les dynamiques de graphes


Description


Detecting frauds, failures, errors and/or attacks is a challenge in many contexts. Available data often has the form of a dynamic graph: links between entities that evolve over time. Typical examples include communications between machines in a network, financial transactions, and software execution traces.
Current methods poorly use this information because of a lack of methods to deal with graph dynamics and detect events in such dynamics. We propose here to work on the definition of appropriate concepts to help analyzing realworld data like the ones cited above.
URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2013/10/evenements.pdf
Remarques :





SM20733 Un radar pour l'Internet


Description


Lorsqu'on est sur une machine de l'Internet (une source), il est possible de connaitre le chemin suivi par les informations partant de cette machine vers n'importe quelle autre machine de l'Internet (une destination). Cette approche est largement utilisée pour explorer la topologie de l'Internet (machines et liens entre elles). Nous proposons ici un nouveau point de vue : nous allons voir l'exécution de traceroute comme une sonde qui, utilisée périodiquement, va nous permettre de construire un radar pour l'Internet. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2013/10/radar.pdf
Remarques :





SM20734 Modélisation des graphes de terrain


Description


Dans de très nombreux contextes, on rencontre de grands graphes n'ayant aucune structure simple apparente, que nous appellerons graphes de terrain (par opposition aux graphes explicitement construits par un modèle ou une théorie). Citons par exemple la topologie de l'internet (routeurs et câbles entre eux), les graphes du web (pages web et liens hypertextes entre elles), les échanges divers (pairàpair, email, etc), mais aussi les réseaux sociaux, biologiques ou linquistiques.
Il est apparu récemment que la plupart de ces grands graphes ont des propriétés statistiques en commun, et que ces propriétés les différencient fortement des graphes aléatoires utilisés jusqu'alors pour les modéliser. Ce stage s'insère dans ce contexte, en explorant de nouvelles approches de modélisation permettant de capturer les propriétés rencontrées en pratique. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2013/10/radar.pdf
Remarques :





SM20735 Building Proofs without Proof Languages


Description


This internship will investigate the formal links between deep inference proofs in the calculus of structures and formal proofs in natural deduction or the sequent calculus, as used interactive theorem provers such as Coq and Isabelle.
Specifically, we have been building an interactice proof construction tool called Profound (see detailed description) that can be used to build proofs using direct manipulation of formulas instead of indirectly through a proof language. Nevertheless, it retains all the benefits of formal proofs, most importantly the guarantee of correctness.
Internally, these deep inference proofs are represented in the form of a rewrite system built on formula zippers, with a small number of algebraic properties. The underlying calculus has a penandpaper proof of correctness, but these rewrite systems have yet to be formally shown to be structurally correct. This will be the main task of the internship candidate.
Besides the theoretical background, there is also scope in this project for practical toolbuilding work with immediate application to existing interactive theorem proving systems. See the detailed description. URL sujet détaillé : http://www.lix.polytechnique.fr/~kaustuv/stages/pwopl/
Remarques :





SM20736 Dynamiques de graphes


Description


Les graphes (ensembles de noeuds et de liens entre eux) sont parmis les objets les plus classique de l'informatique. Ils modélisent des objets aussi divers que les réseaux informatiques, les interactions entre protéines, ou les relations sociales.
Or dans la plupart des cas les objets qu'ils modélisent sont dynamiques : des liens et/ou des noeuds apparaissent et/ou disparaissent au cours du temps. La théorie des graphes reste aujourd'hui largement incapable de modéliser ces dynamiques.
L'objectif de ce stage est d'étudier, en partant de divers cas concrets, des dynamiques de graphes afin d'introduire des notions et concepts capables de décrire de telles dynamiques. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2013/10/dynamique.pdf
Remarques :





SM20737 Dynamiques de graphes


Description


Les graphes (ensembles de noeuds et de liens entre eux) sont parmis les objets les plus classique de l'informatique. Ils modélisent des objets aussi divers que les réseaux informatiques, les interactions entre protéines, ou les relations sociales.
Or dans la plupart des cas les objets qu'ils modélisent sont dynamiques : des liens et/ou des noeuds apparaissent et/ou disparaissent au cours du temps. La théorie des graphes reste aujourd'hui largement incapable de modéliser ces dynamiques.
L'objectif de ce stage est d'étudier, en partant de divers cas concrets, des dynamiques de graphes afin d'introduire des notions et concepts capables de décrire de telles dynamiques. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2013/10/radar.pdf
Remarques :





SM20738 Proof Systems for XPath


Description


XPath is a widely employed language for querying XML data. One of the main computational problems associated with XPath is _satisfiability_: given an XPath query (and optionally a schema for the kind of XML documents we want to consider), does there exist an XML document on which this query would select some node? The problem has been studied in depth for many fragments of XPath with data, with complexities ranging from ExpSpace to Ackermannian to undecidable.
An issue with the results on the complexity of satisfiability is that each fragment has its own decidability proof, using either some form of a data tree automaton or =93model massaging=94 techniques. The objective of the internship is to develop more syntactic approaches using proof systems, with a longterm objective of unifying the decidability proofs in a single proof system. URL sujet détaillé :
:
Remarques :





SM20739 Algorithmes et structures de données pour les coeurs de communautés


Description


Dans de nombreux contextes applicatifs, on rencontre des graphes de grande taille. Citons notamment les réseaux sociaux en ligne ou réels, les graphes issus du web (pages web et liens hypertextes entre elles) ou des échanges pairàpair.
Ces graphes sont généralement structurés en groupes de sommets fortement liés entre eux, appelés communautés, avec peu de liens vers les sommets des autres groupes. A l'intérieur de ces communautés, des sousgroupes sont particulièrement fortement liés de ce point de vue.
Nous avons développé de premiers travaux pour étudier ces coeurs de communautés, que ce stage vise à explorer plus en profondeur. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2013/10/cores_sujet.pdf
Remarques :





SM20740 Analysis and Extrapolation of CPU processing Rate for the Simulation of Parallel Applications


Description


Analyzing and understanding the performance behavior of parallel applications on various compute infrastructures is a longstanding concern in the High Performance Computing community. When the targeted execution environments are not available, simulation is a reasonable approach to obtain objective performance indicators and explore various hypothetical scenarios. In offline simulation, a trace of a previous execution of the application is ``replayed'' on a simulated platform. While the simulation of communications is a wellcovered area that of computations usually remains rather simple. The main objective of this internship is to design a efficient way to calibrate the CPU processing rate for the offline simulation of MPI applications. To achieve this, the candidate will have to study an existing analysis of the distribution of the processing rate versus the number of instructions computed by each action in the trace. The candidate will then have to: + Acquire execution traces on the Grid'5000 experimental platform. Some traces are already available, but more might be needed; + Conduct a thorough statistical analysis of the distribution of the processing rates. The R language is likely to be used at this step; + Propose a way to extrapolate information gathered on calibration instances to target, i.e., large instances of the studied application(s); + Conduct a validation study of the proposed approach by comparing simulation results obtained thanks to the SimGrid toolkit to actual experimentations.
To favor an Open Science approach, the candidate will be encouraged to document and made available all the processes and data used during the internship. This may be eased by tools such as the org mode of the Emacs editor. URL sujet détaillé : http://graal.enslyon.fr/~fsuter/extrapolation.pdf
Remarques :





SM20741 The pancake flipping problem applied to strings and 2D permutations


Description


The pancake flipping problem (also called in serious terms "prefix reversal problem") is defined as follows: given a permutation P on [1..n], sort it to the identity permutation I=1 2 3...n by a minimum number of prefix reversals (a prefix reversal of length k consists in modifiying P= P1 P2...P_k P_{k+1}... P_n into P'=P_k P_{k1}... P_1 P_{k+1} .. P_n). This number is called the prefix reversals distance of P, or prd(P).
The pancake flipping problem has been extensively studied in the past, notably in terms of diameter (what is the greatest value that prd(P) can achieve? for which P ?). We also recently answered a longstanding open question, by proving that the problem of determining prd(P), given P, is NPhard [Bulteau,Fertin,RusuMFCS 2012].
Here, we want to extend the definition of pancake flipping in 2 directions:
1) Pancake flipping on strings: P is no more a permutation, since a letter can occur several times. In this context, we will focus on constantsized alphabets. The questions to answer here are the following: a) determine complexity of the problem b) provide constant ratio approximation algorithms c) compute the diameter, or give lower and upper bounds for it
2) Pancake flipping in 2D: instead of dealing with permutations, here the input is a n x m matrix M. Several variants can be defined, depending on the contents of M, the definition of the identity matrix I, and how we define a prefix reversal. For each such variant, it could happen that some matrices cannot be transformed to I by prefix reversals. The first question is then to characterize the matrices that can. Then, for each such matrix, questions a),b) and c) from 1) above need to be investigated.
The goal of the internship is to study all the above questions, and give answers to as many as possible. Extensions 1) and 2) described above are actually fairly new, thus many new results can be expected. URL sujet détaillé :
:
Remarques :





SM20742 Semantic Enrichment of Entities in a Large Scale Context


Description


With the exponential growth of data, discovering relevant information about a given topic becomes a difficult task particularly when decisionmakers have to quickly obtain relevant information to take the correct decisions. Thus, this internship aims at automatically building knowledges bases following an event (e.g., natural disaster, cultural event such as a new painting exhibition). The building of such knowlege bases is based on the enrichment of entities potentially relevant for the observation or the study of a geographical area, of a museum collection, etc. The multiplication of data sources from the Linked Open Data Cloud (LOD) provides opportunities for integrating complementary or contradictory information. In addition, new types of data sources (e.g., blogs, Twitter) can also be exploited due to their frequent updates. A major problem for integrating these data sources onthefly lies in the large scale aspect. The objective of this internship is a definition of a collaborative strategy for processing the streams of semantic entities. The intuition is to rely on the =94MapReduce=94 paradigm to pre process the streaming entities. Expected results are a report about related work in distributed integration of semantic entities , an original proposition for the collaborative strategy, and an experimental validation of the proposition. URL sujet détaillé : http://liris.cnrs.fr/~fduchate/docs/SujetM2BDDuchateauLumineauanglais.pdf
Remarques : Coencadrement avec Nicolas Lumineau http://liris.cnrs.fr/~nluminea/
Indemnités de stage





SM20743 Sparse Voxel Directed Acyclic Graph et squelettisation rapide d'objets discrets


Description


Dans de nombreux domaines de l'informatique graphique, la taille des objets à représenter ou à analyser devient de plus en plus problématique et pose de nombreux problèmes algorithmiques. Nous nous intéressons ici à l'analyse d'objets discrets définis comme des régions dans des images 2D, 3D ou en dimension supérieure. En imagerie médicale ou en matériaux, des objets dans des images volumiques 2048^3 sont courants. En modélisation géométrique, nous pouvons être amenés à manipuler des volumes binaires de 64k^3, voire 128k^3 (monde modélisés, cartes minecraft...) . Pour ces exemples, il est nécessaire d'avoir des structures de données compactes en mémoire et efficaces en accès.
Récemment, une structure de données particulièrement intéressante a été proposée : la structure "Sparse Voxel DAG" [1]. L'objectif du stage est double. Dans un premier temps, il s'agira de proposer une implémentation et d'étendre cette structure dans une bibliothèque opensource (DGtal [2]) développées au laboratoire. Cette intégration permettra notamment d'évaluer finement les performances de cette structure par rapport à d'autres structures optimisées disponibles dans DGtal.
Dans un second temps, il s'agira de proposer un algorithme de squelettisation d'objets volumiques discrets en exploitant la structure "Sparse Voxel DAG". En effet, cette dernière manipule de manière optimale plusieurs occurrences de la même configuration locale de voxels (en multirésolution). Cette spécificité rend son usage pour des problèmes de squelettisation pertinent. L'objectif est donc d'aboutir à un algorithme de squelettisation extrêmement rapide sur des gros volumes de données.
Le stage se déroulera dans le contexte de nombreux projets pour lesquels la manipulation de données réelles de grande taille est primordiales (images volumiques de microstructures de neige ANR digitalSnow, images tomographiques de mousses catalytique PALSE Emergent). Nous pourrons aussi utiliser des mondes modélisés (minecraft) afin de tester la stabilité de la structure et des algorithmes développés.
Le stage se déroulera au LIRIS, Bat. Nautibus, Lyon 1.
[1] "HighResolution Sparse Voxel DAGs", Viktor Kämpe, Erik Sintorn, Ulf Assarsson, SIGGRAPH 2013. [2] http://libdgtal.org URL sujet détaillé :
:
Remarques : Indémnisation de stage





SM20744 Reconstruction stable et garantie de structures fibreuses 3D


Description


a squelettisation est un processus visant à représenter un objet 3D par une structure de dimension 1 ayant certaines propriétés vis à vis de l'objet initial (même topologie, bien centré dans la forme, ...) [1]. L'idée étant ensuite d'utiliser ce squelette pour calculer des mesures sur l'objet 3D et simplifier ainsi les calculs. Dans la littérature, de nombreuses approches existent pour extraire de tels squelettes (amincissement, fonction de distance, modèles énergétiques...).
Dans le cadre d'un projet avec des chimistes, nous nous intéressons à l'analyse de structures fibreuses, 3D [2] et [3]. Dans ce contexte, l'objectif est d'avoir un processus permettant une reconstruction des branches de ces fibres par des structures 1D, mais surtout une reconstruction robuste et stable des "noeuds". On peut parler alors nonplus de squelettisation mais de reconstruction géométrique de graphe. Par stable, nous entendons que l'on soit capable de démontrer que la position des noeuds et leur degré soient corrects , même sur des données imparfaites. Récemment, quelques travaux autour de la reconstruction robuste de structures de dimension 1 ont été proposés dans le cadre de nuages de points [4]. L'objectif du stage sera s'intéresser à ce problème sur des données discrètes (ensemble de voxels dans des images volumiques). Dans ce contexte, nous pouvons espérer à la fois des algorithmes rapides d'extraction mais aussi d'autres bornes théoriques de convergence des structures. Cela sera aussi l'occasion de faire le lien entre squelettisation homotopique en géométrie discrète et stabilité géométrique de géométrie algorithmique.
Le stage se déroulera au LIRIS, Bat. Nautibus, Lyon 1.
[1] http://liris.cnrs.fr/dgtal/wordpress/wpcontent/uploads/2010/10/Captured'écran20110106à21.39.21.png [2] http://liris.cnrs.fr/m2disco/ld/wordpress/wpcontent/uploads/2013/07/mousse_init.png [3] http://liris.cnrs.fr/m2disco/ld/wordpress/wpcontent/uploads/2013/07/mousse_surface.png [4] Aanjaneya, M., Chazal, F., Chen, D., Glisse, M., Guibas, L., & Morozov, D. (2012). Metric graph reconstruction from noisy data. International Journal of Computational Geometry & Applications, 22(04), 305325. URL sujet détaillé :
:
Remarques : Indémnisation de stage





SM20745 Granulométrie et fonction d'épaisseur sur GPU


Description


En analyse géométrique de formes dans des images 2D ou 3D, l'analyse par granulométrie (ou fonction d'épaisseur) s'est révélée être un outil très pertinent. Le problème géométrique sousjacent est extrêmement simple mais l'algorithmique efficace associée n'est pas triviale. L'optimisation des algorithmes impliqués est d'autant plus intéressante que nous souhaitons par exemple pouvoir utiliser cette granulométrie sur des images volumiques de haute résolution (2048^3 pour des images de microstructures de neige 3D).
En pratique, il s'agit de construire une fonction de distance qui associe à chaque point, le rayon du disque (ou boule en 3D) de plus grand rayon qui contient ce point et qui est contenu dans la forme. Ce calcul exploite naturellement l'axe médian ou squelette de la forme. Par la suite, la granulométrie est obtenue par une analyse statistique de ces distances.
L'objectif du stage est de proposer une implémentation sur GPU d'une variante naive de l'extraction de cette fonction d'épaisseur. L'idée étant d'exploiter les propriétés de calcul massivement parallèle du GPU pour obtenir rapidement cette analyse. Les capacités mémoire des cartes étant souvent limitées, il s'agira aussi de résoudre le découpage du problème initial en sousproblèmes indépendants.
Le stage se déroulera au LIRIS, Bat. Nautibus, Lyon 1. URL sujet détaillé :
:
Remarques : Indémnisation de stage





SM20746 Lower bounds for univariate polynomials


Description


The cost of evaluating a polynomial can be measured by the number of arithmetic operations performed by an evaluation algorithm. This notion can be made precise using the model of arithmetic circuits. The internship will be devoted to the study of this arithmetic cost for univariate polynomials, and especially to lower bound techniques.
Given the difficulty of obtaining lower bounds for general arithmetic circuits, we will focus on a restricted class of circuits: we will try to obtain lower bounds for polynomials represented as sums of products of sparse polynomials.
This representation is general enough to lead to challenging lower bound problems, but seems significantly weaker than general arithmetic circuits. The intern will try to apply existing lower bound techniques to this representation, will compare them and will possibly develop new lower bound techniques. URL sujet détaillé : http://perso.enslyon.fr/pascal.koiran/sujet_M2_univariate.pdf
Remarques : Le stage pourrait aussi être encadré par Natacha Portier, ou coencadré.





SM20747 Smart extraction of documents for finance and biology


Description


Motivation:  Authorities receive every year an extremely large amount of documents related to insurance companies and fund management. These documents include accounting information but also assess the compliance of the proposed funds with the law.  The idea is to automate as much as possible the document processing and verification. Problem formulation:  Most of the provided documents are paper based or unstructured pdf.  The idea would be to recover document structures based on the information of the expected structure. This structure will be expressed as a graph. Such that the problem solution can be related to the graph isomorphism problem.  The proposed tool can be seen as a decision aid tool and not a fully automated one.
Proposed Solutions:  We benefit from large amounts of document and document processing experience in the biology and finance fields and also direct access to domain experts. These 2 domains can be used to validate the approach.  Document classical structures will be extracted from existing information that can be seen as training set. A set of the already structured documents will be turned into pure text and become the testing set. This will enable graphbased but also machine learning approaches to work jointly. Finally the approach will be tested on document import for which one does not benefit of structure information. Candidate's profile  Excellent programming skills. Knowledge of Java, XML, parsing.  Knowledge of graph theory and machine learning techniques.  Great interpersonal skills as many interactions with domain experts are foreseen.  Fluent in English.  Masterlevel student URL sujet détaillé :
:
Remarques : Une couverture de frais de l'ordre de 850 euros/mois est prévue.





SM20748 Connectivity inference in structural proteomics


Description


CONTEXT AND MULTIDISCIPLINARY NATURE
Molecular assemblies involving from tens to hundreds of macromolecules (proteins or nucleic acids) are commonplace, yet, very little is known on their structure. To fill this gap, a number of biophysical experiments (mass spectrometry, tandem affinity purification, etc) providing information on the composition of subsystems called oligomers are being developed [Robinson07]. More precisely, given an assembly, one such experiment gives access to the composition of one subsystem, called oligomer, so that experiments for multiple overlapping oligomers convey information on the overall connectivity. For example, if one oligomer reduces to two proteins, these proteins touch in the assembly. Connectivity inference is the problem concerned with the elucidation of this connectivity. Instances for the connectivity inference problem can be modeled as a graph in which each vertex corresponds to a protein, and an assembly (oligomer or set of proteins) is a connected subset of vertices.
The problem is of multidisciplinary nature, since developing solutions requires a virtuous circle between biophysics on the one hand, and algorithmic  optimization  graph theory on the other hand.
GOALS
The connectivity inference problem in the particular case where each protein species is represented by a single copy has recently been studied using tools from graph theory and integer linear programming [esa13]. Roughly, the minimum connectivity inference problem is to find the smallest subset of edges of a complete graph such that each subset of vertices associated to an assembly gets connected. This optimization problem has been proved NPhard and hard to approximate in general. Nonetheless, when the number of proteins is moderate (less than 20), the problem can be solved efficiently using mixed integer linear programming (MILP) and a greedy approximation algorithm.
These two types of algorithms generate ensembles of (optimal) solutions, from which consensus solutions were defined. Tests conducted on systems with up to 20 proteins show an almost perfect agreement between the predicted contacts and the experimentally determined ones. The goal of this internship is to go beyond this state of the art, so as to:
 Generalize the modeling in order to handle the cases where multiples copies of a protein are present;
 Propose exact and approximate algorithms for instances involving hundreds of proteins (currently unreachable with existing solutions);
 Validate the algorithms on systems involving hundreds of proteins (viruses, and the nuclear pore complex [Dreyfus13].
SUPERVISORS
F. Cazals, ABS; http://team.inria.fr/abs D. Coudert and C. Caillouet, COATI; http://team.inria.fr/coati
REFERENCES
[esa13] D. Agarwal, J. Araujo, C. Caillouet, F. Cazals, D. Coudert, and S. Perennes, Connectivity Inference in Mass Spectrometry based Structure Determination, European Symposium on Algorithms, 2013. http://hal.inria.fr/hal00837496
[Dreyfus13] T. Dreyfus, V. Doye, F. Cazals, Probing a Continuum of Macromolecular Assembly Models with Graph Templates of Subcomplexes, Proteins: structure, function, and bioinformatics (81), 2013.
[Robinson07] M. Sharon and C. Robinson, The role of mass spectrometry in structure elucidation of dynamic protein complexes, Annu. Rev. Biochem. 76, 2007.
URL sujet détaillé : ftp://ftpsop.inria.fr/abs/fcazals/positions/master14mcc.pdf
Remarques :  Cosupervision with David Coudert and Christelle Caillet (projet COATI)
 Remuneration according to the Inria grid.
 Duration: 6 months; possibility to followup with a PhD thesis.





SM20749 Energyefficient cloud elasticity for datadriven applications


Admin


Encadrant : AnneCécile ORGERIE 
Labo/Organisme : energyefficiency, performance and elasticity. This work will be done in collaboration with the Lawrence Berkeley National Laboratory (USA) in the context of the DALHIS associate team. 
URL : http://www.irisa.fr/myriads 
Ville : Rennes 



Description


Advisors: AnneCécile Orgerie and Christine Morin
Description Distributed and parallel systems offer to users tremendous computing capacities. They rely on distributed computing resources linked by networks. They require algorithms and protocols to manage these resources in a transparent way for users. Recently, the maturity of virtualization techniques has allowed for the emergence of virtualized infrastructures (Clouds). These infrastructures provide resources to users dynamically, and adapted to their needs. By benefiting from economies of scale, Clouds can efficiently manage and offer virtually unlimited numbers of resources, reducing the costs for users.
However, the rapid growth for Cloud demands leads to a preoccupying and uncontrolled increase of their electric consumption. In this context, we will focus on data driven applications which require to process large amounts of data. These applications have elastic needs in terms of computing resources as their workload varies over time. While reducing energy consumption and improving performance are orthogonal goals, this internship aims at studying possible tradeoffs for energyefficient data processing without performance impact. As elasticity comes at a cost of reconfigurations, these tradeoffs will consider the time and energy required by the infrastructure to dynamically adapt the resources to the application needs.
The validations of the proposed algorithms may rely on the French experimental platform named Grid'5000. This platform comprises about 8,000 cores geographically distributed in 10 sites linked with a dedicated gigabit network. Some of these sites have wattmeters which provide the consumption of the computing nodes in realtime. This validation step is essential as it will ensure that the selected criteria are well observed: energyefficiency, performance and elasticity. This work will be done in collaboration with the Lawrence Berkeley National Laboratory (USA) in the context of the DALHIS associate team.
References: [RDG11] N. Roy, A. Dubey and A.S. Gokhale. "Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting". In Proceedings of IEEE CLOUD, 2011. [Koo11] J. Koomey, "Growth in data center electricity use 2005 to 2010", Analytics Press, 2011. [ADE10] D. Agrawal, S. Das and A. El Abbadi, "Big data and cloud computing: new wine or just new bottles?", Journal of the Very Large Data Base Endowment, vol 3 no. 12, pp. 16471648, Sept. 2010. [ODL13] A.C. Orgerie, M. Dias de Assunç=E3o, L. Lefèvre, "A Survey on Techniques for Improving the Energy Efficiency of Large Scale Distributed Systems", ACM Computing Surveys, 2013.
URL sujet détaillé : http://www.irisa.fr/myriads/openpositions/internships1314/document.20130925.9133673154
Remarques : Contact: AnneCécile Orgerie annececile.orgerie.fr Christine Morin christine.morin.fr





SM20750 Physical attacks and codebased cryptosystems


Description


Domain
Cryptography, coding theory and secure implementation
Presentation
Most of the cryptographic schemes used and studied today are based on number theory problems as factorisation or discrete logarithm. In 1994, Shor proposed an algorithm which can factorise in polynomial time using a quantum computer. So RSA and several others schemes are threatened by the quantum computer.
Codebased cryptography is one of the branches of postquantum cryptography with latticebased, multivariatebased and hashbased cryptography. Schemes based on problems as syndrome decoding or decoding random codes are well studied for years and there doesn't exist polynomial time algorithm to solve those problems even in a post quantum world. McEliece was the first to propose a codebased cryptosystem and several improvements and derivation have been proposed so far.
To consider the use of codebased cryptosystems in the real life, they must be resistant to physical attacks as power analysis or fault injection. To date, the study of such attacks on such schemes are rare and there is a lot of work to do in this area.
Purpose
After a state of the art of codebased schemes and sidechannel attacks the student will implement an attack even on smart card, FPGA, graphic card or CPU. Both practical and theoretical, this thesis proposes to study the physical attacks also called sidechannel attacks like Simple Power Analysis, Differential Power Analysis, HigherOrderDifferential Power Analysis or Fault Attack and to show how we can apply these patterns to attack codebased cryptosystems like CourtoisFiniaszSendrier signature scheme, Stern's zeroknowledge identification scheme or McEliece public key cryptosystem.
We have a laboratory for the implementation and the application of the attacks. The student will have to write an article on his research in a LATEX format and give an english presentation to the team. With specialists in physical attacks and codebased cryptography, the student could work with different teams and improve his background in the two areas.
Goals
The outcome of the thesis is supposed to be a publishable result on physical attacks on codebased cryptosystems. URL sujet détaillé : http://www.cayrel.net/enseignement/20132014/article/physicalattacksandcodebased261
Remarques :





SM20751 Code based signature schemes with special properties


Description


Domain Cryptography and coding theory Presentation Most of the cryptographic schemes used and studied today are based on number theory problems as factorisation or discrete logarithm. In 1994, Shor proposed an algorithm which can factorise in polynomial time using a quantum computer. So RSA and several others schemes are threatened by the quantum computer. Codebased cryptography is one of the branches of postquantum cryptography with latticebased, multivariatebased and hashbased cryptography. Schemes based on problems as syndrome decoding or decoding random codes are well studied for years and there doesn't exist polynomial time algorithm to solve those problems even in a post quantum world. McEliece was the =1Crst to propose a codebased cryptosystem and several improvements and derivation have been proposed so far. There exists di=1Berent codebased signature schemes (Stern identi=1Ccation and signature scheme, Courtois Finiasz and Sendrier signature scheme) and several schemes proposed additional properties like identitybased constructions or threshold ring signatures. Purpose After a state of the art of codebased signature schemes, the di=1Berent properties that a signature scheme can o=1Ber and the generic constructions in each context, the student will have to propose a codebased signature scheme with one of this property (undeniable, designated veri=1Cer or timereleased for example). A proof of security of the scheme in the random oracle model or in the standard model would be appreciate. The student will have to write an article on his research in a LATEX format and give an english presentation to the team. Goals The outcome of the thesis is supposed to be a publishable result on codebased signature scheme with a special property. URL sujet détaillé : http://cayrel.net/IMG/pdf/Code_based_signature_schemes_with_special_properties2.pdf
Remarques :





SM20752 Declarative Design and validation of Information Systems


Description


Many information systems (IS) are distributed on the Web, and use distributed components : databases, external services, etc. In a decade, the vision of such systems has moved from a centralized approach, where data and services are all located on a single site with access control, to a more open and distributed vision, in which services are open to their environment, accept invocations from the external world, and realize their tasks by composing services and aggregating contents provided by distant subsystems. One of the current trend in IS design is called " declarative data centric service ". Software architectures of this kind are distributed, open to the external world, exchange data. In such applications, the workflow of a given operation is not static, and may depend on the manipulated data. The declarative aspect is also well considered during design: as it is almost impossible to design a complete workflow for a system built out of multiple heterogeneous components. However, such systems can be defined using declarative formalisms, that is, using rules depicting how and when data should be modified or transferred from a site to another. Rulebased definition of systems also allow interaction with all stakeholders, whose knowledge of business rules is essential to build a correct system. The topic considered during this internship is the study of a declarative model called =93communicating rewriting rules=94, that allow for an intuitive design of data centric systems. In particular, we want to consider verification techniques applied to such formalism.
URL sujet détaillé : http://www.irisa.fr/sumo/Sujets/SecuWebRules_Master_English.pdf
Remarques : Coencadrant : Loïc Hélouët loic.helouet.fr





SM20753 Anonymous blacklisting without trusted third party


Description


There are many systems allowing to communicate privately over the internet. Tor (https://www.torproject.org/) being the most famous. The problem with such anonymizing networks is that there is no easy way to establish accountability of anonymous users. Therefore many services, web site administrators, block all known exit nodes of anonymizing networks since they can't rely on IPadresses of misbehaving clients. In the end it challenges the success of anonymizing networks. Some solutions have been presented in order to block misbehaving users while keeping their anonymity protected. Anonymous blacklisting is a very alive research field. The aim of the internship is first to make a stateofthe art survey of this field. A second aim would be to develop a system allowing blacklisting while maintaining anonymity in a completely decentralised manner, for instance by reusing ideas developped for distributed electronic cash à la Bitcoin (http://bitcoin.org/). Don't hesitate to contact me for more informations. ___________________________________________________________________ Biblio : [1] Formalizing Anonymous Blacklisting Systems by R. Henry and I. Goldberg, in proceedings of the 2011 IEEE Symposium on Security and Privacy, May 2011. [2]Nymble: Blocking Misbehaving Users in Anonymizing Networks by P. Tsang, A. Kapadia, C. Cornelius, and S. W. Smith. In IEEE Transactions on Dependable and Secure Computing 8(2), March=96April 2011, pages 256269. URL sujet détaillé :
:
Remarques :





SM20754 Privacy and social networks


Description


Social networks have modified our society: Facebook, for instance, has more than a billion users. Every day one can testify their impact on privacy. Lately the design of search engines based on data provided by social networks has magnified these issues. This internship is centered around a theoretical study of privacy in such a context. Privacy problems are, more or less, well understood in standard programming through noninterference. is the mathematical basis for confidentiality analyses. The problem is simple: how to shape and control the information flow in such a way that data labeled private cannot be publically observed ? The noninterference approach is intuitive: data are tagged with security level and it aims at proving that modifications of a data tagged ``high'' cannot lead to modifications of data tagged ``low''. The system has to enforce noninterference. The aim of this internship is to study how these notions can properly defined and analyzed in large social graphs. A path to consider is to look how graph transformation techniques can help in this perspective. Don't hesitate to contact me for more informations. URL sujet détaillé :
:
Remarques :





SM20755 Normalization by Completeness


Description


In logical systems, normalization of proof terms and semantic cut admissibility are two ways to show that cuts are eliminable. It amounts to show that a certain process of cut elimination (for instance of proofterms) terminates, or to show a completeness theorem for the cutfree calculus, two techniques that have been developed independently.
The goal of this internship is to study the links between both methods, and more precisely, to investigate on how semantic cut admissibility proofs generate cutfree proofs. URL sujet détaillé : http://www.cri.ensmp.fr/people/hermant/internships.html
Remarques :





SM20756 Étude des attaques par profilage sur les fuites d'algorithmes cryptographiques / Template Attacks on Cryptographic Algorithm Leakages


Description


Context: The embedded cryptography may be vulnerable to Side Channel Attacks (SCA) that use the information obtained during algorithm executions. In general this leakage is physically measured via consumption current or electromagnetic radiation signals. Many statistical methods exist to exploit these signals and retrieve some secret elements. The SCA efficiency depends on the observed information and the attacker facilities. Two attack families are generally considered. The first one assumes that the attacker has little knowledge of the targeted device and consists in the acquisition and the exploitation of many signals. The second one, called Template Attacks, is composed of two phases: firstly the attacker collects enough information into a database constructed with a full knowledge of the device, secondly the secret elements are deduced with only a few signals. The Template Attacks are more efficient but a good database is difficult to build. All the parameters are important and depend on the acquired signals. Work: At first, the student may use the existing signals to get familiarized with the Template Attacks. Then the student may rely on the state of the art in order to test and compare other promising methods with curiosity and creativity. At last, it will be interesting to improve the attack efficiency with a thorough understanding of each parameter function. URL sujet détaillé :
:
Remarques : Ce stage rémunéré offre la possibilité d'une continuation en thèse. Les candidatures doivent nous parvenir au moins trois mois avant le début du stage.





SM20757 A processor computing just right


Description


Current processors compute on 32 or 64bit words. In floatingpoint, this means a precision of 7 or 15 decimal digits. This is way too much compared to the needs of typical embedded applications: is the trajectory you input to Angry Birds really accurate to 1e7? Even in scientific computing, such accuracy is not needed for each and every operation, but only in very localized code sections.
Computing too accurately is a waste of computing power, and perhaps more importantly, a waste of energy. The purpose of this internship is to study the design of a generalpurpose computer that could compute just right.
With current submicron technology, more energy is spent in data movements than in computing itself. A processor computing just right should therefore also be able to move around only the bits necessary to the computation. This implies to move them in series, as opposed to the buses used in conventional processors. This is a radical redesign of the von Neumann architecture, and it should lead us to rethink also instruction encoding and the memory subsystem. Building digitserial operators is not trivial either, but it has been addressed by a wide body of litterature between the seventies and the nineties.
The intern will study these issues and develop a functional simulator of such a processor. The processor design will be validated by the implementation of wellchosen numerical kernels, such as elementary functions and digital filters, for which the precision required by each computation step is known at compile time. The simulator will then be refined to provide relevant estimation of performance and switching activity. This should enable a quantitative study of energy/performance tradeoffs according to various architectural options. In particular, the impact of design parameters such as the serial granularity will be studied. Also, this simulator should address from the start a massively parallel multicore.
Depending on the intern progress and taste, a successful processor could be translated to VHDL and emulated on FPGAs. URL sujet détaillé :
:
Remarques :





SM20758 Collaborative query processing inside user community


Description


This internship context is based on a large amount of data sources (e.g., web data, RSS stream, sensor networks) queried by a large number of users. Those data sources are autonomous, distributed and with capacities (often) bounded. The queries we consider may have a complex structure (e.g., aggregate queries) and they are produced by many remote users. In such context, many scientific locks can be considered. For instance, how to ensure that the query processing can be done despite of its own complexity and user resources availibity? How to ensure the scalability of the data accessibility.
In your context, it is worth noting that some users have similar needs. That leads to consider equivalent query processing. The proposition we are currently studying is based on the building of user communities around some queries to handle them in a collaborative way and so to answer to a common need of each community member. Inspired by classic query processing (based on three steps: data loading, query execution, result loading), our approach is split up in three steps too. To collaboratively handle a query, a user community has to be able i) to retrieve and to load relevant data for the query, ii) to compute an execution plan (that may be distributed) to obtain the results and iii) to broadcast the query result to the member of the community.
This internship is focused on the query execution step that is done inside a community. Expected results are: a state of the art on distributed streambased query processing, an original contribution that allows to collaboratively execute a query in an optimized way by taking into account the features and the resources of users. This solution should be tested via a simulator to prove the scalability and the robustness of the proposed query processing. Performances will be evaluated by comparison with other solutions. URL sujet détaillé : http://liris.cnrs.fr/nicolas.lumineau/stages/M2R/2014/SujetM2BDLumineauLamarreVUK.pdf
Remarques : Coencadrant: Philippe Lamarre (philippe.lamarrelyon.fr)
Stage rémunéré.
Possibilité de thèse financée par l'ANR via le projet SocioPlug (20132017)





SM20759 Spectral clustering and extensions


Description


Unsupervised clustering of data in high dimensions is a very important problem with many applications in various fields such as learning theory, bioinformatics, etc ... Many methods have been proposed and the litterature on the subject is extensive. However, most methods suffer from the unpleasant drawback of having to minimize a nonconvex function, with the consequence of getting possibly stuck at nonglobal minimizers.
Recently, polynomial time algorithms have been proposed in order to cicumvent this problem, based on the geometry of graphs thus the Laplacian and its eigenvalues. For twoclass problems, a discrete analog to the Cheeger inequality provides straightforward garantees on the quality of the method. Extensions have been proposed to multiclass problems together with generalized versions of Cheeger's inequality.
During this internship, we propose to study these recent breakthroughs. If time permits, we will also study a new algorithm based on the recent works on nuclear norm minimization for spectrally sparse problems in order to address the question of estimating the number of clusters and the point labels at once. URL sujet détaillé :
:
Remarques :





SM20760 A distributed search structure for georeplicated storage


Admin


Encadrant : Marc SHAPIRO 
Labo/Organisme : The research will take place at Université Pierre et Marie Curie (UPMCParis 6), Laboratoire d'Informatique de Paris 6 (LIP6), in the Latin Quarter in the centre of Paris. The internship is fully funded. 
URL : https://team.inria.fr/regal/ 
Ville : Paris 



Description


Modern data storage engines such as keyvalue stores (KVSes) are deployed across some machines in a data centre (sharding and replication), and across several data centres located in various geographical locations (georeplication). Since geographical replication improves both locality and availability, it is desirable to replicate all objects in a KVS in all available datacenters. However, because of storage and interdatacenter transfer costs, objects are usually replicated in some (but not all) data centers.
A KVS locates its objects using a mapping mechanism such as consistent hashing: the node storing the object with primary key K is the one identified by applying the consistent hash function H(K). (This is easily generalised to an object replicated on multiple nodes.) Accessing object K is as simple as sending a message to node H(K). This approach is usually highly scalable, as long as queries are performed on the primary key K. However, when accessing an object other than through its primary key, for instance based on its content or on its position relative to another object, then these systems basically send a request to all nodes in the system. This is not scalable, and is very slow.
To sidestep this problem, one solution is to build a distributed search index, for instance a distributed B+Tree or a skip list, for every nonprimary key of interest. This supports efficient discovery of the node(s) storing an object with a given nonprimary key. This problem has been studied in the context of singlesite KVSes, but not for georeplicated systems. In this context, latency is high and failures are not uncommon, which exacerbates the issues of consistency and fault tolerance. One direction of interest is levaraging weakconsistency techniques such as nonmonotonic snapshot isolation or conflictfree replicated data types. URL sujet détaillé : https://team.inria.fr/regal/joboffers/mastersinternshipadistributedsearchstructureforgeoreplicatedstorage/
Remarques : The research will take place at Université Pierre et Marie Curie (UPMCParis 6), Laboratoire d'Informatique de Paris 6 (LIP6), in the Latin Quarter in the centre of Paris. The internship is fully funded.
Requirements:
 Enrolled in a Master's in Computer Science / Informatics or a related field.
 Excellent academic record.
 Motivated by experimental research.
 Skills in algorithms, data structures, concurrency, distributed computing, highperformance computing, operating systems, distributed systems, databases, or a related area.
Please provide a CV, a list of courses and your marks, an essay relevant to the topic (1 to 4 pages), and at least two references (whom we will contact ourselves for a recommendation).





SM20761 Exploring the design space of transactional consistency models


Admin


Encadrant : Marc SHAPIRO 
Labo/Organisme : The research will take place at Université Pierre et Marie Curie (UPMCParis 6), Laboratoire d'Informatique de Paris 6 (LIP6), in the Latin Quarter in the centre of Paris. The internship is fully funded. 
URL : https://team.inria.fr/regal/ 
Ville : Paris 



Description


In distributed systems and databases, there is a fundamental tradeoff between the performance and the guarantees offered by a consistency model. The strongest consistency model (strict serializability) offers the guarantees of a highly faulttolerant sequential system, and is therefore very easy to understand and use, but performs poorly. Performance can be improved, often by orders of magnitude, by weakening the consistency guarantees; in the extreme, Eventual Consistency (EC) ensures minimal latency and maximal performance, but is extremely challenging to program against.
Practical database systems cover the whole spectrum of intermediate consistency models. For instance, Oracle and other database engines famously do not implement serializability for performance reasons, but only the weaker Snapshot Isolation (SI). As SI itself has inherent bottlenecks, several alternatives have been proposed that weaken the guarantees somewhat still disallowing update conflicts. In distributed settings, NoSQL databases such as Riak or Cassandra are highly scalable, at the cost of guarantees no stronger than EC.
Despite the plethora of models, both in research and in practical systems, the differences are poorly understood. How does an application developer navigate this space? How can he tell what is the right model for his problem?
The proposed research aims for a systematic and exhaustive exploration of the design space of transactional consistency models, in order to characterise the advantages and limitations of each one. Many distributed transactional protocols have a common structure, called Deferred Update Replication (DUR). We have shown that protocols of this family differ only in the specific behavior of a few generic functions. We have implemented a generic DUR framework, called Jessy, along with a library of finelyoptimized plugin implementations of the required behaviors. By mixing and matching the appropriate plugins, the framework can be customized to provide a highperformance implementation of diverse transactional protocols, which in turn enables a fair, applestoapples comparison between them.
The research consists of implementing a number of consistency models of interest by an appropriate selection of plugins (and implementing any missing plugins). Each such combination will be studied in detail, both analytically and experimentally, in order to characterise precisely its performance and guarantees, its bottlenecks, and the class of applications that it supports (or not). Exploring plugin combinations might generate new consistency models with unexpected properties. URL sujet détaillé : https://team.inria.fr/regal/joboffers/mastersinternshipadistributedsearchstructureforgeoreplicatedstorage/
Remarques : The research will take place at Université Pierre et Marie Curie (UPMCParis 6), Laboratoire d'Informatique de Paris 6 (LIP6), in the Latin Quarter in the centre of Paris. The internship is fully funded.
Requirements:
 Enrolled in a Master's in Computer Science / Informatics or a related field.
 Excellent academic record.
 Motivated by experimental research.
 Skills in algorithms, data structures, concurrency, distributed computing, highperformance computing, operating systems, distributed systems, databases, or a related area.
Please provide a CV, a list of courses and your marks, an essay relevant to the topic (1 to 4 pages), and at least two references (whom we will contact ourselves for a recommendation).





SM20762 Equivalences comportementales pour le rhocalcul


Description


The rhocalculus [1] extends lambdacalculus with rewriting notions, such as firstorder patterns, which can be passed to functions as arguments and manipulated by terms. The calculus is useful for studying the theory of rulebased programming languages such as Tom[2]. The goal of this internship is to study the behavioral theory of the rhocalculus, i.e., to find how to prove formally that two rhocalculus terms behave in the same way. To do so, one has to define relations called bisimulations and prove their soundness. More precisely, we want to extend the already existing bisimulations for the lambdacalculus to the rhocalculus. We aim at proving equivalences between patterns or startegies in the rhocalculus, or maybe between programs in Tom.
[1] rho.loria.fr [2] tom.loria.fr URL sujet détaillé :
:
Remarques : Stage coencadré avec Horatiu Cirstea





SM20763 Fast and accurate 3D X ray image reconstruction for Non Destructive Testing industrial applications


Description


2D and 3D X ray Computed Tomography is very useful in medical imaging as well as in Non Destructive Testing (NDT) industrial applications. The main subjects of the research nowadays in medical imaging is the dose reduction while keeping the good quality of reconstructed images. In NDT, the main subject is NDT computational cost which needs to reduce the number of projections and develop fast methods for on line testing needs. The common point of both domains is limiting number of projections while proposing appropriate methods for fast and accurate reconstruction methods. The Bayesian estimation approaches to the inverse problems of image reconstruction is appropriate for the above mentioned objective. This is due to the fact that this approach has the appropriate tools for combining a priori information as well as taking account for the measurement noise and modeling errors and giving an estimate and some measures of remaining uncertainties in the reconstruction results. In L2S we have gained very high knowledge both in the main Bayesian approaches as well as in parallel implementation of the proposed algorithms on many core processors such as GPUs. The main subject of this PhD proposal is, in a first step, to implement some of the main methods we have developed (GaussMarkov prior modeling and MAP estimation) on GPU to evaluate the performances of these methods. Then, to adapt these methods to NDT applications, we have to use more appropriate priors such as GaussMarkovPotts or Infinite Gaussian Mixture (IGM) models. Then, the main difficulty becomes the computational costs. We need then to use Variational Bayesian Approximation (VBA) tools to be able to do fast computations. However, for real size 3D applications, we need to use muliGPU and great amount of memory transfer optimization. This PhD subject will be cosupervised by Ali MohammadDjafari who is a Research Director at CNRS and Nicolas Gac who is an Assistant Professor at the University of Paris Sud, Orsay. The main work will be done at Signal and system Laboratory (L2S) which is located at SUPELEC. The PhD is part of Doctoral program of =93Université Paris Sud, Orsay=94 URL sujet détaillé : http://djafari.free.fr/Sujets/sujet_ these_2014_CT_a.pdf
Remarques : Nicola GAC





SM20764 B


Description


The B method is a formal approach for developing missioncritical software. It is based on some variant of set theory. In the context of the BWare project, this set theory is encoded in the Why3 environment. This internship aims at verifying that thsi encoding is faithful URL sujet détaillé : http://www.lri.fr/~marche/stagesemBenCoq.pdf
Remarques : coencadrement par Catherine Dubois, CEDRIC, Evry





SM20765 Unknown Noncircular trajectory Computed Tomography


Description


In some Non Destructive Testing (NDT) industrial applications of X ray Computed Tomography (CT) the trajectory of the detectors may not be on a perfect circular one. The reconstruction results which are mainly based on Radon or Xray Transform may then be greatly altered. The estimation of trajectory then becomes important. One solution is to estimate first this trajectory from the projections and then do the necessary corrections before applying a standard Filtered BackProjection (FBP) method for reconstruction. The second approach to estimate jointly the trajectory and the image in an iterative way. This problem is also mathematically equivalent to a moving object CT in medical imaging. In fact, the movement of the object with fixed detectors and movement of the detectors with fixed object are mathematically equivalent and so, the proposed methods can also be used for medical imaging applications. In this PhD, first the first approach will be explored to propose fast methods for practical applications, in particular when the number of projections are very important and so the image reconstruction part can be done via the classical FBP methods. In the second part, we will extend this to new needs in CT where the number of projections are limited and dose reduction is important. Then, we propose to consider the Bayesian estimation approaches to joint estimation of the body movement or equivalently the detector trajectory and the image. This is due to the fact that this approach has the appropriate tools for combining a priori information as well as taking account for the measurement noise and modeling errors and giving an estimate and some measures of remaining uncertainties in the reconstruction results. In L2S we have gained very high knowledge both in the main Bayesian approaches as well as in parallel implementation of the proposed algorithms on many core processors such as GPUs. The main subject of this PhD proposal is, in a first step, to implement some of the main methods of the first approach for immediate applications. In a second step, we want to replace the classical FBP by the Bayesian image reconstruction methods developed (GaussMarkov prior modeling and MAP estimation) in our group and implemented on GPU to evaluate the performances of these methods. In a third step, we want to adapt these methods to NDT applications, we have to use more appropriate priors such as GaussMarkovPotts or Infinite Gaussian Mixture (IGM) models. Then, the main difficulty becomes the computational costs. We need then to use Variational Bayesian Approximation (VBA) tools to be able to do fast computations. However, for real size 3D applications, we need to use muliGPU and great amount of memory transfer optimization. This PhD subject will be supervised by Ali MohammadDjafari who is a Research Director at CNRS and Professor at the University of Paris Sud, Orsay. The main work will be done at Signal and system Laboratory (L2S) which is located at SUPELEC. The PhD is part of Doctoral program of =93Université Paris Sud, Orsay=94 URL sujet détaillé : http://djafari.free.fr/Sujets/sujet_ these_2014_NCCT_a.pdf
Remarques :





SM20766 Estimation of the opacity degree of a system holding a secret


Description


We consider a dynamic system and some secret property de=0Cned over the runs of this system. For its interactions with the outside world, the system produces visible events and may also accept some inputs, which allows one to partially guess its internal behaviors. So autonomous or driven runs of the system may disclose some information about its secret property. The problem is to determine if the secret of the system is always preserved along its interactions (opacity problem), and to quantify the information leak about this secret (opacity degree evaluation).
More formally, the opacity problem is stated in the setting of discrete event systems, assuming models as automata, stochastic automata (or Markov chains), weighted automata, (weighted) Petri nets, etc. As a =0Crst step, one can assume the system does not accept inputs, but performs a hidden run in which only some transitions are visible (i.e. produce a visible event). The secret property can be expressed as a partition of the set of possible trajectories. The secret is revealed/discovered if, from the collected observations, an observer can determine which subset the hidden run belongs to. On the contrary, if ambiguity remains in all cases, the system is said to be opaque. In many situations of interest, the system can be weakly opaque, in the sense that given the collected observations, it is much more likely that the hidden run belongs to one subset than to another. Although the system is opaque in such situations, there is however a leak of information about the secret : an observer does not get all the necessary bits of information that would allow him to guess the hidden property with certainty, but he nevertheless gets part of these bits. An important issue is then to quantify this amount of revealed bits about the secret, and to measure as well how many bits away one is from knowing the secret. For systems accepting inputs, strategic considerations immediately appear as an attacker may want to drive the system in order to collect as much information as possible about the secret.
The objective of this internship is =0Crst of all to formalize adequately this (or these) question(s), and to design an algorithmic solution to estimate the opacity degree of a system. According to the accomplished progress, one may go as far as computing optimal strategies that extract the most information possible from an open system. The background will be a set of classical results about weighted/stochastic automata, for example algorithms to compute volumes of trajectory sets, and of course basic results in information theory.
URL sujet détaillé : http://people.rennes.inria.fr/Eric.Fabre/masterRI2013_aE.pdf
Remarques : Coencadrement par Nathalie Bertrand.





SM20767 Optimal control of a complete urban train system


Description


We consider the optimal control problem for a very large system, that may model for example a complete subway line or a set of subway lines. This comprises the topology of railway tracks, the signaling devices, the trains themselves, lowlevel automatic control systems (for train speeds and travel times), but also highlevel regulation policies driven by human operators. The model must also capture random events that may alter the normal operation of trains. These events range from small incidents responsible for train delays, up to major failures that cause severe traffic disruptions. Such train systems are designed to meet the requirements of time tables. To do so, several control or regulation strategies are possible, according to the distance between the actual system state and the target time table. These strategies range from the correction of small delays, to the modification of time tables, and to heavy decisions like injecting/extracting trains or partially closing a line.
The objective of this internship is to design a first simple model of the entire system (included control and perturbations), presumably under the shape of a stochastic timed Petri net. An important task will be to propose a thourough modeling methodology, based on a modular and hierarchical approach, and selecting the most appropriate abstraction level. One will then address questions like the performance evaluation of different regulation/control strategies, when random perturbations occur on the line. The objective will be to progress toward the recommendation of corrective actions for a given pertubed situation. On the theoretical ground, another objective will be to explore the interest of modular/distributed strategies for the simulation and optimization of large systems. The interest of abstraction methods will also be explored.
This internship will take place in the SUMO team of the INRIA lab (Rennes, France), and is scheduled for 6 months. This work is part of a research contract with Alstom Transport (research), and may lead to a PhD preparation. URL sujet détaillé : http://people.rennes.inria.fr/Eric.Fabre/masterRI2013_bE.pdf
Remarques : Coencadrement par Loic Helouet.





SM20768 Diagnosis of large systems by progressive model refinement


Description


Softwares, applications and systems that we use everyday involve a large number of interconnected material and software resources, which must all behave correctly for the expected service to be delivered. Let us mention for example a large web server located in a distant cloud and accessible through a network provider and several successive network operators. The complexity of such systems is such that the detailed understanding of malfunctions becomes a problem that is unaccessible to human operators. This issue remains even if the topic is limited to a specific software layer, or to the data transport layer. The objective of this internship is to contribute to a formalism that would help modeling and monitoring such large systems.
The proposed approach consists in representing a large system as a set of interconnected resources (or components), that provide services to one another. We wish to focus specifically on the hierarchical nature of such models: each component can be described at different levels of granuarity, or can be represented by a refinable model. A first approach will use Bayesian networks as the underlying formalism, as they provide a natural setting to encode resources covariances and dependencies. The system will be assumed static, or frozen in time, and thus observed as a picture. Observations consist in reading some state variables of the system, or symptoms on some components. The general issue consists in guessing the state of the remaining unobserved variables, or more generally to infer some property of the system, for example finding the cause of a malfunction, given the collected observations. This problem is standard for Bayesian networks, and numerous dedicated algorithms exist.
The problem we wish to address here is the definition of hierarchical and generic Bayesian networks, where the model of a component can be replaced at any time by a more detailed one, in order to uncover new state variables, new subcomponents, or a finer description of its interactions with other components, and at the same time new observations on the state of the system. In this formalism, the issue will then be to design inference algorithms that non only take advantage of all the observations available at some granularity level, but that also decide online whether the model should be refined to give access to new observations, and thus yiel a more accurate result.
The work will be either focused on the theoretical aspects of the problem, or oriented toward the implementation of a simulator/demo, for a given toy example. A typical application domain of this approach is the diagnosis of failures in telecommunication networks (see ref. below). URL sujet détaillé : http://people.rennes.inria.fr/Eric.Fabre/masterRI2013_cE.pdf
Remarques :





SM20769 Robust supervision and diagnosis of discrete event systems


Description


The supervision and the diagnosis of (possibly distributed) discrete event systems is a research domain that found several applications in automatic control, and specifically for the management of telecommunication networks. Formally, this is a natural extension of classical results in formal language theory and automata theory towards problems and applications raised in the community of discrete event dynamic systems.
This internship focuses on the questions of diagnosis and state estimation, which are very well understood. The diagnosis problem can be seen as the detection of specific events in a system (typically an automaton) from the output or observations provided by this system. In the classical approach, all observations issued by the system are used to answer the question (i.e. detect the failure). Here, one is rather interested in robustifying the approach against model and observation errors. The idea is to keep only the last N observations produced by the system, and to forget about observations that are too far in the past (and that may have misguided us, or may not be relevant anymore). The problem consists in building an observer/diagnoser that corresponds to these assumptions. The starting point will be the classical derivation of observers, and one will try to transform them into sliding window observers taking into account the only last N observations. Specific attention will be paid to the relations of such objects for sizes N and N+1, and to the limiting cases for N going to infinity. One will also study the relation of this problem with the notion of local automaton.
Extensions of this work are then possible in different directions. For example to address the specific case of distributed/modular systems, or the case of stochastic systems. The links with classical methods for linear Gaussian systems could also be examined (Kalman filtering). URL sujet détaillé : http://people.rennes.inria.fr/Eric.Fabre/masterRI2013_dE.pdf
Remarques : Coencadrement par Herve Marchand.





SM20770 Formalization of the executable semantics of the ANSI/ISO C Specification Language


Description


FramaC is a suite of tools dedicated to the analysis of the source code of software written in C (http://framac.com). The kernel implements a specification language called ACSL. Executable ACSL (EACSL) is a subset of ACSL in which annotations can be executed. A plugin translates annotated programs into an equivalent C code, where annotations are checked at runtime. The goal of this internship is to provide some guarantees that the translation of EACSL annotations is sound. URL sujet détaillé : http://www.lri.fr/~marche/M2eacsl.pdf
Remarques : Coencadrement par Claude Marché, Inria SaclayIledeFrance,





SM20771 Semantic aspects in the WoT avatar lifecycle


Description


This master thesis is part of the ASAWoO project.
The ASAWoO project heads towards building an infrastructure for the Web of Things (WoT), based on the concept of avatar. An avatar is a logical artifact that supports the functional and semantic behavior of an object in the WoT. Based on previous work of first year UCBL computer science master students, the avatar architecture is mainly specified and the basis of the semantic data repartition and reasoning has been set up.
The aim of this thesis is to identify the knowledge part and reasoning processes wrt. the different stages of the avatar lifecycle (initialization, discovering, functional requests...), as well as to investigate existing solutions for deploying semantic web services between the avatar and a physical appliance.
For this, the student will implement a prototype that copes with this architecture and allows demonstrating the results of the semantic aspects of the work. The avatar shall be generic, but the prototype will be linked with a Lego Mindstorms EV3. URL sujet détaillé :
:
Remarques :  Coencadrant : Michaël Mrissa (LIRIS)  Rémunération sur le projet ASAWoO
 Financement de thèse sur le même sujet ouvert en septembre 2014





SM20772 BIg Data Anonymisation Techniques


Description


Public datasets are used in a variety of applications spanning from genome and web usage analysis to locationbased and recommendation systems. Publishing such datasets is important since they can help us analyzing and understanding interesting patterns. For example, mobility trajectories have become widely collected in recent years and have opened the possibility to improve our understanding of largescale social networks by investigating how people exchange information, interact, and develop social interactions. With billion of handsets in use worldwide, the quantity of mobility data is gigantic. When aggregated, they can help understand complex processes, such as the spread of viruses, and build better transportation systems, prevent traffic congestion. While the benefits provided by these datasets are indisputable, they unfortunately pose a considerable threat to individual privacy. In fact, mobility trajectories might be used by a malicious attacker to discover potential sensitive information about a user, such as his habits, religion or relationships. Because privacy is so important to people, companies and researchers are reluctant to publish datasets by fear of being held responsible for potential privacy breaches. As a result, only very few of them are actually released and available. This limits our ability to analyze such data to derive information that could benefit the general public.
The main goal of this internship is to design, implement and evaluate novel and practical anonymisation schemes under the kanonymity and differential privacy models. Different datasets, with different (sometimes complex) datastructures, will be considered. We will, for example, consider mobility traces (provided by a telecom operator), electricity traces (provided by an electricity providers), and several other types of datasets. The intern will contribute to the development of an anonymisation toolkit. URL sujet détaillé : https://team.inria.fr/privatics/privatebigdatapublication/
Remarques : Grenoble ou Lyon (with a preference for Grenoble).
Required skills:
Strong analytical and mathematical skills, good programming skills (in particular in Python, Java and C). Some
knowledge in security or privacy would also be appreciated but are not a
prerequisite for the position.





SM20773 Practical secure multiparty computation


Description


The problem studied in the field of Secure MultiParty Computation (SMPC) is as follows: There are n parties or nodes in a network. Each node has an input which is private. The node does not wish to reveal the input to any other node in the network. A secure multiparty computation protocol enables the nodes to collaboratively compute a function over their inputs such that all inputs remain private, i.e., known only to their owners. An example of such a protocol is private maximum, i.e. given n nodes each with a private input, how to determine the maximum without revealing who has the maximum or any of the other private inputs. Other examples of possible private computations may include geometric functions, set operations, etc.
The objective of this project would be to study secure multiparty computation protocols. Such protocols are often resourceheavy, which limits their practical application. The specific objective of the project would be to analyze and propose methods for rendering these protocols resourceefficient and thus practical. A study of cryptographic techniques and privacy enhancing technologies such as zeroknowledge proofs, homomorphic encryption, electronic cash, etc. will be part of the project. URL sujet détaillé :
:
Remarques :





SM20774 Elimination systématique des horloges des programmes X10 polyédriques


Description


This work is related to parallel programming languages and program transformations and is set in a collaboration between some members of two INRIA teams (CAMUS and COMPSYS) which are interested in developing optimizing techniques for the efficient use of parallel architectures.
This project is particularly concerned with optimizing X10 programs. The X10 language is a promising recent parallel language, developed by IBM Research. The sequential core of X10 is an objectoriented language in the Java family. This core is augmented by a few parallel constructs that create activities and synchronize them. Like many other parallel languages, these constructs are redundant and may be used interchangeably in some circumstances, allowing a programmer to choose amongst several program shapes or viewpoints. In particular, synchronization can be achieved by using clocks which can be seen as a generalization of the classical barriers. Synchronization on a clock is specified by the advance() method call. Activities that execute advances stall until all extent activities have done the same, and then are released at the same (logical) date.
We recently introduced a program transformation which is applicable to a large class of parallel programs (which are called "polyhedral programs"). This systematic transformation is able to switch between two extreme program shapes:  a parallel composition of sequential activities that synchronizes using clocks ;  a sequence of parallel unclocked activities. Obviously, these two extreme cases can be combined to produce many intermediate solutions. Such an ability opens up a large space of new potential optimizations. This transformation is based on the polytope model which is particularly used to model program control. Polyhedral operations are used to manipulate the program.
Objective
The objective of this project is to implement this sourcetosource transformation for the X10 language and in the simplest case where all dates are affine index expressions. (A further goal is to include this transformation in a compiler within either static or dynamic optimizations). The student will have to :  Study the polyhedral X10 programs class and the algorithm for transforming them  Use some existing libraries for modeling programs and applying polyhedral operations (CLooG, ISL)  Manipulate an AST (Abstract Syntax Tree) corresponding to an X10 program: extract and then gather the required syntactic elements  Code the algorithm : compute the date symbolic expressions, build a new AST and produce a new X10 program written in concrete syntax. URL sujet détaillé : http://icubeicps.unistra.fr/index.php/M2_internships
Remarques :  Autres chercheurs concernés par ce stage : Paul Feautrier (COMPSYS [contact référent à l'ENS Lyon]), Alain Ketterlin (CAMUS), Tomofumi Yuki (Colorado State University)
 Pour les élèves qui ne sont pas normaliens, une rémunération est prévue de l'ordre de 400 euros par mois (tarif habituel pour ce type de stages).





SM20775 Contracting factor for Markov Chains


Description


Markov Chains form a simple formalism to describe a stochastic (probabilistic) system. They can be used to model uncertainty in a railway system; or uncertainty of different initial conditions in a biological system (signaling pathway as the apoptosis pathway). Markov Chains can be seen as a linear transform (=Matrix) of distributions over states. A distribution is a function associating to each state a probability, and such that these probabilities sums up to 1. Although the formalism is very simple, the behaviors created by the Markov Chain can be pretty complex. Recently, we described behaviors of Markov Chains approximately as a regular language. To compute exactly this language, an important tool is the contracting factor of the chain. It can also be used to bound the error made for approximated analysis of biological systems [PALGT12]. More precisely, the graph of states can be decomposed into components with a unique stationary distribution (M x distrb_stat = distrib_stat). Moreover, on each component, for each initial distribution µ, we have that (M^n x µ) converge towards distrib_stat when n goes to infinity. Computing this factor is thus very important. The goal of this stage is: 1) Understand the decomposition of Markov Chain. 2) Find different algorithms to overapproximate it (literature, potentially invent new techniques). 3) Compare them (theoretically and/or experimentally). URL sujet détaillé : http://www.crans.org/~genest/StageProba.pdf
Remarques :





SM20776 Compilation optimisante et certifiées pour la précision numérique


Description


Exact computations being in general not tractable for computers, they are approximated by floatingpoint computations. This is the source of many errors in numerical programs. Because the floatingpoint arithmetic is not intuitive, these errors are very di=0Ecult to detect and to correct by hand and we consider the problem of automatically synthesizing accurate formulas. We consider that a program would return an exact result if the computations were carried out using real numbers. In practice, roundo=0B errors arise during the execution and these errors are closely related to the way formulas are written. Our approach is based on abstract interpretation. We use Abstract Program Equivalence Graphs (APEGs) to represent in polynomial size an exponential number of mathematically equivalent expressions. The concretization of an APEG yields expressions of very di=0Berent shapes and accuracies. Then, we extract optimized expressions from APEGs by searching the most accurate concrete expressions among the set of represented expressions.
The current internship concenrns the certification of the optimizations performed with APEGs. We aim at automatically generating certificates which can be automatically checked by a theorem prover. These certificates need to assert the mathematical equivalence of the original and synthesized formulas and they also need to give a proof of the error bounds on the results of the computations of the programs. The proofs will rely on the Gappa tool. URL sujet détaillé : http://perso.univperp.fr/mmartel/stageM2compilfloatcertif.pdf
Remarques : Coencadrement par David Delmas (Airbus). Indemnité de stage prévue.
Ce stage a pour vocation d'être suivi d'une thèse CIFRE (Université de Perpignan et Airbus).





SM20777 Partitioning special classes of hypergraphs


Description


Topic:
Hypergraph partitioning based models are used for various complex and irregular problems arising in diverse domains, such as parallel scientific computing, VLSI design, software engineering, and database design. These models formulate an original problem as a hypergraph partitioning problem, trying to optimize a certain objective function (e.g., minimizing the total volume of communication in parallel iterative solvers for linear solvers, optimizing the placement of circuitry on a dice area, minimizing the access to disk pages in processing GIS queries) while maintaining a constraint (e.g., balancing the computational load in a parallel system, using disk page capacities as an upper bound in data allocation) imposed by the problem. In general, the solution quality of the hypergraph partitioning problem directly relates to the performance of the modeled problem. As is evident, efficient and effective hypergraph partitioning algorithms are important for many applications.
The hypergraph partitioning problem is NPcomplete and therefore heuristic algorithms are used. Recent work suggests that methods that can take the advantage of the underlying structure, if there is, of a given hypergraph are far better (either in terms of the partition quality or the execution time or both) than those general methods. The subject of the training period is to study the hypergraph partitioning problem on special classes of hypergraphs from an algorithmic point of view. We will consider hypergraphs corresponding to treelike structures. Novel partitioning algorithms which exploit the properties of those hypergraphs are going to be developed. In the long term, algorithmic and discrete mathematical investigations are to be carried out to detect those properties or their approximations in a given hypergraph. In a more longer term, the overall approach will be used to partition general hypergraphs.
Candidate:
The ideal candidate is enthusiastic and motivated; has background on algorithms and working knowledge of graph theory, discrete mathematics, and combinatorial optimization; enjoys implementing algorithms in computers.
Some annotated references: + See a special hypergraph partitioning problem in: P. R. Amestoy, I. S. Duff, J.Y. L'Excellent, Y. Robert, F.H. Rouet, and B. Uçar, On computing inverse entries of a sparse matrix in an outofcore environment, SIAM Journal on Scientific Computing, Vol. 34, No. 4, pp. A197A1999, 2012.
+ See another very related one in: J. Gil and A. Itai, How to pack trees, J. Algorithms, Vol. 32, No. 2, pp. 108132, 1999.
+ A classical application is depicted in a set of course notes: http://perso.enslyon.fr/bora.ucar/papers/murcia.pdf
Keywords: Hypergraphs, partitioning, NP completeness, heuristics. URL sujet détaillé :
:
Remarques :





SM20778 Ingénieur Développement C++  Prototyper une méthode innovante et très générale permettant d'adresser n'importe quelle forme d'outils pour l'usinage 3 à 5 axes


Description


Le département Machining & Composite Apps de DASSAULT SYSTEMES PROVENCE recherche un stagiaire pour participer, dans l'environnement de DELMIA V6, à l'étude et à la réalisation d'un projet industriel pour le Machining. Ce projet consiste à prototyper une méthode algorithmique innovante et très générale permettant le calcul de trajectoire d'outils de machine à commande numérique avec des outils de forme quelconque. Le stagiaire sera intégré à l'équipe de développement, encadré par un ingénieur expérimenté. Il utilisera C++ pour les développements sous Visual Studio, et sera formé à la technologie et aux produits de PLM Dassault Systèmes. URL sujet détaillé :
:
Remarques : Sujet de géométrie algorithmique nécessitant une grande capacité d'abstraction. Ne pas espérer publier (le secret professionnel est la règle).





SM20779 Développement d'un algorithme d'alignement de signaux pour la recherche de marqueurs de la qualité des produits issus de l'agriculture


Admin


Encadrant : CoEncadrants 
Labo/Organisme : LIMOS, UMR 6158 CNRS et INRA, UR370 QuaPA 
URL : : 
Ville : ClermontFerrand 



Description


L'INRA conduit des recherches visant à optimiser les propriétés des produits issus de l'agriculture afin de répondre au mieux aux exigences de secteurs d'activité aussi divers que l'agroalimentaire, la production énergétique, le biomédical... Dans cette optique, l'identification des constituants marqueurs des propriétés fonctionnelles, nutritionnelles, sanitaires ou sensorielles de ces matrices biologiques est un enjeu majeur. Une des techniques instrumentales les plus couramment employées pour la recherche de marqueurs de la qualité est la chromatographie en phase gazeuse couplée à la spectrométrie de masse (CPGSM). Pour un échantillon donné, le signal CPGSM peut se présenter sous la forme d'une matrice donnant l'abondance de chacun des fragments de masse générés par le spectromètre à chaque temps chromatographique. Le signal CPGSM sur un ensemble d'échantillons est donc un cube à quatre dimensions, la première dimension étant le temps de séjour dans la colonne chromatographique, la deuxième dimension représentant les différents fragments analysés par le spectromètre de masse, la troisième dimension les différents échantillons et la quatrième dimension le signal à recaler par corrélation. Les signaux à recaler par corrélation doivent être comparés pour des mêmes dates, ce qui nécessite un alignement temporel des signaux. L'algorithme actuel fonctionne sur un mode combinatoire et a une complexité exponentielle, ce qui limite drastiquement son utilisation. Intégré dans un projet INRALIMOS intitulé TADoSS, le stage aura pour objectifs : 1) d'améliorer la complexité de REALCOW pour en trouver une version polynomiale, où à défaut une bonne heuristique de complexité polynomiale de bas degré. 2) d'explorer et éventuellement implémenter des solutions parallèles sur GPU. 3) de valider et éventuellement d'adapter REALCOW pour permettre un alignement performant de 5 autres jeux de données fournis par l'INRA sur des problématiques diverses (alimentation humaine, santé, production d'énergie renouvelable) URL sujet détaillé : http://www.malgouyres.org/activrech/data/sujetM2_inra.pdf
Remarques : CoEncadrants :  Erwan Engel, Directeur de Recherche INRA, UR370 QuaPA, INRA de Theix
 Saïd Abouelkaram, Ingénieur de Recherche INRA, UR370 QuaPA, INRA de Theix





SM20780 Semicalculabilité des figures géométriques


Description


This work is a fundamental research project. The main theme is computability theory over the natural numbers and the real numbers and its interaction with Euclidean geometry. Semicomputability notions appear everywhere in logic and theoretical computer science (halting problem, provability of a logical formula, tiling the plane with a finite tile set, etc.). However their interaction with geometry has been hardly investigated and is not well understood. The goal of this work is to improve our understanding of this interaction through some specific and welldefined questions, among which:  Is it possible to find a characterization of semicomputable triangles in terms of the properties of their coordinates?  Is it possible to find a set of independent parameters describing the semicomputable triangles?  What is the minimal size of such a set of parameters? URL sujet détaillé : www.loria.fr/~hoyrup/geo.pdf
:
Remarques : Ce stage sera coencadré par Emmanuel Jeandel.





SM20781 Preimage search for the MD4 hash function


Description


Keywords: Hash function, cryptanalysis, preimage, MD4.
At CRYPTO 2012, Knellwolf & Khovratovich proposed a new differential view of the meetinthemiddle techniques used to attack the preimage resistance of various hash functions of the MD4 family. An advantage of their approach is that it makes the search for attack parameters easier; this was demonstrated by the improvements they made to the previously best attacks on SHA1.
The purpose of this internship is to study this framework and to apply it to the MD4 hash function. More generally, it should be interesting to understand how it can be used to attack other functions than SHA1. Admittedly, a major motivation behind that task is also to use these new techniques to try to improve the best known attacks on MD4.
Reference:
Simon Knellwolf and Dmitry Khovratovich. New Preimage Attacks against Reduced SHA1. In Reihaneh SafaviNaini and Ran Canetti, editors, CRYPTO, volume 7417 of Lecture Notes in Computer Science, pages 367=96383. Springer, 2012. URL sujet détaillé : http://perso.eleves.ensrennes.fr/~pkarp892/md4preimagesEN.pdf
Remarques : Coadviser: Pierre Karpman.





SM20782 Verification of equivalence properties in security protocols


Description


Context. Security protocols are distributed programs that aim at ensuring security properties, such as confidentiality, authentication or anonymity, by the means of cryptography. Such protocols are widely deployed, e.g., for electronic commerce on the Internet, in banking networks, mobile phones and more recently electronic elections. As properties need to be ensured, even if the protocol is executed over untrusted networks (such as the Internet), these protocols have shown extremely difficult to get right. Formal methods have shown very useful to detect errors and ensure their correctness. Many results exist in literature for analyzing reachability properties, such as confidentiality and authentication. Recently, indistinguishability properties, received a lot of attention. The notion of indistinguishability is particular useful to model different flavors of anonymity, strong versions of confidentiality and specification of security properties as ideal systems. Indistinguishability is conveniently modelled by process equivalences in extended pi calculi that allow the modeling of cryptography by the means of equational theories.
Objectives of the internship. Recently, a new procedure [CCK12] has been proposed for verifying such equivalence properties. The tool is based on a modeling of the protocols as first order Horn clauses, techniques from rewriting theory and a dedicated resolution procedure. The aim of the internship is to extend the scope of this procedure. The procedure does currently not support else branches in conditionals. We expect that techniques used in the tool ProVerif [BAF08] could be adapted to enable the tool to reason about protocols that require the use of else branches. This extension may also lead to an implementation in the tool AKisS that implements (in OCaml) the procedure described in [CCK12]. URL sujet détaillé : http://www.lsv.enscachan.fr/Stages/Fichier/m214sdsk1.pdf
Remarques : We are looking for candidates with good skills in Foundations of Computer Science (logic, automated deduction, concurrency the ory. . . ). Some knowledge in security is an asset but is not mandatory. The candidate will assimilate this knowledge during the internship.
The internship will be located at Nancy or at ENS Cachan depending on the choice of the candidate. This internship may also lead to a PhD thesis on similar topics.





SM20783 Vérification de protocoles cryptographiques


Description


The Internet is a large common space, accessible to everyone around the world. As in any public space, people should take appropriate precautions to protect themselves against fraudulent people and processes. It is therefore essential to obtain as much confidence as possible in the correctness of the applications that we use.
Because security protocols are notoriously difficult to design and analyse, formal verification techniques are extremely important. Formal verification of security protocols has known significant success during the two last decades. The techniques have become mature and several tools for protocol verification are nowadays available. However, nearly all studies focus on tracebased security properties, and thus do not allow one to analyse privacytype properties that play an important role in many modern applications.
The goal of this internship is to study privacytype properties and especially modularity issues. The idea is to combine existing decision procedures in order to be able to deal with various cryptographic primitives. See the .pdf in attachment. URL sujet détaillé : http://www.lsv.enscachan.fr/Stages/Fichier/m214sdsk2.pdf
Remarques : We are looking for candidates with good skills in Foundations of Computer Science (logic, automated deduction, concurrency the ory. . . ). Some knowledge in security is an asset but is not mandatory. The candidate will assimilate this knowledge during the internship.
The internship will be located at Nancy or at ENS Cachan depending on the choice of the candidate. This internship may also lead to a PhD thesis on similar topics.





SM20784 Verification of protocols with loops


Description


Context. Security protocols are distributed programs that aim at ensuring security properties, such as confidentiality, authentication or anonymity, by the means of cryptography. Such protocols are widely deployed, e.g., for electronic commerce on the Internet, in banking networks, mobile phones and more recently electronic elections. As properties need to be ensured, even if the protocol is executed over untrusted networks (such as the Internet), these protocols have shown extremely difficult to get right. Formal methods have shown very useful to detect errors and ensure their correctness. Many automated tools exist for analyzing protocols. However, while classical authentication protocols are a simple sequence of inputs and outputs, recent protocols tend to have more complicated program structures. One such example are protocols which include loops. For instance the TESLA protocol [PCTS00] is a broadcast stream authentication protocol which sends a stream of data in a loop without a bound on the number of iterations. Automatically analyzing such a protocol is out of the scope of most existing tools. Recently, a new tool, called tamarin [SMCB12, SMCB12], has been pro posed. In contrast to most other fully automated tools it allows a user to interact with it and guide the tool in order to avoid nontermination. (Note that nontermination is unavoidable as the problem of verifying security protocols is undecidable in general.) It has been shown that the tool is in deed able to analyze protocols such as TESLA. However, a shortcoming of tamarin is that its input language, labelled multiset rewriting rules, is a very low level language. This makes the process of specifying protocols a difficult, errorprone task yielding specifications that are difficult to read. To over come this shortcoming we recently presented a tool [KK13] which translates specifications written in a highlevel specification language, a variant of the applied pi calculus, into the input language of tamarin, using tamarin as a backend. Objectives of the internship. The aim of the internship is to verify protocols with loops, such as TESLA, written directly in a highlevel specification language. This language has no direct loop construct but loops can be encoded in different ways. During the internship the student will explore different possible encodings of loops in this variant of the applied pi calculus as well as the possibility to extend the language with a loop construct and extend the translation. The ability to proof protocols with loops is also related to the analysis of protocols with lists [BP13] which provides another possible set of target applications. URL sujet détaillé : http://www.lsv.enscachan.fr/Stages/Fichier/m214sdsk3.pdf
Remarques : We are looking for candidates with good skills in Foundations of Computer Science (logic, automated deduction, concurrency the ory. . . ). Some knowledge in security is an asset but is not mandatory. The candidate will assimilate this knowledge during the internship.
The internship will be located at Nancy or at ENS Cachan depending on the choice of the candidate. This internship may also lead to a PhD thesis on similar topics.





SM20785 Application de primitives de chiffrement efficaces pour la recherche de données


Description


The type of cryptographic primitives used has a significant impact on the overall efficiency of the protocol, specially in the case of searching on encrypted data. A compromise between safety and efficacy is required. The purpose of this internship is to study schemes that limit the extra layer of encryption through optimized symmetric encryption primitives. In particular, the concept Order Preserving Encryption (OPE) has been proposed in order to effectively compare numerical data (see [1]). Recent advances in the searchable encryption have also been obtained [2].
The internship permits to carry out a state of the art on these specific primitives and work on the comparison of these results to practical constraints and adaptation to new biometric applications.
[1] Alexandra Boldyreva, Nathan Chenette, Adam O'Neill: OrderPreserving Encryption Revisited: Improved Security Analysis and Alternative Solutions. CRYPTO 2011: 578595
[2] D. Cash, S. Jarecki, C. Jutla, H. Krawczyk, M. Rosu, and M. Steiner. Highlyscalable searchable symmetric encryption with support for boolean queries. Crypto'2013. URL sujet détaillé : http://cryptrl.net/stages/MORPHOopestage201314.pdf
Remarques :





SM20786 Proof theory for term representations


Description


Basic to the construction of any symbolic system is the structure terms, which are used to represent expressions, programs, formulas, types, etc. One way to define the structure of terms is as small proofs of their types. Such a view of terms provides a direct way to model substitutions as well bindings within terms. For example, a common representations of λterms uses the head normal form of terms where every (normal) term has the structure (λ x1 ... λ xi. h t1 ... tj),
where h (the head of the term) is a variable or constant, t1, ..., tj are head normal terms (the arguments), and x1, ..., xi are variables bound in the body of this term (the binder). This structure for terms corresponds to particular cutfree proofsvariously called uniform proofs or LJT proofsof implicational formulas viewed as types. Given that "terms are proofs", the encoding and dynamics of substitution can be addressed by using cut and cutelimination.
One drawback of the headnormal form structure of λterms is that they do not have direct support for either sharing or parallel structure in terms. Direct support for sharing makes it possible to have much more compact encodings of some structures than is provided by the head normal form, which requires copying and recopying the same substructure. However, sharing appears to have its own problems: Can sharing in terms be enforced? Can sharing persist during substitutions and normalizations? Can sharing help or hinder operations on terms structures such as equality checking and unification?
Recent advances in proof theory related to focused proofs systems should have immediate application to these issues of term representation. Some of these recent advances include:
Focusing in intuitionistic logic allows two polarities to be given to atomic propositions. If such propositions are assigned a negative polarity, then cutfree focused proofs correspond to headnormal form terms. If positive polarity is used instead, then focused proofs encode terms with sharing.
Focus proofs can easily allow for "multifocusing" and there is strong evidence in recent papers that multifocusing provides a good encoding of parallelism in proofs and terms.
The purpose of this internship is to address some of these issues in term representation by making use of the recent results in focusing proof systems.
This internship will take place within the funding context of the ProofCert project (an ERC Advanced Investigator Award). This five year project (20122016) provides funding for PhD students covering a broad range to topics in a foundational approach to proof certificates. Continuing to work on this topic for a PhD is possible and encouraged.
Bibliography
The following bibliography is indicative of the kind of papers that will be part of this research effort.
[CHM13] A multifocused proof system isomorphic to expansion proofs by Kaustuv Chaudhuri, Stefan Hetzl, and Dale Miller. Accepted to the J. of Logic and Computation: pdf. [DL06] LJQ: A Strongly Focused Calculus for Intuitionistic Logic by Roy Dyckhoff and Stéphane Lengrand. CiE 2006: 173185: pdf. [LM09] Focusing and Polarization in Linear, Intuitionistic, and Classical Logic by Chuck Liang and Dale Miller. Theoretical Computer Science, 410(46), pp. 47474768 (2009): pdf. [S07] Completing Herbelin's Programme by José Esp=EDrito Santo. TLCA 2007: 118132: pdf. URL sujet détaillé :
:
Remarques :





SM20787 Calcul sécurisé pour la biométrie


Description


The secure multiparty computation (MPC) is a set of cryptographic techniques enabling interactive calculations without revealing the inputs of the calculation. A primitive of the most used in such calculations is the oblivious transfer (OT) allowing a first party to recover an item of its choice from a list belonging to the second part, without learning anything about the other elements of the list and without the second part knowing which item was chosen by the first part.
The trainee will focus in a first step on different techniques for effective oblivious transfers (e.g. based on [1,2]), the aim being the application to the secure computation of biometric comparisons [3,4]. In a second step, he will study the techniques for moving from minimum security (semihonest model) to high security (malicious model) [5] for the same applications. Implementations in order to assess performance in practice are anticipated.
[1] Yuval Ishai, Joe Kilian, Kobbi Nissim, Erez Petrank: Extending Oblivious Transfers Efficiently. CRYPTO 2003:145161 [2] Gilad Asharov, Yehuda Lindell, Thomas Schneider, Michael Zohner: More Efficient Oblivious Transfer and Extensions for Faster Secure Computation. ACM CCS 2013. [3] Julien Bringer, Mélanie Favre, Hervé Chabanne, Alain Patey: Faster secure computation for biometric identification using filtering. ICB 2012:257264 [4] Julien Bringer, Hervé Chabanne, Alain Patey: SHADE: Secure HAmming DistancE computation from oblivious transfer. WAHC 2013 [5] Yehuda Lindell: Fast CutandChoose Based Protocols for Malicious and Covert Adversaries. CRYPTO 2013:117 URL sujet détaillé : http://www.cryptrl.net/stages/MORPHOsecurecomputationstage201314.pdf
Remarques :





SM20788 Isotropic Meshing of NURBS Surfaces


Description


NURBS (NonUniform Rational Basis Spline) is a mathematical model that has become a standard in the modeling community for generating and representing curves and surfaces. It offers flexibility, precision and intuitive human interaction through control points [1]. NURBS surfaces are nowadays widely used in computer graphics, computeraided design, manufacturing and engineering. Nevertheless, for some specific applications such as computational engineering or realtime rendering, the NURBS surfaces must be converted into isotropic triangle surface meshes. Such conversion is still a scientific challenge and current mesh generators suffer from a lack of control on the shape and size of the mesh elements, as well as of topological guarantees. The goal of this internship is to devise a reliable algorithm for meshing NURBS surfaces, within the generic meshing framework of the CGAL library [2]. Some experiments and comparisons with the Workbench software from Ansys will be carried out.
In order to control the quality of the output surface triangle mesh, notably in terms of shape and size of the triangles, we will investigate an approach based on Delaunay refinement that uses the notion of restricted Delaunay triangulations. This approach consists of sampling a point set on the NURBS surface to initialize the restricted Delaunay triangulation, then refining it until all mesh elements meet some userspecified criteria [2]. The refinement process inserts new points that are computed as intersections between the NURBS surface and line segments that are dual Voronoi edges of Delaunay triangles.
The first objective is to devise a reliable and efficient line/NURBS intersection oracle. For this purpose, we will investigate a new matrixbased representation of NURBS surfaces relying upon recent advances on algebraic techniques for eliminating variables in polynomial systems [3]. This representation, adapted to trimmed NURBS, i.e., NURBS that contain holes, gives the ability to compute intersections by means of stateoftheart methods in numerical linear algebra (matrix kernel, singular value decompo sition, eigencomputation). The second objective is to specialize the initialization of the mesh generation process to NURBS surfaces. The third objective, more prospective, is to assess the common defect of NURBS surfaces (e.g., cracks, islands, selfintersections) in order to devise an algorithm with resilience to defectladen inputs.
References:
1 L. Piegl and W. Tiller, The NURBS book. Springer, Monographs in Visual Communication, 2nd edition, 1997.
2 C. Jamin, P. Alliez, M. Yvinec and J.D. Boissonnat. CGALmesh: a Generic Framework for Delaunay Mesh Generation. Inria Technical report 8256. Available at http://hal.inria.fr/ hal00796052/en/.
3 L. Busé, Implicit matrix representations of rational B ́ezier curves and surfaces. To appear in ComputerAided Design, preprint available at http://hal.inria.fr/hal00847802/en/. URL sujet détaillé :
:
Remarques : Stage coencadré par Pierre Alliez (http://team.inria.fr/titane/pierrealliez/), équipe TITANE (http://team.inria.fr/titane/) et Laurent Busé (http://wwwsop.inria.fr/members/Laurent.Buse/); équipe GALAAD (http://wwwsop.inria.fr/galaad/). Possibilité de rémunération à confirmer.





SM20789 Comportement multirésolution des estimateurs de longueurs et de surfaces dans l'espace discret 3D


Description


A way to evaluate length estimators of curves is to observe their behavior for curves that are discretized from continuous reference curves. When the resolution tends to zero, the discretized curve "tends" to the reference continuous curve from which it is originated. It is then possible to measure the convergence speed of the length estimator for the referencecurve. A new class of socalled semilocal estimators was proposed in our team a few years ago. They have the ability to converge towards the reference length and we are studying their convergence speed according to the assumptions on the reference curve. The first objective of the intership is to extend to the skew curves the results obtained on planar curves with this class of estimators and to develop a program for both the result validation and the practical use of this class of estimators. A second objective is to investigate the extension to the surface area calculation (in Z^n with n greater than or equal to 3) of both the theoretical properties and the implementation. URL sujet détaillé :
:
Remarques : Coencadrement : Étienne Baudrier, Mohamed Tajine. Rémunération : 436=80 par mois.





SM20790 Verification of concurrent systems with data structures


Description


We are interested in verification techniques for realistic and powerful (physically) distributed programs that may access unbounded data structures such as queues or stacks. More precisely, each process is described by a finite state machine and, when performing a transition, a process may read from or write to some unbounded data structure. The architecture of such a system is defined by the set of processes, the set of data structures and their types (queues, stacks, bags, ...), and the relations Reader and Writer between data structures and processes.
Such systems are Turing powerful as soon as there is a process with a selfqueue, or with two self stacks, or there are two processes with queues between them, etc. Therefore, all interesting verification problems are undecidable in general.
To regain decidability, a very promising approach is to restrict the possible behaviors of systems by bounding some parameter. Our choice is to leave the data structures unbounded (otherwise we are in the wellstudied setting of finite state systems) and to not compromise with their reliability (as, e.g., with lossy channels). Instead, we restrict to behaviors with bounded treewidth, or equivalently with bounded splitwidth. This parameter subsumes many underapproximation techniques used so far (e.g., bounded context, bounded phase, bounded scope, ...) and gives decision procedures with good complexities.
The PhD thesis of Aiswarya Cyriac will be the main reference for this internship. Our aim is to address some of the many problems left open there. For instance, the need to design good controllers for the various decidable restrictions. Ideally, we aim at controllers that are distributed, deterministic, nonblocking and that, when synchronized with a system, allow all and only those behaviors satisfying the considered restriction. Also, in the thesis, every data structure admits a single writer and a single reader. Is it possible to relax this hypothesis and to allow, for instance, several readers or several writers for some data structure? URL sujet détaillé : http://www.lsv.enscachan.fr/~gastin/Verif/2013stageM2CPDS.pdf
Remarques : coencadré par Benedikt Bollig





SM20791 Probabilistic specifications


Description


Current needs in the verification of systems evolve from boolean properties to finer quantitative properties. The overall theory behind automatic verification is actively being extended by the scientific community from the boolean to the quantitative setting. One of the foremost extension concerns the study of probabilistic systems or probabilistic specifications.
In this internship, we mainly focus on probabilistic specification formalisms. In the boolean case, a specification will assign true or false to each possible behavior. In the probabilistic extension, the specification assigns probabilities to behaviors. Recently, a fragment of weighted regular expressions was proposed and shown to have the same expressive power as probabilistic automata. Since the definition of these probabilistic expressions uses rewriting rules, it is not obvious to decide whether a given weighted expression is actually a probabilistic one. Solving this problem is the first aim of this internship. A second objective is to search an equivalent definition of probabilistic expression for which membership would be easier to check.
There are other interesting specification formalisms, and in particular those like PDL (Propositional Dynamic Logic) that mix state formul=E6 with path regular expressions. A weighted extension of PDL has been proposed in the recent PhD thesis of Benjamin Monmege. In this internship, we will also try to define and study a natural fragment of weightedPDL dedicated to probabilities. As a special case, probabilisticPDL should subsume the probabilistic extension of LTL defined in [P. Gastin and B. Monmege. Adding pebbles to weighted automata. CIAA'12]. URL sujet détaillé : http://www.lsv.enscachan.fr/~gastin/Verif/2013stageM2probaspec.pdf
Remarques : coencadré par Benedikt Bollig





SM20792 Synthesis of Distributed Systems with Parameterized Network Topology


Description


We consider distributed programs that can be run on an arbitrary network topology from a class of topologies (e.g., on all pipeline, all grid, or all ring topologies). Typically, such a program is given by a single sequential process, and a copy of that process can be run on any node in the given network topology. During its execution, each process may receive signals from an (uncontrollable) environment.
In parameterized synthesis, a specification S is given (e.g., a temporallogic formula) describing the desired system behavior. The goal is then to synthesize a program that implements S. Thus, the program acts as a collection of local controllers that enforce the system to satisfy S, independently of the network topology and independently of the behavior of the environment. There have been, so far, only a few approaches to parameterized synthesis. Existing works consider specifications that are interpreted over sequentialized executions of a system. The aim of the internship is to develop a framework for parameterized synthesis in a setting with real concurrency, where the specification is interpreted over partially ordered behaviors reflecting the parallelism. URL sujet détaillé : http://www.lsv.enscachan.fr/~gastin/Verif/2013stageM2parameterized.pdf
Remarques : coencadré par Benedikt Bollig





SM20793 Compiling music into FPGA


Description


FAUST (Functional Audio Stream, http://sourceforge.net/projects/faudiostream/) is a functional programming language specifically designed for realtime signal processing and synthesis. FAUST targets highperformance signal processing applications and audio plugins for a variety of platforms and standards. It is developed in Lyon by the GRAME laboratory (http://www.grame.fr/)
In some situations, sound synthesis with Faust requires more processing power than classical computers can offer. The objective of this internship will be to provide an FPGAaccelerated backend for FAUST that can provide better latency, and run onstage on a small FPGA board (we will use a ZedBoard based on the Xilinx Zynq7000, see http://www.zedboard.org/).
Currently, Faust compiles a highlevel description of sound and music into C++, with signal processing performed in floatingpoint. In a first step, we will generate functionally equivalent dataflow descriptions of the core DSP loops as VHDL pipelines. This VHDL backend will be built upon the FloPoCo project (http://flopoco.gforge.inria.fr/).
In a second step, automatic conversion to fixedpoint will be studied for improved performance. Very probably, this will require interaction with sound experts in GRAME. URL sujet détaillé :
:
Remarques : Coadvising with FAUST author, GRAME's Yann Orlarey.





SM20794 A design environment for computing just right


Description


The objective of this training period is to help save energy and resources by computing just right. In an ideal computing world, no bit should be flipped, no bit should be transmitted that is not actually useful to the application. However, for half a century, Moore's law and the very successful fixedarchitecture processor have kept us lazy and away from this ideal. As an illustration, doubleprecision arithmetic (64bit resolution, 15 decimal digits of precision) is vastly overkill for most applications, but... it was for free so far.
However, the party is now over. Power consumption issues increasingly constrain the performance of computing cores, and people are beginning to reassess the cost of computing and transmitting useless bits.
The FloPoCo project (http://flopoco.gforge.inria.fr/) helps circuit designers compute just right by offering arithmetic operators for a wide (and growing) range of applicationspecific operations. Besides, all these operators are parameterized, so a designer should be able to choose the precision that best suits her application requirements.
However, this only raises a new question: How to determine the precision requirements of each line of code of an application? This training period will attempt to address this question. Worrying about precision is an unusual exercise, since programmers have been spoilt by decades of free doubleprecision. After a review of the few existing tools for this purpose, we will attempt to sketch a design environment for computing just right. It will consist of a declarative language (expressing what we want to compute, not how to compute it) and an associated compiler using FloPoCo as a backend. As the precision requirements are inherited from the context, this experiment will initially be very focused on a given application domain: either electrical simulation (as in the SPICE simulator) in collaboration with a Singapour team, or highenergy physics in collaboration with CERN in Geneva, or musicoriented digital signal processing in collaboration with GRAME in Lyon. URL sujet détaillé :
:
Remarques : Coadvising with FAUST author, GRAME's Yann Orlarey.





SM20795 Large induced forests in sparse graphs


Description


The aim of this project is to study the following conjecture of Albertson and Berman:
Conjecture [Albertson and Berman, '76]. Every planar graph admits an induced subgraph on at least half of the vertices that is a forest.
Some details on this conjecture can be found on the page of Douglas West : http://www.math.uiuc.edu/~west/openp/planforest.html
Firstly, the student will make a state of the art knowing that this problem have several names: decycling number, feedback vertex set, ...
Secondly, we will work on a generalization of a formula of Alon, Mubayi, and Thomas concerning sparse graphs
Some references :  L.Kowalik, B. Luzar and R. Skrekovski. An improved bound on the largest induced forests for trianglefree planar graphs, Discrete Mathematics and Theoretical Computer Science 12(1):87=96100, 2010.  N. Alon, D. Mubayi, and R. Thomas. Large induced forests in sparse graphs, J. Graph Theory, 38:113=96123, 2001. URL sujet détaillé :
:
Remarques : Coencadrement : Alexandre Pinlou (Alexandre.Pinlou.fr)





SM20796 Weak models for data consistency


Admin


Encadrant : Thomas GAZAGNAIRE 
Labo/Organisme ://ocaml.io), at the Computer Laboratory of the University of Cambridge. This will be part of the Mirage project (http 
URL : http://ocaml.io 
Ville : Cambridge, UK 



Description


We have a complete operating system library written in OCaml that we can use to run pure OCaml application in the Cloud.
We want to formalise and implement a new kind of persistence layer with a weak consistency semantics, inspired by Git, and we are looking for candidates with a strong interest in building safe and efficient systems using Functional Programming to help us to do so.
The internship will be located in the OCaml Labs (http://ocaml.io), at the Computer Laboratory of the University of Cambridge. This will be part of the Mirage project (http://www.openmirage.or). URL sujet détaillé : http://gazagnaire.org/pub/m2.pdf
Remarques :





SM20797 Dynamic reconfiguration of SystemsofSystems by a patternbased approach


Description


An SoS can be perceived as a composition of systems in which its constituents, i.e. themselves systems, are separately discovered, selected, and composed possibly at runtime to form a more complex system. As a consequence of the independence of the constituent systems, an SoS as a whole may evolve over time to respond to changing characteristics of its environment, constituent systems or of its own mission. When architecting a system, patterns provide basic building blocks that are known to organize components in manner wellsuited to some frequent problems. The dynamic reconfiguration of a system consists in introducing modifications without service disruption. One objective of this project is to study the dynamic reconfiguration for a particular class of patternbased systems: Softwareintensive SystemsofSystems (SoS). A focus will be made on the relation between patterns and reconfiguration. In particular to characterise the SoS architecture, architectural patterns will be defined to model interactions between components of a system, and between components of an SoS. An inventory of existing architectural patterns must be beforehand made. For each pattern, we must identify instrumentation and reconfiguration capabilities and model the dynamic behaviour. URL sujet détaillé : http://wwwarchware.irisa.fr/joboffers/
Remarques : coencadrement : Jérémy Buisson (jeremy.buisson.fr) possibilité de rémunération de stage (à vérifier)





SM20798 Automatic Software Repair


Description


Automated software repair is the process of fixing software bugs automatically. It is a research field related to software testing, program synthesis, fault tolerance, data mining for software engineering, empirical software engineering and many others. It may be automatically generating patches when a test case fails [Weimer2009]. It also consists of avoiding crashes at runtime in production [qin2005rx]. Automated software repair can address all phases of the software life cycle from development time to maintenance and production time (to survive failures).
Automated software repair can manipulate behavior (by manipulating code  whether source or binary) or data (e.g. by manipulating the running state of programs). Automated software repair can use randomized techniques (e.g. with genetic algorithms) or deterministic reasoning (e.g. using SAT or SMT solvers). According to the stateofart, there is no way to compare those families of approaches together.
You will work on the automatic repair approaches developed in our team. We explore two directions: randomized techniques (e.g. with probabilistic and genetic algorithms) or deterministic reasoning (e.g. using SAT or SMT solvers). You will further explore one of this family of techniques. URL sujet détaillé :
:
Remarques : Coencadrement possible avec:  Westely Weimer (University of Virgina, USA)
 Claire Le Goues (Carnegie Mellon University ,USA)
 Houari A. Sahraoui (Université de Montréal, Canada)
 Earl Barr (University College London, UK)





SM20799 Malleable treeshaped taskgraph scheduling under memory constraints


Description


As scientific applications involve bigger and bigger data, memory becomes an important constraint and has to be taken into account when scheduling the application. This internship considers the problem of scheduling task graphs which come from a particular application domain (direct solvers for sparse factorization) and thus, have special characteristics: the task graph is a tree, tasks have a common performance profile and can be modeled as malleable tasks.
The objective of the internship is to study the complexity of the bicriteria scheduling problem: makespan and memory minimization, and to propose efficient algorithms to solve it in practical settings (limited amount of memory). The candidate will have to build on the proposed bibliography (see detailed subject).
As a second objectif, the intern may also consider the problem of scheduling such a tree of malleable tasks on a hybrid platform, consisting in two types of resources, such as CPUs and GPUs. URL sujet détaillé : http://graal.enslyon.fr/~lmarchal/sujetstageM2.pdf
Remarques : Cosupervision with Frédéric Vivien.
English version of the detailed subject available upon request.





SM207100 Characterization of cooperation mechanisms in System of Systems


Description


SystemsofSystems (SoS) are an emergent class of evolving softwareintensive systems that is increasingly shaping the future of our softwarereliant world. A SoS can be perceived as a composition of systems in which its constituents, i.e. themselves systems, preexist, are separately discovered, selected, and composed possibly at runtime to form a more complex system. This composition forms a larger system that performs a mission not performable by one of the constituent systems alone, i.e. creates an emergent behavior. This research project focuses on the characterization of SoS, that is to say their ability to produce an expected functionality with given quality of service conditions. To contribute to general issues regarding SoS, its goal is to provide a platform to describe, characterize and simulate SoS. Real case studies (linked to societal needs) will guide the design of the platform. The expected results will provide assistance to SoS architects and developers. The concepts and tools proposed in this study will base their cooperation mechanisms on the notion of role. Role appears to us a fundamental concept for defining and manipulating SoS. It substitutes the concept of cooperating system to that of cooperating logical role, the role may be materialized dynamically by a system having potentially the role. The aim of the Master will be to study the cooperations mechanisms based on the concept of role in a system of systems, to make some experiments with a formal Language such as a process algebra supporting dynamicity, respectively for defining systems and communications; and finally to participate to the definition of the initial expression framework based on SysML . URL sujet détaillé : https://wwwarchware.irisa.fr/files/2013/11/projet_C3_SDS.pdf
Remarques : coencadrement : Régis Fleurquin (regis.fleurquin.fr)
possibilité de rémunération de stage (à vérifier)





SM207101 Theories as inference rules with applications to modal logics


Description


roof theory has a successful and mature history of describing proofs and their properties for firstorder classical and intuitionistic logics. Mathematics and computer science, however, requires much more than is offered by these logics alone. For example, if one needs to formalize notions such as sets, algebras, and order relations, one can introduce theories to axiomatize the properties needed of these notions.
A more sophisticated treatment of axioms in a theory would be to convert them into inference rules. Recent work by Negri et. al. [NvP01, N05] and Ciabattoni et. al. [CGT08] have made progress in doing exactly that for a range of theory types.
The main goal of this internship is to discover the extent to which recent work in focused proof systems for classical and intuitionistic logics can be used as a general framework for converting axioms into inference rules.
Goal 1: Show how a range of common theory types (for example, Horn clauses and geometric theories) can be viewed as inference rules in which cut and initial both can be eliminated. Use the framework of focused proof systems in classical and intuitionistic logic [LM09] to prove these results. What limits to this method can you find?
There are several ways to judge whether or not one's treatment of axiomsasrules is of ``high quality''. One such way involves applying that treatment to, say, modal logics. The semantics of such logics are usually given by classes of (Kripke) models in which the accessibility relationship is axiomatize by various theories.
Goal 2: Proof systems for modal logics can be made with or without explicit reference to possible worlds and the theories that define their accessibility relations. Use the results above to show how explicit codings can be abstracted away to yield the ``implicit'' proof system.
Goal 3: Show how an explicit proof representation for modal logic formulas (in both classical and intuitionistic logics) can be checked using proof checkers that do not have underlying modal operators.
This internship will take place within the funding context of the ProofCert project (an ERC Advanced Investigator Award). This five year project (20122016) provides funding for PhD students covering a broad range to topics in a foundational approach to proof certificates. Continuing to work on this topic for a PhD is possible and encouraged. Bibliography
The following bibliography is indicative of the kind of papers that will be part of this research effort.
[CGT08] From axioms to analytic rules in nonclassical logics by Agata Ciabattoni, Nikolaos Galatos, and Kazushige Terui. LICS 2008, IEEE Computer Society Press, 229240. pdf [LM09] Focusing and Polarization in Linear, Intuitionistic, and Classical Logic by Chuck Liang and Dale Miller. Theoretical Computer Science, 410(46), pp. 47474768 (2009): pdf. [N05] Proof Analysis in Modal Logic by Sara Negri. Journal of Philosophical Logic, 34(56), 2005, 507544. doi. [NvP01] Structural Proof Theory by Sara Negri and Jan von Plato. Cambridge University Press, 2001. URL sujet détaillé :
:
Remarques :





SM207102 Symbolic execution and software vulnerability detection


Description


Detection and analysis of software vulnerabilities is an active research area in terms of computer security. Its purpose is to find programming flaws that could be exploited by an external user. A possible detection techniques is based on concolic execution,which consists in generalizing some "concrete" program executions by considering some input variables as "symbolic". However, many vulnerabilities are revealed only by executing intensively some program loops, which requires to extend concolic execution by finding some loop invariants.
The work proposed consists in: 1. Evaluating such existing loop summarization techniques in the particular context of vulnerability detection. 2. Proposing a solution that could be automated using existing (sourcelevel) program analysis platforms, such as FramaC or LLVM. 3. Extending this approach at the binary level, targeting classical vulnerabilities.
This work will take place within the French ANR BinSec project, dedicated to the binary code analysis techniques for software security. URL sujet détaillé : http://projetsmastermi.imag.fr/pcarre/CDExportProjetHtml?idSujet=1648
Remarques : Une gratification de stage (environs 420 euros/mois) est prévue.





SM207103 Verification of Infinite Safety Games and Supervisory Control Problems


Admin


Encadrant : Francesco BELARDINELLI 
Labo/Organisme : Ce projet est coencadré par Francesco Belardinelli (Laboratoire IBISC, Université d'Evry) et Guillaume Aucher (INRIA/IRISA, Université de Rennes). 
URL : http://www.irisa.fr/ 
Ville : Evry et Rennes 



Description


Although epistemic temporal logics and imperfect information games are two theoretical frameworks which address different issues, they both deal with the same basic and intuitive notions of action, time and knowledge. On the one hand, game theory involves complex notions which can be expressed by a combination of terms dealing more or less implicitly with actions, time and knowledge. The complex notion of observationbased winning strategy is a paradigmatic example of such combination. On the other hand, epistemic temporal logics are logics where it is possible to express complex statements which are explicit combinations of these notions. So far, there has been little interaction between these two theoretical frameworks. In (Aucher, 2013), we embedded the framework of infinite twoplayer turnbased games played on graphs with safety objectives in an epistemic temporal logic. This embedding was made possible thanks to two intermediate embeddings: the first is from infinite games with safety objectives to supervisory control theory and the second is from supervisory control theory to epistemic temporal logic. Thereby, we were able to determine whether a player has a winning strategy from a given state, and in that case we were able to elicit and synthesize all his winning strategies. The originality of our results stems from the fact that they are all formulated in terms of model checking problems in the epistemic temporal logic CTLK. The reformulation of infinite game theory and supervisory control theory in epistemic temporal logic highlights their main underlying intuitions in terms of formal expressions involving the modalities of action, time and knowledge.
The objectives of this internship are twofold: 1. Investigate the exact computational complexity of the different model checking problems elicited in (Aucher, 2013); 2. Implement the theoretical framework and results of (Aucher, 2013) in model checkers of epistemic temporal logics such as MCK (Model Checking Knowledge) (Gammie and van der Meyden, 2004b,a) or MCMAS (Model Checking MultiAgent Sys tems) (Lomuscio and Raimondi, 2006, 2009).
A number of software tools have been developed for supervisory control theory, such as Supremica, Sigali and GIDDES (Graphical Interface for the Design of DiscreteEvent Systems). An interesting line of research would be to investigate whether the reformulation of supervisory control problems into epistemic temporal logic makes the implementation of the solutions of these problems more feasible and efficient. In that respect, the symbolic methods developed for dealing with the model checking of epistemic temporal logics might turn out to outdo the algorithms developed by researchers in supervisory control theory. A detailed comparison between these two kinds of implementation is definitely an interesting line of research and a series of benchmark may be needed for that. URL sujet détaillé : https://www.ibisc.univevry.fr/~belardinelli/Documents/SujetStageX2014.pdf
Remarques : Ce projet est coencadré par Francesco Belardinelli (Laboratoire IBISC, Université d'Evry) et Guillaume Aucher (INRIA/IRISA, Université de Rennes).





SM207104 Etude d'un algorithme aléatoire pour le prblème du SetCover  Application en oncologie.


Description


Molecular biology is a field giving raise to a high number of mathematical and algorithmic issues. One of those is the detection of the regulatory misbehaviors which transform a normal cell into a tumoral one.
One way to tackle this problem is to consider the gene regulation network as a bipartite graph of regulators and regulatees, each regulatee beeing governed by a table of truth depending on its regulators. Once the tables of truth have been learned in normal cells by statistical means, misbehaving regulatees can be detected in tumoral cells as violating them.
An argument of parsimony allows then to write the enumeration of minimal regulators' perturbation scenarios as a HittingSet problem, which is equivalent to the NPhard SetCover problem. Solving this problem would give raise to a list of regulators beeing potential therapeutic targets.
The goal of the internship is to study, theoretically and in practice, a new random heuristic for this problem. It's mean approximation ratio can be shown to be unbounded but by using very pathological instances. On real data, this heuristic seems quite promising. The aim is therefore twofold:
(1) Study the heuristic theoretically, by trying to characterize the instances for which it behaves well/bad.
(2) Apply it on real data and think on biologicaldriven ways to improve it.
The main concern of the internship will be computer science (random algorithms, graph theory, approximation ratios) but it will take place in the Statistic team of an applied maths lab and will imply discussions with bioinformaticians and biologists. No statistical nor biological knowledge is required, but the student should be highly openminded and interested in multidisciplinarity. URL sujet détaillé :
:
Remarques : Le stage impliquera des contacts avec des bioinformaticiens (équipe {em MEGA} de l'institut de biologie synthétique et systémique d'Evry; équipe {em Baobab} de l'INRIA RhôneAlpes) et des biologistes (équipe {em Oncologie Moléculaire} de l'institut Curie à Paris).





SM207105 HugeDimensional Models for Visual Recognition


Description


General Information: The internship will take place at the INRIA RhoneAlpes research center near Grenoble, in the LEAR Team. The intern will be coadvised by Julien Mairal, a young researcher in machine learning and optimization, and by Cordelia Schmid, who is the head team and one of the most renowned scientist in the computer vision community. If successful and after its completion, the intern can possibly be hired as a PhD student by the team.
Research Topic and Objectives: We have now entered an era of massive data acquisition, which raises new opportunities. In the context of computer vision, massive data will allow us to learn better image and video models, which involve a huge number of parameters. Such models are richer than traditional ones, but raise difficult computational challenges, which need to be solved during the internship.
During the last decade, many computer vision approaches have represented images under the form of a ``bag of visual words'', or by using some variants of it. In a nutshell, each local patch (small regions of typically times 16) of an image is encoded by a descriptor which is invariant to small deformations. Unsupervised machine learning is then used for discovering the underlying structure of local descriptors, and for defining a set of typical visual patterns called ``visual words''. The image is then described by computing histogram of word occurences, yielding a powerful representation for discriminative tasks. Various attempts have been made in the past for improving this representation, such as highorder models, where images are represented by cooccurences of visual words in small neighborhoods. As a result, the number of parameters in the model grows quadratically, resulting in a computational burden and risk of overfitting. For these two reasons, cooccurences approaches have not been successful yet, even though they correpond to richer models than the traditional bags of words. We believe that this fact should change in the big data context.
To deal with the large number of parameters, we will use techniques called hashing. These techniques, originally developed in computer science, can help us implicitly reduce the number of a features, and are starting to be used in machine learning, with promising preliminary results for dealing with large problems. To deal with the massive amount of data, we will also use stochastic optimization, which has proven to be more scalable than traditional approaches in the context of big data.
Desired Skills: The intern should have a strong background in mathematics and computer science. He should be motivated by difficult computational challenges, and thus should be familiar with a highlevel programming language (Python or Matlab), and be well skilled in a lowlevel one (C++ or C). Some knowledge of parallel programming will be appreciated.
URL sujet détaillé : http://lear.inrialpes.fr/people/mairal/bigdata_vision.pdf
Remarques : The intern will be coadvised by Cordelia Schmid, who is the head of the LEAR team at INRIA.
The intern will receive a compensation an can possibly be hired as a PhD student, if the internship is successful.





SM207106 Distributed Algorithm, selfstabilization, modeling


Description


** Subject
*** Scientific Context
Modern distributed systems are challenging because they can be resource constraint (e.g., Wireless Sensor Networks  WSN), dynamic (e.g., PeertoPeer Systems), large scale (e.g. Internet), and prone to faults (e.g., in WSN, processes are subject to crash because of their limited battery and communications use radio media that are subject to intermittent lost of messages).
The ability of a distributed algorithm to be faulttolerant in this context is thus important. In this work, we focus on selfstabilization, which is a lightweight faulttolerance technique. Selfstabilization is a property which allows to face transient faults that may occur in a network: after some faults occur (message loss, memory corruption, topology change, ...), a selfstabilizing algorithm guarantees that the network will converge within finite time to a correct behavior without external intervention.
The correctness of selfstabilizing solutions is usually proven using handwritten proofs. But, nowadays, there exist tools, namely proofassistant, able to mechanically check mathematical proofs. For example, COQ is an assistant for semiautomatic proofs. It allows to formally express the properties of the algorithm. Then, interactions with the user allow to build a certified proof of those properties.
Compared with handwritten proofs, the use of COQ greatly improves the confidence on those proofs (in particular there cannot exist logical or computational mistakes). The level of guaranty for COQ proofs is sufficiently high to be use in critical areas, such as avionics or cryptography.
Beyond the verification of existing proofs of selfstabilizing algorithms, we aim at identify recurring proof schemas, in order to reuse them and mechanize the development of other proofs. For example, many proofs of selfstabilizing algorithms share common arguments based on induction schema. The overall idea of this project is to participate to the certification of distributed selfstabilizing algorithms and to the partial automation of their proofs. This subject is a first step toward this direction: a COQ proof of a selfstabilizing algorithm.
** Work
 Bibliographical study: understand the computational model, the algorithm to be proved and its hand proof.  Define the semantics of the model in COQ.  Implement the algorithm in the COQ framework.  Proof of the selfstabilization of the algorithm in COQ. The selfstabilization definition being defined into the three following properties: 1. Correctness, 2. Convergence, and 3. Closure.
** Useful background
 Caml language programming  Sequential algorithms, hand proofs of algorithms, complexity of algorithms,  Distributed algorithms
** Where and who?  Verimag Lab  Teams: DCS and Synchrone  Persons involved in the project: Karine ALTISEN, Pierre CORBINEAU, Stéphanes DEVISMES, Michaël PÉRIN (email : prénom.nom.fr) URL sujet détaillé :
:
Remarques :





SM207107 Lambda Calculus on the reals compared to other real models of computation


Description


Contrary to discrete computation, there is no ChurchTuring thesis for models of computation on the reals. Various models exist: Recursive Analysis[1] that uses Turing Machines for computing reals at any precision; the General Purpose Analog Computer[2] which combines basic continuous bricks in a fashion similar to circuits; the Blum Shub and Smale model[3] which transforms the RAMmodel so that it can read and store real numbers in registers; Abstract Geometrical machines[4] inspired by cellular automata... Some comparability and incomparability results between them, but none of those models is clearly the good one.
Recently, an extension of Lambda calculus was introduced by Di Ginanatonio and Edalat[5] for capturing continuous functions with the help of a Differential operator. This model presents similarity with the GPAC as differentiation is a core operator, however it also has the ability to produce noncontinuous functions with a test which can only be found in the BSS model. Where this model can be placed on the map of models of real computation is hence an open question.
The goal of this internship will be to compare this Lambda Calculus with Differential (LCD) with the GPAC. Are all functions generated by GPAC definable in LCD, and conversely. For possible simulations do they preserve computing time? URL sujet détaillé : http://www.loria.fr/~hainry/sujet2_en.html
Remarques : coadvised by Romain Péchoux





SM207108 Complexity Analysis in Java using a Type System


Description


The aim of Implicit Complexity is to design criteria (type systems, semantic interpretations) to prove that programs belong to given classes of complexity. A new implicit complexity analysis based on a type system for imperative and object oriented languages was proposed in [1], [2] and [7]. This analysis is inspired by Data Ramification techniques [3, 4] and by noninterference [5]. It ensures that if a program can be typed and terminates, it will run in polynomial time (or in a different context, polynomial space).
The goal will be to extend the existing type system to capture more features of Object Oriented Programs, while still ensuring polynomial time or polynomial space execution.
This work should focus on the following ideas: increase the number of programs that can be analysed through the use of program transformation techniques; improve the expressivity of the considered language by considering forks, threads, functions, exceptions for example. explore the common cases in real world programs for which this analysis fails and correct the type system to capture them. provide new characterizations of complexity classes (FPspace, PP, BPP, NC, ...)
1) J.Y. Marion, A Type System for Complexity Flow Analysis. LICS 2011, pp. 123132 (2011) 2) J.Y. Marion, R. Péchoux, Complexity Information Flow in a Multithreaded Imperative Language CoRR abs/1203.6878 (2012) 3) S. Bellantoni and S. Cook, A new recursiontheoretic characterization of the polytime functions, Computational Complexity 2 (1992), p. 97=96110. 4) D. Leivant and J.Y. Marion, Lambda calculus characterizations of polytime, Fundamenta Informaticae 19 (1993), no. 1,2, p. 167,184. 5) Dennis M. Volpano, Cynthia E. Irvine, Geoffrey Smith, A Sound Type System for Secure Flow Analysis. Journal of Computer Security 4(2/3): 167188 (1996) 6) Byron Cook, Andreas Podelski, Andrey Rybalchenko, Terminator: Beyond Safety. CAV 2006, 415418 7) E. Hainry, R. Péchoux, Typebased heap and stack space analysis in Java, http://hal.inria.fr/hal00773141, 2013
URL sujet détaillé : http://www.loria.fr/~hainry/sujet1_en.html
Remarques : coadvised by Emmanuel Hainry





SM207109 Lagrangian duality in Online Algorithms


Description


Online algorithms is a domain studying problems with uncertainty about future. In general, requests arrive overtime and one needs to make irrevocable decisions without any information about the future. The problems appears in many practical settings, for example in decision making, stock investment, network routing, etc. Hence, it is important to design algorithms/strategies with performance guarantee in uncertain environment.
The most successful technique until now in online algorithms is the potential argument. However, the technique has its limit (it does not give much insight about the problems, etc.) Currently, there is a trend looking for a principled approach to design and analyze online algorithms. A direction [1] is to exploit the power of mathematical programming and use the Lagragian duality. The method is powerful and flexible as it is based on natural relaxations (even nonconvex ones) and it successfully gives theoretical evidence for practical algorithms.
Our project is to develop this technique and the internship is a part of the project. The goal is to study concrete online problems (online scheduling/matching/paging etc) using the approach. Ideally, a research publication could be derived at the end of the internship.
Requirement: Motivation and basic knowledge on Algorithms, Discrete Mathematics.
[1] Nguyen Kim Thang. Lagrangian Duality in Online Scheduling with Resource Augmentation and Speed Scaling. European Symposium on Algorithms (ESA), 2013. URL sujet détaillé : https://www.ibisc.univevry.fr/~thang/Papers/lagrange_ESA.pdf
Remarques : Renumeration (standard): environs 440 euros/mois.





SM207110 Grid exclusion theorems for directed treewidth


Description


Treewidth is considered as one of the most important graph parameters with a great variety of applications in graph theory and in algorithm design. It was introduced as a cornerstone concept of the Graph Minors Project, developed by Robertson and Seymour (acknowledged as among one of the most influential results in modern combinatorics), has been one of the most powerful and widely used tools in modern Algorithmic Graph Theory. One of the most influential results of the Graph Minors Project asserts that for every planar graph H there exist a constant c such that if a graph G has treewidth at least c, then some subgraph of G can be transformed to H after a series of edge contractions. The subject of this stage is to investigate possible extensions (or variants) of this theorem in directed graphs. In this directions there exist a conjecture of Johnson, Robertson, Seymour, and Thomas based on a suitable definition of directed graphs and the corresponding gridlike digraph pattern. The aim of the stage is to build on the existing progress on this conjecture and prove variants or special cases of it that will appear useful for algorithmic applications on directed graph problems. The candidate will be involved with all the combinatorial results on treewidth and their possible extensions on directed graphs. Moreover, he/she will proceed with an extensive study of the current advances on this conjecture and their possible com binations with other techniques, mostly those emerging from the Graph Minors series. (For more details, please check the attached .pdf at http://www.lsi.upc.edu/~sedthilk/stage/stage_2014_1_Thilikos.pdf ) URL sujet détaillé : http://www.lsi.upc.edu/~sedthilk/stage/stage_2014_1_Thilikos.pdf
Remarques :





SM207111 Completion problems for parameterized properties


Description


Graph completion problems are natural to define: take any graph property, represented by a collection of graphs P, and ask, given an input graph G, whether it is possible to add edges to G so that the new graph belongs in P. Such problems belong in the general category of "graph modification problems", have been studied for a long time, and some of the most prominent are the following: HAMILTONIAN COMPLETION, INTERVAL GRAPH COMPLETION, and CHORDAL GRAPH COMPLETION where P is the set of Hamiltonian, interval, and chordal, graphs respectively. The study of the parameterized complexity of such problems was initiated by a seminal paper of Kaplan, Shamir, and Tarjan where most (but not all) of them where proven to be fixed parameter tractable when the problem parameter is the number of edges to be added. In this stage we propose the study of another general family of completion problems where the parameter is not the number of edge to be added but, instead, is the parameter of another parameterized problem. That way we may define the completion counterpart of every parameterized problem on graphs. This stage proposes the study of several parameterized problems of this type and its ambition is to detect patterns in problems that can reveal associations between the parameterized complexity of parameterized problems and the parameterized complexity of their completion counterparts. Towards accomplishing the above target, the candidate will gradually be involved with some of the most powerful techniques in parametertized algorithm design and parameterized complex ity such as Dynamic Programming, Courcelle's Theorem, Bidimensionality Theory, and Cross Composition Reductions. In parallel he/she will work on completions problems starting with the problems mentioned above and their variants. It is expected that the experience and knowledge ob tained during this 6 month training period will provide the basis for a more general research program on the parameterized complexity of graph completion problems.
For more details on this stage proposal, see the attached .pdf file at http://www.lsi.upc.edu/~sedthilk/stage/stage_2014_2_Thilikos.pdf URL sujet détaillé : http://www.lsi.upc.edu/~sedthilk/stage/stage_2014_2_Thilikos.pdf
Remarques :





SM207112 Avoiding long abelian repetitions in words


Description


A word is a (possibly infinite) sequence of letters from a fixed alphabet. A square in a word of the form uu. Is it possible to construct an infinite word on binary alphabet without squares (as subword of consecutive letters) ? Obviously not... And what about 3 letters ? Axel Thue answered positively to that question in a paper in 1906, which is considered as one of the origin of word combinatorics. Since then, the study of repetitions is a central focus in this domain.
In 1957, Paul Erdos asked whether abelian squares can be avoided by an infinite word on an alphabet of 4 letters (uv is an abelian square if u is a permutation of the letters of v). Keranen answered positively to Erdos's question in 1992, with an intensive use of computer. Erdos also raised whether arbitrarily long squares can be avoided in binary words. Some conjectures are still open on the size the smallest alphabet required to avoid arbitrarily long abelian squares (resp. cubes).
Recently Karhumaki et al. introduced a new notion of word equivalence, the kabelian equivalence, which is a generalization of both abelian and standard equivalence (and bring the gap between them). We now know, for every k, the size of the smallest alphabet required to avoid every kabelian squares (resp. cubes).
The subject of this stage is to focus on the question of repetition avoidability (squares/cubes) of arbitrary long kabelian squares (resp. cubes). One goal is to answer to the following questions:
 Is there a kin N such that one can avoid arbitrarily long abelian cubes on binary words ?  Is there a kin N such that one can avoid arbitrarily long kabeliansquares on binary words ?
Candidate:  Background in discrete mathematics / combinatorics.  Good programming skills URL sujet détaillé : http://perso.enslyon.fr/michael.rao/stage_kab.pdf
Remarques :





SM207113 Cloud Computing Monitoring


Description


This master thesis is in the Cloud Computing domain, one that is growing currently. Several resource monitoring techniques, algorithms and tools can be used to monitor Infrastructureasaservice, PlatformasaService and Softwareasaservice Cloud applications. The goals of this master thesis work are: 1) To analyse the Cloud Computing monitoring models, 2) To develop a Cloud Computing innovative monitoring methodology that addresses the deficiencies found in the existing models and 3) To implement it as software prototype and to test it.
Reference: Cloud Computing Bible, Barrie Sosinsky, 2011  ISBN10: 0470903562. This book can be made available.
For more information please contact Catarina Ferreira da Silva: catarina.ferreiradasilva [at] liris.cnrs.fr URL sujet détaillé : http://liris.cnrs.fr/catarina.ferreiradasilva/researchsubjects.html
Remarques :





SM207114 Certifying Executions of an Automatic Program Verifier in Coq


Description


Context
We are developing a generic programverification approach and tool based on a formalism called Reachability Logic (RL) that was recently introduced in [1]. In RL one can specify program properties similar to the pre/post conditions and invariants of Hoare logics [2] but in a languageindependent way. Our approach amounts to automatically searching for proofs in a deductive system that we have introduced in [3], which, for technical reasons, is better suited for automatic proof search than the original proof system of RL [1]. We have also shown that our proof system is subsumed by the original one, in the sense that each rule of our system is encoded by possibly many rules of the original. The subsumption result, as well as the implementation of an automatic program verifier based on it (kcheck), are quite nontrivial: intrusions of theoretical and/or implementation errors in them cannot be formally ruled out in the current state.
Problem
In order to obtain a strong guarantee that the programs sucessfully verified by our tool do meet their RL specifications we propose to certify executions of the verifier by using the Coq proof assistant [4]. Specifically, from any succesful execution of our verifier on a given program and RL specification, a Coq script stating that the program meets its RL specification (according to the original proof system of RL) will be automatically generated, and then executed in Coq.
Work to be accomplished
* embed the original RL proof system in the formalism of the Coq proof assistant. * identify the relevant information required for automatically mapping rules of the proof system of [3] to sequences of rules of the original RL proof system. * instrument the program verifier in order to make it return that information. * develop a tool that, following a sucessful execution of the verifier on a given program and RL specification, collects the said information and generates a Coq proof script from it. If Coq accepts the proof then the verifier's execution (on the given program and specification) can be considered as being certified. * illustrate the resulting approach and tools on several relevant examples.
Bibliography
G. Roşu and A. Ştefănescu. Checking reachability using matching logic. In G. T. Leavens and M. B. Dwyer, editors, OOPSLA, pages 555=96574. ACM, 2012.
C. Marché. Introduction, Classical Hoare Logic. Cours au Master Parisien de Recherche en Informatique.
A. Arusoaie, D. Lucanu, V. Rusu. LanguageIndependent Program Verification Using Symbolic Execution. Research report.
Y. Bertot and P. Castéran. Coq'Art. Chinese and English translations of the book are available. The French version can be freely downloaded from the website. URL sujet détaillé : http://www.lifl.fr/~nowakd/RLinCoq.html
Remarques : Coencadrement avec Vlad Rusu





SM207115 Continued Fractions for Special Functions


Description


Continued fractions have been studied since Euler's time for their applications in number theory and approximation.
Computing the first few terms is easy, but general formulas are sporadic. The aim of this work is to use an approach based on experimental mathematics that consists in first guessing (automatically!) a pattern and then proving the resulting formula (still automatically). The target functions are classical special functions that are solutions of linear differential equations with polynomial coefficients.
If time permits, the automatic derivation of convergence domains and bounds on the speed of convergence will also be addressed. URL sujet détaillé :
:
Remarques :





SM207116 Analyse de données issues du trafic DNS pour la localisation de botnets


Description


Les services de transport de paquets IP sont sujets à des risques importants. A travers ce stage, nous cherchons à mettre en óuvre des méthodes d'apprentissage automatique afin de détecter diverses activités malveillantes en analysant finement le trafic.
Parmi ces activités que nous souhaitons caractériser, nous nous focaliserons sur le contrôle de botnets (réseaux d'ordinateurs zombies, contrôlés par des pirates).
Les données consistent en des traces de trafic DNS et peuvent être analysées via différents algorithmes de data mining et de machine learning (classification supervisée avec des arbres de décision, classificateurs Bayésiens, réseaux de neurones, support vector machines).
Nous pourrons nous appuyer également sur diverses méthodes de clustering pour étudier les différents profils de communications. Des contraintes propres aux données rendent l'élaboration de méthodes dédiées indispensables.
Activités: montée en compétence sur les formalismes et les systèmes, modélisation mathématique des problématiques, algorithmique et développement. URL sujet détaillé :
:
Remarques : Stage rémunéré.





SM207117 Valued constraint satisfaction against an adversary


Description


We propose to initiate a study of valued constraint satisfaction problems when some variables are controlled by an adversary. In concrete terms this mean that some variables are universally quantified, and that one needs to build as efficiently as possible a strategy to assign values to variables such that the cost implied by the constraints is minimal. The main task we propose is to investigate classes of instances that would remain tractable be cause one can reduce an instance to polynomially many (hopefully tractable) instances of the un quantified valued constraint satisfaction problem. URL sujet détaillé : https://www.dropbox.com/s/wq8plmrs9tefeo5/VQCSP.pdf
Remarques : coencadrement avec Florent Madelaine (http://www.isima.fr/~madelain/) rémunération possible sur le projet ANR international francoautrichien ALCOCLAN.





SM207118 Algorithms for Tensor Decomposition


Description


General presentation of the topic::
A symmetric tensor is a higher order generalization of a symmetric matrix. Tensors appear in many areas of computer science and mathematics. They have many practical applications in applied areas like for example statistics, electrical engineering, antenna array processing, telecommunications, chemometrics, psychometrics, data analysis, independent component analysis, shape analysis. There are also of extreme importance in theoretical computer science, like for example polynomial identity testing, and in the study of the arithmetic complexity of computing a set of bilinear forms (an approach that lead to fast algorithm for matrix multiplication). Cryptography is also an area where tensors can occur(typically, in multivariate cryptography).
Every symmetric tensor, of dimension n and order d, corresponds to a homogeneous polynomial in n variables of degree d. The decomposition of the tensor as a sum of rank 1 matrices corresponds to decomposing the corresponding homogeneous polynomial as a sum of dth powers of linear forms. The number of linear forms is called the rank of the homogeneous polynomial (or the rank of the tensor). Objectives
Objectives::
The successful applicant will perform a survey on the status of some known algorithms and approaches for decomposing a homogeneous polynomial as a sum of powers of linear forms. After that, She/he will implement in a computer algebra system (such as Maple) a prototype of one or more of the existing algorithms. An experimental analysis of the complexity will be then conducted.
She/he will then try to identify the strengths and the weaknesses of each approach and will classify the algorithms according to their complexity. If time permits the applicant might also try to introduce efficient algorithm in the case where the rank is small.
Keywords::
algorithmic, cryptology, combinatorics, complexity theory.
Required Skills::
Knowledge of Μaple, basic knowledge of algorithms, basic knowledge of complexity theory.
Bibliography ::
N. Kayal. Affine Projections of Polynomials. In STOC, ACM, 2012.
J. Brachat, P. Comon, B. Mourrain, and E. Tsigaridas. Symmetric Tensor Decomposition. Linear Algebra and its Applications, 433(1112):1851=961872, 2010.
L. Oeding and G. Ottaviani. Eigenvectors of tensors and algorithms for Waring decomposition. Journal of Symbolic Computation 2013.
P. Comon; G. Golub; L.H. Lim; B. Mourrain. Symmetric tensors and symmetric tensor rank. SIAM J. on Matrix Analysis and Applications, 2008, 30 (3), pp. 12541279.
URL sujet détaillé :
:
Remarques :  Interested candidates can contact JeanCharles Faugère for further information.
 The internship will take place at Laboratoire d'Informatique de Paris 6 (LIP6) of UPMC, which is located in the center of Paris.
 The internship will be funded.
 Ludovic Perret and Elias Tsigaridas will also be involved in the supervision.





SM207119 SMT and Temporal Logics


Description


Automated proof techniques are paramount for the verification of algorithms. For certain properties of (parallel and distributed) algorithms, it is useful to reason about formulas in temporal logics that express facts about values of variables that evolve over time. Temporal logics are instances of the larger class of modal logics.
SMT (satisfiability modulo theories) solvers are highly efficient tools for automatically determining the satisfiability of logical formulas in a firstorder language where certain symbols are interpreted, such as the symbols of arithmetic. In general, SMT solvers are decision procedures for quite expressive, but quantifierfree languages; reasoning about quantified formulas is handled by instantiation. Temporal or modal connectives are not handled natively. We recently showed that SMT solvers can be adapted for obtaining a decision procedure for socalled Basic Modal Logic.
The objective of the work proposed here is to study the use of SMT solvers first as decision procedures for extended modal logics, and then of temporal logics. In a first step, our results on Basic Modal Logic should be extended to account for particular properties (such as transitivity) of accessibility relations, and to cases where additional operators are used. For this, it is important to understand the standard translations of these logics to classical firstorder logics and to define (and prove correct) a framework in which formulas of the studied logics are decided by a combination of an instantiation procedure and of a decision procedure for the quantifierfree fragment. Depending on the interests of the student, the resulting algorithms can be prototypically implemented in the SMT solver veriT.
URL sujet détaillé : http://www.loria.fr/~merz/stages/smttemporal.pdf
Remarques : Topic jointly supervised with Pascal Fontaine.





SM207120 Belief revision in decidable fragments of firstorder logic


Description


Belief revision is an operation of belief change that consists in modifying old beliefs so that they become consistent with new beliefs. It has been studied in a general logical framework, in particular through postulates that a revision operator is supposed to verify. Then, it has been applied in various formalisms, starting with propositional logic. In these formalisms, a family of operators are defined thanks to a distance function on interpretations.
The goal of this internship is to define similar operators in firstorder logic (FOL) and to design algorithms implementing them in decidable fragments of FOL (e.g., some description logics). This raises the problem that on the usual definition of the semantics of FOL, its is difficult (or even impossible) to define a distance function on interpretations. One idea is then to reuse or to define an equivalent semantics based on a more "handable" set of interpretations (e.g., the Herbrand interpretations). URL sujet détaillé : http://www.loria.fr/~lieber/revision_in_FOL.pdf
Remarques : Coadvising: Alice Hermann (Alice.Hermann.fr) Remuneration: contact us (by mail)
For any question: contact us (by mail)





SM207121 Certified proof of a distributed and selfstabilizing algorithm


Description


Certified proof of a distributed and selfstabilizing algorithm
** Key words
Distributed Algorithm, modeling, proof in Coq
** Subjet
*** Scientific Context
Modern distributed systems are chagellenging because they can be resource contraint (e.g., Wireless Sensor Networks  WSN), dynamic (e.g., PeertoPeer Systems), large scale (e.g. Internet), prone to faults (e.g., in WSN, processes are subject to crash because of their limited battery and communications use radio media that are subject to intermittent lost of messages).
Being fault tolerant, in this context, is thus important: we focus in this work on selfstabilization. Indeed, selfstabilization is a property which allows to face transient faults that may occur in a network. After some fault occurs (message loss, memory corruption, topology change, ...), a selfstabilizing algorithm guarantees that the network will converge again towards a correct behavior, and this within some finite time and without any outside help.
Selfstabilizing solutions and their properties are usually designed and studied using simulation and hand proved results. But, there exists nowadays, tools able to mechanically check mathematical proofs. The COQ software is an assistant for semiautomatical proofs. It allows to formally express the properties of the algorithm and then the user interacts with the software to build a certified proof of those properties. Compared with hand written proofs, using COQ greatfully improves the confidence on those proofs; this allows to certified their validity (in particular there may not exist logical nor computational mistakes). The level of guaranty for COQ proofs, is high enough to be use in highly critical domain such as avionics or cryptography.
Beyond the verification of existing proofs of selfstabilizing algorithms, we aim at identify recurring schema of proofs, in order to reuse them and mechanize the development of other proofs. Indeed, the proofs of properties on selfstabilizing algorithms often share common arguments based on induction schema. The general idea of this project is to participate to the certification of distributed selfstabilizing algorithms and to the partial automation of their proofs. This subject is a first step in this direction: a COQ proof of a selfstabilizing algorithm.
** Work
 Bibliographical study: understand the computational model, the algorithm to be proved and its hand proof.  Define the semantics of the model in COQ.  Implement the algorithm in the COQ model  COQ proofs of the theorem: 1. Correctness of the algorithm 2. Selfstabilizing properties: convergence and closeness.
** Useful background
 Caml language programming  Sequential algorithms, hand proofs of algorithms, complexity of algorithms,  Distributed algorithms (is a plus!)
** Where and who?
 Verimag Lab  Teams: DCS and Synchrone  Persons involved in the project: Karine ALTISEN, Pierre CORBINEAU, Stéphanes DEVISMES, Michaël PÉRIN (email : prénom.nom.fr)
URL sujet détaillé :
:
Remarques :





SM207122 Compositional decision procedures for a fragment of Separation Logic


Description


Separation Logic (SL) is a widely used formalism for describing dynamically allocated linked data structures, such as lists, trees, etc. The decidability status of various fragments of the logic constitutes a long standing open problem. Recent results report on techniques to decide satisfiability and validity of entailments for fragments of SL using a rather large class of inductively defined predicates specifying data structures like (doubly) linked lists, trees, etc. These techniques mainly proceed by reduction to logics for which the decidability is known (e.g., firstorder logic with reachability and set constraints, or Monadic Second Order Logic on graphs with bounded tree width). But in practice, this reduction approaches do not behaves well. Better results are obtained when the power of the separation operator of SL is used to split, e.g., the entailment checking between big formulas into a set of entailment checking problems between simple formulas. Such approaches, called compositional, have been proposed for SL fragments with (nested) singly linked lists; they are based on graph representation of formulas and graph homomorphism. The internship will study how to extend the compositional approaches to obtain decision procedures (DP) for the entailment checking problem for SL fragments with more general data structures, including the ones dealt by the reduction approaches. Depending on the abilities of the student, the internship may include programming of the DP studied or/and experiments with existing DP (some developed in our team). URL sujet détaillé :
:
Remarques : The internship will be codirected with Constatin Enea.





SM207123 Execution of a PGAS programming model on an embedded multicore


Description


In this internship aims to explore the opportunity of exploiting a parallel programming model for highperformance applications on a recent multicore systemonchip (SoC).
Usual parallel programming models include the sharedmemory vision, e.g., as it is the case in OpenMP or Pthreads, and distributedmemory vision, e.g., as in MPI. A more recent model, named partitioned global address space (PGAS) aims to provide a compromised approach where threads have a global logically shared memory space, combined with datalocality awareness and onesided communication. Beyond performance, this model also aims to improve user's productivity by promoting suitable language constructs and code portability. An example of language embracing the PGAS model is Unified Parallel C (UPC). It is associated with a compiler and runtime system, and supported by a range of system configurations. The target system in this study is the OdroidXU platform, which includes an ARM Exynos 5 SoC architecture.
The main objective of this internship is to address first the execution of UPCbased application programs on the OdroidXU platform. The efficiency of this execution should be demonstrated by exploiting both UPCrelated features such as data locality and the features of the OdroidXU such as the ability to switch execution between the A7 energyefficient multicore and A15 performanceefficient multicore.
The internship is a good opportunity for the successful candidate to apply his/her skills in programming, compilation and architecture in a pragmatic and very exciting research field. Its related research topic could be continued in a PhD thesis later on. URL sujet détaillé : http://www.lirmm.fr/~gamatie/pages/PGASIntern2014.pdf
Remarques :





SM207124 Analysis of cellular networks fed by renewable energy sources


Description


General context:
Renewable energy is energy that is produced by unlimited (at least at a human timescale) resources such as wind, sunlight, waves, etc. The ICT (Information and Communications Technology) sector, as one of the main energy consumers worldwide, is becoming more and more interested in using renewable energy as a means for reducing the environmental footprint. Past efforts in making communication networks green have focused on improving the energy efficiency across the network. A more recent and complementary approach consists in using renewable energy whenever possible. Recent initiatives include the installation of 2,000 cellular base stations powered by renewable energies worldwide (according to AlcatelLucent).
Goal of the research:
This research aims at evaluating whether it is feasible to power base stations solely with renewable energy, and more precisely with solar energy. Batteries must be used to store the harvested energy and compensate for when the power consumption is higher than the energy generation rate. We propose to model the battery level as a continuoustime Markov chain (CTMC). A transient analysis is necessary because both the solar radiation that recharges the battery and the traffic intensity that depletes the battery are nonstationary (daily and seasonal patterns).
Specific tasks:
During this internship, the student will research the existing methods for computing efficiently the transient distribution of a CTMC. A comparative study shall be performed and the most efficient method will be implemented in a mathematical tool such as Maple or Matlab. Another task consists in finding a suitable model that fits in real traces collecting the solar radiation. According to the availability of traffic data, a similar task can be planned to find a convenient model for the traffic generated in a cell. URL sujet détaillé :
:
Remarques : Ce sujet de recherche est en collaboration avec Alain JeanMarie (Inria).





SM207125 Analysis of a real DNS trace


Description


General context:
Innetwork caching is a widely adopted technique to provide an efficient access to data or resources on a worldwide deployed system while ensuring scalability and availability. For instance, caches are integral components of the Domain Name System, the World Wide Web, Content Distribution Networks, or the recently proposed InformationCentric Network (ICN) architectures. Many of these systems are hierarchical. The content being cached is managed through the use of expirationbased policies using a timetolive (TTL) or replacement algorithms such the Least Recently Used, FirstIn FirstOut, Random, etc.
The Domain Name System (DNS) is a valid application case. The DNS maintains in a distributed database the mappings, called resource records, between names and addresses in the Internet. Servers in charge of managing a mapping are said to be authoritative. Caches=97used to avoid generating traffic up in the DNS hierarchy=97are found in both servers and clients (devices of end users). Caching is however limited in duration to avoid having stale records which may break the domains involved. DNS cache updates are strongly related with how the DNS hierarchy works. When a requested resource record R is not found at the client's cache, the client issues a request up in the hierarchy until R is found. The server providing R is called the answerer. The record R is sent back to the client through the reverse path between the answerer and the client, and a copy of R is left at each cache on this path. According to RFC 6195, all copies of R are marked by the answerer with a timetolive (TTL) which indicates to caches the number of seconds that their copy of R may be cached.
Goal of the research:
This research aims at evaluating the performance of TTLbased caching, in terms of hit probabilities, occupancy of caches and miss processes. Analytical models have been developped and a preliminary validation performed. We have collected traces from a real DNS cache (the cache of one of the two DNS servers of Inria at Sophia Antipolis) in the period from 21 June to 1 July 2013. The trace contains information about 2313984 resource records requested by a total of 2147 users.
Specific tasks:
During this internship, the student will analyse this trace in order to compute, for each content, the relevant characteristics (requests arrival rate, TTl values). The popularity of the contents will be computed as well as the best TTL value for each content (according to some specific constraint). In a second step, tracedriven simulations of a single cache and of a network scenarios will be performed. The objective is to identify from simulation which TTL distribution is the best at higherlevel caches. To this end, bottom caches will be fed by the arrival processes recorded in the trace and will fix the best constant TTL to each content. The miss process at higherlevel caches will be compared to the one predicted by the analytical model. URL sujet détaillé :
:
Remarques :





SM207126 Towards a safe subset of JavaScript


Description


JavaScript has become the de facto assembly language of the web. Not only it is used in the setting of web applications, but it is now the compilation target of several languages, enabling many programs to run directly in a browser. Unfortunately, JavaScript is a complex and very dynamic language, thus JavaScript programs are difficult to analyze and even harder to prove correct. The combination of widespread use and complexity motivates the goal of this internship: research formal techniques for safer JavaScript development.
The core of the internship will be a identification and formalization of a subset of the JavaScript language, based on the work of the JSCert project. Building upon this formal foundation, and expanding it as necessary, the intership may then proceed to one of the following two research tracks: certified compilation to JavaScript of higherlevel languages, and development of certified analyses. URL sujet détaillé : http://www.irisa.fr/celtique/aschmitt/internships/jsm2lyon2014.html
Remarques : L'étudiant pourra être rémunéré.





SM207127 Numerical algorithms for certified topological and geometrical description of singular curves


Description


Motivation.
The description of the topology and geometry of curves arises in a wide range of applications, from scientific visualization to the design of robotic mechanisms. Currently, most software provide either (a) numerical approximations without guarantee for singular curves; or (b) rely on heavier tools of computer algebra to handle singular curves but suffer from efficiency in practice. This internship addresses the design and the implementation of certified numerical algorithms for restricted classes of singular curves, namely the projection of the intersection of two surfaces or the projection of the silhouette of a surface.
Topological and geometrical descriptions.
In general, a curve C solution of a system of polynomial equations cannot be represented exactly. A common approach is to compute a set of points at a distance smaller than a given threshold from C. The main drawback of this approach is that it misses topological information, such that the number of connected components, self intersections or isolated points.
Another approach is to compute a piecewiselinear graph that has the same topology as C. The main challenge is to perform this computation efficiently with numerical and certified algorithms.
Numerical methods.
When the curve is smooth, numerical methods ([Rum10]) are efficient to represent its topology and geometry with a piecewise linear graph. However the projection of a curve of R^3 is not smooth in general. No numerical methods can yet certify its topology. A challenge is to design a numerical method to represent efficiently the topology and geometry of the projection in this case.
Objective of the internship
Let S be a curve of R^3 given by the two polynomial equations system (S):
f(x,y,z) = 0 g(x,y,z) = 0
The projection of S, denoted by C, is the set of points in the xyplane such that (S) has a solution. As shown in Figure 1, this set is a curve that can contain selfintersection and isolated points. For any point p in the xyplane, denote by Np the number of points of C that project on p. If C is generic, then Np can be either zero, one or two. Computing the topology of C can be reduced to isolating with certification the points p such that Np = 2.
Recall that fast numerical algorithms based on the Newton method can only certify systems with as many equations as unknowns. The first challenge is to find a system with as many equations as unknowns such that its solutions are exactly the points p such that Np = 2. From there, according to the motivations of the candidate, two lines of research can be explored.
On the one hand, design of a complete certified numerical algorithm to isolate these points. This requires the implementation of several numerical algorithms based on the Newton method, and their experimental comparison.
On the other hand, generalize the system to higher dimensions. For curves defined by three polynomial equations in four variables, what system describes the points in the xyplane such that Np = 2? Is it possible to generalize this approach for curves in R^n?
The candidate should have a taste for both mathematics (geometry or numerical analysis) and computer science. Programming skills are appreciated (C/C++, Python or Mathlab).
Reference
[Rum10] Siegfried M. Rump. Verification methods: Rigorous results using floatingpoint arithmetic. Acta Numerica, 19:287=96449, 5 2010. URL sujet détaillé : http://webloria.loria.fr/~pougetma/web_files/2014internshipINRIA.pdf
Remarques : Coencadrant: Marc Pouget marc.pouget.fr
Financement ANR SingCAST Topologie des courbes et surfaces singulières





SM207128 Communication Complexity for twodimensional regular languages


Description


Picture languages represent a twodimensional generalisation of the classical unidimensional concept of regular (rational) languages, obtained by replacing words by rectangular pictures. They can be defined equivalently in various ways : by logical formulas, regular expressions or tiling systems. However none of them provide sufficient means to characterise regular languages, i.e. obtain a systematic method to prove whether some language is indeed regular. In fact, while problems about unidimensional regular languages can often be solved using automata or graph theory, the same questions on twodimensional languages usually require tools from complexity (or computability) theory. The purpose of the internship is to devise a new necessary condition for a language to be rational based on communication complexity. Informally, communication complexity studies how much information should be transferred between two peers Alice and Bob, each with a different fragment of the input, for them to compute some given function of the whole input. This particular complexity is well adapted to the study of twodimensional languages. Indeed, we will be able to represent in this formalism the obvious remark that, tearing a picture in half, the information each fragment contains about the other half is encoded in the boundary. It is expected from the intern to first acquire knowledge in twodimensional languages and communication complexity, and then to use them to study how this tool may help in answering some open problems in twodimensional language theory. In particular we will be interested in proving that some specific twodimensional language (the reverse context free shift) is not regular. A basic knowledge of and an interest in formal language theory is expected from the applicant. URL sujet détaillé : http://www.loria.fr/~ejeandel/files/stageM2.2.eng.pdf
Remarques :





SM207129 NonMonotony Impact in Boolean Networks


Description


Since about four decades, biological regulation network modelling has opened numerous questions at the frontier of mathematics, fundamental computer science and biology. In such an interdisciplinary context clearly anchored in theoretical computer science, this internship takes place in the lines of recent works dealing with dynamics and complexity properties of general interaction networks, when the latter are modelled by Boolean automata networks. More precisely, the aim of this internship is to go further towards the understanding and the characterisation of the dynamics (transient and limit behaviour) and the complexity of (attraction basins size and convergence time) of the class of nonmonotone networks.
The major part of studies and known results on the dynamics of Boolean automata networks assumes that the set of local transition functions that defines a network contains only locally monotone functions. Intuitively, that means that if an automaton i influences and automaton j in the network, then there is no ambiguity at all about the nature of this influence, that must be either positive (i tends to activate j) or negative (i tends to inhibit j). This local monotony is a strong property that induces de facto a strong restriction about the networks studied. However, no biological results allow a priori to go in the direction of such a restriction. So, now that a lot of interesting properties have been proven on locally monotone networks, it is important and relevant to study networks that do not admit this feature anymore. Moreover, the specific interest we have in this class of Boolean automata networks comes from previous works on the differences of their behaviours depending on their synchronous or asynchronous nature [1,2,3]. In particular, we have shown that the smallest Boolean automata networks that are not robust to changes of synchonicity are nonmonotone ones. Moreover, we also have shown that monotone networks can owe this property of nonrobustness. One of the remarkable point here is that it seems that nonrobust monotone networks can always be coded by a smaller nonmonotone network. At present, no general results that explain this phenomenon have been found, although certain conditions for a monotone network to be nonrobust to synchronicity perturbations have shown the need of very specific cycles in its definition. According to us, this emphasises the essential role of nonmonotone architectural patterns on the complex dynamical behaviour of interaction networks. Thus, for biological arguments related to the possible intrinsic nonmonotone nature of regulations as well as for computer science arguments also related to the influence of synchronicity, studying the impact of nonmonotony is very pertinent and promising, and should lead to open other interesting questions. URL sujet détaillé : http://pageperso.lif.univmrs.fr/~sylvain.sene/files/intern_nonmon_en.pdf
Remarques :  Sujet disponible en français à l'adresse http://pageperso.lif.univmrs.fr/~sylvain.sene/files/stage_nonmon_fr.pdf
 Coencadrement avec Adrien Richard (http://www.i3s.unice.fr/~richard/) de l'I3S de SophiaAntipolis.
 2 lieux possibles :
> le LIF à Marseille
> l'I3S à SophiaAntipolis
 Rémunération : oui
 Poursuite en thèse envisagée





SM207130 Cyberdeterrence  Why does it work?


Description


On a technical level, the asymmetry of threats in cyberspace and the preference of most countries for offense strategies versus defense ones should have already flamed many conflicts with some of them escalating to world level casualties. On the contrary, we observe only a handful of conflicts, most of them being minor. As other research projects performed at the Maryland Cybersecurity Center, the student will integrate behavioral, statistical and cybersecurity knowledge to determine the dynamic that is deterring assaillant or preventing them to develop cyberattacks to their full potential.
Jason Healey(ed), "A Fierce Domain: Conflict in Cyberspace, 1986 to 2012", Cyber Conflict Studies Association, 2013. URL sujet détaillé :
:
Remarques : The student will benefit from cybersecurity events and other ones organized by the Atlantic Council in Washington DC. The University of Maryland will support a visa for nationals from France and similar countries. A PhD opening at the University of Maryland will be linked to the work produced by the student and his rank at the end of the M2.





SM207131 Selfreducibility in distributed local decision


Description


The objective of the internship is to investigate the potential connections between decision and construction algorithms in the distributed setting, within the framework of local network computing. One question of particular interest is to measure the power given to the system by providing each node with an oracle deciding a language L, with respect to the ability of these nodes to construct an instance of L. URL sujet détaillé : http://www.liafa.univparisdiderot.fr/~pierref/MPRI/internshipmpri2014.pdf
Remarques :





SM207132 Etude et implantation des applications multilinéaires fondées sur les réseaux euclidiens


Description


Traditional cryptography reaches its limits in terms of functionalities with regards to the new applications offered by the cloud storage, or the cloud computing. Recently, latticebased cryptography [Ajt96,Reg05] reemerges as an alternative to the classical (RSA/discrete logbased) cryptography. It has many appealing features, such as high security guarantees or efficient underlying arithmetic operations. Moreover, very recently, it has been shown [GGH12] that lattices allow to design multilinear maps, which is a very powerful tool to design versatile cryptosystems.
The goal of this internship is to understand and implement such multilinear maps, which implies manipulations of very large lattices, generation of discrete gaussians,... A lot of work still remains to prove that such objects have a real practical impact.
[Ajt96] M. Ajtai. Generating Hard Instances of Lattice Problems. Proc. of STOC 1996, 99108 (1996)
[GGH12] S. Garg, C. Gentry and S. Halevi. Candidate Multilinear Maps from Ideal Lattices and Applications. Proc. of Eurocrypt 2013, Springer LNCS Vol. 7881, 117 (2013)
[Reg05] O. Regev. On lattices, learning with errors, random linear codes, and cryptography, Proc. of STOC 2005, ACM, 8493, (2005) URL sujet détaillé : http://perso.enslyon.fr/fabien.laguillaumie/sujet_M2ENSL.pdf
Remarques :





SM207133 SMT featuring Separation Logic


Description


One of the main goals of program analysis and verification is to develop fully automated tools that can reason efficiently about as many programs as possible. However, scalable automated reasoning is still among the main challenges for programs which operate on userdefined data structures on the heap.
Over the last decade, separation logic has turned out to be a very suitable formalism for reasoning about such data structures. While recently automation of reasoning with separation logic has made some impressive progress, this is to a certain extent due to 'hardcoding' the possible shapes of data structures into the program analysis tools, most notably lists and trees. We now aim to lift this restriction to hardcoded data structures and allow more general definitions of data structures by inductive predicates.
Moreover, automated reasoning has made some significant advances most notably in the areas of SAT and SMT solving. We wish to harness this recent progress and transfer it also to separation logic. One prominent feature of SMT solvers is the support for combinations of theories like integers, bitvectors, arrays, ... Since programs usually operate not only on userdefined data structures, but also on builtin data structures like integers or bitvectors, here we can reuse and build upon existing technology to obtain a more general program analysis framework.
The goal of this internship project is to develop a theory solver plugin for a stateoftheart SMT solver for a suitable fragment of separation logic. Here we build on a recent result on satisfiability checking for separation logic, which however does not yet consider combinations with other theories:
http://www.cs.ucl.ac.uk/fileadmin/UCLCS/research/Research_Notes/RN_13_15.pdf
This project thus has both a theoretical component in developing the underlying automated reasoning techniques and also a practical component in the form of implementing the novel technique as a theory solver plugin for a stateoftheart SMT solver. URL sujet détaillé :
:
Remarques :





SM207134 Amélioration des algorithmes de robots mobiles par l'utilisation de la géolocalisation


Description


Les systèmes d'agents mobiles sont des environnements distribués dans lequel les nóuds du réseau sont passifs et ce sont les agents mobiles qui s'occupent de l'exécution de l'algorithme. Le réseau est représenté par un graphe et les agents peuvent passer d'un nóud à l'autre le long des arêtes du graphe. Des travaux récents sur les agents mobiles supposent que les agents connaissent leur position initiale et ont un connaissance totale de la carte du terrain (représentée sous forme de graphe). Bien que les résultats obtenus soient intéressants, la connaissance totale du graphe a un coût en mémoire dépendant de la taille et de la complexité du terrain. Durant ce stage, nous allons donc essayer d'obtenir des résultats similaires pour le problème de rendezvous sans la connaissance de la carte. L'agent peut accéder à tout moment à ses coordonnées dans u n graphe planaire, mais ne connaît pas la topologie exacte du graphe. URL sujet détaillé : http://pageperso.lif.univmrs.fr/~arnaud.labourel/sujetLYON2013.pdf
Remarques : Une continuation en thèse est possible.





SM207135 Cunningham Chain Mining


Description


Problems and puzzles involving prime numbers have always fascinated humans. A Cunningham chain (of the first kind) is a sequence of the form , 2p+1, 4p+3, dots that each term in the sequence is prime. Similarly, sequences of primes built using the relation = 2p_i  1 known as Cunningham chain of the second kind.
Following Dickson's conjecture, it is widely believed that for every are infinitely many Cunningham chains of length Apart from proving this conjecture, there exists various interesting challenges associated to Cunningham chains. For example, one may want to find the largest prime starting a chain of a given length. Another interesting problem consists in finding Cunningham chains of longest length ; a problem which has recently been proposed as the setting for primecoin mining, a mining process associated to a virtual money similar to bitcoin.
Investigating efficient algorithms and implementations for addressing the above challenges is the main objective of the internship project. The algorithms involved are related to sieving techniques, primality proving and the associated arithmetic primitives. If time permits, highly optimized implementation of the proposed algorithms on GPU and/or multicore processors will also be considered.
Links:
http://primecoin.org http://primecoin.org/static/primecoinpaper.pdf http://primes.utm.edu/glossary/page.php?sort=CunninghamChain http://users.cybercity.dk/~dsl522332/math/Cunningham_Chain_records.htm http://en.wikipedia.org/wiki/Cunningham_chain
URL sujet détaillé : http://www.lirmm.fr/~imbert/sujets/cunninghamchains.pdf
Remarques : Sujet coencadré par Pascal GIORGI et Bastien VIALLA
Possibilité de rémunération





SM207136 Exploration du comportement des automates cellulaires probabilistes


Description


Cellular automata are both a computation model and a tool to model real life phenomena (epidemic spreading, wildfires, traffic, crystallography, etc). A cellular automaton is made of several cells. Each cell is characterized by a state. The state of a cell evolves according to the states of its neighbors. Classically, cells evolve synchronously, i.e. all cells are updates and switch their state at the same time. Several works focussed on probabilistic dynamics mainly for three different reasons : randoms errors occur and have to be erased. In this case, probabilities are adversaries ; Random computations can solve problems which are impossible in the deterministic case. In this case, probabilities are allies ; Probabilities are inherent of the model introduced to simulate a real life phenomenon. Generally, studies focus on a specific stochastic cellular automaton. Recent works have started to consider an exploratory approach for studying a great number of stochastic cellular automata. They aim to characterize and classify the different encountered behaviors. These works mainly focused on simulations of simple cellular automata (2 states, 1D, small neighborhood, simple rules) and they developed tools for analyzing and understanding the behavior of these automata. The aim of this stage is to look for original behaviors using simulations and to generalize the previous tools to these new behaviors. Interesting automata have already been identified. Thus one part of this stage relies on simulations and the other part is made of theoretical analyzes. We are looking for a student with a good knowledge of theoretical computer science and some basis of probability theory but no prior knowledge of cellular automata is needed.
Either Damien Regnault is the main advisor and the stage will take place in Évry. Either Sylvain Sené is the main advisor and the stage will take place in Marseille. Whatever the choice of the student is, both advisors will interact with him. URL sujet détaillé : https://www.ibisc.univevry.fr/~dregnault/Page_dacceuil_files/%5BRegnault,Sene%5DStage_AC.pdf
Remarques : Coencadrement avec Sylvain Sené, professeur des universités au Lif. Possibilité de faire le stage à Évry ou Marseille au choix de l'étudiant.





SM207137 Improper colouring of weighted hexagonal graphs


Description


Motivated by some channel assignment problems, we study improper colouring of weighted graphs and more specifically of weighted hexagonal graphs since they crop up naturally in these channel assignment problems. A weighted graph is a pair (G,p), where G is a graph and p a weight function on the vertex set of G. A kimproper tcolouring of a weighted graph (G,p) is a mapping C from V(G)into the subsets of {1,... , t} such that (a) for every vertex v,C(v)=p(v) and (b) for every colour c in {1, ...t}, the graph induced by the vertices v having c in their colour set C(v) has degree at most k.
Observe that when k=0, condition (b) is equivalent to imposing that for each edge uv the sets C(u) and C(v) are disjoint.
Hexagonal graphs are subgraphs of the triangular lattice.
The proposed aim of this intership is to design distributed approximate algorithm for kimproper colouring weighted hexagonal graphs. There are many results for k=0, and the challenging question is to find an approximate algorithm with ratio less than 4/3. A conjecture suggests that 9/8 could be possible. For k>0, very little has been done so far, and the goal will be to find algorithm with the best possible approximation ratio. URL sujet détaillé : http://wwwsop.inria.fr/members/Frederic.Havet/sujetstagehexag.pdf
Remarques :





SM207138 Modular process networks and circuit synthesis


Description


Process networks are execution models expressing naturally the parallelism of a computation. They are studied as models for parallel systems and allow to design theoretical studies, analysis, and measures for the various problematics related to parallelism (scheduling, load balancing, allocation, checkpointing, power consumption, etc). Process networks are a natural intermediate representation for parallelizing compilers, where the frontend extracts the parallelism and produces a process network and the backend compiles the process network to the target architecture.
Highlevel circuit synthesis (HLS) consists in compiling a program written in a highlevel language (as C) to a circuit. The circuit must be as efficient as possible, while using the available resources in the best fashion (consumption, silicon surface, FPGA LUT units, etc). Though many advances were achieved on the backend aspects (pipeline construction, routing), the frontend aspects (I/O, parallelism) are still rudimentary and far less powerful than the approaches developed in the HPC community.
In that context, we have design a model of process network that fit HLSspecific constraints. Our model makes explicit the communications with the central memory and the parallel access to channels, and is close enough to the hardware constraints to be translated directly to a circuit.
This intership aims at studying how to extend our process network model with modularity. That is, the possibility to view a process network as a process in a bigger network, while keeping the efficiency of the global system. Such a feature would enlarge dramatically the class of programs currently handled, allowing bigger kernels and irregular constructions.
The student will:
 Study the state of the art in automatic parallelization and process networks.
 invent a module system for our process networks, and propose an efficient compiler algorithm.
 implement, test and validate his approach on community benchmarks URL sujet détaillé :
:
Remarques :





SM207139 Symbolic FloatingPoint Arithmetic and the Optimality of Error Bounds


Description


Part of the Computer Arithmetic group at ENSL works on techniques and tools for designing reliable floatingpoint arithmetic programs. To construct proofs or counterexamples we often need to derive properties on the errors generated for some input data parametrized by the radix b and the precision p. Such properties are used for example in [2] to establish the optimality of an error bound for Kahan's 2x2 determinant algorithm and, more generally, results like this one are essential for a good understanding of the numerical behavior of computing kernels.
Unfortunately, the proofs of such properties have so far been done by hand, and this penandpaper approach is of course time consuming and error prone.
To overcome these difficulties we propose to implement a ``symbolic floatingpoint'' type into a computer algebra system, in order to be able to represent and compute with floatingpoint data parametrized by b and p, and eventually to handle examples like the one above automatically.
For this, the student is expected to learn the basics of floatingpoint arithmetic (starting from [1,3,4]), to define the features that such symbolic floatingpoint data should have, and to design and implement (for example within Maple) algorithms to handle them. Then, and if time permits, the software produced will be applied beyond Kahan's algorithm, namely to the analysis of numerical kernels like Euclidean norms, Givens rotations, or fast Fourier transforms.
[1] N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM 2002.
[2] C.P. Jeannerod, N. Louvet, J.M. Muller. Further analysis of Kahan's algorithm for the accurate computation of 2x2 determinants. Math. Comp., 82:22452264, 2013.
[3] J.M. Muller et al. Handbook of FloatingPoint Arithmetic. Birkhauser 2010.
[4] J.M. Muller. Exact computations with approximate arithmetic. Recent problems related to computer arithmetic, available at http://perso.enslyon.fr/jeanmichel.muller/ConfJourneesKnuth.pdf URL sujet détaillé :
:
Remarques : Coencadrement : ClaudePierre Jeannerod, Nicolas Louvet.





SM207140 Toward certified data


Description


Current data management applications and systems involve huge and increasing data volumes. Data can be numeric (e.g. output of measure instruments), textual (e.g. corpora studied by social scientists), data may consist of news archives over several years, structured as is the case of astronomy or physics data, or highly unstructured as is the case of medical patient files.
Data, in all forms, is increasingly large in volume, as a result of computers capturing more and more of the companies activity, the scientists' work and also as a result of better and more powerful automatic data gathering tools (e.g. space telescopes, focused crawlers, archived experimental data mandatory in some types of governmentfunded research programs).
The availability and reliability of such large data volumes is a gold mine for companies, scientists and simply citizens. It is then of crucial importance to protect data integrity and reliability by means of robust methods and tools.
Program verification and certification have been intensively studied in the last decades yielding very impressive results and highly reliable software (e.g., the Compcert project [1]). However, and surprisingly, while the amount of data stored and managed by data engines has drastically increased, little attention (with the noticeable exception of [2]) has been devoted to ensure that such (complex) systems are indeed reliable.
The aim of this internship is to give a first step toward formalizing and/or mechanizing data intensive systems such as XML/XQuery or JSON/NoSql engines using formal tools such as Coq or SMT provers (e.g., Altergo) embedded in the Why(3) platform.
Prerequisite: The candidate should have a very strong background in theoretical computer science with emphasis in logic as well as a concrete practical experience of functional programming in OCamllike programming languages. A good knowledge of Coq will be a plus.
[1] Xavier Leroy. A formally verified compiler backend. J. Autom. Reasoning, 43(4):363446, 2009.
[2] Gregory Malecha, Greg Morrisett, Avraham Shinnar, and Ryan Wisnesky. Toward a verified relational database management system. In ACM International Conference on Principles of Programming Languages (POPL), 2010. URL sujet détaillé :
:
Remarques : Coencadrant : Evelyne Contejean





SM207141 Exact Real Numbers Computations applied to Geometry


Description


Being able to compute exactly with real numbers is of paramount importance in computer science, especially in the field of computer graphics for all geometric computations.
In Strasbourg and Poitiers, we work on a discrete model of the continuum, namely the HarthongReeb line to represent reals and their computations. The first goal of this internship is to link this way of computing exactly with real numbers with another library (Creal) written in Ocaml. In addition, we want to investigate how to achieve efficiency and reliability when computing geometric predicates using wellknown techniques such as filters to avoid unnecessary computations. Finally, we intend to prove all the algorithms and the code we write are correct by formalising them in the Coq proof assistant.
Keywords: Exact Real Computation, Geometric Predicates, Formal Proofs, Coq, Ocaml URL sujet détaillé : http://dptinfo.ustrasbg.fr/~magaud/sujetM2Ren.pdf
Remarques : Coencadrement avec Laurent Fuchs (XLIMSic UMR 7252 CNRS Université de Poitiers).





SM207142 Mobility Models for NonGeometric Dynamic Networks




SM207143 Relational and spatial data mining


Description


The internship will focus on propositionalisation, that is a data mining method designed for relational data. This method is used to transform data contained in relational tables into a unique table. The problem complexity is thus reduced and the problem can be processed by classical supervised or unsupervised data mining methods. The aim of the transformation is to capture the relevant information contained in the relational tables and to represent it within relational features. There exist several methods, logicoriented or databaseoriented. The student will first have to study one or two methods, both from a practical and a theoretical point of view. He will then study how to handle spatial relations within these methods. Qualitative models of space will be used for that purpose. One of the approaches will finally be implemented and tested on a real dataset concerning the hydroecological domain. URL sujet détaillé :
:
Remarques : coencadrement par Florence Le Ber, HDR stage proposé dans le cadre du projet ANR Fresqueau
rémunération à hauteur de 436 euros (approximativement) par mois
thèse possible à la suite.





SM207144 In silico analysis of toxicity for synthetic pathways


Description


Advances in synthetic biology promise to give engineers control over cells to produce predictable behaviors of newly designed biological systems. This field has potential impacts on strategic domains such as health, energy, or environment. Necessarily, this raises a number of security issues mainly related to hazardous behaviors or reactions. As many engineering sciences addressing sensitive technology, the risk detection and prevention must be tackled at the earliest phase of the design and pursued at each stage to guarantee the safety of the products. As a matter of fact, these problems are analyzed in the ambit of toxicology. We are mainly interested in toxicogenomics: the study of the relationship between the structure and activity of the genome and the adverse biological effects of exogenous agents; or in other words the toxic response to perturbations of biologic pathways. In order to tackle toxicity risk we have proposed a framework based on Petri nets with causal time (PeCT). The model allows analyzing safety properties of newly designed pathways by exhibiting reachable configuration of hazardous states. The analysis is based on the verification of these properties formulated via temporal logics, the satisfiability of which is automatically checked by the HELENA model checker. Our goal is to integrate the CAD environment GUBS developed for the Synbiotic project (ANR Blanche 2010) with verification techniques. The objective of the stage is to contribute to the development of GUBS addressing the toxicity analysis problems. More in detail, the student will implement an algorithm that generates a PeCT model from a pathway. Then, he/she will develop an algorithm for translating the PeCT model into a finite state automaton and finally, use the HELENA model checker to test properties written in a suitable temporal logic. URL sujet détaillé : https://www.ibisc.univevry.fr/~digiusto/StageMaster2.pdf
Remarques : Coencadrant: Cinzia Di Giusto Rémunération prévu par le projet Synbiotic
Durée: 56 mois
Possibilité de poursuivre en thèse





SM207145 From planar graphs to higher dimensions


Description


Planar graphs is a widely studied class of graphs which history starts in the 19th century. However, in the early 90's W. Schnyder found a surprising caracterization of these graphs [Sch]. A graph G is planar if and only if its incidence poset has DushnikMiller dimension at most 3.
The tools introduced in that proof yield to many=14combinatorial properties, such as bijections between (oriented) planar graphs and contact=14 system=14 of =14polygons,=14 TDDelaunay triangulations,=14 or=14 Dyck=14 words.=14 They=14 also=14 appear=14 in=14 various=14 computer=14 science=14 applications=14 such=14 as=14 succinct encoding,=14 graph drawing,=14 graph=14 spanners, =14etc.
This=14 proposal=14 aims=14 at=14 generalizing=14 these=14 properties=14 to=14 higher =14dimensional =14objects: CWcomplexes whose incidence poset has DushnikMiller dimension d. The case d=1 is the single vertex, the case d=2 corresponds to the paths graphs, and the case d=3 corresponds to 3connected planar maps (= drawn planar graphs). For higher dimensions P. Ossonna de Mendez has considered simplicial complexes [Oss]. He has proved that any simplicial complexe C whose incidence poset has Dushnik Miller dimension d+1, linearly embeds in R^d.
Several properties in the planar case, raise natural questions. For example, in which case does a complex C admit a dual C^*? And is it always the case that C and C^* have same DushnikMiller dimension? For the planar case the answers are: for 3connected planar maps, and yes [BT]
For some of these questions we already have partial answers :  Given a contact system S of dboxes in R^d, the simplicial complex C(S) obtained from S has DMdimension at most d+1. Under which conditions does the converse hold ? For the planar case the answer is: For submaps of 4connected planar maps [Tho]  Given a contact system S of (d1)boxes in R^d, the simplicial complex C(S) obtained from S has DMdimension at most d+1. Under which conditions does the converse hold ? For the planar case the answer is: For bipartite planar graphs [FPP]. One can note that in general there are d different types of boxes (according to their degenerated dimension), and that the skeleton of C(S) is thus a dcolorable graph. Is it also a sufficient condition?
[BT]=14 G.=14 Brightwell and =14W.T. =14Trotter,=14 The=14 order=14 dimension=14 of=14 planar=14 maps,=14 SIAM=14 J.Disc. Math. 10=14 (1997), =14515528. [FPP]=14 H.=14 de=14 Fraysseix,=14 J.=14 Pach,=14 R.=14 Pollack,=14 How to=14 draw=14 a=14 planar=14 graph=14 on=14 a=14 grid.=14 Combinatorica=14 10(1)=14, (1990)=14, 4151. [Oss] =14P. =14Ossona=14 de=14 Mendez,=14 Geometric=14 realization=14 of simplicial=14 complexes,=14 LNCS =141731 =14(1999), =14323332 [Sch]=14 W.=14 Schnyder,=14 Planar=14 graphs=14 and=14 poset=14 dimension,=14Order=14 5 =14(1989)=14, 323343. [Tho] C. Thomassen, Plane representations of graphs, Progress in graph theory (Bondy and Murty, eds.) (1984), 336342.
prerequisite:  Basic knowledge in graph theory.  Knowledge in linear programming is a plus. URL sujet détaillé : http://www.lirmm.fr/~goncalves/pmwiki/index.php?n=Main.SujetM2
Remarques :





SM207146 Tools to help the designer to specify and improve assistance systems


Description


The aim of this internship is to create a method for detecting users' assistance needs for a given application, called target application, in order to help designers to create and improve an assistance system for this target application.
For this, the student will propose a system which will exploit the collected traces (through an operational tool developed in AGATE project) to identify and characterize problematic situations encountered by users.
This work will be used to enrich an editor [2] aimed at designers developed in Java. This editor allows the design of a rulebased assistance system <>. The systems specified with the editor are then executed by a generic assistance engine [2], which traces users' actions and keeps a history of the provided assistance. Indicators calculated from use traces of the target application will be presented to the designer to guide him in his task. The system developed by the student, should also in a second time, be used to identify strengths and weaknesses of an assistance system to allow the designer to improve it. The student will develop two types of feedback. On the one hand feedback will be generated and proposed from the observation of the target application and of the assistance system in "use". On the other hand, feedback will be created from user's feedback. Indeed, we will allow these users to express their point of views on the relevance of the assistance (opinion on the provided assistance or identification of a not fulfilled assistance need).
This project is part of the AGATE project (an Approach for Genericity in Assistance To complEx tasks), which aims to facilitate the use of complex systems by adding epiphytic assistance systems. An epiphytic system is a system able to perform actions in an external application without affecting its functioning. Epiphytic systems handled in the AGATE project are generic: they can be used on any application, without need to modify it. URL sujet détaillé : http://liris.cnrs.fr/stephanie.jeandaubias/enseignement/projets/2014stageM2R_gb.pdf
Remarques :





SM207147 Trace based assistance


Description


The objective of this internship is to create a method for detecting users' assistance needs for a given application, called target application. This method will define "typical uses" associated with a target application. A similarity measure will compare the current user's trace with the different defined uses, which will allow the system to identify the user's purpose to adapt the proposed assistance. The system will also detect a significant difference between the current user's trace and the practice followed so far, and deduce a change of user's aim or a difficulty encountered by the user to achieve the goal initial.
This internship is part of the AGATE project (an Approach for Genericity in Assistance To complEx tasks), which aims to facilitate the use of complex software by adding epiphytic assistance systems (see Figure 1). An epiphytic system is a system able to perform actions in an external application without affecting its operation. Epiphytic systems handled under the AGATE project are generic: they can be used on any application, without having to modify it.
The student will design and develop a tool allowing an expert to define typical uses for a target application (Ⓐ see Figure 1), and a system for comparing the user's traces with the typical uses defined using a similarity measure (Ⓑ cf. Figure 1), and then to adapt the assistance provided to the user. URL sujet détaillé : http://liris.cnrs.fr/stephanie.jeandaubias/enseignement/projets/2014stageM2R2_gb.pdf
Remarques :





SM207148 modèles de réseaux de véhicules en libre partage


Description


Bike sharing systems as Velib' in Paris or Vélov'V in Lyon appear as an alternative green mode of urban transportation and have seen explosive growth over recent years. Indeed such programs have been launched in many cities around the world. System sizing and management are far to be easy. The users have to face the lack of vehicles or available slots in the stations, due to the difference of attractivity among the stations, or simply randomness. Maintaining both ressources, vehicles and empty slots, available in the stations to the users is a challenging problem for the operator. It is called regulation. Balancing the stations can be obtained either by incentives to the user to take ressources of a station which is better choosen, or by redistribution by trucks. Such systems can be described as stochastic networks, similar to particule systems or communication networks. The aim is to understand their behavior, using models and methods developped for large stochastic networks, which are a hot topic of recent research. The intership is in this research area.
The goal of the intership is the following:
 To propose simple probabilistic models for Velib' and Autolib'. The aim is to address sizing and regulation problems. These models will take into account the main system parameters and describe the system policy, the behavior of the users, and investigate different scenarios of regulation.
 To study these models under simple assumptions when the system gets large. The system performance will be measured by the proportion of stations without one of the ressources. The influence of parameters will be discussed.
 The heterogeneity of the system imbalances it. It is difficult to propose an analytically tractable model for this. Heuristics allow us to study the impact of heterogeneity on regulation policies.
 The results will be validated by simulations. JC Decaux provide open data for Velib' since May 2013. Data analysis will allow us to extract realistic parameters for the simulator calibration. Furthermore, from data analysis, realistic models including the influence of geometry will be designed. URL sujet détaillé : https://team.inria.fr/rap/files/2011/07/ENS_Lyon_2014_detaille.pdf
Remarques :  Le stagiaire pourra obtenir une rémunération sous forme de gratification.  Eventuelle possibilité de faire le stage à INRIARhoneAlpes à Grenoble sous la direction de Nicolas Gast (équipe MESCAL)





SM207149 Décidabilité et Complexité en Logique de Séparation


Description


Le but de ce sujet est donc de rechercher et d'identifier de nouveaux fragments décidables de la logique de séparation (avec et/ou sans extensions récursives et quantitatives), si possible moins restrictifs que le fragment purement existentiel, en utilisant une approche fondée sur la recherche de preuves et la construction de contremodèles. URL sujet détaillé : http://www.loria.fr/~galmiche/=stages/sujet1M2R2013.pdf
Remarques : Coencadrement avec Daniel Méry.





SM207150 Baguette magique et septraction dans la logique BI


Description


L'objectif du sujet est d'étudier la version intuitionniste de la logique BI où l'on a substitué à l'opérateur baguette magique l'opérateur de septraction. En particulier, on cherchera à adapter une méthode de recherche de preuves basée sur les tableaux et la résolution de contraintes à cette logique et à établir, le cas échéant, un plongement dans Boolean BI. URL sujet détaillé : http://www.loria.fr/~galmiche/=stages/sujet2M2R2013.pdf
Remarques : Coencadrement avec Dominique LarcheyWendling.





SM207151 Alliances in graphs


Description


We are interested in the study of defensive alliances and secure sets of a graph. Alliances occur in many contexts: members of different political parties, people who unite by friendship, companies with common economic interest, nations wishing to form defensive coalitions. A defensive alliance in a graph, introduced in [KHH04], is a set of vertices S such that each vertex of S has at least as many neighbors in S as outside S, that means that each vertex of S with its neighbors from S can defend against every attack coming from its neighbors outside S. Finding a defensive alliance of minimum size in a graph is NPhard. Nothing is known about the approximation of this problem.
A defensive alliance involves the defense of each single vertex. In many contexts, alliances must be formed so that any attack on any subset X of S can be defended by X and their neighbors in S. In this case, the set S is called secure. The security number of a graph, introduced in [BDT07], is the smallest size of a secure set. Nothing is known about the complexity and the approximation of this number. Another interesting question is, given an insecure set N, how to add a minimal number of vertices in order to transform N into a secure set. The goal of the internship is to study the complexity and the approximation of these problems. Some related problems with complexity and approximation results were given in [BTV10].
[BTV10] C. Bazgan, Zs. Tuza et D. Vanderpooten, Satisfactory graph partition, variants, and generalizations, European Journal of Operational Research, 206, 271280, 2010.
[KHH04] P. Kristiansen, S. M. Hedetniemi, S. T. Hedetniemi, Alliances in graphs, Journal of Combinatorial Mathematics and Combinatorial Computing, 48, 157177, 2004.
[BDT07] R.C. Brigham, R.D. Dutton, S.T. Hedetniemi, Security in graphs, Discrete Applied Mathematics, 155, 17081714, 2007. URL sujet détaillé :
:
Remarques :





SM207152 The Wagner hierarchy and some of its extensions


Description


The Wagner hierarchy is a fine hierarchy classifying the rational omegalanguages with regard to their topological complexity. The location of a rational omegalanguage in this hierarchy can be determined in an effective manner from the structure of a Muller automaton accepting this language. A first part of the stage will consist in the study of the Wagner hierarchy determined by Wagner in [Wag79]. Next two directions may be considered:
1) The study of the extension of the Wagner hierarchy to one counter automata without zero test [Fin01]. One could search the complexity of the following problem: "determine the position of a given onecounter omegalanguage in this hierarchy". Further questions would consist in the study of possible effective extensions of this hierarchy.
2) The omegapowers of rational languages form a strict subclass of the class of rational languages of infinite words. The topological complexity of these omegapowers and their locations in the Wagner hierarchy is not yet completely known. This subject might be firstly investigated from the papers [Wag79, LitTim87].
This subject in theoretical computer science has an important mathematical aspect.
References:
[Fin01] O. Finkel, An Effective Extension of the Wagner Hierarchy to Blind Counter Automata, Lecture Notes in Computer Science, Volume 2142, Springer, 2001, p. 369383.
[LitTim87] I. Litovsky and E. Timmerman, On Generators of Rational omegaPowers Languages, Theoretical Computer Science, Volume 53 (23), 1987, p. 187200.
[Sta97] L. Staiger, omegaLanguages, in Handbook of Formal Languages, Volume 3, Springer, 1997, p. 339387.
[Wag79] K. Wagner, On omegaRegular Sets, Information and Control, Volume 43 (2), 1979, p. 123177. URL sujet détaillé :
:
Remarques : Coencadrement avec Olivier Carton, Professeur d'Informatique au LIAFA, Laboratoire d'Informatique Algorithmique: Fondements et Applications, Université Paris 7.
http://www.liafa.jussieu.fr/~carton/





SM207153 Games on graphs for the analysis of economics of privacy in social networks


Description


Keywords: Network economics, privacy, social networks, game theory, graph theory
Background:
The Internet advertisement industry weighted billions in the first half of 2012, a 14% increase over the same period in 2011 [1]. A large part of this amount is attributable to targeted advertisement, where advertisers use your private data to send advertisement that you are more likely to be interested in, and hence increase the likelihood that you buy the product. As a result, social networks (e.g., facebook) and web services (e.g., google) that collect information on users can make huge amounts of money simply by selling this information to advertisers.
This situation has led to a recent increase of the public awareness as witnessed in the medias, which also resulted in the start of an intense research activity around the development of privacy models and tools that allow users to gain control of their private data and potentially to sell them [2,3]. This brings two essential questions: how to put a price on each piece of data for the user? and of how much data to buy for the advertisers? Such questions started to be discussed recently [4] but are far from being well understood.
Objectives of the internship:
The goal of this internship is to build a gametheoretic model, which allows investigating the incentive of each user to sell his data. We will focus on the case of a social network, where private data of linked users are correlated and try to understand how the position of a user in the social network impacts the value of his private data. After a short literature survey, the student will propose a model, compute the equilibria and try to find a structure in the equilibria relating to properties of the social network graph [5]. In particular, the student will characterize the impact of the social network graph on standard mechanisms proposed to auction data [4]. Simulations using real social network graphs or standard models (e.g., small world) may also be performed.
Preferred skills:
Game theory, optimization, graph theory.
References:
[1] Interactive Advertising Bureau. http://www.iab.net.
[2] V. Erramilli. The Tussle around Online Privacy. IEEE Internet Computing, 16(4): 6971, 2012.
[3] C. Riederer, V. Erramilli, A. Chaintreau, B. Krishnamurthy, P. Rodriguez. For sale : Your Data, By : You. In proceedings of HoteNets, 2011.
[4] A. Ghosh, A. Roth. Selling privacy at auction. In Proceedings of ACM EC, 2011.
[5] K. Bimpikis, O. Candogan, A. Ozdaglar. Optimal Pricing in Networks with Externalities. Operations Research, 60(4): 883905, JulyAugust 2012.
URL sujet détaillé : http://www.eurecom.fr/~loiseau/Internship_EURECOM_Privacy.pdf
Remarques :





SM207154 Gametheoretic statistics for Internet regulation


Description


Keywords: Internet regulation, network neutrality, measurement, game theory, statistics
Background:
Following a period of intense research activities on network performance measurement, the French telecom regulator (ARCEP) recently started to publicly release statistics about the measured quality of ISPs, e.g., the upload and download rates. A few websites such as http://grenouille.com/ also provide statistics about ISPs quality measured by voluntary end users. Although such measurements are not explicitly used for ISPs regulation, they can be used by users to choose their ISP and have therefore a potentially large economic impact.
Similarly, the neutrality issue [1], which is actively debated in political instances in France, at the European commission and in the USA, has a potentially large economic impact for ISPs. Consequently, researchers are starting to build tools to measure ISPs neutrality, which may eventually be used for user information or even for law enforcement purposes.
To be useful, such measurements should be representative of the overall quality or neutrality of an ISP. However, it is clearly impossible to have a complete measurement: subsampling is necessary. Therefore, if the ISP is aware of the method used to measure statistics (e.g., the measurement points), it can optimize its performance or neutrality as perceived by ARCEP or by the voluntary end users and neglect the rest of the network. In this context, it is difficult to develop an appropriate measurement methodology [2,3,4].
Work description:
The goal of this internship is to build and solve a gametheoretic model to analyze the interaction between an ISP and a regulator measuring its quality or neutrality, in order to deduce an appropriate measurement methodology that can be used for regulation. After a short literature survey, the student will propose a gametheoretic model of interaction between an ISP and the regulator, using a simple network topology. He/she will analyze the equilibrium and use the results to derive guidelines on measurement methodology. After analyzing the equilibrium, the student will investigate the learning process in the game.
The type of games involving strategic agents performing statistic tasks recently started to be studied in security applications [5,6]. They are also related to the famous Blotto game [7].
The results are expected to give generic guidelines on the best measurement methodology to enforce a given regulation, but also to contribute to the neutrality debate by highlighting which regulation among the debated choices can actually be efficiently enforced.
The expected outcome of the internship is a solid analysis with theoretical results and illustrative simulations that can lead to a publication.
Preferred skills:
Game theory, optimization, probability and statistics.
References:
[1] N. Economides, J. T=E5g. Network neutrality on the Internet: A twosided market analysis. Information Economics and Policy (2012) 24(2):91104
[2] C. Kiekintveld, M. Jain, J. Tsai, J. Pita, F. Ord=F3=F1ez, M. Tambe. Computing Optimal Randomized Resource Allocations for Massive Security Games. In Proceedings of AAMAS 2009.
[3] M. Jain, E. Kardes, C. Kiekintveld, M. Tambe, F. Ord=F3=F1ez. Security Games with Arbitrary Schedules: A Branch and Price Approach. In Proceedings of AAAI 2010.
[4] P. Paruchuri, J. Pearce, J. Marecki, M. Tambe, F. Ord=F3=F1ez, S. Kraus. Coordinating randomized policies for increasing security of agent systems. Inf Technol Manag (2009) 10:6779.
[5] L. Dritsoula, P. Loiseau, J. Musacchio. Computing the Nash Equilibria of Intruder Classification Games. In Proceedings of GameSec 2012
[6] M. Tambe. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. CUP 2011.
[7] B. Roberson. The Colonel Blotto game. Economic Theory (2006) 29: 1=9624. URL sujet détaillé : http://www.eurecom.fr/~loiseau/Internship_EURECOM_Regulation.pdf
Remarques :





SM207155 Polynômes mwp pour la réécriture


Description


La complexité implicite fournit des critères syntaxiques et assure que les programmes vérifiant ces critères possèdent une certaine borne de complexité, généralement de calculer en temps polynomial.
Depuis une vingtaine d'années, de nombreux critères de complexité implicite ont été découverts pour différents langages de programmation et il est extrêmement difficile de les comparer.
Afin de faciliter la comparaison entre plusieurs méthodes, on essaye ici d'adapter une des méthodes à un autre langage de programmation. URL sujet détaillé : http://lipn.univparis13.fr/~moyen/stages_M2/sujMoyen2.pdf
Remarques :





SM207156 Reversible synchronous circuits.


Description


Synchronous circuits are made of the usual boolean gates (and, or, not) together with registers. The objective of the internship is to characterise the functions that can be implemented by means of reversible synchronous circuits, i.e. circuits made of reversible boolean gates and a finite number of registers. To this end, thanks to the finiteness condition, finite state automata can be used to describe the behaviour of such synchronous circuits, reducing the initial problem to the characterisation of automata that correspond to reversible evolutions. URL sujet détaillé :
:
Remarques : coencadrant: Emmanuel Jeandel





SM207157 eOptimization schemes for multiobjective optimization


Description


Orlin et al. 2008 proposed a new notion of approximation scheme coming from the inverse optimization and introduced the Lbit precision model. Some relations between these two notions are established. The goal of this internship is to study these notions in the context of multiobjective optimization. URL sujet détaillé : http://www.lamsade.dauphine.fr/~bazgan/Stages/stagesapprox.pdf
Remarques :





SM207158 Plugin gcc pour calculer en espace linéaire


Description


La complexité implicite fournit des critères syntaxiques et assure que les programmes vérifiant ces critères ont certaines propriétés de complexité (généralement qu'ils calculent en temps polynomial).
Ces analyses sont pour le moment essentiellement théorique. On se propose ici de prendre une analyse simple, basée sur les programmes NonSize Increasing de M. Hofmann, pour l'intégrer au sein d'un compilateur. L'idée est alors d'avoir une compilation qui certifie aussi une borne de complexité. URL sujet détaillé : http://lipn.univparis13.fr/~moyen/stages_M2/sujMoyen1.pdf
Remarques :





SM207159 eOptimization schemes for multiobjective optimization


Description


Orlin et al. 2008 proposed a new notion of approximation scheme coming from the inverse optimization and introduced the Lbit precision model. Some relations between these two notions are established. The goal of this internship is to study these notions in the context of multiobjective optimization. URL sujet détaillé : http://www.lamsade.dauphine.fr/~bazgan/Stages/stagesapprox.pdf
Remarques :





SM207160 Propriétés décidables sur les programmes


Description


Le théorème de Rice stipule que toute propriété extensionnelle sur les programmes, c'estàdire qui ne dépend que de la fonction calculée, est indécidable. Mais rien n'est dit sur les propriétés intentionnelles, qui dépendent du programme. certaines sont décidable (par exemple, avoir le même nombre de lignes de code) et d'autres non (par exemple, avoir la même complexité).
On cherche ici à caractériser plus précisément quelles sont les propriétés décidables sur les programmes. En particulier, si on considère l'équivalence "calculer la même fonction", on peut relire le théorème de Rice comme prouvant l'indécidabilité de toute équivalence moins fine que celleci. Qu'en estil des équivalences plus précises ? URL sujet détaillé : http://lipn.univparis13.fr/~moyen/stages_M2/sujBoudesMoyen2.pdf
Remarques : Stage coencadré avec Pierre Boudes (LIPN, Université Paris 13)





SM207161 Quantum computation with multigraphs.


Description


In the socalled graph state formalism, graphs are used to represent quantum states. These states have various applications in quantum information theory, in particular a graph sate can be used as a resource for performing a quantum computation. The graphs that can be used to perform a quantum computation are those which admit a certain kind of flow called gflow. Gflow is a key notion in graphbased quantum computing for instance for the parallelisation of quantum operations.
The objective of the internship is to study how multigraph states, a larger class of quantum states, can be used to perform a quantum computation. In particular an extension of the gflow condition is expected.
No prior knowledge in quantum computing is required. URL sujet détaillé :
:
Remarques : coencadrant: Emmanuel Jeandel





SM207162 Design and implementation of termination algorithms


Description


Proving the termination of a flowchart program can be done by exhibiting a ranking function, i.e., a function from the program states to a wellfounded set, which strictly decreases at each program step.
We have some experience in designing such algorithms but we are faced with scalability issues.
The candidate, according to his skills and preferences, will be expected to : =97 Study the (recent) state of the art of the topic, especially the use of recent advances in SMTsolving ([4, 5]) =97 Theorically and experimentally compare these approaches with ours. =97 Implement the extensions of our algorithm, and/or a Cfrontend to our current implementation. =97 Study the theorical complexity of the underlying problems.
URL sujet détaillé : http://laure.gonnord.org/pro/research/stageTerminaison2013.pdf
Remarques : co encadrement D. Monniaux (Verimag) mais stage au LIP. Rémunération possible.





SM207163 Subdivision adaptative de CatmullClark sous contraintes




SM207164 Résolution de contraintes sur des courbes générées par subdivision




SM207165 Stabilité de schémas et erreurs d'arrondi pour les EDOs


Description


En pratique, un schéma numérique s'exécute sur des nombres à virgule fl=1Dottante et non sur les réels. Il serait donc intéressant de savoir si, de façon générale, les schémas numériques utilisés se comportent bien ou peuvent être des cas pathologiques visàvis des calculs =1Dflottants. URL sujet détaillé : https://www.lri.fr/~sboldo/files/stageM2_an.pdf
Remarques :





SM207166 Coalition Formation in Cellular networks


Admin


Encadrant : Stefano MORETTI 
Labo/Organisme : Le stage est proposé dans le cadre du projet Netlearn  Orchestration d'algorithmes d'apprentissage distribués pour la gestion des ressources dans les réseaux mobiles, un projet financé par l'ANR INFRA en partenariat avec Telecom ParisTech, Orange Labs, PriSM/ Université de Versailles Saint Quentin en Yvelines, LAMSADE/Université Paris Dauphine, AlcatelLucent (ALU), Inria/LIG. 
URL : http://lamsade.dauphine.fr/ 
Ville : Paris 



Description


Coordination is an important issue in cellular networks. For instance, cell coordination may improve an efficient way for reducing intercell interference by allowing several base stations to transmit data simultaneously to the same user. However, cell coordination mechanisms must take into account the tradeoff between higher achievable rates of users and higher radio resource consumption (due to joint transmissions to the same user).
The goal of this stage is to to propose distributed algorithms based on coalition formation theory for coordination among different cells and to study how the cell network structure evolves and change according to such distributed procedures.
References
Han, Z., Niyato, D., Saad, W., Basar, T., & Hjorungnes, A. (2012). Game theory in wireless and communication networks. Cambridge University Press.
Khlass, A., Bonald, T., & El Ayoubi, S. (2013). FlowLevel Performance of Intrasite Coordination in Cellular Networks. In WiOpt (Vol. 1, No. 1, pp. 18).
Saad, W., Han, Z., Basar, T., Debbah, M., & Hjorungnes, A. (2011). Coalition formation games for collaborative spectrum sensing. Vehicular Technology, IEEE Transactions on, 60(1), 276297. URL sujet détaillé :
:
Remarques : Le stage est proposé dans le cadre du projet Netlearn  Orchestration d'algorithmes d'apprentissage distribués pour la gestion des ressources dans les réseaux mobiles, un projet financé par l'ANR INFRA en partenariat avec Telecom ParisTech, Orange Labs, PriSM/ Université de Versailles Saint Quentin en Yvelines, LAMSADE/Université Paris Dauphine, AlcatelLucent (ALU), Inria/LIG.





SM207167 Attack on a classical variant of McEliece's cryptosystem. Application to ChorRivest cryptosystem.


Description


1 Topic 
McEliece's cryptosystem [2], relying on coding theory, is one of the alternative to discret logarithm or factorization based cryptosystems in case of the advent of the quantum computer. McEliece's original proposal used =93binary Goppa codes=94, and many variants have been proposed, for instance by replacing binary Goppa codes with ReedSolomon codes. This variant is weak, as shown by Sidelnikov and Shestakov [3], but an implementation of their attack has never been made public. On the other hand, Vaudenay [4], when building his attack on ChorRivest cryptosystem [1], considered many ways for attacking ChorRivest, including coding theory, as quoted below.
"By using the parity check matrix H of the code spanned by M (which is actually a ReedSolomon code), this can be transformed into a permuted kernel problem H Vπ−1 [...] Unfortunately, there exists no known efficient algorithms for solving this problem"
This is exactly the problem which has to be solved by the cryptanalyst when attacking McEliece's cryptosystem based on ReedSolomon codes, and thus the attack of SidelnikovShestakov should apply. As shown by the bibliography below, the attack of SidelnikovShestakov was already published, but unknown to Vaudenay. It may be the =93efficient algorithm=94 required by Vaudenay.
2 Objectives 
First, the student will have to implement the attack of SidelnikovShestakov and execute it on McEliece's variants based on ReedSolomon codes, for cryptograph sizes. This will be done using the computer algebra system sage. The objects to deal with are algebraic codes, matrices and polynomials over finite fields, already existing in sage. Then, the student will have to understand the cryptosystem of ChorRivest, which is a knapsack whose trapdoor relies on a hidden finite field, and to determine of the attack of SidelnikovShestakov really applies to Chor Rivest cryptosystem. Then the attack will have to be done using previous implementation. References
[1] Benny Chor and Ronald Rivest. A knapsack type public key cryptosystem based on arithmetic in finite fields (preliminary draft). In George Robert Blakley and David Chaum, editors, Advances in Cryptology, volume 196 of Lecture Notes in Computer Science, pages 54=9665. Springer, 1985.
[2] Robert J. McEliece. A publickey cryptosystem based on algebraic coding theory. Technical report, Jet Propulsion Lab Deep Space Network, 1978.
[3] V. M. Sidelnikov and S. O. Shestakov. On insecurity of cryptosystems based on generalized ReedSolomon codes. Discrete Mathematics and Applications, 2(4):439=96444, 1992.
[4] S. Vaudenay. Cryptanalysis of the ChorRivest cryptosystem. 14(2):87=96100, 2001. 1 URL sujet détaillé : http://pages.saclay.inria.fr/daniel.augot/sujetstage2014en.pdf
Remarques :





SM207168 Triangles in extremal graph theory


Description


In extremal graph theory, one often try to compute the maximal density of some object (graph, digraph, hypergraph) which avoid to contain a specific substructure. For instance Turan's theorem asserts that if the number of edges is more than n²/2, then the graph has a triangle.
The situation is dramatically worse for oriented graphs or hypergraphs where the thresholds are open for decades. Let us mention for instance that for now 40 years, we do not know if an oriented circuit of length at most 3 exists in every directed graph with minimum outdegree at least n/3.
The goal of this internship is to explore this topic in light of recent tools which have been recently developed, like for instance topological methods or flag algebras. URL sujet détaillé :
:
Remarques :





SM207169 Calcul d'approximations polynomiales rigoureuses dans des bases de polynômes orthogonaux


Description


L'équipe AriC travaille depuis une quinzaine d'années maintenant à l'élaboration d'algorithmes et de logiciels permettant d'améliorer la qualité d'évaluation des fonctions mathématiques. Jusqu'à présent, ce travail s'est principalement concentré sur la classe des fonctions dites élémentaires (exp, les fonctions trigonométriques, les fonctions puissances et toutes leurs réciproques) d'une variable réelle.
On souhaite à présent étendre le champ de ces fonctions à des classes plus larges, notamment les fonctions classiques de la Physique appelées fonctions spéciales, données en général sous la forme de solutions d'équations différentielles linéaires ordinaires. Un point clé dans le succès de ces méthodes d'évaluation efficace et à erreur garantie, réside dans la capacité à calculer des approximations polynomiales rigoureuses de ces fonctions : on veut retourner un couple formé d'un polynôme et d'un intervalle tel que, sur le domaine considéré, la différence entre la solution de l'équation différentielle et le polynôme prenne toutes ses valeurs dans l'intervalle.
Les premières approches de ce type remontent aux travaux de R. Moore [6] puis de E. W. Kaucher et W. L. Miranker [3]. M. Berz et K. Makino [4] ont étudié le cas où l'approximant polynomial est un polynôme de Taylor et proposent dans leur logiciel COSY un ensemble de procédures traitant le cas de plusieurs variables.
Plus récemment, des membres des équipes AriC et Algorithms (INRIA Rocquencourt) ont proposé [1,2,5] des méthodes performantes de calcul d'approximations polynomiales rigoureuses en considérant le cas, quasioptimal au sens de l'approximation, où le polynôme est soit un interpolant aux nóuds de Chebyshev, soit une série de Chebyshev tronquée. Ce cadre se limite toutefois à l'approximation sur un intervalle fermé borné et il s'agit maintenant d'étendre ces travaux à des bases quelconques de polynômes orthogonaux, afin de pouvoir traiter des classes bien plus larges de domaines de définition pour les fonctions considérées et des applications qui ne relèvent pas de ces approximants de type Chebyshev. Ce travail devrait mettre en óuvre des aspects de calcul formel, d'analyse numérique, d'arithmétique des ordinateurs, d'analyse fonctionnelle. La/le stagiaire devra aussi implanter ses résultats dans un outil, écrit en C ou en SAGE, par exemple, dans la perspective de développer à terme une suite logicielle complète d'outils d'approximation polynomiale et rationnelle certifiée et efficace.
La poursuite de ce travail en thèse est vivement souhaitée.
[1] A. Benoit. Algorithmique seminumérique rapide des séries de Tchebychev. Thèse de doctorat, École polytechnique, Palaiseau, France, Juillet 2012.
[2] M. Joldeş. Rigorous polynomial approximations and applications. Thèse de doctorat, École normale supérieure de Lyon, 2011.
[3] E.W. Kaucher and W.L. Miranker. Selfvalidating numerics for function space problems. Academic Press, 1984.
[4] K. Makino and M. Berz. Taylor models and other validated functional inclusion methods. International Journal of Pure and Applied Mathematics, 4(4):379=96456, 2003. http://bt.pa. msu.edu/pub/papers/TMIJPAM03/TMIJPAM03.pdf.
[5] Marc Mezzarobba. Autour de l'évaluation numérique des fonctions Dfinies. Thèse de doctorat, École polytechnique, Palaiseau, France, November 2011.
[6] R. E. Moore. Interval Analysis. PrenticeHall, 1966. URL sujet détaillé :
:
Remarques :





SM207170 Lattice reduction criteria


Description


Lattice reduction has been growing as one of the main tool of modern cryptanalysis. Lattice reduction allows to cryptanalyze a wide range of cryptosytems, including the classic RSA with Coppersmith method, NTRU, the GGH type schemes, the hash function SWIFFT and SWIFTTX based on Qary lattice, the Learning With Error based cryptosystems and all the new type of fully homomorphic schemes (from ideal lattice to Approximate GCD based cryptosytem).
Lattice reduction is composed by a family of algorithms: starting from the historical LLL a polynomial algorithm, up to HKZ, which is an exponential algorithm. This family includes BKZ, pruning method, deep insertion variant, etc. Those algorithms are provided with an upper bounded running time and an approximation factor. Algorithms use more time, return result with lower approximation factors. The smaller the approximation factor is, the better the result will be. In cryptanalysis, it is often not required to acquire the best result. A good result will often break the cryptosystem. However, if an upper bound is provided, to date there is no answer to the main question: Whatever lattice algorithm we are using, what is the minimum time to obtain a given quality (i.e., the approximation factor)? Due to this problem, cryptograph do an overkill technique by using extremely huge parameters under which they are sure that the resulting cryptosystems will be secure, and therefore compromise the efficiency of such cryptosytems.
This project intend to study existing lattice reduction algorithms and extract criteria to evaluate average and best case scenario of the evolution of a basis during its reduction. URL sujet détaillé :
:
Remarques : Possibility to followup with a PhD candidacy





SM207171 Post quantum lattice cryptography


Description


The availability of quantum computers will create a new era of computing, in particular its impact to cryptography and secure communication. Quantum computer algorithms are capable of efficiently solving the factorisation and discrete logarithm problem, which are used in the current standard ciphers. This would mean that the availability of quantum computers would effectively make standard cipher algorithms insecure.
Lattice based cryptography, Code based cryptography and multivariate quadratic equation cryptography are consider as the main possible replacement for RSA and ECC if such quantum computer become available. If those cryptosystems are often connected to NPHard problems (widely studied even in the context of quantum computing), their practical version are based on weaker problem less studied.
This project intent to analyze the complexity of lattice problems used in cryptography in the context of quantum computing. URL sujet détaillé :
:
Remarques : Possibility to followup with a PhD candidacy





SM207172 ModelChecking Techniques for Virus and Malware Detection


Description


The number of malwares that produced incidents in 2010 is more than 1.5 billion. A malware may bring serious damage, e.g., the worm MyDoom slowed down global internet access by ten percent in 2004. Authorities in vestigating the 2008 crash of Spanair ight 5022 have discovered a central computer system used to monitor technical problems in the aircraft was in fected with malware. Thus, it is crucial to have e=0Ecient uptodate virus detectors. Existing antivirus systems use various detection techniques to identify viruses such as (1) code emulation where the virus is executed in a virtual environnement to get detected; or (2) signature detection, where a signature is a pattern of program code that charaterizes the virus. A =0Cle is declared as a virus if it contains a sequence of binary code instructions that matches one of the known signatures. Each virus variant has its correspond ing signature. These techniques have some limitations. Indeed, emulation based techniques can only check the program's behavior in a limited time interval. They cannot check what happens after the timeout. Thus, they might miss the viral behavior if it occurs after this time interval. As for signature based systems, it is very easy to virus developers to get around them. It su=0Eces to apply obfuscation techniques to change the structure of the code while keeping the same functionality, so that the new version does not match the known signatures. Obfuscation techniques can consist in in serting dead code, substituting instructions by equivalent ones, etc. Virus writers update their viruses frequently to make them undetectable by these antivirus systems. To sidestep these limitations, instead of executing the program or mak ing a syntactic check over it, virus detectors need to use analysis techniques that check the behavior (not the syntax) of the program in a static way, i.e. without executing it. Towards this aim, we propose to use modelchecking for virus detection. Modelchecking has already been applied for this purpose. However, the existing works have some limitations. Indeed, the speci=0Ccation languages they use have only been applied to specify and detect a partic ular set of malicious behaviors. They cannot be used to detect all viruses behaviors. Thus, one of the main challenges in malware detection is to come up with speci=0Ccation formalisms and detection techniques that are able to specify and detect a larger set of viruses. The purpose of this internship is to (1) de=0Cne expressive logics that can be used to compactly express malicious behaviors. Our goal is to be able to express malicious behaviors that were not considered in the litterature. (2) De=0Cne e=0Ecient modelchecking algorithms for these logics. (3) Reduce the malware detection problem to the modelchecking problem of these log ics. And (4) de=0Cne interesting program abstractions that would make these techniques applicable to large programs. URL sujet détaillé :
:
Remarques :





SM207173 ModelChecking Techniques for Virus and Malware Detection




SM207174 High throughput architecture for managing data streams


Description


Big data architectures aim at managing huge quantity of data, but this buzz word covers at least two approaches. The first one tries to handle as fast as possible huge data store with batch jobs, as it is the case for hadoop's MapReduce. The second approach works with continuous data flow, where the volume is not obtained instantaneously but is delivered through a regular pace. Having a slow or unbounded latency flow management process leads to bottlenecks and after memory exhaustion to failures. This second approach is typical to social network architectures. They must manage unstoppable user data flows both incoming and outgoing. For instance Twitter pipeline guarantees a 5s delay between tweet production and reception among followers in conjunction with various processes such as tweet indexing.
We target the second architecture, where we model a multinode, low level, predictive and reactive based framework that handles data flows and pipelined computations. We do not target raw performances, but focus on performance prediction in the general case. To achieve this, we plan to model our data flow architecture as a system of fluid dynamics. We apply equations from this theory to predict and dynamically adapt the multinode architecture in order to absorb and quantify in and out flows.
We target both theoretical elements from fluid dynamics and prototype implementation based for instance on javascript nodejs deployment.
URL sujet détaillé :
:
Remarques : Le stage s'effectuera soit dans les locaux du CITI, soit dans les locaux de l'IXXI.





SM207175 Résolution exacte du Unit Commitment Problem


Description


Le problème de planication d'unités de production, appelé Unit Commitment Problem (UCP) est un problème d'optimisation combinatoire bien connu dans la littérature. Le problème consiste à décider des marches et arrêts d'unités sous la contrainte de satisfaire la demande sur un horizon de temps discret (journalier) en respectant des contraintes techniques fortes, comme par exemple le respect des contraintes de durée minimum de marche et d'arrêt, dites contraintes Minup/Mindown. Le sujet de ce stage s'inscrit dans le cadre de la planication des unités de production électrique d'EDF. EDF dispose d'un parc de production diversié composé d'unités de production thermiques nu cléaires, thermiques à amme et hydrauliques. Les méthodes de gestion à court terme de la production permettent de planier quotidiennement la production la veille pour le lendemain en construisant des plannings à coût minimum et réalisables. Le plan de production court terme d'EDF est aujourd'hui réalisé par une méthode de décomposition par les prix, obtenue par relaxation lagrangienne de la contrainte couplante correspondant à l'obligation de répartir la demande de production sur toutes les unités. Il s'avère que cette méthode peut dicilement être adaptée pour prendre en compte d'autres contraintes couplantes : c'est le cas par exemple de contraintes apparaissant pour les unités de pro duction thermiques à ammes (THF). En eet, les unités THF sont regroupés sur un même site et partagent des ressources communes qui doivent être prises en compte pour la gestion de production. De plus, la décomposition lagrangienne ne vise pas à trouver une solution exacte (même si, en pratique, elle fournit de bonnes solutions). L'objectif de ce stage est d'étudier les aspects combinatoires de l'UCP an de mettre en =F7uvre des techniques de programmation mathématique discrète pour sa résolution exacte de l'UCP en prenant en compte les contraintes couplantes des unités THF. URL sujet détaillé : http://wwwdesir.lip6.fr/~fouilhoux/documents/Stage_UCP.pdf
Remarques : coencadré par Fanny Pascual





SM207176 Certifying differential privacy algorithms


Description


The aim of this project is to formally certify the correctness of some basic statistical analysis studied in the theory of differential privacy. Differential privacy offers strong statistical guarantees ensuring that the presence or absence of a single individual in the data collection has a negligible statistical effect on the program's result. Many important differentially private programs are obtained by composing several basic building blocks. It is so important to ensure that the basic building blocks are correct. The aim of this project is to use a special purpose interactive proof assistant to certify the differential privacy guarantees provided by those building blocks. URL sujet détaillé :
:
Remarques : Dundee is a growing Scotland university with an exciting research environment. There will be possibilities of collaborations with other research groups in Dundee as well as in other Scotland university as Edinburgh, Strathclyde and St. Andrews.





SM207177 Programming privacyaware machine learning algorithms


Description


The aim of this project is to implement a collection of privacyaware data analysis using recently developed programming languages for machine learning. Privacyaware algorithms usually provide at the same time finegrained data analysis and strong statistical guarantees useful to ensure the privacy of individuals. Programming languages for machine learning are usually based on bayesian inference. They have been used in a variety of setting from statistical learning to data analysis, but applications to privacyaware analysis are still missing. The aim of this project is to fill this gap and to evaluate the usefulness of the programming facilities provided by those frameworks to implement privacyaware analysis tools. URL sujet détaillé :
:
Remarques : Dundee is a growing Scotland university with an exciting research environment. There will be possibilities of collaborations with other research groups in Dundee as well as in other Scotland university as Edinburgh, Strathclyde and St. Andrews. Some possibilities to continue the internship in a Phd program.





SM207178 Type systems for Resource Analysis


Description


The aim of this project is the design and the implementation of typebased program analysis useful to analyse the time complexity of programs. Several applications require programs to respect strong constraints on the amount of resources they can access, in term of running time and used memory space. So, it is important to develop methods and tools useful to enforce programs to use only the amount of resources they will have at their disposal at run time. Following this approach, the goal of the project is to design and implement program analysis based on the use of type theory in order to infer the resources that the program will need at runtime. URL sujet détaillé :
:
Remarques : Dundee is a growing Scotland university with an exciting research environment. There will be possibilities of collaborations with other research groups in Dundee as well as in other Scotland university as Edinburgh, Strathclyde and St. Andrews. Some possibilities to continue the internship in a Phd program.





SM207179 Application d'éléments de la théorie des graphes pour rendre intrackable les sources d'informations dans un système complexe


Description


L'équipe Dice travaille sur l'élaboration d'un système complexe permettant à plusieurs millions de participants de coopérer anonymement à l'élaboration d'une information commune. Tout en garantissant que les participants ont coopérés à l'élaboration de l'information, l'anonymat en préserve leurs origines. Seul l'auteur d'une information sait qu'elle est exprimée et intégrée.
Le protocole repose sur une une décomposition de l'information et du groupe de participants par un arbre. Chaque participant peut en éditer une partie, participe à sa diffusion, et maintient cohérente la structure globale de l'arbre ainsi réparti entre les participants.
La structure de cet arbre est fondamentale dans le comportement des protocoles associés. Particulièrement nous voulons veiller aux contraintes suivantes :
 Coûts de parcours. L'arbre n'est qu'une structure virtuelle d'organisation des données. Les noeuds sont limités en ressource et ne peuvent pas se permettre de parcourir l'arbre pour chaque requête. Ainsi la structure de l'arbre doit être choisie afin que certains types de parcours soient en complexité faible.
 Résilience à la panne. Nos protocoles permettent d'inviter un grand nombre de participants à collaborer, mais ne forcent pas tous les participants à collaborer. Ainsi le résultat du document doit être disponible dès sa cloture, et chaque participant possède une vision partielle de certains éléments supérieurs de l'arbre. Aucune de ces différentes visions ne doit fournir une quantité d'information supérieure aux autres, et aucun participant, par son comportement, ne doit pouvoir perturber la vision que les autres peuvent en avoir.
 Equilibre convergence/anonymisation. La structure de l'arbre permet d'accumuler de l'information synthétique à la racine, afin de pouvoir élaguer les racines de l'arbre sur les différents noeuds de stockage. L'objectif de la structure est de favoriser une vitesse de convergence rapide en remontant l'information vers la racine, tout en preservant un anonymat efficace des sources.
 Evaluation des coûts. Le nombre de participants à l'élaboration du document, influence la structure et la vitesse de convergence des données dans l'arbre. N'ayant pas d'estimation précise sur le délai d'établissement du document final, nous nous reposons sur une évaluation générale de la synchronisation des données en fonction du nombre de messages échangés et d'une probabilité de convergence. Le protocole préserve l'anonymat il génère donc un certain nombre de messages apportant du bruit dans le système. Ce nombre de message est borné en probabilité, ce qui nous permet d'avoir une estimation globale du délai.
L'objectif du stage, est d'étudier d'un point de vu formel la structure et les contraintes associées à l'arbre. Nous voulons obtenir un objet mathématique dont les propriétés sont clairements définies et sur lequel nous pouvons développer nos algorithmes de synchronisation et de transport. URL sujet détaillé :
:
Remarques : Le stage s'effectuera soit dans les locaux du CITI, soit dans les locaux de l'IXXI.





SM207180 Titre du stage


Description


L'objectif de ce stage est d'étudier des solutions permettant de synthétiser et structurer les graphes représentant les réseaux des supercalculateurs, sous la forme de topologies compréhensibles et utilisables facilement. Par exemple reconnaître une topologie proche d'un fattree en identifiant clairement les switchs des différents niveaux, les tailles des liens entre les niveaux, les serveurs impliqués dans les différents voisinages hiérarchiques. URL sujet détaillé : http://runtime.bordeaux.inria.fr/goglin/Goglin2013netloc.html
Remarques :





SM207181 Approche déclarative de la spécification formelle d'architectures




SM207182 Cunningham Chains Mining


Description


Problems and puzzles involving prime numbers have always fascinated humans. A Cunningham chain (of the first kind) is a sequence of the form (p, 2p+1, 4p+3, ...), such that each term in the sequence is prime. Similarly, sequences of primes built using the relation p_{i+1} = 2p_i  1 are known as Cunningham chain of the second kind.
Following Dickson's conjecture, it is widely believed that for every k there are infinitely many Cunningham chains of length k. Apart from proving this conjecture, there exists various interesting challenges associated to Cunningham chains. For example, one may want to find the largest prime starting a chain of a given length. Another interesting problem consists in finding Cunningham chains of longest length ; a problem which has recently been proposed as the setting for primecoin mining, a mining process associated to a virtual money similar to bitcoin.
Investigating efficient algorithms and implementations for addressing the above challenges is the main objective of the internship project. The algorithms involved are related to sieving techniques, primality proving and the associated arithmetic primitives. If time permits, highly optimized implementation of the proposed algorithms on GPU and/or multicore processors will also be considered. URL sujet détaillé : http://www.lirmm.fr/~imbert/sujets/cunninghamchains.pdf
Remarques : Coencadrement : Pascal Giorgi et Bastien Vialla. Possibilité de financement





SM207183 Modèle de performance des applications et de plateformes parallèles pour le placement de processus


Description


In many area of science and technology it is required to solve complex problems requiring tremendous computational need. To fulfil these needs scientists use parallel computers. However these computers are becoming extremely complex and hard to master.
Among the difficult issues to adresse in nowadays parallel comporting lies the "process placement problem". This problem comes from the unhomogenous affinity of the computing processes and the heterogenous performance of the communications. Indeed, some pairs of processes communicate more than other pairs and therefore, it is important to carefully map processes onto the resources to improve communication or memory access.
To optimise the mapping, we develop algorithms and tools that use simple models that represent the topology of the underlying hardware as a tree. The tree structure is used to model the memory hierarchy. However, levels of the hierarchy do not impact the performance in the same way. It is important to know which level of the hierarchy of the most important impact on the performance to improve the accuracy of the model.
The goal of this work is to improve the architecture model of parallel system by understanding the interaction between the performance of an application and the access restriction to some resources. By taking simple applications, we want to see how performance varies if, in a multicore machine, we restrict the access of threads execution to given level of the hierarchy. To do so we will use the ORWL readwrite lock library that provides fine resource control
We target a sturdy on a 160 cores multicore. This work will be decomposed in 3 parts: 1) Experiments on small compute kernel 2) design of performance models of the application and the resource execution 3) model validation on larger and different settings
URL sujet détaillé : http://www.labri.fr/perso/ejeannot/stage_AEN_2013.html
Remarques : Se stage se déroulera dans le cadre d'une collaboration entre l'Université de Bordeaux et l'université de Strasbourg et pourra donner lieu à une thèse en cotutelle entre ces deux universités.
Financement : Inria (100%)
E. Jeannot, G. Mercier, and F. Tessier, Process placement in multicore clusters: Algorithmic issues and practical techniques, IEEE Transactions on Parallel and Distributed Systems, 2013, to appear.
F. Broquedis, J. CletOrtega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, and R. Namyst. hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications. In PDP2010, Pisa, Italia, February 2010. IEEE.
PierreNicolas Clauss and Jens Gustedt. Iterative computations with ordered readwrite locks. Journal of Parallel and Distributed Computing, 70(5):496=96504, 2010. RR6685.





SM207184 Utilisation de techniques SSA pour la traduction binaire dynamique


Description


Simulation is crucial when developing for embedded systems, in particular for multiprocessors systems. It allows to start programming in advance for architectures still in development and ease the coding / compiling / testing cycle by performing all this on a host machine. In order to speed up simulation, new techniques are appearing as, with the increasing computing power of embedded systems, it is not feasible anymore to emulate the target's instructions as the performances are terrible. Among these techniques, the dynamic binary translation (DBT) converts on the fly instructions in the target's format to the host's format. In our system, we use internally an intermediate representation resembling what is found in a compiler, only much simpler. In recent work, we have shown that an intermediate representation close to the SSA form (Static Single Assignment) solves many issues that arise when simulating VLIW processors (Very Long Instruction Words). The goal of this internship is to evaluate what could bring the SSA form to the DBTbased simulation, by studying existing algorithms on these two different worlds: simulation on the one hand and compilation on the other hand. We expect that it is possible to devise new algorithms combining those knowledge to improve the performance of simulation. URL sujet détaillé : http://nanosim.imag.fr/membres/florent.boucheztichadou/sujet_stage.pdf
Remarques : Ce stage serait coencadré par Florent Bouchez Tichadou (expertise compilation SSA) et Nicolas Fournel (expertise simulation DBT). nicolas.fournel.fr





SM207185 Flexible mesh generation


Description


Mesh generation plays a fundamental role in numerical simulation for aeronautics, oil and gas exploration and computational mechanics (to name but a few). This research project concerns the development of a novel, flexible 3D meshing algorithm adapted to certain numerical simulations. The goal of the project is to consider some geometrical aspects of the problem (namely the fast and robust computation of a restricted Voronoi diagram).
URL sujet détaillé : http://alice.loria.fr/index.php/positions/8master/64flexiblemeshgeneration.html
Remarques : This master project is in the frame of project GOODSHAPE/VORPALINE, funded by the European Research Council (Starting grant and Proofofconcept grant). There are possibilities for funding a Ph.D. thesis.





SM207186 Profiling interprocess communications on multicore servers


Description


** Description: The context of this internship is a freshly started research project focused on the design and implementation of a new operating system (OS) optimized for emerging hardware architectures and modern workloads. The internship is tightly integrated with the project and offers opportunities to carry on the research as a Ph.D student.
** Context: Modern machines are very difficult to program because of their rich and fast evolving topologies, which include complex networkonchips interconnecting many processing cores, deep memory hierarchies with nonuniform access times (NUMA), and highspeed I/O (storage and networking) devices. Besides, modern workloads (e.g., Cloudbased services and Big data applications) are outstanding because they have very demanding requirements on all the hardware resources (CPUs, memory, storage and networking devices). In addition, these workloads also stress the logical facilities of the operating system, for instance the memory management and interprocess communication subsystems.
We believe that current operating systems like Linux are a poor match for such a context. In particular, they suffer from a lack of correlation between chains of resource dependencies. Indeed, current operating systems make task scheduling decisions based on a very narrow view of the logical resources. For instance, a task may be elected to run on a core and almost immediately get blocked because it attempts to acquire an unavailable lock or to write in a shared buffer with no space left. Such a sequence of events is inefficient as it exacerbates the overhead of the OS (e.g., increasing the frequency of context switches) and, when generalized at the scale of thousands of threads, can significantly hurt performance.
**Internship: The goal of the internship is to design and implement a profiling tool for programmers and OS designers. The purpose of this tool is to trace the chains of resource dependencies in order to allow pinpointing inefficient interactions between multiple tasks (i.e., processes and threads). The first version of the tool will target the Linux kernel and focus on interactions via explicit communication channels, such as pipes and Unix domain sockets. The key research problems to be addressed are:
1) The design of a lowoverhead and accurate mechanism for tracing interactions between execution flows; 2) The design of algorithms for postmortem processing, aimed at automatic detection of inefficient scheduling/interaction patterns; 3) Suggestions for adapting the algorithms to online processing.
The internship will involve the following phases:
1) Study of the related work on existing profiling tools and basic building blocks for efficient instrumentation of operating system kernels; 2) Design and implementation of the profiling tool on Linux; 3) Experimental validation of the tool (accuracy, overhead, insight) on synthetic and realistic workloads.
** Keywords: Operating systems; Multicore; Linux kernel; Profiling.
** Two selected publications made by the research team on the internship topic:
 Traffic Management: A Holistic Approach to Memory Placement on NUMA Systems. Mohammad Dashti, Alexandra Fedorova, Justin Funston, Fabien Gaud, Renaud Lachaize, Baptiste Lepers, Vivien Quéma, and Mark Roth. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Houston, USA, March 2013.
 MemProf: A Memory Profiler for NUMA Multicore Systems. Renaud Lachaize, Baptiste Lepers, and Vivien Quéma. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), Boston, USA, June 2012. URL sujet détaillé :
:
Remarques :  Coencadrement par Renaud Lachaize : http://membresliglab.imag.fr/rlachaiz/
 Il est possible de rémunérer le stage.





SM207187 Profiling interprocess communications on multicore servers


Description


Modern multicore machines are very difficult to program because of their rich and fast evolving topologies, which include complex networkonchips interconnecting many processing cores, deep memory hierarchies with nonuniform access times (NUMA), and highspeed I/O (storage and networking) devices. Besides, modern workloads (e.g., Cloudbased services and Big data applications) are outstanding because they have very demanding requirements on all the hardware resources (CPUs, memory, storage and networking devices). In addition, these workloads also stress the logical facilities of the operating system, for instance the memory management and interprocess communication subsystems. We believe that current operating systems like Linux are a poor match for such a context. In particular, they suffer from a lack of correlation between chains of resource dependencies. Indeed, current operating systems make task scheduling decisions based on a very narrow view of the logical resources. For instance, a task may be elected to run on a core and almost immediately get blocked because it attempts to acquire an unavailable lock or to write in a shared buffer with no space left. Such a sequence of events is inefficient as it exacerbates the overhead of the OS (e.g., increasing the frequency of context switches) and, when generalized at the scale of thousands of threads, can significantly hurt performance.
The goal of the internship is to design and implement a profiling tool for programmers and OS designers. The purpose of this tool is to trace the chains of resource dependencies in order to allow pinpointing inefficient interactions between multiple tasks (i.e., processes and threads). The first version of the tool will target the Linux kernel and focus on interactions via explicit communication channels, such as pipes and Unix domain sockets. The key research problems to be addressed are: 1) The design of a lowoverhead and accurate mechanism for tracing interactions between execution flows; 2) The design of algorithms for postmortem processing, aimed at automatic detection of inefficient scheduling/interaction patterns; 3) Suggestions for adapting the algorithms to online processing. URL sujet détaillé : http://membresliglab.imag.fr/quema/sujet_multicore_ENS.pdf
Remarques :  Coencadrement par Renaud Lachaize : http://membresliglab.imag.fr/rlachaiz/
 Il est possible de rémunérer le stage.





SM207188 TrueTime Travel


Description


The TrueTime API is a GPS/atomicclock based API recently invented by Google [Corbett et al. OSDI 2012]. On one hand, TrueTime bridges the gap between the asynchronous and the synchronous distributed computing communication models and is of significant theoretical interest. On the other hand, TrueTime is the future of modern datacenter oriented distributed systems, being, e.g., practically used in Google's datacenters facilitating increasingly many complex protocols.
The purpose of this internship is to answer one or more of the following questions. What is computable in a distributed system with TrueTime API? What are the minimum synchrony assumptions needed to implement TrueTime? How can we implement TrueTime in different failure models? How can TrueTime be used to augment and simplify existing asynchronous distributed protocols?
This internship will be coadvised by Prof. Vivien Quéma from Grenoble INP and Prof. Marko Vukolic from Eurecom. The location of the internship (Grenoble or Sophia Antipolis) will be agreed in discussions with a successful candidate. URL sujet détaillé : http://membresliglab.imag.fr/quema/sujet_truetime_ENS.pdf
Remarques :  Coencadrement par Marko Vukolic (Eurecom) à SophiaAntipolis : http://www.eurecom.fr/fr/people/vukolicmarko
 Il est possible de rémunérer le stage.





SM207189 Modèles mécaniques formels pour l'algorithmique distribuée


Description


Distributed computing is one of the domains where informal reasoning is not an option, in particular when Byzantine failures are involved. What characterises also Distributed Computing is its diversity of models subtle modifications of which induce radical change in the system behaviour. We consider Robot Networks, that is swarms of *autonomous* mobile entities that have to accomplish some task in *cooperation*. In this emerging framework, models can be distinguished by the capabilities of robots, the topology of the considered space, the level of synchrony (that is the properties of a demon), the type of the failures likely to occur, etc.
We are interested in obtaining formal and moreover mechanical guarantees of properties for certain protocols, using the Coq proof assistant. A Coq framework for robot networks recently proposed by Auger et al. can express quite a few variants of models for such networks, in particular regarding topology or demon properties. This framework is higher order and based on coinductive types. It allows to prove in Coq positive results (the task will be fulfilled using the algorithm embedded in all robots *for all* initial configuration) as well as negative results (*there cannot be any* embedded algorithm that will work for this task for all initial configuration).
An objective of this proposal is to obtain (with the help of this framework) the first mechanical certifications for some theoretical results, mostly negative (for example : impossibility of gathering of two robots when the demon is not constrained...).
Then the model will be extended to some particular topologies (higher dimensions) so as to propose a framework in which some new results could be formally established, in particular regarding the election of a leader, or the formation of geometrical patterns. Those two problems have been shown equivalent (meaning that there is an algorithm that solves one of them if one has as an input a solution for the other one) under some particular assumptions. Extending this equivalence to less constrained models is an interesting open problem. URL sujet détaillé :
:
Remarques : Coencadrement Pierre Courtieu





SM207190 Étude et mise en óuvre d'un support pour l'ordonnancement de tâches malléables au sein de l'intergiciel DIET sur fermes de calculs et Clouds


Description


Au fil des ans l'intergiciel DIET (http://graal.enslyon.fr/DIET) s'est positionné comme une solution efficace et fiable pour le déploiement d'applications HPC(High Performance Computing) dans des environnements hétérogènes allant du simple PC aux architectures Cloud en passant par les machines parallèles ou la grille. Le champ applicatif de DIET s'est étendu audelà des applications HPC avec la robotique, ou encore le domaine des jeux vidéos.
Actuellement DIET est capable d'utiliser des nóuds de calcul d'un cluster (ferme de calcul) ou d'un Cloud, pour exécuter des tâches parallèles. Son rôle est alors celui d'un meta workload manager. Par exemple dans le cas d'un cluster, un serveur DIET est capable, étant donné un nombre de processeurs à utiliser, de soumettre un job (i.e. tâche de calcul) à un batchscheduler (ordonnanceur local de tâches). Afin d'améliorer les capacités d'ordonnancement au sein de DIET, nous souhaitons mettre en place et/ou implémenter de nouveaux algorithmes d'ordonnancement de jobs malléables, c'estàdire des jobs dont le nombre de processeurs utilisés change au cours du temps. Un paramètre a prendre en compte dans le cadre applicatif qui nous intéresse implique une prise en compte de la notion de priorité.
Travail à faire
Dans un premier temps, l'étudiant fera un état de l'art des algorithmes d'ordonnancement de tâches parallèles (avec ou sans priorité) sur fermes de calcul et Cloud. Dans, la continuité de cette étude, il s'intéressera à la littérature sur l'ordonnancement de tâches malléables avec priorités. Il sera amené à implémenter ces algorithmes existants dans DIET en bénéficiant des mécanismes de plugin scheduler inhérent à l'intergiciel. Ces travaux s'inscrivent dans une collaboration naissante sur un domaine applicatif précis et ciblé. Fort de l'étude menée, on pourra envisager la conception d'algorithme d'ordonnancement spécialisé pouvant tirer partie des spécificités de l'application visée. Ces algorithmes seront évalués par simulation et/ou par expérimentations en fonction de nos avancés. URL sujet détaillé :
:
Remarques : Coencadrement avec Lamiel Toch (Postdoc Inria. LIP)





SM207191 Faulttolerant sparse linear algebra kernels at scale


Description


Stage M2
Faulttolerant sparse linear algebra kernels at scale Advisors:  Yves Robert, ENS Lyon and IUF, Yves.Robertlyon.fr  Bora Ucar, CNRS and ENS Lyon, Bora.Ucarlyon.fr
The subject of this training period is to investigate several algorithms for sparse linear algebra kernels, such as matrixvector multiplication and conjugate gradient (basic or preconditioned). One needs to revisit these classical kernels to derive efficient versions at scale. Current petascale and future exascale platforms include millions of components. Such large scale systems induce two major problems: resilience and energyconsumption.
Resilience is (loosely) defined as surviving to failures. For instance, a failure will occur every 50 minutes in a system with one million components, even if the mean time between failures of a single component is as large as 100 years. General approaches to failure mitigation include redundancy and checkpointing. However, specialized methods for linear algebra kernels can also be used: ABFT techniques have recently been extended to deal with sparse problems. All these methods apply to fatal failures rather than to silent errors, which cause additional problems (such as the detection latency).
To further complicate the picture, minimizing power consumption is also mandatory. It is anticipated that the power dissipated to perform communications and I/O transfers will make up a much larger share of the overall power consumption. In fact, the relative cost of communication is expected to increase dramatically, both in terms of latency/overhead and of consumed energy.
Redesigning sparse numerical kernels to ensure resilience and to minimize energy consumption is a difficult proposition, especially because higher resilience often calls for redundant computations and/or redundant communications, which in turn consume extra energy. But failure mitigation, together with silent error detection and correction, are mandatory issues for forthcoming platforms, and algorithmic tradeoffs are to be proposed and carefully assessed.
The main goal of this training period is to investigate tradeoffs between performance, resilience and energy consumption. This is a challenging but unavoidable multicriteria optimization problem, whose solution is critical for many applications and largescale systems. URL sujet détaillé :
:
Remarques : Codirection with Yves Robert.





SM207192 SÉMANTIQUE DU LANGAGE SPARK/ADA EN COQ


Description


SPARK is a programming language, based on ADA, and a platform dedicated to static analysis of ADA/SPARK programs. Examples of analysis are information flow analysis and formal program proofs. ADA and SPARK are widely used in safety critical industrial project (spatial and flight systems, railway systems, etc).
The design of a formal semantics for SPARK has been started recently in the CPR team of CEDRIC laboratory. This is done in collaboration with the AdaCore company and laboratory SAnToS (Kansas State University). The objectives are the followings:
 Validation of the SPARK platform by certification authorities;  enhance confident in analysis tools and even in the long term a possible link to the certified compiler Compcert;  Allow to define and prove new formal analysis, in particular some recent automatic loop invariant generation techniques;
The intended work for this internship is the extension of the currently existing Coq development to a larger subset of the language. One of the original feature added to the semantics is the presence of annotations within the code (pre and postconditions, loop invariants). This work includes performing proofs of various properties of the semantics (determinism, verification of runtime checks,etc).
Interaction with AdaCore and SAnToS will be scheduled on a regular base. URL sujet détaillé :
:
Remarques : Interaction régulière avec AdaCore et le laboratoire SAntoS (Kansas University).





SM207193 Optimal mechanisms for protection of confidential information


Description


Our era is characterized by a massive use of computing devices, increasingly more powerful and interconnected, and pervading almost every aspect of our life. These devices routinely collect, store and process information about the user, often without clear guarantees concerning its future deployment  who, and in what way, will use such information. Naturally this practice poses serious threats to the confidentiality of sensitive data and to the privacy of the user. Not surprisingly, then, there is a growing interest for mechanisms that protect the confidential part of the data while still allowing to use the service offered by the computing device. This is not an easy task, though, because in general the device requires certain information in order to provide the service, and this may be correlated to the information that the user wishes to keep secret. There is therefore a trade off between confidentiality and utility, and it is important to design mechanisms that are optimal, i.e., that offer the maximum utility for the desired level of confidentiality.
A typical example of this tradeoff is the case of location privacy: imagine a person who wants to use his smart phone to get a list of restaurants in his surroundings, but without revealing his exact location. He can then send an approximate location, or the zone where his location is, but he must be ready to pay the price (the bandwidth) of retrieving restaurants in a larger area, to be sure that his location is covered. An optimal mechanism will minimize the necessary area / bandwidth, while ensuring the required obfuscation of the real location.
In this internship, we plan to investigate classes of utility and confidentiality measures, and to devise methods to design optimal mechanisms for given measures. We will consider a probabilistic setting, in which data are random variables, and the mechanisms themselves can be randomized. We will focus, in particular, on measures that can be expressed as linear functions of the probabilities. This is the case of several notions of confidentiality and privacy considered in literature. We mention, in particular, the notion of Bayes risk [1,2], gvulnerability [3], and location privacy [4]. For the utility, an example is the quality of service (loss) formalized in [4].
The fact that we will consider linear functions will allow us to express the bounds on confidentiality as linear constraints. In this way, we will be able to express the problem of maximizing the utility as a linear optimization problem.
Bibliography
[1] K. Chatzikokolakis, C. Palamidessi, P. Panangaden. On the Bayes risk in informationhiding protocols. Journal of Computer Security, 16(5): 531571, 2008.
[2] G. Smith. On the Foundations of Quantitative Information Flow. Proc. of FOSSACS, 288302, 2009.
[3] M.S. Alvim, K. Chatzikokolakis, C. Palamidessi, and G. Smith. Measuring Information Leakage using Generalized Gain Functions. Proc. of CSF, 265279, 2012.
[4] R. Shokri, G. Theodorakopoulos, C. Troncoso, JP. Hubaux, and JY. Le Boudec. Protecting Location Privacy: Optimal Strategy against Localization Attacks. Proc. of the 19th ACM CCS, 617627, 2012.
Continuation as PhD: The internship is supposed to be used by the candidate to produce his/her master thesis. There will be the possibility of continuing working with the team as a PhD student. A preference will be given to the students who are willing to continue as PhDs.
Candidate's required level: Master's student (2nd year).
Proposed Duration: 46 months
Prerequisites: Attitude for formal and mathematical reasoning Some knowledge of Probability Theory would be welcome URL sujet détaillé : http://www.lix.polytechnique.fr/comete/stages/#om
Remarques : Coencadré with Serge Haddad, ENS Cachan.
Financed by INRIA and ENS Cachan





