SM2071 Attackdefense trees with dependencies


Description


Context Copying keys, cloning access cards, breakingin, or socialengineering legitimate users are just a few examples of possible attacks aiming to infiltrate restricted access areas. These examples illustrate that the digital security cannot be considered in isolation from the social and physical aspects. On the other hand, a holistic, sociotechnical approach makes the risk analysis and the security predictions much more complex and problematic.
Attack=96defense trees provide a wellfounded methodology for a systematic modeling and assessment of security [2]. They extend the wellknown model of attack trees widely popular in industry [5, 3]. An attack=96defense tree models a race between an attacker trying to compromise a system and a defender whose goal is to protect the system against potential attacks. Thanks to the flexibility regarding the labels of their nodes, attack=96defense trees are especially wellsuited to reason about security scenarios combining digital, physical, and also human aspects. A simple example of an attack=96defense tree illustrating how to disable the RFIDbased communication is given in Figure 1 in an extended description of this project, available at http://people.irisa.fr/Barbara.Kordy/internships/EMSEC_master_1516.pdf
Formal semantics for attack=96defense trees have been developed in [2]. They provide ways to decide which trees represent the same scenario and lay foundations for quantitative analysis of security using attack=96defense trees [4]. Unfortunately, the current model and its existing semantics based on propositional logic and set theory are not powerful enough to express the causeconsequence relationships between actions represented by the nodes of a tree. An initial effort has been made in [1] to extend the simple model of attack trees with sequential dependencies, but more research is necessary to generalize these results to a more complex model of attack=96defense trees.
Objective The objective of this project is to enhance attack=96defense trees with causeconsequence relationships between the nodes and to develop formal techniques for the analysis of the extended model. Augment attack=96defense trees with causeconsequence dependencies will allow the modeler to distinguish between different defensive measures =97 such as preventions, detections, or reactions =97 and to consider multiple colluding attackers. It will therefore provide greater modeling capabilities with respect to standard attack=96defense trees.
Description The project will be conducted in two phases. First, a practical case study consisting on analyzing the security of a recently deployed physical access control system at INSA Rennes will be performed. This phase will cover  the analysis of digital attacks on Mifare DESfire contactless smartcards,  the identification of relevant sociophysical attacks,  the examination of access control policies that have been implemented.
The objective of the case study will be to identify reallife characteristics that need to be modeled using causeconsequence dependencies and to get an insight on how to formally represent them in attack=96defense trees.
The student's tasks in the second phase of the project will be to formally define the extended attack=96defense tree model and to implement the underlying analysis techniques. Finally, the attacks on the access control system of INSA Rennes identified in the case study phase will be modeled using the developed formalism, with the goal of proposing relevant countermeasures to secure the most vulnerable elements of the system.
The EMSEC team plans to have an opening for a Ph.D. position in autumn 2016, to continue the research initiated in this project.
References : [1] Ravi Jhawar, Barbara Kordy, Sjouke Mauw, Sasa Radomirovic, and Rolando TrujilloRasua. Attack Trees with Sequential Conjunction. In Hannes Federrath and Dieter Gollmann, editors, The Proceedings of IFIP SEC 2015, volume 455 of IFIP AICT, pages 339=96353. Springer, 2015.
[2] Barbara Kordy, Sjouke Mauw, Sa=9Aa Radomirovic, and Patrick Schweitzer. Attack=96Defense Trees. Journal of Logic and Computation, 24(1):55=9687, 2014.
[3] Barbara Kordy, Ludovic PiètreCambacédès, and Patrick Schweitzer. DAGBased Attack and Defense Modeling: Don't Miss the Forest for the Attack Trees. Computer Science Review, 13=9614(0):1=9638, 2014.
[4] Barbara Kordy, Marc Pouly, and Patrick Schweitzer. A Probabilistic Framework for Security Scenarios with Dependent Actions. In Elvira Albert and Emil Sekerinski, editors, The Proceedings of iFM 2014, volume 8739 of LNCS, pages 256=96271. Springer, 2014.
[5] Bruce Schneier. Attack Trees: Modeling Security Threats. Dr. Dobb's Journal of Software Tools, 24(12):21=9629, 1999. URL sujet détaillé : people.irisa.fr/Barbara.Kordy/internships/EMSEC_master_1516.pdf
:
Remarques : The internship will be hosted by the EMSEC team (https://www.irisa.fr/en/teams/emsec) from IRISA in Rennes (https://www.irisa.fr/en/). It will be cosupervised by Barbara Kordy (http://people.irisa.fr/Barbara.Kordy/) and Gildas Avoine (http://www.avoine.net/). For more information please check the following website http://people.irisa.fr/Barbara.Kordy/internships.php





SM2072 High Performance category theory implementation for big graph transformations


Description


Graphs are used to describe a wide range of situations in a precise yet intuitive way. Different kinds of graphs are used in modelling techniques depending on the investigated fields, which include computer science, chemistry, biology, quantum computing, etc. When system states are represented by graphs, it is natural to use rules that transform graphs to describe the system evolution. There are two main streams in the research on graph transformations: (i) the algorithmic approaches and (ii) the algebraic approaches which define abstractly a graph transformation step using basic constructs borrowed from category theory. This internship is concerned about the latter approach.
Internship Objectives:
The aim of the internship is to implement categorical tools in order to: * provide a tool box (graph morphism,pushout, pullback, final pullback complement etc.) sufficiently generic in order to easily implement different categorical graph transformation frameworks. Component models will be used to achieve composability of the elements of the toolbox. * Be a parallel implementation: typically rules are local (the effects only act on the matched nodes and neighbors) and can be applied in parallel. Finding matchings that do not overlapp in an efficient way is a goal of this internship. * Manage large set of data. Indeed, many application domains of graph transformation are data intensive. The internship will consider the application of graph transformation to the problem of of social data anonymization and chemical/biological simulation. In both cases graphs bigger than 109 nodes have to be considered.
This internship is at the same time practical and theoretical in the sense that it is focused in the design and implementation of parallel algorithms of a high level programming approach.
Requisite skills and abilities :
Strong inclination to both theory and practise C++ programming, Programing language semantics Rewriting basics
Keywords : Category Theory, Big Data, Parallel programming, Graph Transformation, Rule Based Modeling, Anonymization
URL sujet détaillé : http://perso.enslyon.fr/frederic.prost/m2_hpcct.pdf
Remarques : This intership will be cocoached by Christian Pérez and Frédéric Prost.





SM2073 Algorithme d'inférence de types et rates pour le langage musical Faust


Description


1. Sujet Développement d'un algorithme d'inférence de type et de "rate" (fréquence de signal) dans le langage Faust de traitement du signal audio.
2. Projet Le langage Faust, développé au Grame (Lyon), permet de spécifier de manière fonctionnelle des opérations de traitement du signal ; il est particulièrement adapté aux applications audio et musicales (http://faust.grame.fr) à haute performance.
Dans une nouvelle version à venir de l'environnement Faust, élaborée en partie dans le cadre du projet ANR FEEVER en collaboration avec le CRI (MINES ParisTech), ce langage a fait l'objet d'une extension vectorielle qui propose l'ajout (1) de nouvelles primitives permettant de manipuler des valeurs vectorielles et (2) de signaux de "rates" (nombre d'échantillons par seconde) différents. Les applications visées par cet ajout vectoriel sont les traitements de hautniveau de type FFT par trame.
L'introduction de cette extension a logiquement conduit à enrichir la sémantique statique du noyau du langage Faust ; une spécification formelle a été proposée (voir l'article paru dans le journal Computer Languages, Systems and Structures, http://cri.ensmp.fr/classement/doc/A446.pdf).
3. Descriptif L'objet du stage de niveau Master Recherche, de quelques mois, avec prolongation en thèse possible, est, en partant du système de typage développé par le CRI et Grame :  de concevoir et prototyper un algorithme d'inférence de types et rates, en s'appuyant sur les outils de type SMT (Satisfiability Modulo Theory) ;  de l'implémenter, si possible, dans le compilateur Faust, en C++ ;  et d'en prouver la correction par rapport à la sémantique statique (en fonction du temps disponible, une preuve formelle, en Coq/Ssreflect par exemple, de l'ensemble ou d'une partie de ce processus pourra être envisagée). URL sujet détaillé :
:
Remarques : Exigences : Une connaissance de la programmation fonctionnelle et des techniques de compilation est nécessaire. Un intérêt pour la musique ou les traitements audio est un plus.





SM2074 Firewall rules compilation


Description


Firewalls are important components for securing networks. Filtering packets is usually defined in chains of rules with boolean conjunction condition and an action if the condition is true. Iptables is the tool available on Linux to define these rules. Chains may be quite large, and are applied on every packet going through the firewall, with a cpu load proportional to the flow and the number of rules examined to take a decision.
The objective is to apply static or dynamic optimizations to filtering rules so as to reduce the overall load on the firewall. This may involve reordering rules and chains, possibly based on statistics about incoming packets.
Some references:
Fabien Coelho and François Irigoin. API Compilation for Image Hardware Accelerators. ACM Trans. on Architecture and Code Optimization, 9(4):49:149:25, January 2013. Article 49.
Julien Zory. Contributions à l'optimisation de programmes scientifiques. PhD thesis, École des mines de Paris, December 1999.
Julien Zory and Fabien Coelho. Using algebraic transformations to optimize expression evaluation in scientific codes. In Proceeding of International Conference on Parallel Architectures and Compiler Techniques (IEEE PACT), pages 376384, October 1998
URL sujet détaillé :
:
Remarques :





SM2075 Energyefficient clustering and scheduling of scientific workflows


Description


Scientific workflows are composed of many fine computational granularity tasks, and the dependency among them is represented by task graphs [1]. For example, the Montage workflow generates custom astronomical image mosaics [2]. The workflow tasks may be highly heterogeneous in terms of execution time. A typical Montage workflow contains thousands of tasks most of which have a running time of minutes or less [3].
The physical resources used to run these scientific workflows consume huge amounts of energy as testified by the TOP500 list [4] that ranks the 500 most powerful computer systems, among which many are designed for scientific applications, such as weather forecasting or earthquake modeling. As the electricity bill of these energyhungry systems is reaching new heights, reducing their consumption becomes a major challenge.
Clustering techniques consist in grouping some tasks together in order to decrease the system overhead due to workflow management. Clustering of short tasks has been proved to be an efficient way of reducing the overall execution time [5]. On the other side, once groups of tasks have been done, they need to be scheduled onto physical resources, like regular tasks. Workflow scheduling has also a nonnegligible influence over quality of service and execution time in particular.
This work will revisit classical clustering and scheduling techniques for scientific workflows with energy efficiency as a primary target. For both techniques, our algorithms will adopt a tradeoff between energy consumption and performance. We will also study the consequences of the optimizations done at the task clustering level on the scheduling level in order to have a comprehensive view on our optimized execution of scientific workflows.
Two types of infrastructures will be considered in this work: typical HPC infrastructures such as computational clusters and grids, and Clouds, which have currently become the center of attention and are increasingly used for scientific purposes [6]. Both infrastructures pose different kinds of problems in terms of clustering and scheduling. For instance, Cloud infrastructures resources are tight to virtual machines which have fixed sizes, while HPC infrastructures are more flexible on resource provisioning.
From a software point of view, this work will rely on GinFlow [7], a workflow executor developed within the Myriads team, and TIGRES [8], a template description language designed by the DST department at Lawrence Berkeley National Lab. The validation of the proposed algorithms will use Grid'5000, a French experimental testbed for distributed computing research.
[1] W. Chen, R. Ferreira da Silva, E. Deelman, and T. Fahringer. "Dynamic and FaulTolerant Clustering for Scientific Workflows". IEEE Transactions on Cloud Computing, 2015.
[2] http://montage.ipac.caltech.edu
[3] G. Singh, M.H. Su, K. Vahi, E. Deelman, B. Berriman, J. Good, D. Katz, and G. Mehta. "Workflow Task Clustering for Best Effort Systems with Pegasus", ACM Mardi Gras conference (MG), 2008.
[4] http://www.top500.org
[5] Q. Zhu, J. Zhu, and G. Agrawal. "Poweraware Consolidation of Scientific Wokrflows in Virtualized Environments". International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2010.
[6] C. Hoffa, G. Mehta, T. Freeman, E. Deelman, K. Keahey, B. Berriman, and J. Good, =93On the Use of Cloud Computing for Scientific Workflows,=94 in IEEE International Conference on eScience, 2008, pp. 640=96645.
[7] http://ginflow.inria.fr/
[8] http://tigres.lbl.gov/home URL sujet détaillé :
:
Remarques : Coadvising:  AnneCécile Orgerie (CNRS Researcher)
 Matthieu Simonin (Inria Engineer)
 Cédric Tedeschi (U. Rennes 1 assistant Prof.)
Payment: ~ 500 euros / month





SM2076 Dynamic Adaptation for StreamProcessing Engines


Description


Distributed stream processing has become a leading trend for analysing a large amount of data in realtime. Internet of things, stock trading, web traffic monitoring are all pushing continuously datas for immediate processing. To address the challenge of handling high volume and high velocity of datas, different stream processing engines emerged including Spark Streaming based on Spark[1], Storm[2], Flink[3] or Samza[4]. Those platforms are known to be intensively used at large scale by different actors (e.g. Yahoo!, Twitter, LinkedIn). The reputation of those frameworks are often measured in terms raw performances (number of events treated / second). From our point of view, another aspect to take into account is the capacity of the framework to adapt to changes due to external factors. For example these systems are subject to overload or failures. Different steps have been made to this direction in [5,6,7,8].
These works primarily focus on dynamically adapting the mapping of the operators of the streamprocessing application onto the computing resources dedicated to host the application. Some of them assume a limited amount of resources and propose different techniques to adapt without having to significantly degrade the throughput of the application (but at the cost of a degraded accuracy of the output). Other approaches consider the elasticity of such an approach when cloudbased resources are available, in which case the goal is to minimize the gap between allocated resources and their actual utilisation.
Some work remains to be done as several point are still unclear, in particular:
1) The comparison of these approaches in terms of performance is still a widely open issue, as no actual systematic benchmark or tool has been devised for this issue
2) The optimization of their performance in a dynamic environment has only been addressed in a nonsystematic fashion, where algorithms target a subpart of this software family, having its own specificities.
The internship will first focus on studying several representatives of this family of software and devise a model able to include them all, so as to develop a generic framework for their experimental evaluation. In a second part, the work will try to include the dynamic adaptation into the model devised. Finally, this framework will allow to test the different actors in the field (mentioned above) so as to conduct a more comprehensive experimental campaign. The Grid'5000 platform will be used to perform these experiments.
[1] http://spark.apache.org/streaming/
[2] https://storm.apache.org/
[3] https://flink.apache.org/
[4] http://samza.apache.org/
[5] Waldemar Hummer, Benjamin Satzger, Schahram Dustdar: Elastic stream processing in the Cloud. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery 3(5): 333345 (2013)
[6] Vincenzo Gulisano, Ricardo JiménezPeris, Marta Pati=F1oMart=EDnez, Claudio Soriente, Patrick Valduriez: StreamCloud: An Elastic and Scalable Data Streaming System. IEEE Trans. Parallel Distrib. Syst. 23(12): 23512365 (2012)
[7] Javier Cervi=F1o, Evangelia Kalyvianaki, Joaqu=EDn Salvach=FAa, Peter Pietzuch: Adaptive Provisioning of Stream Processing Systems in the Cloud. ICDE Workshops 2012: 295301
[8] Daniel J. Abadi, Yanif Ahmad, Magdalena Balazinska, Ugur Çetintemel, Mitch Cherniack, JeongHyon Hwang, Wolfgang Lindner, Anurag Maskey, Alex Rasin, Esther Ryvkina, Nesime Tatbul, Ying Xing, Stanley B. Zdonik: The Design of the Borealis Stream Processing Engine. CIDR 2005: 277289 URL sujet détaillé :
:
Remarques : Coadvising:  Matthieu Simonin (Inria Engineer)
 Cédric Tedeschi (U. Rennes 1 assistant Prof.)
Payment: ~ 500 euros / month





SM2077 Distributionally robust network design


Description


Given an undirected graph and a set of pointtopoint commodities with known demands (traffic matrices), the objective of the network design problem is to find the cheapest capacity installation on the edges of the graph such that the resulting network supports the routing of the commodities. The problem has numerous applications in telecommunications, transportation, and energy management, among many others.
An important aspect of network design problems is related to the knowledge of demands. In a large number of applications, these demands are not available at the time we decide of the capacity installation. The best one can do is to rely on demand forecasts, based, for instance, on population statistics or traffic measurements. In general, these these statistical studies are accurate enough to come up with precise probabilistic distributions of the demand. Hence, the approach traditionally used in the litterature relies on using sets that represent the demand variation, falling into the framework of robust optimization [OZ07,PR13,BN09].
In this project, we intend to apply the more advanced tools of distributionally robust optimization to network design problems [WK14]. These models refine the traditional robust approach by considering sets of random variables (called ambiguity set) that are compatible with the available data. The computational tractability of the resulting models depends on the type of ambiguity sets defined for the problem. More specifically, the prospective student will: * Study distributionally robust optimization * Apply the framework to the available data on network design * Solve the resulting optimization models using commercial solvers
The prospective student should have some background in statistics, programming, and optimization.
References:
[BE09] BenTal, Aharon, Laurent El Ghaoui, and Arkadi Nemirovski. Robust optimization. Princeton University Press, 2009. [OZ07] Fernando Ord=F3=F1ez, Jiamin Zhao: Robust capacity expansion of network flows. Networks 50(2): 136145 (2007) [PR13] Michael Poss, Christian Raack: Affine recourse for the robust network design problem: Between static and dynamic routing. Networks 61(2): 180198 (2013) [WK14] Wolfram Wiesemann, Daniel Kuhn, Melvyn Sim: Distributionally Robust Convex Optimization. Operations Research 62(6): 13581376 (2014) URL sujet détaillé :
:
Remarques :





SM2078 Inferring Device Failures from Online Discussions


Description


Technicolor maintains large datasets of failed and returned hardware equipment (Internet gateways, STBs etc.), including dates and cause of failures. While this dataset provides interesting statistics, it does not allow anticipating clients that return their hardware, nor provides insight on how device failures are perceived by endusers. The objective of this training is to develop a model for the prediction of hardware returns, using the observed online user activity. This might be achieved by crawling the web and by automatically analysing the magnitude of online discussions in forums, blogs and tweets with respect to specific keywords, about the tracked devices. In other words, we would like to verify if the online discussions reflect the failures observed in the returned hardware. We would like to answer the following questions: Can we anticipate and correlate the trends in online discussions with the trends in the returned devices? How are the failures perceived by the enduser? Can we model the online behaviour of users facing device failures? In addition to the model, a service crawling (monitoring) the web for triggering alerts is a possible training outcome.
Some related references: * The Predictive Power of Online Chatter. In KDD 2005. * Deriving marketing intelligence from online discussion. In KDD 2005. URL sujet détaillé :
:
Remarques : Skills : * Analytical and modelling capabilities (a machine learning background is a plus)
* Good programming skills
* Habits in reading research papers
Salary: 1200=80/month gross





SM2079 Algorithms for Cell Reprogramming


Description


Cell reprogramming consists in triggering cell redifferentiation by perturbing particular genes or signal receptors. Such phenotype changes have many applications, notably in regenerative medicine. The prediction of robust targets for cell reprogramming is therefore a major challenge which requires new computer sciences methods for modelling and analysing dynamics of biological networks.
Recent work demonstrated that socalled qualitative models allow the automatic prediction of reprogramming determinants within gene networks. Those Boolean models consider each gene as either active or inactive ; their expression being a function of the binary state of their regulators, therefore inducing changes over time. From such Boolean networks, one can compute the socalled attractors of the system, which model the cell functions, and derive key genes, whose mutation would induce the reachability of a different attractor.
The training will begin with an indepth study of current algorithms for predicting cell reprogramming determinants from Boolean networks, and their applicability to synthetic and real biological networks. The research will then focus on major questions : (1) are there other reprogramming strategies than the ones predicted by the current methods; (2) how to take into account the uncertainty on some part of the model; (3) how to evaluate the robustness of a given reprogramming strategy. Those tasks require a formalisation of the computation of reprogramming determinants, the implementation and the evaluation of the proposed algorithms.
No particular background in biology is required. URL sujet détaillé : http://loicpauleve.name/BNReprogrammingM2.pdf
Remarques : Coadvising with Stefan Haar (Inria Saclay)





SM20710 ACASA


Description


The AGATE project (an Approach for Genericity in Assistance To complEx tasks) aims to propose generic models and unified tools to make possible the setup of assistance systems in any existing application, that we call the targetapplication. The aim of this internship is to create a method for detecting users' assistance needs for a given target application, in order to help designers to create and improve an epiphytic assistance system for this target application. For this, the student will propose a system which will exploit the collected traces (through an operational tool developed in AGATE project) to identify and characterize problematic situations encountered by users. URL sujet détaillé : http://liris.cnrs.fr/stephanie.jeandaubias/enseignement/projets/stageM2R_ACASAgb.pdf
Remarques : sujet en français : http://liris.cnrs.fr/stephanie.jeandaubias/enseignement/projets/stageM2R_ACASA.pdf





SM20711 Shared, persistent and mutable bigdata structures


Description


In current computer systems, sharing or persisting data is the job of the file system. However, the file system interface is narrow and slow. This penalises applications with a large memory footprint, such as bigdata analytics. Consider for instance a mapreduce computation. The map processes produces some large data structure, e.g., a socialnetwork graph, which the reduce processes will consume. Currently, the only practical approach is for the mapper to serialise (marshall) the graph and write it into a file, and the reducers to read the file and deserialise (unmarshall) the graph. This repeated serialiseoutputinput deserialise sequence is extremely costly.
However, a recent and exciting development in computer architecture is the advent of very large (gigabytes) main memories, including persistent memories. This has the potential to make traditional file systems obsolete, since sharing data and making it durable can now be done directly in main memory. Returning to the example, the graph can be shared and made persistent directly in main memory instead. A related problem is that of lazily mutating or copying a large pointerbased data structure in memory.
The basic techniques are well known (e.g., mmap or copyonwrite) but they are either too difficult or impractical for application programmers. They are possible but difficult in lowlevel languages such as C or C++, and practically impossible in managed languages such as Java or Scala.
Therefore, the aim of the internship is to explore how to enable direct sharing between processes, and/or lazy copying/mutation, of a rich pointerbased data structure into the Java Virtual Machine. This consists of two related subproblems: how to implement these techniques efficiently inside the execution environment, and how to expose them to the application programmer in a safe and simple fashion.
The intern shall study the state of the art and perform experiments. The intern will build a proofof concept prototype, initially making simplifying assumptions, e.g., no concurrent writes, no garbage collection, no JIT, which shall be relaxed little by little. The findings are to be published in the scientific literature. URL sujet détaillé : https://team.inria.fr/regal/joboffers/mastersinternshipsharedpersistentandmutablebigdatastructures/
Remarques : Requirements:  Enrolled in a Masters' in Computer Science / Informatics or a related field.
 An excellent academic record.
 A strong interest and good knowledge of memory management techniques, for operating systems or managed runtime systems, and concurrent algorithms.
 Motivated by experimental research.
Applying
The internship is fully funded. It will be coadvised by Dr. Marc Shapiro, Inria and Laboratoire d'Informatique de Paris6 (LIP6), Université Pierre et Marie Curie, and by Prof. Gaël Thomas of Télécom SudParis.
A successful intern will be invited to apply for a PhD.
To apply, contact Marc Shapiro and Gaël Thomas ￼￼￼￼￼sudparis.eu/~thomas_g> with the following information:
 ￼A resume or Curriculum Vit=E6
 A list of courses and grades of the last two years of study (an informal transcript is OK).
 Names and contact details of two references (people who can recommend you). We will contact these people directly.





SM20712 Efficient implementation of highlevel objects in a georeplicated database


Description


Traditional databases ensure that updates appear one at a time in some serial order. In contrast, modern georeplicated databases accept updates at different replicas in parallel, in order to ensure that the data is always quickly available. The principles of ConflictFree Replicated Data Types (CRDTs) [2] ensure that data will converge despite the concurrent updates.
When concurrent updates are allowed, there is no more a linear order of updates and of their corresponding object versions, since two concurrent updates do not depend on one another. Tracking this requires more complex structures, such as multiple operation logs, snapshots, and vector clocks [1].
In order to serve a particular read or update, the system must compute an appropriate version. This may require searching through multiple previously generated snapshots, as well as through buffers of recent updates. Once the version is generated, it may be stored as a snapshot (or discarded, depending on the system's policy) and later updates buffered.
On top of this inherent complexity, the application often requires data types with a rich structure, such as sets, graphs, or maps, for which generating a version requires significant computation and memory management. Breaking down a large object into smaller pieces that can be updated independently again adds more complexity. Finally, implementing these objects is further complicated when data is spread across both main memory and disk.
The subject of this internship is to study the above issues and to propose efficient solutions. After studying the state of the art of databaseoriented memory/disk data structures (e.g., BTrees or LSMTrees) and concurrent data structures, the intern will focus on two main tasks. The first is to design a memory/disk organisation that provide efficient reading and updating of objects in these systems. The second how to parallelise reads and updates to large objects where an operation concerns only small section of the object. URL sujet détaillé : https://team.inria.fr/regal/joboffers/mastersinternshipefficientimplementationofhighlevelobjectsinageoreplicateddatabase/
Remarques : Requirements:  Enrolled in a Masters' in Computer Science / Informatics or a related field.
 An excellent academic record.
 Be strongly interested in, and have good knowledge of, distributed systems, distributed algorithms, or distributed databases, concurrent algorithms, or memory/disk data structures.
 Motivated by experimental research.
Applying
The internship is fully funded. It will be coadvised by Dr. Marc Shapiro, Inria and Laboratoire d'Informatique de Paris6 (LIP6), Université Pierre et Marie Curie, and by Prof. Gaël Thomas of Télécom SudParis.
A successful intern will be invited to apply for a PhD.
To apply, contact Tyler Crain (tyler.crain.fr) with the following information:
 ￼A resume or Curriculum Vit=E6
 A list of courses and grades of the last two years of study (an informal transcript is OK).
 Names and contact details of two references (people who can recommend you). We will contact these people directly.





SM20713 Generating efficient concurrency control for georeplicated storage


Description


Reasoning about concurrent updates in a replicated database is a difficult task because operations might execute in different orders at different replicas. We have developed a tool, called CISE, that verifies whether a given program will maintain its specific correctness invariants under concurrent execution. If not, the tool provides a counterexample, which the program developer can examine, in order to understand which concurrent executions should be disallowed. We call these abstract concurrency control points tokens.
Abstractly, a token is similar to a lock. A possible (but inefficient) implementation would be the following: before a process can execute an operation, it acquires the corresponding locks; when the operation returns, it immediately releases the locks. This is formally correct, but is unlikely to perform well, because of the high cost of acquiring and releasing locks. Alternatively, a process could acquire all the locks it will ever need initially, and never release them. This would be highly efficient and safe, but also inefficient, because other processes would never be able to do any work. A good protocol lies somewhere between these two extremes.
Although the CISE tool is automatic, the later steps of identifying the appropriate tokens, and then to translate this into an efficient concurrency control protocol, are currently manual. This is tedious and errorprone. A first level of improvement would be to automate the analysis of counter examples, in order to suggest a correct token assignment. A second level will be to automate the translation of the above token assignment into an efficient concrete protocol.
There are different ways to optimize a lock implementation, each with own cost and complexity. Examples include multilevel locks, lock coarsening, and early lock release. As suggested above, each operation may be protected by its own finegrain lock, at the cost of constant acquire/release cycles. A wellknown optimization is to coarsen, replacing several finegrain locks with a single coarsegrain one. Although a coarser lock reduces the synchronization cost, it also delays concurrent updates, which costs performance too.
From a performance perspective, there is no single best locking protocol, since this will depend on dynamic characteristics of the workload, namely on how often updates are blocked (contention) vs. how often locks are acquired (overhead).
The internship topic is to address the above issues. Its specific goals may include the following:  Automating the analysis of counterexamples, to derive an appropriate token assignment.  Using heuristics to automatically generate an efficient concurrency control protocol, based on assumptions about the workload. Consider alternatives to locking when appropriate, e.g., leveraging causality, or using escrow. Compare different possible protocols for the same application, quantifying the lock optimization benefits against costs.  Develop profiling or monitoring tools to measure the efficiency of the concurrency control protocol under different workloads, and heuristics to improve it.  Ensuring that the concurrency control protocol does not deadlock, through analysis, heuristics, and/or automated testing.
The approach is likely to combine both static analysis, extending the CISE tool with token assignment and concurrency control generation, and dynamic analysis to evaluate performance, and to ensure absence of deadlock under real conditions. URL sujet détaillé : https://team.inria.fr/regal/joboffers/mastersinternshipgeneratingefficientconcurrencycontrolforgeoreplicatedstorage/
Remarques : The intern must:  Be enrolled in a Masters' in Computer Science / Informatics or a related field.
 Have an excellent academic record.
 Be strongly interested in, and have good knowledge of, distributed
systems, distributed algorithms, distributed databases, or static and
dynamic analysis.
 Be motivated by experimental research.
The internship is fully funded, and will take place in the Regal group, at Laboratoire d'Informatique de Paris6 (LIP6), in Paris. It will be advised by Dr. Marc Shapiro and by Mahsa Najafzadeh. A successful intern will be invited to apply for a PhD.
To apply, contact Mahsa Najafzadeh (mailto:mahsa.najafzadeh.fr), with the following information:
 A resume or Curriculum Vit=E6.
 A list of courses and grades of the last two years of study (an informal
transcript is OK).
 Names and contact details of two references (people who can recommend
you), whom we will contact directly.





SM20714 Protection d'un cryptoprocesseur sur courbes (hyper)elliptiques par randomisation des calculs


Description


L'équipe CAIRN (http://team.inria.fr/cairn/) effectue des recherches sur les moyens matériels pour effectuer les nombreux calculs de la cryptographie sur les courbes elliptiques (ECC) et courbes hyperelliptiques (HECC) de façon sécurisée contre certaines attaques physiques. Les attaques physiques par canaux cachés exploitent les liens qui existent entre les clés secrètes et certaines grandeurs physiques mesurables à l'extérieur du circuit: courant d'alimentation, rayonnement électromagnétique, temps de calcul, etc. Un cryptoprocesseur (H)ECC matériel et ses outils de programmation sont développés.
L'objectif du stage est d'étudier différentes techniques pour randomiser les calculs sans trop affecter les performances. Les attaques exploitent souvent les régularités des séquences de calcul. La randomisation des calculs liés aux valeurs secrètes est donc une contremesure naturelle. Mais sa mise en oeuvre est complexe car elle peut être vue à plusieurs niveaux: représentations arithmétiques, algorithmes de calcul, ordonnancement des instructions, modification de l'architecture, ajout de blocs matériels spécifiques, etc. En fonction des centres d'intérêts du/de la stagiaire, différentes combinaisons de ces niveaux pourront être étudiés. URL sujet détaillé : http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet2016randomisationecchecc.pdf
Remarques :





SM20715 Optimal Bandwidth on Demand for SDN Networks


Description


SoftwareDefined Networking (SDN) technologies have radically transformed the network architecture of data centers, network overlays, and carrier networks. They provide programmable data planes that can be configured from a remote and logically centralized controller. The high flexibility and reprogrammability of SDN infrastructures enable carrier network operators and Internet service providers to exploit more efficiently their network resources, thus offering new types of access services. Bandwidth Calendaring (BWC) and Bandwidth on Demands (BWoD) services enable enterprises or cloud providers to dynamically establish or resize connectivity from the fixed or wireless access network through the core as necessary, so they pay only for what they consume. However, the lack of automation capabilities as well as the high dynamicity of connection requests from OverTheTop (OTT) operates hinder the widespread adoption of these services. The objective of this internship is to formulate and analyze the BWC/BWoD services as combinatorial problems, proposing efficient algorithms to quickly solve them even with large size instances. To this end, particular attention should be paid to decomposition methods, such as Benders' decomposition and Dantzig=96Wolfe decomposition, to exploit the special block structure of the problems. The candidate will have to study and formulate the problem, design efficient algorithms to compute quickly a solution and evaluate their performance through simulations. URL sujet détaillé :
:
Remarques : Appropriate remuneration for the internship. This internship is in collaboration with the team "Optimisation et Systèmes" of the CERMIS laboratory of the Ecole des Ponts ParisTech.





SM20716 Towards detecting and analyzing the emergent behavior in assuring the security of Systemsofsystems


Description


Context Especially in the last 10 years [Guangrong, 2014], the field of Systemsofsystems (SoS) enjoyed a considerable development, prouved also by the rising number of European projects that target them, e.g. TAREASOS, DANSEIP, COMPASS, AMADEOS, Local4Global, CPSoS. SoS are large, distributed, concurrent systems, composed of complex systems [Jamshidi, 2008]. Traditionally, security deals with confidentiality, integrity and availability of data [Avizienis, 2004].If security is taking into consideration only at the end of the system development cycle, all changes will be expensive. That is why security must be taken into consideration as soon as possible, from the architecture and design stages of a SoS [Lukasik, 2013].
Objectives One of the main attributes specific to SoS is the emergent behavior: SoS have behaviors that are not reducible to constituent systems, but which depend rather on the interactions between the constituent systems. In the context of security, emergent behaviors involve cascading several vulnerabilities of different constituent systems [Eusgeld, 2011]. To analyze such cascading vulnerabilities and their effects, methods such as Development Dependency Network Analysis (DDNA) [Guariniello, 2013], their integration with Functional Dependency Network Analysis (FDNA) [Guariniello, 2014], bigraphs [Wachholder, 2015] or Bayesian Networks [Han, 2013] have been used. Work program: 1.State of the art on existing methods already used to analyze the emergent behavior of SoS. 2.Choosing (or proposing) of a method/algorithm well adapted to analyze the SoS emergent behavior in a security context =96 cascading vulnerabilities. 3.Development of a software implementing the chosen method / algorithm, e.g. as an Eclipse plugin. 4.Apply the method on an attack type, e.g. a Denial of Service in a Smart Grid.
Bibliography [Avizienis, 2004] A. Avizienis, J.C. Laprie, Randell, B. and C. Landwehr. =93Basic concepts and taxonomy of dependable and secure computing=94, Dependable and Secure Computing, IEEE Trans. on, vol.1, no.1, pp.1133, 2004. [Eusgeld, 2011] I. Eusgeld, C. Nan & S. Dietz. Systemofsystems approach for interdependent critical infrastructures. Reliability Engineering & System Safety, vol. 96, no. 6, pp. 679 =96 686, 2011. [Guangrong, 2014] Y. Guangrong You, Xiao Sun, Min Sun, Jimin Wang, and Yingwu Chen. =93Bibliometric and social network analysis of the SoS ﬁeld=94, in Proc. of the 9th International Conference on System of Systems Engineering (SOSE), 2014, Adelaide, Australia. [Guariniello, 2013] C. Guariniello, D. DeLaurentis. =93Dependency Analysis of SystemofSystems Operational and Development Networks=94, Procedia Computer Science, Vol. 16, 2013, pp. 265274. [Guariniello, 2014] C. Guariniello, D. DeLaurentis. =93Integrated Analysis of Functional and Developmental Interdependencies to Quantify and Tradeoff Ilities for SystemofSystems Design, Architecture, and Evolution=94, Procedia Computer Science, Vol. 28, 2014, pp. 728735. [Han, 2013] S. Y. Han, D. DeLaurentis. =93Development Interdependency Modeling for SystemofSystems (SoS) using Bayesian Networks: SoS Management Strategy Planning=94, Procedia Computer Science, Vol. 16, 2013, pp. 698707. [Jamshidi, 2008] M. Jamshidi. =93System of systems engineering  new challenges for the 21st century=94, Aerospace and Electronic Systems Magazine, IEEE, Vol. 23, No. 5, pp. 419, 2008. [Lukasik, 2013] S. J. Lukasik. Vulnerabilities and failures of complex systems. Int. J. Eng. Educ., 19(1):206212, 2003. [Wachholder, 2015] D. Wachholder, C. Stary, "Enabling emergent behavior in systemsofsystems through bigraphbased modeling," in System of Systems Engineering Conference (SoSE), 10th , vol., no., pp.334339, 1720 May 2015. URL sujet détaillé : https://www.sites.google.com/site/vaneachiprianov/researchprojects
Remarques : The internship will be payed according to current French laws, with a sum of around 500 euros / month.





SM20717 Algorithmics of energy functions


Description


Energy and resource management problems are important in areas such as embedded systems or autonomous systems. They are concerned with the question whether a given system admits infinite schedules during which (1) certain tasks can be repeatedly accomplished and (2) the system never runs out of energy (or other specified resources). Formal modelling and analysis of such problems has attracted a lot of attention in recent years. We are interested in modelling systems taking time and energy into account in their behaviour. Such systems, or their semantics, can be seen as finite state based transitions systems, where time elapses when staying in a given state, and energy is gained when time elapses, or spent when taking discrete transitions between states. We call such a model a realtime energy automaton (RTEA). The behaviour of an RTEA depends on the initial energy and time available, and the interplay between time and energy means that even simple models can have rather complicated behaviours.
We have developed an algebraic methodology [1] for deciding reachability issues for such models, i.e., answering the following question: given some initial energy and time, can a system reach a designated state without running out of energy before? This methodology is based on the theory of semiringweighted automata [2] and their extensions [3]. The finite behaviour of an RTEA is seen as a function f(x0,t) which maps initial energy x0 and available time t to a final energy level, intuitively corresponding to the highest output energy the system can achieve when run with these parameters. We define a composition operator on such realtime energy functions which corresponds to concatenation of RTEA and show that, equipped with this composition and with a maximum operator, the set of realtime energy functions forms a mathematical structure where reachability can be decided in a static way. From a theoretical point of view, we have shown that operations on realtime energy functions can be done in time linear in the size of their representation, and the representation size of compositions and suprema of realtime energy functions is a linear function of the representation size of the operands. This makes us conjecture that reachability in RTEA can be decided in exponential time.
The purpose of this internship is to develop, optimise and implement the corresponding algorithms in order to validate this conjecture, and see how this algorithm behaves in practice.
References
[1] D. Cachera, U. Fahrenberg, and A. Legay. An omegaAlgebra for RealTime Energy Problems. In FSTTCS 2015.
[2] M. Droste, W. Kuich, and H. Vogler, eds. Handbook of Weighted Automata. Springer, 2009.
[3] Z. Esik, U. Fahrenberg, and A. Legay. Starcontinuous Kleene omegaalgebras. In DLT, vol. 9168 of LNCS. Springer, 2015. URL sujet détaillé :
:
Remarques : Coadvising: Ulrich Fahrenberg (IRISA)





SM20718 Modèles formels pour les essaims de robots


Description


Robot Networks are swarms of autonomous mobile entities that have to accomplish some task in cooperation.
We are interested in obtaining formal and moreover mechanical guarantees of properties for certain protocols, using the Coq proof assistant. A Coq framework for robot networks has been recently proposed (Pactole); being of higher order and based on coinductive types, it allows one to prove correctness of protocols or to establish impossibility of a task in quite a few variants of models for such networks, in particular regarding topology or synchronicity properties.
An objective of this proposal is to extend the formal framework to robot behaviours that are probabilistic. Randomised algorithms are indeed most convenient to handle situations where symmetricity makes any deterministic protocol useless.
The Coq library alea will be an interesting starting point to this goal. URL sujet détaillé : http://pactole.lri.fr/pub/1516_probas.pdf
Remarques : coencadrement P. Courtieu (lab. Cédric), S. Tixeuil (Lip6)





SM20719 Modèles formels pour les essaims de robots


Description


Robot Networks are swarms of autonomous mobile entities that have to accomplish some task in cooperation.
We are interested in obtaining formal and moreover mechanical guarantees of properties for certain protocols, using the Coq proof assistant. A Coq framework for robot networks has been recently proposed (Pactole); being of higher order and based on coinductive types, it allows one to prove correctness of protocols or to establish impossibility of a task in quite a few variants of models for such networks, in particular regarding topology or synchronicity properties.
An objective of this proposal is to extend this framework to some particular topologies (discrete spaces, and higher dimensions) so as to propose a framework in which some new results could be formally established, in particular regarding exploration, the election of a leader, or the formation of geometrical patterns. Those two problems have been shown equivalent (meaning that there is an algorithm that solves one of them if one has as an input a solution for the other one) under some particular assumptions. Extending this equivalence to less constrained models is an interesting open problem. URL sujet détaillé : http://pactole.lri.fr/pub/1516_discret.pdf
Remarques : coencadrement P. Courtieu (lab. Cédric), S. Tixeuil (Lip6)





SM20720 Encodings for Location Graphs


Description


Scientific context
Location Graphs are a new formal model for dynamic componentbased systems currently developed in the INRIA SPADES team, that combines concurrent composition, dynamic reconfiguration and graphlike component structures [1]. The Location Graphs model subsumes previous computational models for mobile and distributed systems including distributed or located variants of Milner's picalculus [2], and more abstract computational models such as Milner's bigraphs [3]. The psicalculi is a process calculus framework recently developed by J. Parrow et al. [4] to facilitate the development of applied and extended variants of the picalculus. The Synchronized Hyperedge Replacement (SHR) [5] is an hypergraphbased model of computation introduced to model dynamic distributed services.
Subject of the internship
The goal of this internship is to contribute to the study of the expressive power of Location Graphs model, via the study of encodings. Specifically, the intership will study whether faithful and nontrivial encodings are possible: (1) of the Location Graphs model in the psicalculi framework, and (2) of the SHR model in a simple instance of the Location Graphs model. By 'faithful' we mean here an encoding which establishes some operational correspondence or simulation between a term and its encoding , and by 'nontrivial' we mean here an encoding which respects certain wellformedness criteria as studied e.g. by Gorla [6]. The internship will contribute to a more precise understanding of the Location Graphs semantics, and can lead to a publication and/or a PhD on the topic.
References
[1] J.B. Stefani : "Component as location graphs", in Proc. 11th Int. Symp. Formal Aspects of Component Software (FACS), LNCS 8997, Springer, 2015.
[2] R. Milner : "Communicating and Mobile Systems: The Pi Calculus", Cambridge University Press, 1999.
[3] R. Milner : "The Space and Motion of Communicating Agents", Cambridge University Press, 2009.
[4] J. Bengtson, M. Johansson, J. Parrow, B. Victor: "Psicalculi: a framework for mobile processes with nominal data and logic", Logical Methods in Computer Science 7(1), 2011.
[5] G.L. Ferrari, D. Hirsch, I. Lanese, U. Montanari, and E. Tuosto : "Synchronized Hyperedge Replacement as a model for Service Oriented Computing", in Proc. 4th Int. Symp. Formal Methods for Components and Objects (FMCO), LNCS 2006.
[6] D. Gorla : "Towards a Unified Approach to Encodability and Separation Results for Process Calculi", in Proc. CONCUR 2008. URL sujet détaillé :
:
Remarques :





SM20721 I/O Types for Location Graphs


Description


Scientific context
Location Graphs are a new formal model for dynamic componentbased systems currently developed in the INRIA SPADES team, that combines concurrent composition, dynamic reconfiguration and graphlike component structures [1]. The model subsumes located variants of the picalculus [2], and more abstract computational models such as Milner's bigraphs [3].
Subject of the internship
The original Location Graphs model has only a crude notion of types and does not exploit in full to constrain the behavior of component graphs and capture architectural constraints that may occur in dynamic software systems. The goal of this internship is to develop a simple type system for Location Graphs, inspired by notions of i/o types for the picalculus [4] and by mobility types in Boxed Ambients [5], that takes advantage of operations available in a location graph (communication, passivation, and graph modification) to prevent simple forms of errors, including classical communicaton errors, but also reconfiguration errors such as unlawful node update, binding and movement. If sucessful, the internship can begin to investigate more complex forms of types dealing with graphbased architectural conditions such as encapsulation and dependency invariants. The internship can lead to publication and/or a PhD on the topic of contracts and type systems for dynamic software architectures.
References
[1] J.B. Stefani : "Component as location graphs", in Proc. 11th Int. Symp. Formal Aspects of Component Software (FACS), LNCS 8997, Springer, 2015.
[2] R. Milner : "Communicating and Mobile Systems: The Pi Calculus", Cambridge University Press, 1999.
[3] R. Milner : "The Space and Motion of Communicating Agents", Cambridge University Press, 2009.
[4] D. Sangiorgi, D. Walker : "The picalculus: A Theory of Mobile Processes", Cambridge University Press, 2001.
[5] M. Bugliesi, S. Crafa, M. Merro, V. Sassone: "Communication and mobility control in boxed ambients", Information and Computation 202(1), 2005.
URL sujet détaillé :
:
Remarques :





SM20722 Reversibility and Compensation in Location Graphs


Description


Scientific context
Location Graphs are a new formal model for componentbased systems currently developed in the INRIA SPADES team, that combines concurrent composition, dynamic reconfiguration and graphlike component structures [1]. The Location Graphs model subsumes previous computational models for mobile and distributed systems including distributed or located variants of Milner's picalculus [2], and more abstract computational models such as Milner's bigraphs [3].
Subject of the internship
Compensations are a basic mechanism for fault recovery, which have been the subject of several recent works, e.g. [6,7]. The goal of this internship is to study the introduction of compensation mechanisms in the Location Graphs model, allowing the possibility for compensations to rollback to welldefined previous states. To progress towards that goal, the internship will first study the relationship between the primitives for controlled rollback introduced in [5] and the primitives for dynamic compensation studied in [7]. In a second phase, it will study introducing causally consistent reversibility and compensation capabilities in Location Graphs. The internship can lead to publication and/or a PhD on the topic of faultrecovery in dynamic software architectures.
References
[1] J.B. Stefani : "Component as location graphs", in Proc. 11th Int. Symp. Formal Aspects of Component Software (FACS), LNCS 8997, Springer, 2015.
[2] R. Milner : "Communicating and Mobile Systems: The Pi Calculus", Cambridge University Press, 1999.
[3] R. Milner : "The Space and Motion of Communicating Agents", Cambridge University Press, 2009.
[4] I. Lanese, C.A. Mezzina, J.B. Stefani: "Reversing HigherOrder Pi", in Proc. CONCUR, LNCS 6269, Springer, 2010.
[5] I. Lanese, C.A. Mezzina, A. Schmitt, J.B. Stefani: "Concurrent Flexible Reversibility", in Proc. ESOP, LNCS 7792, Springer, 2013.
[6] R. Bruni, H. C. Melgratti, U. Montanari: "Theoretical foundations for compensations in flow composition languages", in proc. 32nd POPL, 2005.
[7] I. Lanese, C. Vaz, C. Ferreira: "On the Expressive Power of Primitives for Compensation Handling", in proc. ESOP, LNCS 6012, Springer, 2010.
URL sujet détaillé :
:
Remarques :





SM20723 Resilient and energyaware scheduling algorithms for largescale distributed systems


Description


(1) Motivation
Largescale distributed systems correspond to wide range of platforms, such as highperformance supercomputers, sensor networks, volunteer computing grids, etc. All these systems have a very large scale, and include millions of components. Such a large scale induces two major problems: failures and energy consumption. Resilience is (loosely) defined as surviving to failures. For instance, a failure will occur every 50 minutes in a system with one million components, even if the MTBF (mean time between failures) of a single component is as large as 100 years. Obviously, failure handling is critical for highly parallel applications that use a large number of components for a significant amount of time, because such applications are likely to experience at least one failure during execution. Failures are usually handled by adding redundancy, either continuously (replication) or at periodic intervals (migration from faulty node to spare node, rollback and recovery). In the latter case, the state of an application must be preserved (checkpointing), and the system must roll back to the last saved checkpoint. However, the amount of replication and/or the frequency of checkpointing must be optimized carefully.
Largescale distributed systems face a second important challenge: power consumption. Power management is necessary due to both monetary and environmental constraints. Presently, large computing centers are among the largest consumers of energy. Energy is needed to provide power to the individual cores and also to provide cooling for the system. In future distributed systems, it is anticipated that the power dissipated to perform communications and I/O transfers will make up a much larger share of the overall power consumption. In fact, the relative cost of communication is expected to increase dramatically, both in terms of latency/overhead and of consumed energy. Using dynamic voltage and frequency scaling (DVFS) is a widely used technique to decrease energy consumption, but it can severely degrade performance and increase execution time.
Redesigning scheduling algorithms to ensure resilience and to minimize energy consumption is a difficult proposition, especially because higher resilience often calls for redundant computations and/or redundant communications, which in turn consume extra energy. In addition, reducing the energy consumption with DVFS increases execution time, hence the expected number of failures during execution.
The main goal of this M2 training period is to investigate tradeoffs between performance, resilience and energy consumption. This is a challenging but unavoidable multicriteria optimization problem, whose solution is critical for many applications and largescale systems.
(2) Technical setting
Initial studies will be focused around task graph scheduling and checkpointing, starting with an application task graph deployed on a largescale platform. In term of traditional scheduling, this corresponds to scheduling with unlimited resources and has polynomial complexity. With resilience into play, what is the complexity of the problem? Which tasks should we checkpoint? Then, we will consider several applications running on the platform, which means that tasks are already assigned to processors. The same problems should be addressed. Subsequent studies will include another technique (replication) and another constraint (management of a critical resource, such as I/O bandwidth or main memory capacity).
(3) Background
We have extended expertise in the area of parallel algorithms, scheduling techniques, checkpointing strategies and multicriteria optimization problems. Relevant literature will be provided to the student for assimilation and synthesis during the first part of the training period. Continuation into a PhD program is expected. URL sujet détaillé :
:
Remarques : Coadvising with Yves Robert, Yves.Robertlyon.fr





SM20724 Introspection mémoire en OCaml


Description


Contexte:
The OCamlPro company has developed a tool, to analyse images of the memory of OCaml programs (heap snapshots), and to generate reports explaining how the memory is used to help the programmer optimize his application. This analysis is made possible by adding, in every OCaml block, some runtime information on the type of the block and its allocation site, without increasing the size of the block. For now, this information is only used when stored in snapshot files by the analysis tool. However, it might be interesting, for the programmer, to use the extra information directly at runtime, to interactively explore the memory of his application.
Objectifs:
The global objective of this internship is to develop a library, that can be linked with an OCaml program compiled with memory profiling support, that can use the runtime information stored in every OCaml block. A first objective, to gain understanding of the system, would be to develop a prettyprinter for any OCaml value, using the runtime type information stored in the block to display its content. A second objective, more ambitious, would be to develop graph algorithms to explore the graph of pointers interactively. A typical user request would be to find the roots retaining a value, or the total memory size retained by a given root. Of course, more complex analysis can be performed, to finally be able to perform all then analysis that are currently performed postmortem. URL sujet détaillé :
:
Remarques : Possibilité de négocier la rémunération en fonction des besoins du stagiaire.





SM20725 Semantic Patches for OCaml


Description


Contexte:
When comparing branches or versions of the sources of a project, developers often use textual diffs tools to generate textual patches, that can later be propagated to other branches or versions. However, such patches are very sensible to any textual modification of the sources, making them unusable on the long term. To be able to maintain its own version of the OCaml compiler, the OCamlPro company has developed a technique based on "semantic patches": the differences are not expressed as textual changes, but as operations over the syntax tree. As a consequence, these patches are much more robust to the changes made over time in the code. They are applied directly by the compiler when compiling the sources.
Objectifs:
The main objective of this internship is to generalize the idea of semantic patches over OCaml, in several directions: (1) currently, only a few kinds of modifications/refactoring can be expressed using semantic patches; the idea is to have a more exhaustive set of transformations that can be expressed by semantic patches; (2) currently, semantic patches are applied by the compiled; it would be useful to have a tool to apply these semantic patches on the source to obtain new sources; as a consequence, generic refactoring could be applied by expressing them into semantic patches, and then using the tool to apply them; (3) when applying patches on sources, it can be interesting to inverse them, i.e. to get the semantic patches that reverts a semantic patch, when possible; patches should be sorted into reversible and nonreversible patches; (4) some error messages can easily be fixed automatically; here, the idea would be for an automatic tool to propose a semantic patch after an error, to be applied if accepted by the user. URL sujet détaillé :
:
Remarques : Possibilité de négocier la rémunération en fonction des besoins du stagiaire.





SM20726 Performance Models for Dataflow Programs on NUMA Processors


Description


Today's new architectures are more and more parallel. In order to face the frequency wall, chip manufacturers are putting more and more cores on the same die. Processors for personnal devices are moving fast towards 4, 6 or 12 cores on the same chip, those chips being assembled together to integrate even more parallelism, as with NUMA architectures.
With so many cores at hand, one may wonder how programmers should write their applications so as to benefit from such massive parallelism. Dataflow, first introduced in the seventies (see seminal papers by Kahn [5] and Dennis [3]) is a programming model that is thought to help solve this programmability problem. In dataflow programs, the programmer writes independant actors that only communicate through firstinfirstout queues. Programs thus explicitly describe the program's potential for parallelism. This can then be to profit for executing dataflow programs on parallel architectures [4].
Specifically, dataflow suits well the design of, eg, signal processing applications (video codecs, image processing, softwaredefined radio, etc.). With the everincreasing complexity of platforms, a crucial question is to provide the programmer with tools to understand the performance of his/her application and helping him/her identifying and ultimately leveraging perfor mance bottlenecks. In particular, understanding these bottlenecks is quite challenging in the presence of socalled NonUniform Memory Access architectures, which are now very common with desktop and HPC systems. In our approach, this is done by relying on hardwarelevel components that are able to monitor the various components in or outside the processor. The information measured by such performancemonitoring units (PMU) can then be linked to the more abstract programming models. It can also be used to build abstract performance models, among which we will focus on the roofline proposed recently [10, 7].
A roofline model (see figure 1, in the pdf document, taken from [7]) is a two dimensional plot with the horizontal axis representing the operational intensity I and the vertical axis representing the performance P . The roofline plot is defined by the physical limits imposed by the hardware : π is the peak performance (π = 2 flops/cycle in the figure) and β is the memory bandwidth β = 1 byte/cycle in the figure). For each considered applications, we can measure points (P, I) that are placed somewhere below the roofline. The closer the point is to the bound based on β, the more memoryintensive the application is. The closer the point is to π, the more computeintensive the application is. This research project studies the performance bottlenecks of dataflow applications and the scalability of such applications. This work will build on top of PMUrelated monitoring facilities proposed by previous contributions [9, 8, 1].
The goal of the project is to extend roofline models mentionned earlier as follows. In the dataflow model we investigate, one application is composed of various processes communicating through FIFO channels. However, a roofline model plots the performance of the application as a whole. Consequently, it is of little help to understand which actor or which FIFO channel may be the bottleneck of the whole application. We shall thus explore how the roofline model can be adapted to provide insights on each actor, as well as compositions of certain actors in the application. The following ideas will be experimented : 1. One roofline per actor, and comparison of the rooflines to exhibit differences in the performance model of each actor ; 2. One roofline per core, and comparison of the rooflines of the different cores to exhibits load balancing differences. Roofline models also concentrate on describing the performance behavior of applications on a given architecture (say with a given number of cores). It is of little help, as is, to understand the performance problems that can occur when trying to scale programs on different architectures, eg when increasing the number of cores. We shall also study extensions of the roofline model to better understand scalability. Recent extensions of rooflines (eg to enable performance prediction of notyetimplemented algorithms, using algorithm classifications [6] or for taking into account energyconsumption [2]) have been proposed. For both the aforementioned contributions, we shall follow comparable methodologies to revisit the formal definition of rooflines. Experiments will be conducted with real applications (in particular video decoders) implemented with the RVCCal programming language and compiler [11].
Bibliographie [1] numap : memory profiling for numa systems. https://github.com/numaplibrary/numap. [2] Jee Whan Choi, D. Bedard, R. Fowler, and R. Vuduc. A roofline model of energy. In Parallel Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on, pages 661=96672, May 2013. [3] Jack B. Dennis. First version of a data flow procedure language. In Symposium on Programming, pages 362=96376, 1974. [4] Michael I Gordon. Compiler techniques for scalable performance of stream programs on multicore architectures. PhD thesis, Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science, 2010. [5] G. Kahn. The semantics of a simple language for parallel programming. In Information processing. NorthHolland, 1974. [6] Cedric Nugteren and Henk Corporaal. The boat hull model : Adapting the roofline model to enable performance prediction for parallel computing. In Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12, pages 291=96292, New York, NY, USA, 2012. ACM. [7] Georg Ofenbeck, Ruedi Steinmann, Victoria Caparros, Daniele G. Spampinato, and Markus Püschel. Applying the roofline model. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2014. [8] M. Selva, L. Morel, and K. Marquet. Numa profiling for dynamic dataflow applications. Submitted for evaluation at ACM TACO (Transactions on Architecture and Code Optimization. September 2015. [9] M. Selva, L. Morel, K. Marquet, and Frenot S. A monitoring system for runtime adaptations of streaming applications. In PDP, the 23rd Euromicro Conference on Parallel, Distributed and Networkbased Processing, 2015. [10] S.Williams,A.Waterman,andD.Patterson.Roofline:Aninsightfulvisualperformancemodelformulticorearchitectures.Commun.ACM, 52(4) :65=9676, April 2009. [11] HerveYviquel,AntoineLorence,KhaledJerbi,GildasCocherel,AlexandreSanchez,andMickaelRaulet.Orcc:Multimediadevelopment made easy. In Proceedings of the 21st ACM International Conference on Multimedia, MM '13, pages 863=96866. ACM, 2013. URL sujet détaillé : http://lionel.morel.ouvaton.org/images/Sujets/perfEvalddf.pdf
Remarques :





SM20727 Structural properties of Dynamic Epistemic Arenas


Description


This internship subject is purely theoretical to provide a crucial understanding of phenomena arising in multiagent systems analysis. Multiagent systems behaviours can be modelled by combining two ingredients (as proposed in Dynamic Epistemic Logic [vDvHK2007]):
1. Epistemic models, that we call "situations", represent a static situation: these are graphs whose vertices are possible worlds and whose edges represent 2. Action models represent which actions can be performed in the current situation and how actions are perceived by agents.
On this basis, the resulting of applying an action model is obtained by computing an update of the current situation by a fairly simple operation.
Now, in order to reason about strategic abilities of agents, one has to analyse a multiplayer game arena arising from an initial situation and arbitrary iterations of action models updates. Mathematically, this arena is a forest of infinite trees: a branch in a tree represents the sequence of executed actions and the crossways relations represent the imperfect information of agents about the history that has effectively taken place in the game. With reference to the approach of [vDvHK2007], we call this arena the "Dynamic Epistemic Arena" (DEA).
As shown in [Maubert1014] for a restricted case, DEAs possess noticeable structural properties that fall into classic ones, widely studied in the literature, namely DEAs are "automatic structures" [BG2000,KM2007,2007,R2008]. Automatic structures are infinite graphs with a finite presentation: the infinite set of vertices is provided by a regular language (hence a finitestate automaton) and the binary relations between vertices are described by synchronous transducers (a subclass of twotape finite state automata). Form a logical perspective, automatic structures have a decidable firstorder logic theory. Moreover, according to [Maubert2014], finer analysis of the DEAs' structures yields decidability of the existence of a sequence of actions (a strategy) to achieve a winning objective in the game. As a consequence, epistemic planning and epistemic protocol synthesis can be applied, with impacts for many relevant real life applications (robotics, web services, etc.).
The pioneer results of [Maubert2014] are however subject to severe restrictions on the kind of action models one is allowed to consider: preconditions of actions must be fully propositional, preventing the setting from considering some realistic pictures where agents act on the basis of their knowledge, acting only if e.g. knowing something, or knowing that another agent knows something, etc.
The subject of this internship is to study the tradeoff between relaxing these restrictions and the ability to perform strategic. The roadmap for this task is to have a clear picture of the DEAs' structural properties in the landscape of finitelypresentable structures and by determining which logical fragment of firstorder logic but also secondorder logic can be decided.
This internship will be supervised by Sophie Pinchinat and François Schwarzentruber
Bibliography [vDvHK2007] Dynamic epistemic logic. H van Ditmarsch, W Van Der Hoek, B Kooi. Springer Verlag, 2007. [Maubert2014] Chapter 7 of Logical foundations of games with imperfect information : uniform strategies. Bastien Maubert. PhD Université de Rennes 1, 2014. [BG2000] Automatic Structures. Achim Blumensath, Erich Grädel. LICS 2000: 5162. [KM2007] Three lectures on automatic structures. Khoussainov and M. Minnes. Tutorial lectures on automatic structures given at the Logic Colloquium 2007, Wroclaw, Poland. [B2007] Automatic presentations of infinite structures. Vince Barany. RWTH Aachen University 2007, pp. 1146. [R2008] Automata presenting structures: A survey of the finite string case. Sasha Rubin. Bull. Symbolic Logic Volume 14, Issue 2 (2008), 169209. URL sujet détaillé : http://people.irisa.fr/Sophie.Pinchinat/2015sujetDELstructure.html
Remarques : This internuship will be cosupervised by François Schwazentruber (ENS Rennes, see http://people.irisa.fr/Francois.Schwarzentruber/)





SM20728 GNUradio extension for dynamic reconfiguration


Admin


Encadrant : Tanguy RISSET 
Labo/Organisme : The internship will take place in Citi Lab, InsaLYon with standard internship payment (~417=80/month). 
URL : http://www.citilab.fr/team/socrate/ 
Ville : The internship will take place in Citi Lab, InsaLYon with standard internship payment (~417=80/month). 



Description


New wireless protocols regularly emerge, leading to a increase of the number of dedicated chips in mobile wireless devices. Next generation of the wireless chips should contain a reconfigurable baseband chip able to adapt to different wireless protocols (Wifi, 3G, 4G etc..) and dedicated to physical protocol layer (modulation, channel coding etc..).
The associated concept is the concept of "Software defined radio", a software defined radio platform needs to be programmed with a high level langage, sometimes called "waveform langage". This langage offers the possibility to describe différent modulation schemes in a way that is independent of the target platform. In practice, it describes a dataflow treatment composed of FFT, convolution filters, coder/decoder, synchronisation modules etc..
One of the emerging programming environment for waveform langage is GNUradio. GNUradio enables the user to program a complete baseband process in software (CF++ and Python) and, associated to a radio platform (Ettus USRP platform for instance), it can provide a software defined radio transceiver.
GNUradio programs are witten in a dataflow computation model. Currently GNUradio can only handle static dataflow and cannot express reconfiguration of the dataflow graph. Software defined radio requires reconfiguration capabilities.
The goal of the intership is two propose an extension of the GNUradio scheduler to handle fast reconfiguration of dataflow graph without losing any sample.
The Socrate team of Citi is responsible for the The FIT/CorteXlab testbed (http://www.cortexlab.fr/). CorteXlab is dedicated to the study of these emerging software defined radio system. It contains 22 USRP and is a natural experimentation platform for this intership.
URL sujet détaillé :
:
Remarques : The internship will take place in Citi Lab, InsaLYon with standard internship payment (~417=80/month).
It requires good linux programming and administration skills.C++ and python programming. Wireless communication protocol knowledge is appreciated as well as dataflow programming experience (but not mandatory).
The internship will also be advised by Lionel Morel (embedded systems & data flow) and leonardo cardoso (signal processing & wireless).





SM20729 CrowdSenSim


Description


CrowdSenSim: Mobile Crowd Sensing Simulator
Context Smartphones have become essential for our day life. They are normally equipped with a rich set of sensors, including GPS, microphone, camera, accellerometer and gyroscope among the others. As a consequence, everyone can easy collect and share sensed information. Mobile crowd sensing emerged recently as a promising largescale sensing paradigm where data collection is typically performed by smartphones. Mobile crowd sensing is projected to become one of the most important technologies contributing to healthcare, monitoring, logistic and organization in future smart cities. CrowdSenSim is the first simulator designed for research use in mobile crowd sensing. It allows simulations of largescale crowd sensing activities in urban scenarios and can be used to develop novel solutions in data collection, task assignment, monitoring and resource management.
Objectives Being a large scale simulator, the number of users scales in the order of hundred of thousands. As a result, finedetailed communication models used in traditional networking simulators like NS3 or Omnet++ become not suitable for scalability purposes. Objectives of the project consists in enhancing CrowdSenSim communication module to support both LTE and WiFi connectivity and in implementing a graphical interface for data collection results.
Description  Contribute to the scientific research in the area of mobile crowd sensing  Extend CrowdSenSim communication module capabilities and implement graphical interface  Participate in writing of scientific articles that will be presented in international conferences and published in major scientific journals
Requirements  Very strong programming skills (C++, Java, Javascript)  Good command of English language is essential  Ability to work independently as well as in a team
Contact For inquiries please contact Dr. Dzmitry Kliazovich (dzmitry.kliazovich.lu) or Prof. Pascal Bouvry (pascal.bouvry.lu) URL sujet détaillé : http://claudiofiandrino.altervista.org/PhD/crowdsensimproject.pdf
Remarques :





SM20730 Mobile Cloud Computing and Wearables


Description


Context Mobile cloud computing enables delivery of a large variety of cloud application to billions of smartphones and wearable devices. Many of these devices have constrained computing, battery and storage resources, which requires offloading the execution of heavy tasks to the cloud. Mobile cloud computing in turn enabled mobile devices to run applications providing rich user experience, including commerce, healthcare, social networks and gaming among the others. As a result, the mobile devices market is continuously growing. In the context of wearable devices, sales are expected to grow from billion in 2013 up to around billion in 2018.
Objectives Traditionally, offloading task execution to the cloud focused on the impact of tasks on energy consumption of the devices. However, in mobile cloud computing environment communications play an important role. Despite of this, few works jointly investigated the impact of computing and communication on battery drain. The objective of the project is to verify experimentally whether it is convenient for wearable devices to offload task execution on the cloud taking into account computing, networking and storage resources. Particular emphasis will be put in application partitioning models. Experiments will be carried on a rich set of wearable devices, including Google Glasses, smart watches and smart rings.
Description  Contribute to the scientific research in the area of mobile cloud computing  Conduct experiments using a rich set of available wearable devices (Google Glasses, smart watches and smart rings)  Participate in writing of scientific articles that will be presented in international conferences and published in major scientific journals
Requirements  Background in the area of communications and optimization  Good programming skills (C++)  Good command of English language is essential  Ability to work independently as well as in a team
Contact For inquiries please contact Dr. Dzmitry Kliazovich (dzmitry.kliazovich.lu) or Prof. Pascal Bouvry (pascal.bouvry.lu). URL sujet détaillé : http://claudiofiandrino.altervista.org/PhD/mobilecloudcomputing.pdf
Remarques :





SM20731 Augmenting Remote Control Devices Experience with Mobile Cloud Computing


Description


Context Smartphones have become essential for our day life. They are normally equipped with a rich set of sensors, including GPS, microphone, camera, accellerometer and gyroscope among the others. Accelerometer and gyroscope enable human hand gestures to become control commands for other devices such as drones or robots. Despite other control devices, smarthphones are very popular and have a much simpler interface, the touch screen. Being battery equipped, smartphones have constrained computing, battery and storage resources, which requires offloading the execution of heavy tasks to the cloud to enrich quality of experience and prolong battery lifetime.
Objectives Because of the need of synchronization, controlling devices impose severe constraints to the network, being latency the most crucial factor. For such a reason, applications developed so far for remote control are very simple. Mobile cloud computing can enrich user Quality of Experience (QoE). For example, it can enable object detection and recognition, that are typically heavy computational applications. The objective of the project is to verify experimentally how to enrich control applications through mobile cloud computing. Experiments will be carried on real devices.
Description  Contribute to the scientific research in the area of mobile crowd sensing  Conduct experiments and measurements with real devices (drones)  Participate in writing of scientific articles that will be presented in international conferences and published in major scientific journals
Requirements  Background in the area of communications  Good programming skills (C++,Java,Javascript)  Good command of English language is essential  Ability to work independently as well as in a team
Contact For inquiries please contact Dr. Dzmitry Kliazovich (dzmitry.kliazovich.lu) or Prof. Pascal Bouvry (pascal.bouvry.lu). URL sujet détaillé : http://claudiofiandrino.altervista.org/PhD/remotecontrol.pdf
Remarques :





SM20732 Cloud Computing


Description


Context Cloud computing is entering our lives and changing the way people consume information dramatically. Clouds transform IT infrastructures with an emphasis on making them flexible, affordable, and capable of serving millions of users, satisfying their computing or storage demands. The optimization of cloud computing and cloud networking can significantly increase system performance, reducing energy consumption and save costs not only inside individual data centers, but also globally, on the Internet scale. Advancements in internetworking become key enabler for building hybrid clouds and federations of clouds. Efficient resource management and scheduling in data centers and cloud infrastructures is open research challenge that has to be addressed and novel architectures, telecommunication technologies, and protocols must be developed to ensure efficiency of future cloud computing systems. Tasks  Implement and validate novel resource allocation and scheduling approaches using GreenCloud (http://greencloud.gforge.uni.lu/) .  Participate in writing of scientific articles that will be presented in international conferences and published in leading scientific journals  Help developing models and simulation tools for cloud computing environments Required Qualification  Strong analytical and programming skills (C++)  Fluent written and verbal communication skills in English  Ability to work independently as well as in a team
Contact For inquiries please contact Dr. Dzmitry Kliazovich (dzmitry.kliazovich.lu) or Prof. Pascal Bouvry (pascal.bouvry.lu). URL sujet détaillé : http://wwwen.uni.lu/content/download/84148/1033215/file/cloudcomputing.pdf
Remarques :





SM20733 Topological analysis of 3D data for scientific visualization


Description


We have 3 master2level internship openings (in the perspective of a Ph.D. followup) on the topic of the topological analysis of 3D data for scientific visualization:
http://lip6.fr/Julien.Tierny/stuff/openPositions/internship2016a.html
http://lip6.fr/Julien.Tierny/stuff/openPositions/internship2016b.html
http://lip6.fr/Julien.Tierny/stuff/openPositions/internship2016c.html URL sujet détaillé :
:
Remarques :





SM20734 Efficient multisensor fusion on Instrumented Vehicle


Description


La thématique générale du projet de recherche est autour des technologies pour véhicules autonomes. La compétence du CEA Grenoble au sein de son institut LETI couvre toute la conception/production de système micronano électroniques (Technologies silicium, MEMS, conception de circuit, architecture matérielle, logiciel embarqué, application). Pour ce projet, ce sont davantage les " couches hautes " (système embarqués et algorithmique) qui sont visées, bien que les connaissances des principes physiques de mesures soit un plus.
Le projet général vise l'intégration de multiples capteurs hétérogènes dans un véhicules, pour modéliser l'environnement en temps réel. Nous disposons pour cela d'un véhicule déjà équipé dans lequel vous devrez proposer/implémenter/valider vos développements. Le détail du sujet se trouve sur le site du CEA
Les fondements scientifiques que nous exploitons sont : la programmation bayésienne et les réseaux bayésiens, la modélisation de flux optique, la théorie du contrôle, et divers principes physiques de " range sensing " (LIDAR/ RADAR / SONAR/ Stéreovision) + une grande expertise pour la programmation embarquée. La thématique est très forte en ce moment, et nous avons un écosystème de partenaires de recherche grandissant dans le monde du véhicule autonome et du drone.
A l'issu du stage, une poursuite en thèse est envisagée.
Quelques références:
[1] A. Elfes, =93Occupancy grids: A probabilistic framework for robot perception and navigation,=94 Ph.D. dissertation, Carnegie Mellon University, Pittsburgh, PA, USA, 1989 [2] A. Negre, L. Rummelhard, and C. Laugier, =93Hybrid sampling Bayesian Occupancy Filter,=94 in Intelligent Vehicles Symposium Proc., 2014. [3] T. Rakotovao, D. Puschini, J. Mottin L. Rummelhard, A. Lukas, and C. Laugier =93Intelligent Vehicle Perception: Toward the Integration on Embedded Manycore=94, n Proceedings of the 6th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures (PARMADITAM '15) [4] T. Rakotovao, J. Mottin, D. Puschini, and C. Laugier, =93Realtime powerefficient integration of multisensor occupancy grid on manycore,=94 in The 2015 IEEE ARSO, 2015 URL sujet détaillé : http://portail.cea.fr/emploi/Lists/Stages/StructuredDisplayForm.aspx?ID=20893&Source=http%3A%2F%2Fportail%2Ecea%2Efr%2Femploi%2FPages%2Fstages%2Foffres%2Dstage%2Easpx%23InplviewHasheb25e04c%2D726f%2D493b%2D872a%2D3677d7bd9278%3DSortField%253DLieu%5Fx0020%5Fde%5Fx0020%5Fstage%2DSortDir%253DAsc%2DFilterField1%253DLieu%25255Fx0020%25255Fde%25255Fx0020%25255Fstage%2DFilterValue1%253DGrenoble%2DFilterFields2%253DDomaine%25255Fx0020%25255Fscientifique%2DFilterValues2%253DInformatique%25253B%252523Math%2525C3%2525A9matiques&ContentTypeId=0x010010995605D5DC4A16AA9AA61FBA54D1B20016019053C4495143AC79FFD9B31F9981
Remarques :





SM20735 Component Model Formalization and its Algorithms


Description


Componentbased Software Engineering has proved many times good properties for code reuse, separation of concerns, maintainability and productivity of codes. As a matter of fact, those same properties are also needed in high performance parallel programming, where codes are difficult to produce for one given parallel hardware architecture (productivity problems), but even more difficult to maintain, to make generic or less hardware specific. Componentbased runtime models have already been proposed without performance overhead on high performance codes. Those kind of models proposes very low level and simple concepts, as for example primitive components and method calls or MPI connections between them. However if lowlevel component models bring improvment in high performance codes, a higherlevel language is needed to increase productivity and codereuse. In this context the High Level Component Model (HLCM) has been proposed. In this model higher level concepts as composite components, genericity and connectors are proposed. From this higher level language a compilation phase produces a lowlevel high performance componentbased program. This compilation phase is based on search, backtrack and reduction to primitive components. A specialization calculus, called SpecMod, has recently been proposed. HLCM could be formalized using this calculus. This internship takes place in this context. First, a formalization of HLCM using SpecMod will be asked to the student, as well as an evaluation of this formalization through the definition of parallel algorithmic skeletons (parallel patterns). Second, the student will have to write an algorithm for the HLCM compilation, which is linked to the formalization previously done, probably based on dynamic programming with search and backtrack. URL sujet détaillé : http://graal.enslyon.fr/~hcoullon/m2_hlcm.pdf
Remarques : Stage coencadré avec Christian Perez : christian.perez.fr





SM20736 Linear graph rewriting for Boolean logic


Description


This project is about 'reducing' Boolean expressions in a correct way, simulating logical reasoning. For example:
x AND (y OR z) > (x AND y) OR (x AND z) (*)
Given a set of such reductions, one can chain together various instances to construct 'proofs'. In particular, one can construct sets of such reductions that are 'complete', i.e. from which we can prove every correct reduction.
A striking feature of all known complete sets is that they all involve *duplication of variables*. For example, in the above reduction the variable x is duplicated. What happens when we do not allow such behaviour? For example:
x AND (y OR z) > (x AND y) OR z
This rule lies at the heart of Girard's 'Linear Logic', which successfully demonstrates the advantages of considering duplicationfree fragments of Boolean logic. In fact this 'linear' fragment of Boolean logic is rich enough to encode the entire logic: it is already coNPcomplete. Hence arrives our motivating question:
(Q) Can we construct reduction systems that are correct and complete for *linear* Boolean logic?
In particular, this requires us to remain in the world of linear Boolean expressions, for instance rejecting (*) and even intermediate nonlinearity, e.g.:
x AND (y OR z) > (x AND y) OR (x AND z) > (x AND y) OR z
A complete linear system, unless coNP = NP, must admit reductions that are *exponentially long* but only *linearly wide*, a property consistent with but thusfar unobserved in proof complexity. There would also be farreaching implications on the space complexity of various wellknown proof systems.
In this project we propose to work with certain graphs extracted from Boolean expressions, called 'cographs'. The world of graphs is far more general than the world of logical expressions, *strictly extending* the class of Boolean functions, and so exhibits far richer and more expressive reduction behaviour. We believe that this setting admits reduction systems with the improved complexity properties that we desire.
Inspired by recent work, we propose to develop this idea as follows:
* Develop a comprehensive framework for (co)graphs that models Boolean logic. * Construct complete reduction systems based on (co)graphs. * Obtain *separation results* distinguishing the world of graphs from that of Boolean expressions.
Useful references
1. Rewriting with linear inferences in propositional logic. Anupam Das. http://drops.dagstuhl.de/opus/volltexte/2013/4060/ 2. No complete linear term rewriting system for propositional logic. Anupam Das and Lutz Stra=DFburger. http://drops.dagstuhl.de/opus/volltexte/2015/5193/ URL sujet détaillé :
:
Remarques : Coadvised by:
Anupam Das.
anupam.daslyon.fr
http://www.anupamdas.com/
(also the contact for informal inquiries)





SM20737 Causality Analysis in SafetyCritical Systems


Description


Today's IT systems, and the interactions among them, become more and more complex. Power grid blackouts, airplane crashes, failures of medical devices, cruise control devices out of control are just a few examples of incidents due to component failures and unexpected interactions of subsystems under conditions that have not been anticipated during system design and testing. Determining the root cause(s) of a systemlevel failure and elucidating the exact scenario that led to the failure, e.g., to establish the contractual liability of component providers, is today a complex and tedious task that requires significant expertise.
The diagnostic usually relies on counterfactual reasoning ("what would have been the outcome if component C had behaved correctly?") to distinguish component failures that actually contributed to the outcome from failures that had little or no impact on the systemlevel failure [1]. Informally, an event e is considered as a cause of e' if both e and e' have occurred, and in an execution that is as close as possible to the observed execution but where e does not occur, e' does not occur either.
In this project the intern(s) will
 improve an existing approach for causality analysis based on event structures [2]
 propose a symbolic encoding enabling effective analysis
 implement the results and apply them to the case study
 optionally, develop a GUI based on the developed analysis to visualize the results and assist the user in exploring alternative scenarios ("what would have happened if ...").
This project can be followedup by a PhD thesis.
The intern(s) will be part of the SPADES team at INRIA Grenoble  RhôneAlpes. SPADES is committed to developing techniques and tools for the design of safe embedded systems.
References:
[1] J.Y. Halpern and J. Pearl. Causes and Explanations: A StructuralModel Approach. Part I: Causes. British Journal for the Philosophy of Science 56(4), 2005. Available at ftp://ftp.cs.ucla.edu/pub/stat_ser/R266part1.pdf
[2] G. G=F6ssler and J.B. Stefani. Fault Ascription in Concurrent Systems. TGC 2015, available at https://hal.inria.fr/hal01197486/document URL sujet détaillé :
:
Remarques : The project is coadvised by JeanBernard Stefani.





SM20738 FebruaryJuly


Description


COUPLED SIMULATION, COALGEBRAS AND UPTO TECHNIQUES
Upto techniques are enhancement of conduction, a mathematical principle which allows to prove properties of infinite structures such as streams, infinite trees or the behaviour of systems with feedbacks.
In this internship we propose to study upto techniques for coupled simulation, a behavioural equivalence for concurrent systems that is often used in practice.
In order to achieve an high level of generality, we will use the theory of coalgebras and other categorical concepts, such as those of monad, distributive law and fibration.
The expected results are worth to be published but, the work is intended mainly as a training to master coinduction, coalgebras and, more generally, category theory. As a consequence, we are open to discuss and improve the research programme with the interested student, according to his concerns and needs.
The internship also offers the opportunity to spend four months (from April to July) in Pisa.
Further updates may be available at http://perso.enslyon.fr/filippo.bonchi/internship.rtf URL sujet détaillé : http://perso.enslyon.fr/filippo.bonchi/internship.rtf
Remarques : The student will be supervised also by Damien Pous (http://perso.enslyon.fr/damien.pous/)





SM20739 Causality in rulebased modeling


Description


The general theme of this internship is the further development of the categorical theory of causality in rulebased modeling. A number of possible directions could be taken including (i) the refinement of the implicit state simulation algorithm [this project would ideally also involve some implementation work]; and (ii) the categorical formalization of signaling pathways as causal traces of a trace of events and their compression [this would be a theoretical project with links to previous, and ongoing, work of some of my colleagues in Paris]. Further possibilities could be discussed on a casebycase basis. URL sujet détaillé :
:
Remarques :





SM20740 Multiple Internship offers for Technicolor R&D France


Description


Présentation de Technicolor R&D France With more than 100 years of experience and innovation, Technicolor is a major actor in entertainment, software, and gaming worldwide. With strong historical ties to the largest Hollywood studios, the company is a leading provider of production, postproduction and distribution for creators and distributors of content. Technicolor is a world leader in film processing, a leading supplier in the field of decoders and gateways and one of the largest manufacturer and provider of DVD & Bluray discs. The recent acquisition of The Mill, a leading visual effects and content creation studio for the advertising, gaming and music industries adds one more firstrate brand to our already robust portfolio (MPC, Mikros Image and Mr. X), effectively making Technicolor the leader in VFX and post production services to the advertising production segment The company also has a strong activity in Intellectual property and Licensing business. This unique environment presents an exceptional opportunity to explore cuttingedge video technologies that will reach homes in the years to come. In Rennes research and development center, our research engineers will make your internship a fascinating and rewarding experience. For more information, please visit our website: www.technicolor.com
Our interns' mission is to help researchers identify new approaches, implement new algorithms, perform tests and benchmarks and contribute to the implementation of demos and prototypes within research teams. We offer research internship opportunities in multiple fields:
Objectives /missions  Computer Graphics (CG)  Computer Vision (CV),  Data Mining (DM)  Distributed Computing (DC),  Film Making (FM)  Human Computer Interaction (HCI)  Human Factors (HF)  Internet of Things (IOT)  Security and Privacy (SP)  Video Processing (VP)
Details of the internships offered within these thematics at http://www.technicolor.com/en/innovation/studentday/jobinternshipopportunitiesrilabs Apply at stage.rennes.com Monthly allowance: =801200 gross
Profile  Student in Master 2 in an engineering school or university  Specialization in one of the fields above  Inquiring mind, inventive, passionate .  Skills in prototyping applications on PC or embedded platforms  Good English level URL sujet détaillé : http://www.technicolor.com/en/innovation/studentday/jobinternshipopportunitiesrilabs
Remarques : Monthly allowance: =801200 gross





SM20741 Multiobjective SearchBased Software Testing


Description


SearchBased Software Engineering (SBSE) [1] and SearchBased Software Testing (SBST) [2] are fields of Software Engineering (SE) that involve the application of Operations Research (OR) techniques to a variety of problems in the SE and ST domains. A classical example where SBST has proven successful is coverage testing. Say you have several thousand test cases for a software system under development. Running all of them takes up too much time, and you know that they are not all required as there is much intersection between what code they cover. How do you decide on a good set of test cases to use, i.e., the smallest set that provides the best code coverage? Trying every possible combination would take too long and is in any case unnecessary  you don't need the optimum solution, only a good one. In this project we want to address the overlooked [3] problem of multiobjective searchbased testing for test management. It is indeed grossly unrealistic to assume that test managers want to optimise only one metric, in one direction of the search space. For instance, non functional properties, such as, security, execution time, energy consumption etc. are all as relevant as coverage. However this problem is complex and large, and there is a variety of techniques to tackle this sort of challenges that need to be evaluated. Requirements: Good knowledge of OR, and especially multiobjective resolution. Good understanding of software testing and software engineering. References: [1] Harman, Mark, and Bryan F. Jones. "Searchbased software engineering." Information and Software Technology 43.14 (2001): 833839. [2] McMinn, Phil. "Searchbased software testing: Past, present and future." International Conference on Software Testing, Verification and Validation (ICST), 2011. [3] Harman, Mark, Yue Jia, and Yuanyuan Zhang. "Achievements, open problems and challenges for search based software testing." International Conference on Software Testing, Verification and Validation (ICST), 2015. URL sujet détaillé :
:
Remarques : Environment: The interns will be hosted in the School of Computer Science at UCD, Dublin, Ireland (http://www.ucd.ie/). UCD is the leading Irish University.
Conditions and Benefits:
The candidate will be remunerated based on experience, in the range =801,000=96=801,500 per month (exempt from tax). The student will also receive UCD student status for the duration of the internship, which gives them access to UCD student facilities and services (such as, clubs, concerts, sports centre, campus pubs, etc.).
Dates:
Internship will commence in first semester of 2016 depending on candidates' availability.





SM20742 Computer Science and Archaeology for the Collection and Analysis of Petroglyphs


Description


Neolithic rock art is a form of art that was extremely popular around the Irish sea and beyond in Europe during the 40002500 BC period. It consists in symbols pecked, scratched and carved on rock. New digital photography and more recently various 3D scanning methods opened a new stage for the documentation and analysis of petroglyphs. In this project, the interns will study the use of various types of cameras (e.g., IR, depth and lidar cameras) for the task of collecting and documenting petroglyphs. The interns will then study different classification techniques in the context of rock art. The final product of this internship would be a tool to assist archaeologists in their everyday work: from collection and classification, to comparison and visualisation. This is a broad project that will be done in collaboration with real practitioners (from UCD School of Archaeology), with potentially some time spent in the field. We expect the interns to be autonomous and selfdriven. Requirements: Python and/or C++. Knowledge of visualisation tools and library (e.g., OpenCV) would be a plus. References: Seidl, Markus, Ewald Wieser, and Craig Alexander. "Automated classification of petroglyphs." Digital Applications in Archaeology and Cultural Heritage (2015). Deufemia, Vincenzo, and Luca Paolino. "Combining unsupervised clustering with a nonlinear deformation model for efficient petroglyph recognition." Advances in Visual Computing. Springer Berlin Heidelberg, 2013. 128137. URL sujet détaillé :
:
Remarques : Environment: The interns will be hosted in the School of Computer Science at UCD, Dublin, Ireland (http://www.ucd.ie/). UCD is the leading Irish University.
Conditions and Benefits:
The candidate will be remunerated based on experience, in the range =801,000=96=801,500 per month (exempt from tax). The student will also receive UCD student status for the duration of the internship, which gives them access to UCD student facilities and services (such as, clubs, concerts, sports centre, campus pubs, etc.).
Dates:
Internship will commence in first semester of 2016 depending on candidates' availability.





SM20743 Distributed Hybrid Simulator for Fast and Accurate Urban Traffic Prediction


Description


Prediction of urban road traffic is critical for transportation operators. It is traditionally computed using mathematical models which lack accuracy in complex contexts, such as, dense urban areas. Computational models, i.e., simulations, are known to have a better accuracy but they are computationally expensive and the use of distributed systems to increase the computational power comes with a cost: partitioning the simulated environment is challenging, as is the synchronisation between computing nodes. This project will explore the tradeoffs between mathematical and computational (simulations) models. Requirements: Java and Python. References: Ventresque, Anthony, et al. "SParTSim: a space partitioning guided by road network for distributed traffic simulations." Proceedings of the 2012 IEEE/ACM 16th international symposium on distributed simulation and real time applications. IEEE Computer Society, 2012. Lin, WeiHua. "A Gaussian maximum likelihood formulation for shortterm forecasting of traffic flow." Intelligent Transportation Systems, 2001. Proceedings. 2001 IEEE. IEEE, 2001. Min, Wanli, and Laura Wynter. "Realtime road traffic prediction with spatiotemporal correlations." Transportation Research Part C: Emerging Technologies 19.4 (2011): 606616. URL sujet détaillé :
:
Remarques : Environment: The interns will be hosted in the School of Computer Science at UCD, Dublin, Ireland (http://www.ucd.ie/). UCD is the leading Irish University.
Conditions and Benefits:
The candidate will be remunerated based on experience, in the range =801,000=96=801,500 per month (exempt from tax). The student will also receive UCD student status for the duration of the internship, which gives them access to UCD student facilities and services (such as, clubs, concerts, sports centre, campus pubs, etc.).
Dates:
Internship will commence in first semester of 2016 depending on candidates' availability.





SM20744 Performance Testing of Largescale Distributed Systems


Description


Largescale systems, such as the Cloud, are becoming commonplace. However, testing their quality is difficult, especially for global properties such as scalability, connectivity, selforganisation, etc. which cannot be evaluated on a small number of nodes. It is notably the case for performance, a property which is increasingly critical for largescale distributed systems. The objective of this project is to check global properties of largescale systems (P2P) using a distributed testing framework (e.g., Peerunit) and a supercomputer (e.g., Fionn, the Irish supercomputer). More specifically, we want (i) to see whether basic features are ensured in all cases; (ii) to study the performance of the system in some predefined scenarios. To this end, we want to run a large number of nodes (several hundreds) and apply extreme load, or disconnect a significant number of nodes, etc. and evaluate the response of the system under test (SUT). Requirements: Very good knowledge of Java, software testing (JUnit). References: De Almeida, Eduardo Cunha, et al. "A framework for testing peertopeer systems." Software Reliability Engineering, 2008. ISSRE 2008. 19th International Symposium on. IEEE, 2008. URL sujet détaillé :
:
Remarques : Environment: The interns will be hosted in the School of Computer Science at UCD, Dublin, Ireland (http://www.ucd.ie/). UCD is the leading Irish University.
Conditions and Benefits:
The candidate will be remunerated based on experience, in the range =801,000=96=801,500 per month (exempt from tax). The student will also receive UCD student status for the duration of the internship, which gives them access to UCD student facilities and services (such as, clubs, concerts, sports centre, campus pubs, etc.).
Dates:
Internship will commence in first semester of 2016 depending on candidates' availability.





SM20745 On the theory of the Probabilistic mucalculus


Description


The probabilistic mucalculus is a quantitative variant of Kozen's modal mucalculus, a wellknown temporal logic, which is designed to express properties of probabilistic concurrent systems. The general theme of this internship is the further development of the theory of this logic. URL sujet détaillé : http://perso.enslyon.fr/matteo.mio/doku.php?id=m2
Remarques :





SM20746 On extensions of Monadic Second Order logic


Description


The general theme of this internship is to further investigate recently introduced extensions of Monadic Second Order logic (MSO) with new quantifiers expressing a weaker form of universal quantification.
See the attached link for further details and references to the literature. URL sujet détaillé : http://perso.enslyon.fr/matteo.mio/doku.php?id=m2
Remarques :





SM20747 Certified roundoff error bounds using polynomial optimization


Description


Roundoff errors cannot be avoided when implementing numerical programs with finite precision. For each computation step, the roundoff error is very small but the accumulation of such errors can grow very quickly.
A workaround to this issue consists of relying on other software tools to verify programs and proofs, allowing to certify the roundoff error bounds. This verification ability is especially important if one wants to explore a range of potential representations, for instance in the world of FPGAs. It becomes challenging when the program does not employ solely linear operations as nonlinearities are inherent to many interesting computational problems in realworld applications.
Here, we are interested in certified computation of lower bounds for roundoff errors. The aim of this research work is twofold:
1) reformulating the roundoff error problem as a generalized eigenvalue problem. The idea of this reformulation shall be based on the theoretical results from polynomial optimization (semidefinite programming, polynomial sums of squares). 2) implementing a tool solving this eigenvalue problem and comparing with recent competitive methods. The tool could possibly be included in the Real2Float verification framework (http://nlcertify.forge.ocamlcore.org/real2float.html). URL sujet détaillé : http://wwwverimag.imag.fr/~magron/sujets/lowerroundsdp.pdf
Remarques : Local funding is available.





SM20748 Optimisation de centres de calculs avec sources d'énergie renouvelables


Description


Contexte : Dans le cadre du projet DATAZERO (projet de l'Agence National de la Recherche) en partenariat avec les laboratoires LAPLACE (Toulouse) et FEMTO (Belfort), l'IRIT propose l'étude de l'optimisation de la gestion de centres de calcul alimentés par des sources d'énergies renouvelables.
Sujet : L'équipe SEPIA travaille depuis plusieurs années pour l'optimisation de la consommation d'énergie des centres de calcul. Les niveaux électrique (sources d'énergie) et planification des applications sont souvent décorrélés. Le sujet de stage de fin d'étude à caractère recherche, permettra de démarrer le travail sur le sujet qui pourra se poursuivre en thèse 20162019 (un financement est acquis dans le cadre du projet DATAZERO).
Pour cela le sujet propose trois étapes : 1) Etat de l'art :  Faire l'état de l'art sur les centres de calcul alimentés par sources renouvelables (photovolotaique, éolien, pile à combustible, supercap)  Faire l'état de l'art sur les algorithmes de gestion des centres de calcul alimentés par sources renouvelables
2) Contribution recherche attendue : Proposer un/des algorithmes de placement visant à optimiser l'énergie provenant des sources renouvelables en utilisant leur caractérisation.
3) Implémentation et validation : Intégration dans le simulateur DCWorms (DataCenter Workload Management System) de :  la caractérisation des sources d'énergie renouvelables  des algorithmes de placement  évaluation par rapport aux algorithmes de l'état de l'art
Le stagiaire sera intégré à l'équipe du projet et sera amené à collaborer avec les partenaires.
Compétences attendues :  optimisation et évaluation de performances  programmation Java  bon niveau d'anglais : lu, écrit, parlé  curiosité et engagement pour la recherche  une connaissance du génie électrique et/ou du cloud computing sont un plus
Candidature : Envoyer CV complet et relevés de notes avec classement de chaque année après le L2 URL sujet détaillé : http://www.irit.fr/~JeanMarc.Pierson/crbst_9.html
Remarques : Cosupervised by : Stéphane Caux, LAPLACE
Georges Da Costa, IRIT
Patricia Stolf, IRIT





SM20749 Positivity certificates for polynomials using amoebas


Description


Context:
Finding positivity certificates for a multivariate polynomial is a central problem in optimization. The method providing more accurate certificates consists of finding a sum of squares decomposition of this polynomial, which boils down to solving a convex optimization problem (called a semidefinite program). Even though such certificates can be computed in polynomial time (up to an additive error), this framework only provides sufficient conditions: a bivariate nonnegative polynomial is not necessarily a polynomial sum of squares.
Iliman et al. [2] recently introduced a new type of positivity certificates, relying on Archimedean amoebas. The latter mathematical objects have been studied for fifteen years and are the images of polynomial system solutions under the logmodule map. Amoebas have various combinatorial properties related to polytopes. The framework proposed in [2]combines these properties with convex optimization methods to compute certificates in a more efficient way. However, this approach only applies to a restricted class of polynomials, whose supports are simplex Newton polytopes.
Objectives:
We believe that this approach could be extended to more general polynomial optimization problems, leading to the following twofold research investigation:
1) identifying polynomial systems for which the framework from [2] can be applied 2) implementing a tool to perform numerical experiments relying on geometric programming (e.g. with the cvx [1] Matlab toolbox)
Required Skills:
1) Convex optimisation, geometric and semidefinite programming 2) Linear algebra basics 3) Programming with OCaml/C/C++/Matlab according to preference of the candidate
A related PhD topic can be foreseen. URL sujet détaillé : http://wwwverimag.imag.fr/~magron/sujets/gpcircuits.pdf
Remarques : Cosupervision with Xavier Allamigeon and Stephane Gaubert (INRIA Maxplus/CMAP Ecole Polytechnique)
Local fundings is available





SM20750 Generating Texts using Deep Learning


Description


Given some input data, the objective of Natural Language Generation (NLG) is to find the best text verbalising that input data. The problem has numerous applications such as verbalising databases, numerical and semantic web (OWL, RDF) data.
An important aspect of statistical Natural Language Generation is related to the availability of a parallel corpus of text and data from which a statistical model can be learned to predict new text from unseen data. In a large number of approaches, these parallel corpora are constructed manually or semiautomatically from existing text and data sources. Although the statistical models that are learned from such artificial datatotext corpora are accurate enough to generate wellformed text, the creation of a parallel datatotext corpus is time consuming and the resulting models are non generic in that a new corpus must be constructed each time a new domain is considered.
By exploiting a parallel corpus aligning the input data with abstract grammatical information rather than words, [GP15] provide a generic approach that can generate text from arbitrary knowledge bases. The approach uses predefined feature templates and a Conditional Random Field (CRF) model to predict, given the input data, the sequence of grammatical categories that underly the most fluent, grammatical sentences generated by the grammar for that data. Recently, though, deep learning approaches have been proposed which have proved to be very effective in learning sequences and for which no feature engineering is required. In particular Long ShortTerm Memory (LTSM) recurrent networks, have recently been shown [WGMSVY15] to support text generation in the context of a dialog system.
In this project, we intend to apply the more advanced tools of deep learning to the hybrid grammarbased, statistical approach proposed in [GP15]. The goal of the internship will be to combine this grammarbased approach with an LTSM model predicting the best sequence of grammatical categories underlying the most fluent grammatical sentences generated by the grammar for the input data.
More specifically, the internship will involve:
* Studying deep learning in particular, recurrent and recursive nets for sequence modelling * Specifying and implementing an LTSM for predicting a sequence of grammar categories given some input data. * Applying the framework to the available data, namely the Groningen Meaning Bank, a corpus relating a semantic representation (the input data) to a sentence and a syntactic derivation * Evaluating the results
References:
[GP15] Gardent, Claire and Laura PerezBeltrachini. A Statistical, GrammarBased Approach to MicroPlanning. Submitted to Journal of Computational Linguistics. 2015.
[WGMSVY15] TsungHsien Wen, Milica Gasic, Niola Mrkshic, PeiHao Su, David Vandyke and Steve Young. Proceedings of EMNLP 2015, pages 17111721 URL sujet détaillé :
:
Remarques :





SM20751 Complexité et approximation sur le problème de l'échafaudage de génomes dans le cas des (r,l)graphes


Admin


Encadrant : Annie CHATEAU 
Labo/Organisme : Annie Chateau (annie.chateau.fr) et Rodolphe Giroudeau (rgirou.fr) sont maîtres de conférences au LIRMM (Laboratoire d'Informatique, Robotique et Microélectronique de Montpellier), respectivement dans les équipes MAB (Méthodes et Algorithmes pour la Bioinformatique) et MAORE (Méthodes et Algorithmes pour l'Ordonnancement et les Réseaux). Le stagiaire sera également amené à travailler en interaction avec Julien Baste, doctorant dans l'équipe ALGCO (ALgorithmes, Graphes et COmbinatoire), et Mathias Weller, postdoctorant dans l'équipe MAB. 
URL : www.lirmm.fr
: 
Ville : Montpellier 



Description


Genome scaffolding is a major step when it comes to produce new genomic sequences. It consists in ordering and orienting previously produced pieces of sequences, named contigs, using external information from various origins. The problem is modeled as an optimisation problem in a special class of graphs, called scaffold graphs. Of course, it is NPcomplete... Previous work has shown that the problem is approximable in complete graphs, meaning that there exists a polynomial heuristic whose performance is guaranteed not being too far from the optimal. Again, of course real graphs are far from being complete... We intend here to refine those results, and extend them in other classes of graphs, like cobipartite graphs, split graphs and (r,l)graphs, which are decomposable into r independant sets and l cliques. Work program is the following:  study the complexity and approximability of the problem in those graphs;  search for possible kernels for a FPT approach;  design algorithms, implement them (C++ welcome), test them... URL sujet détaillé : www.lirmm.fr/~chateau/stagenormalien15.pdf
:
Remarques : Annie Chateau (annie.chateau.fr) et Rodolphe Giroudeau (rgirou.fr) sont maîtres de conférences au LIRMM (Laboratoire d'Informatique, Robotique et Microélectronique de Montpellier), respectivement dans les équipes MAB (Méthodes et Algorithmes pour la Bioinformatique) et MAORE (Méthodes et Algorithmes pour l'Ordonnancement et les Réseaux). Le stagiaire sera également amené à travailler en interaction avec Julien Baste, doctorant dans l'équipe ALGCO (ALgorithmes, Graphes et COmbinatoire), et Mathias Weller, postdoctorant dans l'équipe MAB.
Un financement de thèse est envisageable à l'issu du stage.





SM20752 Certified code generation for geometric predicates


Description


*Context* This master's project deals with the robustness of geometric programs w.r.t. numerical degeneracies: Some classical tasks in geometric modeling, including computing the Delaunay triangulation, computing the intersection between surfacic and/or volumetric meshes, and meshing using Voronoi diagrams are examples involving such geometric programs. A geometric programs take as an input a set of objects (points, lines, segments, triangles ...) and return a combinatorial structure (e.g. a mesh) that interconnects the elements of the input. "Geometric predicates" are central components of geometric algorithms. They are functions that take as input a small set of geometric objects and return a binary e.g., left/right, above/below (or ternary, e.g. left/on/right) answer. Using standard floating point arithmetics for these predicate sometimes lead to inacurate  and more importantly unconsistent  results. Several techniques replace standard floating points with arbitraryprecision number types, thus overcoming the lack of precision encountered with floating points (see e.g. my survey in [1]).
*Goals* The implementation of such arbitrary precision arithmetics is quite involved, and errorprone. The goal of this project is to contribute to the ongoing development of a compiler, that transforms the mathematical expression of a predicate into a C++ program that evaluates the predicates. In particular, it would be interesting to strengthen our existing "Predicate Construction Kit" compiler with some means of certification, to prove that the generated code computes the exact sign of the mathematical expression. The Master student will study different means of proving the bounds for the computed expression, including using several existing Coq packages (Gasper) or Haskell.
*Skills* Ideally, the student will have a good knowledge of both computer arithmetics and software certification techniques. A taste for efficient computer implementation is also appreciated.
[1] http://www.loria.fr/~levy/PCK/PCK_paper.pdf http://www.loria.fr/~levy/PCK/PCK_slides.pdf URL sujet détaillé :
:
Remarques : The team has several possibilities for funding a Ph.D. thesis.





SM20753 A geometric algorithm for implementing Yann Brenier's transportcollapse scheme in 3D


Description


*Context* The goal of this project is to develop an efficient algorithm for implementing some nonconventional numerical solution mechanisms for Partial Derivative Equations with a strong geometric component [1],[2]. Such PDEs are the models of some physics subject to conservation laws. The goal is then to derive numerical solution mechanisms that satisfy these conservation laws. Yann Brenier created the theory that expresses this point of view and invented several original methods, leading to unconventional numerical solution mechanisms. The most well known one is the fluid mechanics formulation of optimal transport that he developed with Benamou [3] (see also [4] for a gentle introduction). It is also possible to derive some semidiscrete solvers [5,6].
*Subject* For conservation laws, Brenier also developed a theory and an algorithm, leading to a nonconventional numerical solution mechanism, that he calls "transportcollapse" ("transportécroulement" in French) [1,2]. This algorithm was not implemented yet, because it requires some nontrivial geometric components (i.e. computing the intersection of a 3D mesh embedded in 4D). The work consists in studying and understanding the algorithm, deriving an efficient geometric algorithm to compute the intersections and implementing it. The work already realized by the team for semidiscrete optimal tranport [6] can be used as a starting point, since it involves similar computations of geometric intersections.
*Skills* Knowledge of basic computational geometry is needed. Taste for C++ implementation is appreciated.
[1] Y. Brenier, Calcul des lois de conservation scalaires par la méthode de transportécoulement, 1981, https://hal.inria.fr/inria00076508
[2] Y. Brenier, 2015, Méthode du transportécroulement, personnal communication
[3] A computational fluid mechanics solution to the MongeKantorovich mass transfer problem JD Benamou, Y Brenier Numerische Mathematik 84 (3), 375393
[4] http://www.futurasciences.com/magazines/mathematiques/infos/dossiers/d/mathematiquesmathstransportmoindrecout849/page/4/
[5] A multiscale approach to optimal transport. Q. Mérigot, Computer Graphics Forum 30 (5) 1583=961592, 2011 (also Proc SGP 2011).
[6] Bruno Levy, A numerical algorithm for L2 semidiscrete optimal transport in 3d, 2015, ESAIM Mathematical Modeling and Numerical Analysis, http://arxiv.org/abs/1409.1279
URL sujet détaillé :
:
Remarques : The team has several possibilities for funding a Ph.D. thesis.





SM20754 The SAPIC/Tamarin Security Protocol Verification Toolchain


Description


General Context. Security protocols are distributed programs that aim at ensuring security properties, such as confidentiality, authentication or anonymity, by the means of cryptography. Such protocols are widely deployed, e.g., for electronic commerce on the Internet, in banking networks, mobile phones and more recently electronic elections. As properties need to be ensured, even if the protocol is executed over untrusted networks (such as the Internet), these protocols have shown extremely difficult to get right. Formal methods have shown very useful to detect errors and ensure their correctness.
We generally distinguish two families of security properties : trace properties and observational equivalence properties. Trace properties verify a predicate on a given trace and are typically used to express authentication properties. Observational equivalence expresses that an adversary cannot distinguish two situations and is used to model anonymity and strong confidentiality properties.
The Tamarin prover is a stateofthe art protocol verification tool which has recently been extended to verify equivalence properties in addition to trace properties. SAPIC allows protocols to be specified in a highlevel protocol specification language, an extension of the applied picalculus, and uses the Tamarin prover as a backend by compiling the language into multiset rewrite rules, the input format of Tamarin.
We propose several internships on extensions and usage of SAPIC/Tamarin.
* Support for exclusive or in tamarin. Many lowlevel protocols, such as those implemented in RFID tokens, rely on XOR because of its low computational ressources. The goal of this internship is to extend the Tamarin prover [SMCB12, BDS15], with support for XOR operators. Tamarin already deals with DiffieHellmantype equational theories, which have similar properties to XOR (notably the finite variant property and an associative commutative operator). The idea is to leverage as much of the existing theory and implementation as possible, in particular the computation of the variants and the intruder's computations. Then, suitable heuristics need to be developed to make the approach efficient in practice.
* Observational equivalence properties in SAPIC. As mentioned above, SAPIC compiles an extension of the applied pi calculus into multiset rewrite rules which can be analysed using Tamarin. This compilation has been proven correct for any property written in a firstorder logic, allowing to express trace properties. The goal of the internship is to extend SAPIC to observational equivalences in order to take advantage of the recent extensions of Tamarin. The work will require to adapt the current translation and the correctness proof to show the translation preserves observational equivalence properties.
* Ifthenelse terms. i.e. multiset rewrite systems that only differ in certain terms, but otherwise keep the same execution structure. This has some limitations for protocols with branching, notably when one branch needs to be simulated by a different one to prove observational equivalence. One solution to improve handling of branches is to extend terms with an ``ifthenelse'' operator, as done by ProVerif. This operator can express multiple branches inside a single branch by pushing differing terms inside the ``ifthenelse'' operator, and thus helps to prove more complex equivalences.
The goal of this internship is to adapt the theory developed for ProVerif to Tamarin and implement a correct treatment of ``ifthenelse terms''. In a first step, only the treatment of the terms will be implemented. Automated identification and merging of suitable branches similar to ProVerif can be done in a second step, but is not required.
* Analysis of the TPM 2.0 specifications. The Trusted Platform Module (TPM) is a hardware chip designed to enable commodity computers to achieve greater levels of security than was previously possible. The TPM stores cryptographic keys and other sensitive data in its shielded memory, and provides ways for platform software to use those keys to achieve security goals. Application software such as Microsoft's BitLocker and HP's ProtectTools use the TPM in order to guarantee security properties. The successor TPM 2.0 specification has been released for public review.
The aim of the internship is to use SAPIC/Tamarin to build a formal model of (part of) the TPM 2.0 and analyse its security in SAPIC/Tamarin.
URL sujet détaillé : http://www.loria.fr/~skremer/Sujets/tamarin.pdf
Remarques : The internship will be coadvised by Jannik Dreier (http://www.loria.fr/~jdreier/).
The internship is supported by the ERC Consolidator grant SPOOC offering the possibility for payment and funding for continuing towards a PhD.





SM20755 Verification of equivalence properties in cryptographic protocols


Description


With the ubiquity of smartphones and other portable, connected devices, the Internet has become the main communication medium. In many cases, e.g., ecommerce, home banking, electronic voting, etc., communications need to be secured. This can be achieved using cryptographic protocols. Cryptographic protocols, such as TLS, are distributed programs which use cryptographic primitives, such as encryption or digital signatures. However, even if the underlying primitives are perfectly secure, protocols may be attacked. Automated detection of such "logical flaws" is known to be undecidable in general, i.e., when the number of protocol sessions is unbounded or arbitrary cryptographic primitives are allowed. However, many algorithms for verifying protocols have been proposed in the case where the number of sessions is bounded and the algebraic properties of cryptographic primitives are rather simple.
The verification problem of cryptographic protocols also depends on the precise property to be verified. We generally distinguish two families of properties: trace (or reachability) properties and equivalence properties. Trace properties, such as (weak) secrecy and authentication, have been extensively studied and automated verification tools for these properties exist. More recently, equivalence properties, such as strong notions of secrecy, anonymity and unlinkability, have gained increasing interest. Verifying these properties is however more complicated and only few tools for their automated verification exist.
We propose 2 possible topics in the area of automated verification of equivalence properties for cryptographic protocols.
* Decidability and complexity of equivalence properties
Even though we know that the general problem of verifying equivalence properties is undecidable, our understanding in terms of complexity of decidable subclasses is very partial. A few, first complexity results exist for particular classes of protocols, as well as a decidability result (without any complexity bound) in the case the number of sessions is bounded and for a small number of particular cryptographic primitives. The objective of the internship is to study the complexity of the verification problem of equivalence properties for various families of cryptographic primitives.
* Efficient verification algorithms
Most tools for verifying equivalence properties are limited to a bounded number of sessions. This allows the tools to be sound, complete and guarantee termination. However, their efficiency is rather poor (sometimes running for a few days to decide the security of a protocol). Other tools do not limit the number of sessions, e.g., ProVerif, but do not guarantee completeness (allowing false attacks), nor termination. However, these tools use abstractions and are extremely efficient in practice. The goal of the internship will be to adapt some of the techniques used in tools such as ProVerif to increase the efficiency of tools working on a bounded number of sessions.
URL sujet détaillé : http://www.loria.fr/~skremer/Sujets/equivalenceen.pdf
Remarques : The internship will be coadvised by Vincent Cheval (http://www.loria.fr/~chevalvi/).
The internship is supported by the ERC Consolidator grant SPOOC offering the possibility for payment and funding for continuing towards a PhD.





SM20756 Computational Complexity of Graph Data Exchange


Description


Data exchange is the task of transforming data structured under a source schema into data structured under a target schema in such a way that all constraints in a fixed set of sourcetotarget constraints and in a fixed set of target constraints are satisfied. A multifaceted investigation of data exchange has been carried out in the past decade, addressing the cases of relational and semistructured data. Recently, the problem of graph data exchange has been considered when only a fixed set of sourcetotarget constraints is available. In our previous work, we have investigated the two problems of interest of data exchange (namely, the existence of solutions problem and query answering) in the case of relationaltograph data exchange by adding a fixed set of target constraints. We have focused on the query complexity existence of solutions and query answering instead of considering the classical data/combined complexity, as in the data exchange literature. We have shown the intractability of the problems of interest and we proved NP lower bounds for the problems studied under particular cases.
In this master work, we would like to pursue the complexity analysis of the two problems of interest by envisioning the following objectives:
(1) can upper bounds be found for the query complexity of the above problems of interest in the case of graph data exchange?
(2) we also would like to study the computational complexity of the problems of interest in terms of data/combined complexity;
(3) we need a new definition of universal representatives that takes into account the target constraints and, correspondingly, a new chase variant to produce such universal representatives;
(4) we would like to further investigate the complexity of the above problems for novel classes of constraints, such as unnested regular path queries and sameas constraints.
Please see the extended version of the master subject for further details:
http://liris.cnrs.fr/angela.bonifati/stages/sujetM2ens.pdf URL sujet détaillé : http://liris.cnrs.fr/angela.bonifati/stages/sujetM2ens.pdf
Remarques : Coencadrement par Radu Ciucanu (University of Oxford), Emmanuel Coquery (Liris) et Romuald Thion (Liris).





SM20757 Graph polynomials


Description


Tutte polynomial is, without any doubt, the most studied graph invariant in combina torics and graph theory. Among other reasons, this huge interest comes from Tutte poly nomial's relation to the chromatic polynomial and from its universality property. Recently, the Tutte polynomial universality property has been proved using algebraic techniques. More precisely, one can use the properties of certain linear maps canonically associated to one edge graphs (with one or two vertices). From an algebraic point of view, these maps are infinitesimal characters of a certain Hopf algebra of graphs. The first objective of the internship is the familiarization of the intern with the Tutte polynomial universality property and with its various proofs: the classical proof (using induction arguments on the set of edges of the graph) and the algebraic proof mentioned above. The second objective of the internship is the utilization of this algebraic technique to define new graph polynomials. If the Tutte polynomial is obtained using linear maps canonically associated to one edge graphs, we first intend to investigate what are the polynomials obtained if one uses linear maps canonically associated to two edges graphs. Moreover, eventual universality properties of polynomials obtained in this way will be studied through various methods. URL sujet détaillé : http://www.labri.fr/perso/atanasa/propositionstageBdxAvalTanasa.pdf
Remarques : the internship will be cosupervised by JeanChristophe Val (LaBRI).The internship will be payed. The internship may be followed by a PhD proposal in the same area. More precisely, we intend to apply this new algebraic technique in order to define polynomials characterizing various combinatorial Hopf algebras. Finally, an extension of this technique to knot theory polynomials and the corresponding Hopf algebras may also be considered. During the internship/PhD, regular visits to work on these topics with Iain Moffatt (London Univ., UK), Thomas Krajewski (AixMarseille Univ., France) or with various members
of the Montreal combinatorics group may be considered.





SM20758 Stages Technicolor Connected Home Rennes (STB, gateways, etc.)


Description


Présentation de Technicolor Connected Home With more than 100 years of experience and innovation, Technicolor is a major actor in entertainment, software, and gaming worldwide. With strong historical ties to the largest Hollywood studios, the company is a leading provider of production, postproduction and distribution for creators and distributors of content. Technicolor is a world leader in film processing, a leading supplier in the field of decoders and gateways and one of the largest manufacturer and provider of DVD & Bluray discs. The recent acquisition of The Mill, a leading visual effects and content creation studio for the advertising, gaming and music industries adds one more firstrate brand to our already robust portfolio (MPC, Mikros Image and Mr. X), effectively making Technicolor the leader in VFX and post production services to the advertising production segment The company also has a strong activity in Intellectual property and Licensing business. This unique environment presents an exceptional opportunity to explore cuttingedge video technologies that will reach homes in the years to come. In Rennes research and development center, our engineers will make your internship a fascinating and rewarding experience. For more information, please visit our website: www.technicolor.com
Objectives /Missions Our interns' mission is to help R&D engineers enabling Multiple Play Video and Communications through:  The design / supply of settop box platforms to satellite, cable and telecom operators  The design / supply of access devices to deliver multiplay services to subscribers
We offer internship opportunities in the multiple fields around: o integrated solutions:  Ultra HD  Multiscreen  Smart Home  Smart Car  OTT solutions
o Products & Software  Video gateways  Settop boxes  Modems & gateways  Connected Home Devices  Software  Professional services
If you are interested in joining us, please visit our website to see the detailed internship open positions at : http://www.technicolor.com/en/innovation/studentday/internshipopportunitiesconnectedhome/connectedhomeinternships
Apply at stage.rennes.com Monthly allowance: =801200 gross
Profile  Preferably Master 2 Student in engineering school / university  Specialization in computing, software, system architecture, embedded realtime software development, testing, UI, etc.  User =96Oriented mindset  Inquiring mind, inventive, passionate.  Good English level URL sujet détaillé : http://www.technicolor.com/en/innovation/studentday/jobinternshipopportunitiesrilabs
Remarques : Monthly allowance: =801200 gross





SM20759 External memory algorithms for large graph decomposition


Description


We are interested by large graphs that are too massive to fit completely inside the computer's internal memory. Processing these graphs needs appropriate data structures and I/O efficient algorithms to reduce time processing. We focus mainly on particular graph decomposition problems for which we can exploit locality to reduce I/O costs. The aim of this work is to survey external memory algorithms for graphs in a first time. Then, work on specific external memory algorithms for some graph decomposition issues. The student will work at LIRIS within the Graph algorithm and AppLications research team GOAL. URL sujet détaillé :
:
Remarques :





SM20760 Calcul numérique et intensif pour l'évolution artificielle


Description


La simulation d'organismes artificiels est maintenant reconnu comme un outil précieux pour étudier l'évolution. Mais pour émuler le processus d'évolution, des milliers d'organismes simulés doivent se reproduire pendant des millards de générations. Par conséquence, le temps d'exécution d'une simulation est de l'ordre du mois et chacune produit des dizaines de Go de données. Pour étudier l'évolution, le nombre de paramêtres à explorer (et le nombre de simulations) est très élevé (des milliers). Par conséquence, pour généraliser l'utilisation de ces techniques, il est critique d'optimiser leurs méthodes numériques.
Le but de ce stage est de proposer des nouvelles approches de calcul numérique et intensif pour optimiser le logiciel d'évolution artificielle Aevol (http://www.aevol.fr). Par conséquence (suivant le candidat), un des 2 axes sera choisi.
Dans le cas des méthodes numériques, le but est de s'intéresser aux 2 principaux méthodes numériques utilisées dans Aevol: les ensembles flous (Fuzzy Set) et les équations différentielles ordinaires (Ordinary Differential Equation). Leurs utilisations ne correspondent pas aux cas classiques de la littératures, il est donc important de proposer des adaptations et optimisations qui correspondent aux besoins de l'évolution artificielle. L'adaptabilité de ces méthodes à des architectures de calcul parallèle sera un critère très important. Cela peut aller jusqu'à la participation au développement de noyaux de calcul CPU/GPU.
Dans le cas du calcul intensif, le but est de proposer une nouveau modèle d'exécution pour Aevol. En effet, actuellement, le code d'Aevol est séquentiel. Hors, il pourrait tirer parti du parallélisme aussi bien sur CPU que sur GPU. Afin de s'orienter vers une méthode générique, le stagiaire devra étudier les systèmes d'exécution (runtime tel que Kaapi ou StarPU) permettant une exécution hybride (CPU+GPU). De plus, Aevol génére énormément de données (des dizaines de Go), le stagiaire devra s'intéresser à des systèmes d'analyse et visualisation insitu des données (FlowVR ou Damaris). Le but du stage est de proposer un état de l'art de ces technologies et leur intégration dans Aevol mais aussi de proposer des améliorations à ces systèmes et Aevol pour permettre une meilleure optimisation. Cela pourra aller jusqu'à l'intégration d'un (ou des deux) systèmes dans Aevol.
Contexte du stage (équipe Beagle Inria): La recherche de Beagle concerne la biologie computationnelle et l'évolution artificielle (génétique numérique). Nous nous positionnons à l'interface entre l'informatique et les sciences du vivant afin de produire de nouvelles connaissances en biologie par le biais de la modélisation et la simulation. En d'autre termes, nous réalisons des artéfacts  du Latin artis factum (une entité créée par l'homme plutôt que par la nature)  et nous les explorons de façon à comprendre la nature. Notre recherche est donc basée sur une stratégie interdisciplinaire : nous développons des formalismes informatiques et des outils logiciels pour la modélisation de systèmes complexes en synergie avec différentes équipes de biologie avec lesquelles nous entretenons des liens étroits. Cette approche, relevant des "sciences numériques" (ou sciences computationnelles) nous permet d'étudier des abstractions de systèmes ou processus biologiques afin de mettre au jour les principes organisationnels des systèmes cellulaires dans une logique de biologie des systèmes. URL sujet détaillé :
:
Remarques :





SM20761 Calcul des coefficients d'une table de Butcher par optimisation sous contraintes


Description


Physical systems are often described by differential equations (ODE or DAE). In order to work with these models (control, identification, design...), it is necessary to simulate these models with the help of integration schemes. We develop a tool for guaranteed integration of differential equations with RungeKutta methods. We are able to consider any RK method, even with interval coefficients (Butcher tableau). Suddenly, we would like to define new RK method, with strong properties. Defining a new RK method requires the solving of an optimization problem under a large system of constraints. The schedule could be:  identify RK methods with floating coefficients and compute them with validated coefficients  define new algorithm to obtain any new RK method (constraints generation, solving) URL sujet détaillé : http://perso.enstaparistech.fr/~alexandre/contents/subjectM2_rk.pdf
Remarques : Coencardé par Julien Alexandre dit Sandretto Informations supplémentaires dans le sujet détaillé.





SM20762 Distance geometry problems


Description


Some systems are sometimes defined only by distances, such as in chemistry, localization, sensor network, robotics. Distance geometry problems are interesting mathematical problems with important applications inrobotics for solving localization problems, UAV swarm organization, etc. If the distances given by sensors are always uncertain, in these special cases, the measures can be obtained with delay (conditionned by dynamics of UAVs and communication) or even completly wrong (due to outliers introduced by sensor fault). More than a distance solver able to manage with uncertainties, some more complicated issues can thus be addressed.
Make a state of the art; Develop a constraint generator; Define some contractors and optimizers to solve the distance problem; Improve the approach toward a robust tool against delay and outliers. URL sujet détaillé : http://perso.enstaparistech.fr/~alexandre/contents/distance.pdf
Remarques : Coencardé par Julien Alexandre dit Sandretto Informations supplémentaires dans le sujet détaillé.





SM20763 Analyse fonctionnelle en Coq


Description


La méthode des éléments finis pour la résolution numérique d'équations aux dérivées partielles est au cóur de nombreux programmes de simulation dans l'industrie. Sa vérification formelle augmentera la confiance dans tous les codes qui l'utilisent. Les méthodes de vérification développées dans ce projet représenteront une avancée à la fois pour la méthode des éléments finis ellemême, mais aussi pour la sûreté de logiciels critiques utilisant des algorithmes numériques complexes.
Ce stage a pour but de prouver formellement en Coq un certain nombre de résultats mathématiques bien connus. Cela inclut une grande partie de la théorie de l'intégration et de l'analyse fonctionnelle, tel que les espaces de Sobolev. Le but est d'aller jusqu'au théorème de LaxMilgram et à la convergence de la méthode des éléments finis.
Ce n'est évidemment pas seulement une "~traduction~" de livres de mathématiques vers un assistant de preuves. Cela nécessitera de bien comprendre les preuves existantes de façon à les factoriser, les simplifier voire à les généraliser. Il faudra également choisir les bonnes formalisations pour simplifier les preuves ultérieures. URL sujet détaillé : https://www.lri.fr/~sboldo/files/stageM2_elfic1.pdf
Remarques : Coencadrants : François Clément (Inria Paris  Rocquencourt) et Micaela Mayero (Université Paris 13, LIPN)
Rémunération: oui.
Il faut noter que le stagiaire pourra se baser sur des travaux antérieurs concernant la preuve formelle d'un schéma numérique résolvant l'équation des ondes et concernant l'amélioration de la bibliothèque standard des réels de Coq.





SM20764 Erreurs d'arrondi en analyse numérique


Description


L'analyse numérique consiste en partie à trouver et à prouver des schémas numériques adaptés à des équations aux dérivées ordinaires ou à des équations aux dérivées partielles. La preuve consiste généralement en une preuve de convergence du schéma et de son ordre de convergence. Mais il existe un problème dont on ne se préoccupe en général pas, c'est celui des erreurs d'arrondi.
Ce stage a pour but d'éclaircir, pour les équations aux dérivées ordinaires seulement, le lien entre deux notions : la stabilité au sens de l'analyse numérique et la stabilité au sens de l'arithmétique flottante : le fait que le calcul se passe "bien", i.e. que les erreurs se compensent à cause de la forme particulière qu'a un schéma numérique stable et convergent. URL sujet détaillé : https://www.lri.fr/~sboldo/files/stageM2_an.pdf
Remarques : Rémunération: oui.





SM20765 Secure and privacy preserving computation using the blockchain protocol


Description


Context:
The problem studied in the field of Secure MultiParty Computation (SMPC) is as follows: There are n parties or nodes in a network. Each node has an input which is private. The node does not wish to reveal the input to any other node in the network. A secure multiparty computation protocol enables the nodes to collaboratively compute a function over their inputs such that all inputs remain private, i.e., known only to their owners. An example of such a protocol is private maximum, i.e. given n nodes each with a private input, how to determine the maximum without revealing who has the maximum or any of the other private inputs. Other examples of possible private computations may include geometric functions, set operations, reputation [1,2], etc.
Objectives of the internship project:
The objective of this project would be to study secure multiparty computation protocols. Such protocols are often resourceheavy, which limits their practical application. The specific objective of the project would be to analyze and propose methods for rendering these protocols, in particular reputation computation protocols, resourceefficient and thus practical. A study of cryptographic techniques and privacy enhancing technologies such as zeroknowledge proofs, homomorphic encryption, electronic cash, etc. will be part of the project.
Recent advancements in cryptographic electronic currencies (bitcoin, blockchain) are also impacting how secure multiparty computation protocols are constructed. In particular, the blockchain protocol may be used for building fully decentralized secure multiparty protocols that are fully trustless and do not need to rely on trusted third parties. The project will thus also include a study of the blockchain protocol and its applications.
References:
[1] A trustless privacypreserving reputation system. Alexander Schaub, Rémi Bazin, Omar Hasan, Lionel Brunie. Technical Report, LIRIS. November 2015.
[2] A decentralized privacy preserving reputation protocol for the malicious adversarial model. Omar Hasan, Lionel Brunie, Elisa Bertino, and Ning Shang. IEEE Transactions on Information Forensics and Security (TIFS). April 2013. URL sujet détaillé :
:
Remarques :





SM20766 Large Scale Deployment of a Replicated Public Key Infrastructure (PKI)


Description


Large Scale Deployment of a Replicated Public Key Infrastructure (PKI)
IRT SYSTEMX CONTEXT Located within a global scientific campus, the Research Institute of Technology SystemX is dedicated to the future system digital engineering. Working closely with the best French research organizations (CEA, INRIA, Institut Telecom ...) and bringing together academic & industrial research teams. SystemX has the ambition to generate major technologies and solutions by using digital engineering breakthroughs and to spread skills in all economic sectors. You will take part to a research collaborative project called ITS Security (ISE). The ISE project is positioned to provide operational solutions to respond to new technological and economic challenges of automotive environment. The project partners are: OpenTrust, PSA Peugeot Citroën, Renault, Trialog, Oppida, Valeo and Telecom ParisTech.
INTERNSHIP CONTEXT The main objective of the ISE project is to implement the security management infrastructure of Cooperative Intelligent Transport Systems (CITS). The stakes are high because these ITS systems must be able to securely process thousands of messages exchanged per second and provide strong guarantees for the protection of personal data in accordance with national legislation and EU directives. The reliable and secure infrastructure (PKI) developed in the ISE project will therefore needs to respond to the issue of largescale design in order to be able to distribute billions of digital identities to the ITS embedded stations. The internship focuses on two main topics: (i) a research axis on large scale replication strategies and (ii) a practical axis on the PKI deployment to the cloud (in particular Amazon EC2). The selected candidate will define PKI replication objectives and requirements, draw up a replication strategy and measure its impact on performances. The internship involves a research work on distributed systems, geographic replication and cooperative ITS. This research work will be submitted for publication in an international conference or workshop.
YOUR MISSIONS  Definition of scaleup objectives/requirements  Definition of the replication strategy  PKI large scale replication including PKI cloud deployment, distributed PKI tests and deployment automation  Measurements campaign  Results analysis and benchmarking  Results submission to an international conference or workshop
The ideal candidate should meet the following criteria: Master degree (ongoing), experience on software development and distributed systems, interested in research. The internship last for six months and is located at IRTSYSTEMX headquarter in ParisSaclay campus (France).
YOUR SKILLS  Good knowledge on software engineering: Java, script languages (e.g., bash, Perl, Ruby, Python)  The knowledge of Amazon EC2 / Amazon EC2 API is an advantage  Teamwork and cooperation: progress report, results documentation  A liking for team work  English or French mandatory Reference: Stage_2015_ISE_08_001 Submissions: stagessystemx.fr pierpaolo.cincillasystemx.fr URL sujet détaillé : http://www.irtsystemx.fr/v2/wpcontent/uploads/2013/04/STAGE_2015_ISE_08_001_050615.pdf
Remarques : Remuneration: ~1100 euro/month





SM20767 Parameter synthesis using probabilistic methods


Description


When modeling reallife, potentially complex, systems, it is now usual to use parameters in order to represent unknown or variable caracteristics of these systems. These parameters can appear in several types of systems and can model several kinds of features (number of processes, timing constants, speed, energy consumption, etc.).
One of the main problem that arises when studying this type of system is to synthesize the optimal value of (some of the) parameters with respect to some given properties. As for model checking, most existing (exact) formal methods for solving this problem do not scale when considering reallife applications. Nevertheless, there exists some other very efficient methods, based on probabilities and statistics, that allow approximating the solution to the modelchecking problem. Our aim is therefore to adapt existing methods such as stochastic abstraction and statistical model checking in order to solve the parameter synthesis problem in several types of systems.
In a first step, we will limit ourselves to simple (discrete) parametric models. Depending on the results, we will move to more complex (probabilistic, timed, hybrid...) models. In a second step, the techniques that are developped during this training period will be applied to reallife models in collaboration with specialists from other domains  transmission of disease between bovine cattles with epidemiologists from INRA; dependability and reliability analysis for civil drones with PIXIEL (www.pixiel.com). URL sujet détaillé : http://pagesperso.lina.univnantes.fr/~delahayeb/sujets/param_synth.pdf
Remarques : Le stage est financé. Possibilité de poursuite en thèse.





SM20768 Discrete Voronoi Diagrams and Applications to Surface Offsetting


Description


Voronoi diagrams are a fundamental concept in computational geometry, with applications ranging from surface remeshing to collision detections and path planning. We propose to study Voronoi diagrams in the context of additive manufacturing, where they can be used to compute offset surfaces of discretized 3D models.
The student will need to understand a 2D version of the algorithm we propose. Then, the goal of the internship is to devise and implement a 3D version that can run on the GPU (using OpenCL), and compare its performances with previous stateofthe art solutions. We hope to achieve realtime offseting of arbitrary radius, which is especially important for interactive modeling. URL sujet détaillé : http://jdumas.org/misc/internshipm2voronoioffsets.pdf
Remarques : The internship will be coadvised by Sylvain Lefebvre (http://www.antexel.com/sylefeb/research), and Jonas Martinez (https://sites.google.com/site/jonasmartinezbayona/).





SM20769 Belief revision in the propositional closure of an attributeconstraint pair formalism


Description


Belief revision consists in the modification of an agent's belief in order to be consistent with new beliefs. It has been studied in several formalisms and has been applied to casebased reasoning (CBR, which consists in the adaptation of pieces of experience in order to solve a problem).
This research subject consists in studying belief revision in the propositional closure of a formalism of attributeconstraint pairs: definition, properties, algorithm, extensions, etc. This will be based on a preliminary work about belief revision in the propositional closure of linear constraints that extends both propositional logic and the language of linear constraints conjunction (with real and integer variables). URL sujet détaillé : http://www.loria.fr/~lieber/belief_revision_attribute_constraint.pdf
Remarques : For any additional information, do not hesitate to write me (lieber.fr). It is then possible
to find a time for a phone discussion.
The payment will be based on the standard allowance.





SM20770 Coinductive and symbolic algorithms for challenging automata models


Description


We recently proposed algorithms for checking language equivalence of nondeterministic finite automata (NFA), by relying: 1/ on techniques from concurrency and process algebra theory: ``bisimulations up to'' [1], and 2/ on a symbolic representation of automata, using BDD [2] .
Thanks to these techniques, our algorithms seem to be more efficient than all algorithms known up to now (Hopcroft&Karp, partition refinment, antichains). We propose to study the possibility of extending these algorithms to other kinds of automata, that are widely used in practice: B=FCchi automata, alternating automata, tree automata, etc...
F. Bonchi and D. Pous, Checking NFA equivalence with bisimulations up to congruence, in Proc. POPL 2013. D. Pous, Symbolic Algorithms for Language Equivalence and Kleene Algebra with Tests, in Proc. POPL 2015.
More info on http://perso.enslyon.fr/damien.pous/stages.html URL sujet détaillé :
:
Remarques :





SM20771 Relation algebra without complement


Description


We consider the equational theory generated by binary relations together with standard operations (e.g., union, intersection, composition, converse, reflexive transitive closure). If we also take the operation of complement, this equational theory is highly undecidable, and any axiomatisation must be incomplete, by G=F6del theorem. So we focus on fragments without complement, and we ask several questions: how to caracterise the corresponding equational theory? Can we decide it? What is its complexity? (This kind of questions were already asked by Tarski in the 40's and many are still unanswered today.) This is a fairly openended topic, requiring a strong interest in logic, universal algebra, and automata theory.
Some course notes on this topic here: http://perso.enslyon.fr/damien.pous/ejcim2016.pdf
Other references: 1. P. Brunet and D. Pous, Petri automata for Kleene allegories, in Proc. LICS 2015. 2. D. Kozen, A Completeness Theorem for Kleene Algebras and the Algebra of Regular Events, Information and Computation, 452:366390, 1994. 3. P. Freyd and A. Scedrov, Categories, Allegories, North Holland, 1990. 4. S Givant and A. Tarski, A Formalization of Set Theory without Variables, AMS Colloquium Publications, 1987 URL sujet détaillé :
:
Remarques :





SM20772 Types for complexity analysis, with applications to security


Description


This internship deals with the problem of automatically analysing the time complexity of higherorder programs. A particular motivation for that is the analysis of cryptographic schemes. Indeed some tools like Easycrypt allow to analyse formally a cryptographic scheme, to establish that it is secure, assuming that a certain computational problem is hard. However such tools currently do not handle in a satisfactory way the complexity analysis of the constructed adversaries. There is thus a need to develop analysis techniques that would fit well in this setting. A recent approach [BBDL15] proposes for that a typebased complexity analysis using types with parameters, applied to a lambdacalculus with references and primitive recursion. We propose to bring this line of work further, by either enhancing the programming language, or by studying overapproximation techniques for making the types more usable. URL sujet détaillé : http://perso.enslyon.fr/patrick.baillot/STAGES/2016/sujet_typesforcomplexity.pdf
Remarques :





SM20773 On Interconnecting logics, proof theories, and proof assistants


Description


We have recently introduced a new proposal for interconnecting interactive proof assistants, namely, we logically extend the language and the dependent type theory with a monadic structure which allows one to call external oracles and also use the result of call in the proof construction process.
More precisely, we have designed a typedλcalculus with dependent type theory and logical primitives of =93Locking=94 and =93Unlocking=94 the proof interaction process. Once the process is locked, the type checker compiles the subtheorem we want to handle externally and, when the oracle returns an output, it decompiles the proof and unlocks it resuming the normal interaction.
The objective of the stage work falls in this research stream. Our aim is to maximize the interaction between logical systems, using a, so called, =93black box technique=94 borrowed by software engineering. This should allow for a sort of =93backward compatibility=94 between systems in the sense that the =93callee system=94 must be aware of the =93caller system=94, and the caller needs only to know the syntax of the callee in order to call it with a suitable input and receive the returned output results in the caller's syntax. The callee could be considered as an =93oracle=94 (or a black box function).
The caller passes the control to the oracle providing as input a suitable translation of the statement whose proof we want to delegate. In case of oracle/function termination, the proof witness (the output) can be translated back to the proof language of the caller and the proof continues, resuming the exact point in which the oracle was called.
We expect the candidate to be able to deal with type systems and typedλcalculus, especially with dependenttypes (à la LF, Logical Framework) and to be fluent in the OCaml programming language. Output of this stage are either/both i) advances in the type/proof theory/logics, and/or a ii) proof of work of an interactive proof assistant, software prototype fork based on an existing interactive proof assistant (like Coq, or Twelf, or Beluga) and an existing oracle (could be either an Automatic Proof Assistant, or a decidable pattern matching algorithm, or even a cellular automata ...).
BIBLIOGRAPHY
hal01170029v1 Conference papers Furio Honsell, Luigi Liquori, Petar Maksimoviç, Ivan Scagnetto. Gluing together Proof Environments: Canonical extensions of LF Type Theories featuring Locks * LFMTP'15. 9th International Workshop on Logical Frameworks and Metalanguages, Berlin, Germany, Aug 2015, Berlin, Germany. Electronic Proceedings in Theoretical Computer Science (EPTCS), 2015 =09 hal01146059v1 Preprints, Working Papers, ... Furio Honsell, Luigi Liquori, Petar Maksimović, Ivan Scagnetto. LLFP : A Logical Framework for modeling External Evidence, Side Conditions, and Proof Irrelevance using Monads 2015 =09 hal01146023v1 Conference papers Furio Honsell, Luigi Liquori, Ivan Scagnetto. LaxLF: Side Conditions and External Evidence as Monads Mathematical Foundations of Computer Science 2014  39th International Symposium, MFCS 2014, Budapest, Hungary, August 2529, 2014. Proceedings, Part I., Aug 2014, Budapest, Hungary. Springer Verlag, 8634, pp.327339, Lecture Notes in Computer Science. <10.1007/9783662445228_28> =09 hal00906807v2 Conference papers Furio Honsell, Marina Lenisa, Luigi Liquori, Petar Maksimovic, Ivan Scagnetto. LFP =96 A Logical Framework with External Predicates (Announcement and survey) Logic: Between Semantics and Proof Theory A Workshop in Honor of Prof. Arnon Avron's 60th Birthday, Nov 2013, Tel Aviv, Israel. 2013 =09 hal00906391v1 Journal articles Furio Honsell, Marina Lenisa, Luigi Liquori, Petar Maksimovic, Ivan Scagnetto. An Open Logical Framework Journal of Logic and Computation, Oxford University Press (OUP), 2013, 25 (2), 43 p. <10.1093/logcom/ext028> =09 hal00909455v1 Conference papers Furio Honsell, Marina Lenisa, Luigi Liquori, Petar Maksimovic, Ivan Scagnetto. LFP  A Logical Framework with External Predicates LFMTP  7th International Workshop on Logical Frameworks and Metalanguages: Theory and Practice  2012, Sep 2012, Copenhagen, Denmark. ACM, pp.1322, 2012, <10.1145/2364406.2364409> =09 hal01146691v1 Conference papers Furio Honsell, Marina Lenisa, Luigi Liquori, Petar Maksimovic, Ivan Scagnetto. LFP =96 A Logical Framework with External Predicates 7th International Workshop on Logical Frameworks and Metalanguages: Theory and Practice  LFMTP 2012. Copenhagen, Denmark  September 9, 2012, Sep 2012, Copenhagen, Denmark. ACM, pp.1322, 2013, <10.1145/2364406.2364409> =09 hal00909574v1 Conference papers Luigi Liquori, Furio Honsell, Marina Lenisa, Ivan Scagnetto. A Conditional Logical Framework Logic for Programming, Artificial Intelligence, and Reasoning 15th International Conference, LPAR 2008, Doha, Qatar, November 2227, 2008. Proceedings, Nov 2008, Doha, Qatar. Springer Verlag, pp.143157, 2008, Lecture Notes in Computer Science. <10.1007/9783540894391_10> =09 hal01148312v1 Journal articles Furio Honsell, Marina Lenisa, Luigi Liquori. A Framework for Defining Logical Frameworks Electronic Notes in Theoretical Computer Science, Elsevier, 2007, In Computation, Meaning and Logic. Special issue, dedicated to Gordon D. Plotkin, on the occasion of his 60th birthday, 172, pp.399436 =09 inria00088809v2 Reports Luigi Liquori, Furio Honsell, Marina Lenisa. A Framework for Defining Logical Frameworks [Research Report] RR5963, INRIA. 2006, pp.56 URL sujet détaillé : https://www.dropbox.com/s/kzg7a38hddvxur1/Stage%20%20oracleproofassistants.pdf?dl=0
Remarques : Note:
Possibility of a coadvising and small/medium stays with Profs Furio Honsell and Ivan Scagnetto, Univ. of Udine, Italy
https://it.wikipedia.org/wiki/Furio_Honsell and https://users.dimi.uniud.it/~ivan.scagnetto/ http://www.uniud.it/
https://en.wikipedia.org/wiki/Udine
MidLong term project, with concrete possibility to pursue with a Ph.D.
Payment: ~ 500 euros / month





SM20774 Linear graph rewriting for Boolean logic


Description


This project is about 'reducing' Boolean expressions in a correct way, simulating logical reasoning. For example:
x AND (y OR z) > (x AND y) OR (x AND z) (*)
Given a set of such reductions, one can chain together various instances to construct 'proofs'. In particular, one can construct sets of such reductions that are 'complete', i.e. from which we can prove every correct reduction.
A striking feature of all known complete sets is that they all involve *duplication of variables*. For example, in the above reduction the variable x is duplicated. What happens when we do not allow such behaviour? For example:
x AND (y OR z) > (x AND y) OR z
This rule lies at the heart of Girard's 'Linear Logic', which successfully demonstrates the advantages of considering duplicationfree fragments of Boolean logic. In fact this 'linear' fragment of Boolean logic is rich enough to encode the entire logic: it is already coNPcomplete. Hence arrives our motivating question:
(Q) Can we construct reduction systems that are correct and complete for *linear* Boolean logic?
In particular, this requires us to remain in the world of linear Boolean expressions, for instance rejecting (*) and even intermediate nonlinearity, e.g.:
x AND (y OR z) > (x AND y) OR (x AND z) > (x AND y) OR z
A complete linear system, unless coNP = NP, must admit reductions that are *exponentially long* but only *linearly wide*, a property consistent with but thusfar unobserved in proof complexity. There would also be farreaching implications on the space complexity of various wellknown proof systems.
In this project we propose to work with certain graphs extracted from Boolean expressions, called 'cographs'. The world of graphs is far more general than the world of logical expressions, *strictly extending* the class of Boolean functions, and so exhibits far richer and more expressive reduction behaviour. We believe that this setting admits reduction systems with the improved complexity properties that we desire.
Inspired by recent work, we propose to develop this idea as follows:
* Develop a comprehensive framework for (co)graphs that models Boolean logic. * Construct complete reduction systems based on (co)graphs. * Obtain *separation results* distinguishing the world of graphs from that of Boolean expressions.
Useful references
1. Rewriting with linear inferences in propositional logic. Anupam Das. http://drops.dagstuhl.de/opus/volltexte/2013/4060/ 2. No complete linear term rewriting system for propositional logic. Anupam Das and Lutz Strassburger. http://drops.dagstuhl.de/opus/volltexte/2015/5193/ URL sujet détaillé : http://www.anupamdas.com/linearrewriting.pdf
Remarques : Coadvised by Anupam Das.
anupam.das AT enslyon.fr
http://www.anupamdas.com/
(also the contact for informal inquiries)





SM20775 Hardware prototyping of a UNUM arithmetic calculation unit and application to advanced signal processing


Description


Hardware prototyping of a UNUM arithmetic calculation unit and application to advanced signal processing UNUM is a very recent representation for floating point numbers, which presents many advantages over the classical IEEE 754 representation. It variable format helps to improve accuracy, avoid calculation pitfalls and accelerate convergence for iterative calculations. The aim of this internship is to realize a hardware prototype of a UNUM calculation unit, using available FPGA boards or hardware emulators. A dedicated operative unit will be developed in VHDL and integrated with a simple processor for sequencing and memory management. The realization will focus on the specific challenges of UNUM: variable data format, unification, format conversion. A minimal set of computation kernels for the basic functions (trigonometric, etc.) will be developed in C for the unit and ported to the prototype in an ad hoc fashion. Finally, a signal processing algorithm will be selected and adapted for running on the platform. The candidate will be supported by skilled system architects for the hardware realization, and interact with signal processing specialists for the algorithmic part. The work will take place within the LETI LISAN laboratory, which has strong experience on low power circuits, high energy efficiency, MPSoC design, and with the support of LETI LSP laboratory for expertise in prototyping of advanced telecom systems. References  2015 The End of Error: Unum Computing published by Chapman & Hall/CRC Computational Science  2015 " THE END OF NUMERICAL ERROR " presentation during ARITH 22 workshop available at http://arith22.gforge.inria.fr/slides/06gustafson.pdf URL sujet détaillé : http://portail.cea.fr/emploi/Lists/Stages/StructuredDisplayForm.aspx?ID=25685&Source=http%3A%2F%2Fportail%2Ecea%2Efr%2Femploi%2FPages%2Fstages%2Foffresstage%2Easpx&ContentTypeId=0x010010995605D5DC4A16AA9AA61FBA54D1B20016019053C4495143AC79FFD9B31F9981
Remarques : coencadrement : Florent Dinechin partenariat envisageable avec CITIlab à Lyon
continuation en thèse avec financement possible (et souhaitée)





SM20776 Quantum random access codes


Description


Random access codes provide a fruitful avenue for studying the power of quantum memory (http://arxiv.org/abs/0810.2937). The objective of this internship is to study generalizations of random access codes (larger alphabets, more general retrieval scenarios,...) and the advantage that is gained by using quantum memory. An interesting direction is to use semidefinite programming techniques to obtain bounds on the best quantum random access codes. URL sujet détaillé :
:
Remarques :





SM20777 Virtualization boundary for optimized clouddelivered security


Description


Description In cloud virtualized platform, the primary role of virtualization layer is to present carefully isolated and elastic virtual resource quantities to multiple tenant software stacks, with two dominant approaches the hardwarelevel virtualization and more recently the OSlevel virtualization (container). Some research works propose to rethink what is called the virtualization boundary that is, find an optimal boundary between the virtualizing part and the virtualized domains [1]. Other works are in favor of revisiting OSes to optimize their execution on virtualized platform [2, 3]. Finally, promising introspection technics  a concept to monitor Virtual Machines from the hypervisor level in order to inspect states and activities of Operating Systems running inside them =96 are highly dependent on the capacity of the virtualization layer to interpret virtualized domains activities [4]. This project has the objective to find out potential benefits on a security point of view of reconsidering the virtualization boundary, with two main aspects: better infrastructure security and enabling security services based on the virtualization layer to be delivered to the tenants (while benefiting from isolation, high privileges and a crossview of VMs at that layer). In this project, the intern will conduct a preliminary study of this research topic by making an extensive state of the art, in the aim of identifying =96 as part of future research  potential optimizations in virtualization boundary abstractions targeting virtualized platform security and low layers monitoring capabilities for delivering virtualizationbased security services. The intern will present his/her results in scientific articles.
The intern will be part of Orange Labs security team in Caen (France). References: [1] Van Moolenbroek, David C., Raja Appuswamy, and Andrew S. Tanenbaum. "Towards a flexible, lightweight virtualization alternative" In Proceedings of International Conference on Systems and Storage (SYSTOR '14), ACM, 2014 [2] Nikos Vasilakis, Ben Karel and Jonathan M. Smith, =93From Lone Dwarfs to Giant Superclusters: Rethinking Operating System Abstractions for the Cloud=94 , 15th Workshop on Hot Topics in Operating Systems (HotOS XV), Usenix, May 2015 [3] Kivity, Avi, Dor Laor, Glauber Costa, Pekka Enberg, Nadav Har'El, Don Marti, and Vlad Zolotarov. "OSv=97optimizing the operating system for virtual machines" In usenix annual technical conference (Usenix ATC 14), 2014 [4] Bauman, Erick and Ayoade, Gbadebo and Lin, Zhiqiang, A Survey on HypervisorBased Monitoring: Approaches, Applications, and Evolutions, ACM Comput. Surv., volume 48 1, September 2015 URL sujet détaillé :
:
Remarques :





SM20778 Designing experimentation tools for Software Defined Networking and Named Data Networking


Description


Software Defined Networking (NDN) and NamedData Networking (NDN) are two new paradigms that aim at changing the way we design and architecture networks. In a nutshell: SDN is to managing networks what Cloud infrastructures are to managing servers: by moving the control to software, it brings better scalability, elasticity, resilience, etc. NamedData Networking explores the idea of moving from the current hostcentric (IPaddresscentric) architecture to one where data and content are at the center of the design.
To evaluate algorithms and software targetting those architectures, experimentation tools are required: simulators, emulators, testbeds.
We are already involved in the design of two experimentation tools: first, the Grid'5000 testbed, which is a major testbed for search on HPC, Clouds, Big Data. Second, the Distem emulator, that relies on Linux technologies to emulate varying performance and arbitrary network topologies on top of clusters of homogeneous nodes (typically from Grid'5000).
The goal of this project is to design extensions to Grid'5000 and Distem to support experimentation on SDN and NDN.
Typically, the intern will: 1. Evaluate requirements for experiments on SDN and/or NDN, by doing a survey of existing experimentation tools and recent experimental studies. 2. Design extensions to Grid'5000 and/or Distem. 3. Evaluate those extensions by performing experiments on SDN and/or NDN.
Depending on opportunities for convergence (and on interests of the intern), the internship could focus first on Grid'5000 (without Distem) and SDN, or on Distem and CCN.
Pointers: Grid'5000: http://www.grid5000.fr/ Distem: http://distem.gforge.inria.fr/ URL sujet détaillé :
:
Remarques :





SM20779 Raffinements automatiques en Coq


Description


The purpose of this internship is to extend a Coq library (CoqEAL) for efficient computation in order in order to automate the process of refinement from prooforiented to computationoriented datastructures and algorithms. URL sujet détaillé : http://www.cyrilcohen.fr/coqeal.pdf
Remarques :





SM20780 Assemblage de tuiles et erreurs


Description


Le premier résultat important dans l'étude des pavages définis par interactions locales est l'existence de règles locales qui forment uniquement des pavages quasipériodiques mais non périodique. Tout naturellement, ces objets ont servi à modéliser les quasicristaux découverts en 1982 par D. Shechtman (Prix Nobel 2011). On s'intéresse à la structure de tels pavages et il est apparu des liens profonds avec des propriétés de calculabilité comme l'indécidabilité du problème du domino qui consiste à décider si un jeu de règles locales pave l'espace ou pas.
Une autre approche consiste à s'intéresser à leur formation, c'est à dire comment former des motifs arbitrairement grands en respectant les règles locales et en s'autorisant uniquement des transformations locales (i.e. la transformation n'a pas accès à la totalité du motif). Différents modèles de croissance ont été établis: autoassemblage (on part d'un motif correct et on rajoute des tuiles en respectant les règles locales) ou flipstochastiques (on part d'un motif quelconque et on effectue des perturbations locales pour rétablir les règles locales). Mais les différentes études établissent peu de liens entre les propriétés structurelles et les modèles de croissance.
Le but de ce stage est d'étudier la vitesses de formation d'un pavage en fonction des propriétés algorithmiques issues des règles locales qui le forment. Lorsqu'on prend un motif fini qui respecte les règles locales, compte tenu de l'indécidabilité du problème du domino, on ne peut pas savoir si se motif peut être prolongé. On mettra en place un invariant qui quantifie cette incertitude et cherchera à voir si cela influe sur les vitesse de croissance des modèles d'autoassemblage ou de flips stochastique. En fonction de la sensibilité de l'étudiant, ce stage pourra avoir une tournure théorique ou plus pratique en proposant une étude empirique des modèles sur des jeux de règles locales de différentes complexité algorithmique. URL sujet détaillé : http://www.i2m.univamu.fr/
Remarques :





SM20781 Mesures atteignables asymptotiquement par un automate cellulaire surjectif


Description


L'étude empirique d'un automate cellulaire est basée sur l'observation d'un diagramme espace temps initié par une configuration tirée au hasard suivant une mesure probabilité. C'est l'approche qu'a eu S. Wolfram lorsqu'il a souhaité classifier les comportements possibles par un automate cellulaire et c'est l'approche standard que l'on adopte lorsqu'on souhaite modéliser un phénomène concret. Si on souhaite caractériser le comportement statistique, c'estàdire la fréquence d'apparition asymptotique d'un motif, il est naturel de s'intéresser aux mesures observées asymptotiquement après itération d'une mesure par un automate cellulaire.
Un résultat récent décrit les mesures accessibles montrant la diversité des comportements possibles. Par exemple, étant donné une mesure cible limitecalculable (i.e. il existe une machine de Turing qui prend en argument un motif et donne une suite convergeant vers la fréquence d'apparition de ce motif), on peut construire un automate cellulaire qui partant de la mesure uniforme (qui donne la même fréquence d'apparition aux motifs de même taille) converge vers la mesure cible.
Dans le cadre de ce stage on va s'intéresser aux mesures accessibles pour les automates cellulaires surjectifs. Pour un automate cellulaire surjectif la mesure uniforme est invariante, il est donc impossible d'avoir un résultat similaire. En effet l'implémentation d'un calcul dans les automates cellulaires nécessite la non suriectivité. En fonction de la sensibilité de l'étudiant, ce stage pourra avoir une tournure théorique, en étudiant les propriétés combinatoires des automates cellulaires surjectifs pour déterminer quels type de calcul est implémentable avec la surjectivité, ou plus pratique en proposant une étude exhaustive de =91'petits'' automates cellulaires surjectifs. URL sujet détaillé : http://www.i2m.univamu.fr/
Remarques :





SM20782 Jeu de domination en version MakerBreaker


Description


Durant ce stage, nous nous int=13eresserons =12a des jeux combinatoires =12a deux joueurs dans les graphes. Une classe importante de jeux est ceux du type "MakerBreaker" o=12ù l'un des joueurs essaie de construire un objet tandis que l'autre l'en empeche. Nous nous concen trerons sur le problème=12eme de la domination. =13Etant donné un graphe G(V;E), un ensemble dominant de G est un ensemble de sommets S tel que chaque sommet du graphe est =12a dis tance au plus 1 d'un sommet de S. Dans jeu de domination introduit par Bre=14sar, Klav=14zar and Rall [1], les deux joueurs choisissent =12a tour de rôle des sommets pour S qui dominent au moins un nouveau sommet, le but pour le Maker =13étant de minimiser la taille de S et pour le Breaker de la maximiser. Nous proposons d'=13étudier un nouveau jeu de domination o=12u le Maker choisit des sommets pour S tandis que le Breaker interdit des sommets pour S. Le Maker gagne s'il arrive =12a cr=13éer un ensemble dominant. Autrement dit, le Breaker gagne lorsqu'il a r=13eussi =12a interdire un sommet et tous ses voisins (ce sommet ne pouvant plus être domin=13é). Les questions qui se posent sont de nature di=0B=13fférente : { =13Etant donn=13é un graphe G et la connaissance du premier joueur, peuton d=13écider qui est le gagnant ? (en supposant que les joueurs jouent parfaitement) { Si le Maker gagne sur G, combien de tours au minimum atil besoin pour gagner ? Un jeu similaire a =13et=13e r=13écemment introduit par Bujt=13as et Tuza [2] o=12u le Maker gagne si l'ensemble des sommets choisis par l'un des deux joueurs forme un ensemble dominant mais en imposant qu'=12à chaque coup, le sommet choisi par le joueur doit dominer un nouveau sommet. Dans cette variante, le Maker gagne =12a tous les coups. La situation semble plus complexe dans notre cas. En eff=0Bet, on peut montrer que d=13écider le gagnant est un probl=12ème PSPACEcomplet sur les graphes bipartis (en r=13éduisant depuis le probl=12ème POSCNF [3]. Cependant, dans un certain nombre de cas, il est assez facile pour le Maker de gagner, par exemple si le graphe peut être partionn=13é en cliques de taille au moins 2 (alors le Maker peut facilement choisir un sommet par clique). Cette condition n'est pas su=0Esante et nous essaierons de r=13ésoudre ce probl=12ème dans des sousclasses de graphes comme les graphes triangul=13és. Une autre piste de recherche sera l'=13étude de l'issu du jeu par passage =12a di=0B=13fférents produits de graphes. Lors de ce stage, l'=13étudiante sera amen=13e =12a collaborer avec des membres de l'ANR GAG (Games and Graphs) ainsi que s'il le souhaite, =12a participer aux activit=13és de vulgarisation de Maths =12a Modeler.
[1] B. Bre=14sar, S. Klav=14zar and D. F. Rall. Domination game played on trees and spanning subgraphs, Discrete Mathematics, 313, (2013), 915923. [2] C. Bujt=13as et Z. Tuza. The Disjoint Domination Game. Submitted, 2014. [3] Thomas J. Schaefer. On the Complexity of Some TwoPerson PerfectInformation Games. J. Comput. System Sci. 16 (1978) 185225. URL sujet détaillé :
:
Remarques : Coendacrement avec Aline Parreau (CR CNRS, LIRIS)





SM20783 Anisometric texture synthesis optimization


Description


The goal of *texture synthesis* can be stated as follows: Given and input texture sample, synthesize a new output texture that, when perceived by a human observer, appears to be generated by the same underlying process.
Anisometric texture synthesis generalizes texture synthesis to allow local rotation and scaling of the texture according to a Jacobian field.
We are interested in gradientbased texture optimization methods [Kwatra 2005]. These methods view the texture synthesis as a minimization problem between the input texture sample, and the output texture.
The goal of the master internship is to formulate an effective texture optimization method for anisometric synthesis, and implement it. URL sujet détaillé : http://www.jdumas.org/misc/anisometric_texture_synthesis.pdf
Remarques : The internship will be coadvised by Sylvain Lefebvre (http://www.antexel.com/sylefeb/research), and Jérémie Dumas (http://www.jdumas.org).





SM20784 Topic centered social media collecting


Admin


Encadrant : El=F6d EGYEDZSIGMOND 
Labo/Organisme : IDENUM (Intelligent cities). This project aims to study in a bottomup way, the echo of an event in the social media. The project is a collaboration between computer science (LIRIS Lab) and communication sciences (ELICO Lab). 
URL : http://liris.cnrs.fr/membres?idn=eegyedzs 
Ville : Villeurbanne 



Description


The master project is part a new project: IDENUM (Intelligent cities). This project aims to study in a bottomup way, the echo of an event in the social media. The project is a collaboration between computer science (LIRIS Lab) and communication sciences (ELICO Lab).
The scientific challenges are twofold:
The first challenge is the semiautomatic specification of the event description. A first textual description is given by a human user. This description has to be extended and clarified through semantic enrichment and statistical methods based on semantic resources and social network based resources. The extended description has to be translated into queries and filters in order to retrieve and select information from social networks.
The second challenge concerns the long term topic centered collect. It is to make evolve the description according to the evolution of the topic during the collection period. Clustering and machine learning methods have to be combined in a novelty detection algorithm to extend and modify the queries.
The proposed methods has to be validated on the TREC challenge datasets but also new evaluation and validation methods have to be created. URL sujet détaillé : http://fex.insalyon.fr/get?k=Ki4hGpJOJPYonvwbDQt
Remarques : Stage pouvant être rémunéré. En collaboration avec l'entreprise WebCastor et la Bibliothèque Municipale de Lyon, possibilité de suite en thèse CIFRE.





SM20785 The Strength of Safra's Construction


Description


This internship subject aims at calibrating the mathematical strength required for some reasoning on automata on infinite words. Automata running on infinite words (such as nondeterministic B=FCchi automata) provide an established framework for the specification and verification of nonterminating programs (in particular via the model checking technique). However, some of their basic properties are known to require nontrivial reason ing principles. This in particular the case of Safra's construction which translates a nondeterministic B=FCchi automaton to a deterministic =93Rabin=94 automaton (this construction is in particular used for the nontrivial task of complementing B=FCchi automata). The goal of this internship is to calibrate the logical strength required to prove the correctness of Safra's construction. We target a subsystem of secondorder arithmetic called WKL0 (for Weak K=F6nig Lemma) which asserts, for every infinite tree T on a finite alphabet, the existence of an infinite set representing an infinite path in T. URL sujet détaillé : http://perso.enslyon.fr/colin.riba/sujets/wkl.pdf
Remarques : The internship will be cosupervised with Henryk Michalewski (http://duch.mimuw.edu.pl/~henrykm/doku.php),
MIMUW, University of Warsaw





SM20786 Witness Extraction for MSO on Infinite Words


Description


The goal of this internship is to propose a constructive variant of Monadic Second Order Logic (MSO) on infinite words, with witness extraction abilities. MSO on infinite words is a yardstick language for expressing properties on nonterminating systems. It is a decidable theory thanks to effective translations to finite state automata. However, these translations involve nontrivial algorithms, with nonconstructive correctness proofs. On the other hand, MSO on infinite words can also be axiomatized as a secondorder nonconstructive arithmetic. We propose to work on an intuitionistic variant of MSO based on a restriction of MSO to particular infinite words called ultimately priodic words. The ultimately periodic words are exactly the infinite words which can be generated by finite state deterministic automata; they are finitely representable and computable, and an MSOformula is statisfiable on infinite words if and only if it is satisfiable on ultimately periodic words. URL sujet détaillé : http://perso.enslyon.fr/colin.riba/sujets/reg.pdf
Remarques : The internship will be cosupervised with Henryk Michalewski (http://duch.mimuw.edu.pl/~henrykm/doku.php),
MIMUW, University of Warsaw.





SM20787 Indicateurs de diversité dans les graphes de recommandation


Description


Au cours de la dernière décennie, les systèmes de personnalisation de recherche sont deve nus omniprésents sur le web, qu'il s'agisse des moteurs de recherche mêmes ou des pratiques de recommandations (vidéos, sites, contacts, achats de produits etc). La recherche de contenus en ligne est désormais assistée par des algorithmes qui conditionnent l'expérience de navi gation des utilisateurs. Cette évolution soulève des questions capitales aussi bien du point de vue informatique que sociologique. En particulier, un effet désormais bien identifié est le phénomène de bulle de filtres (ou filter bubble), correspondant à la restriction de la diver sité des sources d'information imposée par les systèmes de recommandation. Comprendre et mesurer ce phénomène nécessite dans un premier temps de définir des outils pour mesurer la diversité des contenus recommandés. L'objectif de ce stage est de proposer des métriques qui s'appuient sur la structure du graphe entre agents et objets recommandés pour définir la diversité dans ce contexte. La représentation des données sous forme de graphes bipartis (agent/objet) et valués per met en effet de rendre compte des évaluations faites par les agents des différents objets. Par ailleurs, l'utilisation d'une représentation en graphe permet d'introduire naturellement, par l'analyse de la structure, des concepts tels que la proximité de nóuds ou la ressemblance des environnements. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2015/10/sujet_recomm.pdf
Remarques :





SM20788 Dynamique de l'Internet


Description


Le but du stage est de mieux comprendre la dy namique de l'Internet, à la fois en étudiant des données issues de mesures déjà effectuées par l'équipe et par des simulations du modèle. Plusieurs aspects peuvent être développés (suivant les compétences spécifiques du stagiaire et ses centres d'intérêt, l'accent pourra être mis sur l'un ou l'autre de ces aspects) URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2014/10/dyn_inet.pdf
Remarques :





SM20789 Dynamiques de graphes


Description


L'étude des grands graphes apparaissant en pratique a connu ces dernières années un essor hors du commun. Les objets étudiés sont d'origines diverses, comme par exemple la topologie de l'internet (machines et câbles), le graphe du web (pages et liens), les échanges pairàpair (qui échange des données avec qui), mais aussi les réseaux sociaux, les réseaux biologiques ou les réseaux linguistiques. La plupart de ces graphes ne sont pas fixes. Au contraire, ils évoluent au cours du temps : des sommets et/ou des arêtes apparaissent et disparaissent. Cette dynamique joue un rôle essentiel dans de nombreux cas. Par exemple, la dynamique du web peut permettre d'identifier des thèmes émergents, celle des échanges permet d'étudier les comportements des utilisateurs (et d'utiliser ces résultats pour optimiser les protocoles), la dynamique de l'internet permet d'étudier sa fiabilité, etc. Or, il est en général délicat de capturer ces dynamiques et, même lorsque des données sont disponibles, il est nontrivial de les décrire et de les analyser. Cette problématique, bien qu'identifiée comme essentielle, est encore aujourd'hui largement à défricher. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2015/09/dynamique.pdf
Remarques :





SM20790 Conversations, Groupes et Communautés dans les Flots de Liens


Description


Un flot de liens est une suite de triplets (t,u,v) indiquant que les entités u et v ont interagi à l'instant t.} Il peut s'agir d'échanges de messages (emails}par exemple) entre individus, de transferts de paquets entre machines sur un réseau, d'achats en ligne par des clients, d'appels téléphoniques, ou encore de contacts entre individus observés par des capteurs. Les contextes pratiques où les données se modélisent naturellement comme des flots de liens sont extrêmement nombreux. Ces objets sont donc cruciaux pour de nombreuses applications, notamment pour un large spectre de questions de sécurité (attaques réseaux, connivences, fraudes, comportements malicieux, etc).
Dans tous ces contextes, des sousflots jouent des rôles particuliers. Par exemple, dans des échanges de messages, des sousstructures de discussions émergent naturellement (comme des fils de discussion sur des listes de diffusion). Dans les contacts entre individus observés par des capteurs, on retrouvera des réunions d'amis ou collègues. Dans des appels téléphoniques ou des transferts de fichiers, on peut identifier une diffusion de rumeur ou d'information. Enfin, dans du trafic réseau, les échanges entre machines participant à une application distribué e (comme un système pairàpair ou un botnet par exemple) forment de tels sousflots. La figure cidessus illustre la modélisation de ces diverses réalités par des flots de liens.
L'objectif central de ce projet est d'étudier les structures de sousflots dans les flots de liens, et ce selon plusieurs axes. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2014/02/sujet_court.pdf
Remarques :





SM20791 Détection d'événements et/ou d'anomalies dans les dynamiques de graphe


Description


Dans de nombreux contextes (trafic réseau, transactions financières, communications entre individus, etc), on est confrontés à des données sous forme de flux de liens : la donnée est essentiellement composée d'une série de couples (A, B) avec un horodatage t indiquant que A a interagi avec B à l'instant t (par exemple, la machine A a envoyé un paquet à la machine B, ou le compte bancaire A a transféré de l'argent au compte B, la personne A a envoyé un message à la personne B, etc).
Dans tous ces contextes, l'analyse de la dynamique des interactions, et notamment la détection d'anomalies et/ou d'événements dans cette dynamique, est un enjeu crucial tant pour les applications que d'un point de vue fondamental. On peut penser par exemple à la détection d'attaques sur un réseau, de fraudes bancaires ou de corruption, ou encore de changements significatifs dans un réseau social. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2013/10/evenements.pdf
Remarques :





SM20792 Modélisation des graphes de terrain


Description


Dans de très nombreux contextes applicatifs, on rencontre de grands graphes n'ayant aucune structure simple apparente, que nous appellerons graphes de terrain (par opposition aux graphes explicitement construits par un modèle ou une théorie). Citons par exemple la topologie de l'internet (routeurs et câbles entre eux), les graphes du web (pages web et liens hypertextes entre elles), les échanges divers (pairàpair, email, etc), mais aussi les réseaux sociaux, biologiques ou linquistiques. Il est apparu récemment que la plupart de ces grands graphes ont des propriétés statistiques en commun, et que ces propriétés les différencient fortement des graphes aléatoires 1 utilisés jusqu'alors pour les modéliser. Notamment, ils ont une densité 2 très faible, une distance moyenne faible, une distribution de degrés 3 hétérogène, et une densité locale 4 forte. Depuis lors, de nombreux travaux ont été menés visant à capturer ces propriétés dans des modèles, nécessaires tant pour effectuer des simulations que pour étudier formellement ces objets, et bien sûr pour en comprendre la nature. Les graphes aléatoires classiques capturent la densité faible (qui est en fait un paramètre du modèle) et la distance moyenne faible. Par contre, ils ont une distribution des degrés homogène et une densité locale faible. Nous sommes également en mesure de générer un graphe aléatoire à distribution de degrés donnée. On capture ainsi toutes les propriétés citées cidessus sauf la densité locale forte. Malgré de nombreuses tentatives, générer des graphes ayant également une densité locale forte tout en gardant leur caractère aléatoire reste un problème ouvert. Les attentes sont pourtant extrêmement fortes. URL sujet détaillé : http://www.complexnetworks.fr/wpcontent/uploads/2013/10/modelisation.pdf
Remarques :





SM20793 Continuations et logique pour processus mobiles


Description


In the literature on semantics of concurrency, many transformations between calculi have been studied and many of them appear as related to the continuation passing style (CPS) translations that are wellknown in the context of imperative or functional computation: synchronous to asynchronous, polyadic to monadic, functional to concurrent, higherorder to firstorder, etc. In the functional setting, these techniques have a strong logical counterpart (related to G=F6del translations by doublenegation) that allowed for significant advances in proof theory and semantics of programming languages.
The aim of this internship is to study this notion of continuation passing in the context of process calculi with mobility (like the πcalculus). Recent work on operational semantics and type systems suggest that such techniques are the proper way to exploit the power of proof theory in the study of process calculi. URL sujet détaillé : http://iml.univmrs.fr/~beffara/stagem2if2015.pdf
Remarques :





SM20794 Aggregation of complex graphs  Application to biomedical data


Description


This project is related to the aggregation or consensus theory. This theory is about any process of merging several objects (numbers, symbols, structures...) in one object which represents them well. Such processes are formalized as aggregation functions aka consensus functions. There is a growing need for such aggregation procedures in a great variety of domains (mathematics, statistics, physics) but also in many application areas (biology, medicine, humanities...).
For instance in biology, the aggregation is used to identify consensus motifs in DNA sequences or in taxonomic trees (phylogenetics). Cluster analysis applied to graphs aims at grouping objects which are classifications (partitions, trees, hierarchies) based on a similarity/dissimilarity measure. In case of large volumes of data, aggregation approaches often rely on heuristic procedures and rarely produce the same consensual output structures. The difficulty is then to define robust aggregation rules relevant to complex graph data. A starting biomedical project (FRHU project "Fight Heart Failure") will provide us with such complex data relative to patients' history in connection with biological background(data on genes, diseases, biological ontologies...). These patient data can be represented as labeled property graphs (LPG). Indeed, LPG data model offers a good support for data visualization by end users but also for analyzing and mining these data according to user's constraints.
The first objective of the internship is to make a state of the art covering the main relational data models (including graph shaping data models) and existing aggregation operators on such models (at least SQLbased, SPARQLbased, OLAPbased). The second objective is to define aggregation operators on labeled property graphs including the temporal dimension and domain specific ontologies. The final objective is to allow a good consensusbased clustering of a group of patients.
References : Peter C. Fishburn, Bernard Monjardet: Concordance Graphs. Eur. J. Comb. 21(4): 461471 (2000)
M. Grabisch, J.L. Marichal, R. Mesiar, E. Pap : Aggregation Functions. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2009.
O. Hartig: Reconciliation of RDF* and Property Graphs. CoRR abs/1409.3288 (2014).
O. Hudry, B. Monjardet : Consensus Theories. An oriented survey. Math. Sci. hum / Mathematics and Social Sciences 190(2) : 139167 (2010)
Y. Tian, R. A. Hankins, J. M. Patel: Efficient aggregation for graph summarization. In proc. SIGMOD Conference 2008: 567580.
R. De Virgilio, A. Maccioni, R. Torlone: Converting Relational to Graph Databases. In proc. 2nd International Workshop on Graph Data Management Experiences and Systems, GRADES 2014. URL sujet détaillé :
:
Remarques : Coencdarants : Chedy RAISSI (CR Inria) et Miguel COUCEIRO (Professeur Université de Lorraine) Rémunération et possibilité de poursuite en thèse.





SM20795 Spécification formelle et vérification par model checking de modèles d'exigences


Description


Since the early 2000s, various approaches have been proposed to support the step where informal requirements (in a software specification) are transformed into formal ones. Formal requirements can then be verified (coherence, refinement, ...) before starting the design of the software (or system) under development. In earlier work, ONERA proposed an abstract framework, Khi, which allows to formally specify requirements related to the expected behaviours of the system and also the agents (active entities of the system : subsystems, software components, ...) and the assignment of requirements to groups of agents. In order to give a formal semantics to Khi, a temporal multiagent logic has been proposed, USL (Updatable Strategy Logic), which allows to express that a group of agents has a strategy to ensure a property. USL is strictly more expressive than the main other multiagent logics (ATL, SL) and keeps a decidable model checking problem. USL allows to give a semantics to Khi and to translate requirement verification problems into USL model checking instances.
The goal of the proposed work is to enrich and refine these earlier works on Khi and USL. Two main tasks have been considered, and are not meant to be both completely fulfilled during the internship.
The first task consists in studying the logic USL and developing a model checker (in a functional programming language), which will allow different theoretical and practical experimentations (identification of a tractable fragment of USL, development of a case study). The second task consists in improving the language Khi in order to get a true specification language, for which the verification would be sent to the model checker prototype developed in the first task. URL sujet détaillé : http://www.onera.fr/sites/default/files/u494/stageexigencemodelchecking.pdf
Remarques : coencadrement avec David Chemouil (ONERA Toulouse) http://www.onera.fr/fr/staff/davidchemouil





SM20796 Identification dans les structures discrètes


Description


Dans ce sujet, nous nous intéressons au problème d'identification d'éléments dans un ensemble fini structuré. Etant donné un ensemble d'éléments, comment peuton retrouver un élément particulier défectueux en effectuant des tests sur des sousensembles d'élément ? Ces tests sont des questions du type : l'élément recherché estil dans ce sousensemble ?
Plus formellement, étant donné un ensemble V, nous cherchons une collection C de sousensembles de V de telle sorte que les ensembles I(x) = {c in C; x in C} avec x in V , soient distincts deux à deux et non vides. L'objectif est de minimiser la taille de C. Sans contraintes sur les sousensembles que l'on peut choisir pour C, le problème est trivial et l'on peut trouver un code de taille log V. Ce n'est plus le cas dès que l'on ajoute des contraintes sur les sousensembles choisis.
Ici, nous considérerons pour V un ensemble dénombrable de points d'un espace Euclidien et pour C un ensemble de " formes ". Le terme forme pourra désigner par exemple l'ensemble des disques de l'espace considéré, ou alors les disques avec un rayon fixé. Ce problème très général englobe de nombreux sousproblèmes comme celui des codes identifiants dans les graphes d'intersection de disques unitaires (NPcomplet). L'étudiant considèrera ici des problèmes un peu plus accessibles comme celui où les points de V sont alignés et les disques sont tous de rayon 1, ou alors lorsque tous les disques de tous les rayons sont possibles. URL sujet détaillé :
:
Remarques : Coendacrement avec Aline Parreau (CR CNRS, LIRIS)





SM20797 Emulation of Arbitrary Applications with SimGrid


Description


The Simterpose project, which is the core of this proposal, tries to intercept the actions of real applications to mediate them through the simulator. This would allow to run unmodified arbitrary applications on top of SimGrid. For that, Simterpose intercepts all system calls and mediate the communications according to the results computed by the simulator while the computations are only benchmarked so that their timings can be injected into the simulator.
Please refer to the pdf file, here: http://people.irisa.fr/Martin.Quinson/Research/Students/2016mastersimterpose.pdf URL sujet détaillé : http://people.irisa.fr/Martin.Quinson/Research/Students/2016mastersimterpose.pdf
Remarques : Gratification; Suite en thèse souhaitée.





SM20798 Handling CPU affinity of Code in the Simulation of Parallel Applications


Description


Analyzing and understanding the performance behavior of parallel applications on various compute infrastructures is a longstanding concern in the High Performance Computing community. When the targeted execution environments are not available, simulation is a reasonable approach to obtain objective performance indicators and explore various hypothetical scenarios. In the context of applications implemented with the Message Passing Interface (MPI), offline simulation consists in getting a trace of an execution of the application that is then ``replayed'' on a simulated platform.
The main objective of this internship is to leverage the block structure of an existing framework proposed in to better take their CPU affinity in the calibration of the CPU processing rate for the offline simulation of MPI applications. To achieve this, the candidate will have to study and seize earlier work on that topic. Then s/he will analyze existing execution traces, and maybe acquire new ones, to propose a characterization of the respective CPU affinity. Depending on the first obtained results, the candidate may have to propose a way to extrapolate information gathered on calibration instances to target, i.e., large instances of the studied application(s). Finally the candidate will conduct a thorough evaluation of the impact of the CPU affinity on the accuracy of simulation results obtained thanks to the SimGrid toolkit with regard to actual experimentations. URL sujet détaillé : http://graal.enslyon.fr/~fsuter/affinity.pdf
Remarques :





SM20799 Chromatic properties of subdiviedline graphs


Description


The subdiviedline graph operation consists in transforming a graph by first subdividing each edge once (replacing each edge by a path of length two) and then taking the line graph (the line graph of a graph is the graph representing incidences between the edges, i.e. the vertices are the edge of the original graph and two vertices are neighbor in the line graph if the corresponding edges have a common endpoint in the original graph). This operation is also called the cliqueinflation [2] since each vertex is replaced by a complete subgraph.
Among graphs that can be obtained by subdividedline operations are the socalled Sierpiński graphs that correspond to the configurations of the towers of Hanoi game [3].
The aim of this internship is to study colorings of subdividedline graphs. Determining the chromatic number (least number of colors used in a proper coloring of the vertices of the graph) is an easy exercise. The minimum number to color properly the edges is a (less easy) exercise too. For total coloring (both vertices and edges), determining the total chromatic number is much harder and only partial results are known [1]. The student will have first to perform a state of the art about colorings of subdividedline graphs and second to find some new results for some coloring variations like fractional/circular, acyclic, list, equitable, bounded,... colorings.
[1] T. Hasunuma, Structural Properties of SubdividedLine Graphs,Volume 8288 of the series Lecture Notes in Computer Science pp 216229,2013.
[2] O. Togni, Optical alltoall communication in inflated networks. Proc. of the 5th annual International Workshop on GraphTheoretic Concepts in Computer Science, WG'98. Lecture Notes in Computer Science Vol. 1517.:79 =96 87, Springer Verlag, 1998.
[3] A. M. Hinz, D. Parisse, Coloring Hanoi and Sierpiński graphs, Discrete Mathematics, Volume 312, Issue 9, 6 May 2012, Pages 15211535. URL sujet détaillé :
:
Remarques :





SM207100 Parallel and Distributed Simulation of LargeScale Distributed Applications


Description


Executive summary: The context of this project is to allow the efficient parallel and distributed simulation of large systems within the SimGrid framework. The proposed work will improve the existing parallel simulation mode, and propose a novel distributed simulation mode. We target a simulation comprising millions of heavy computational nodes on a much smaller cluster.
Key skill required: System programming and networking programming in C on Linux.
Context ======SimGrid (developed by the Myriads team in collaboration) is a toolkit providing core functionalities for the simulation of distributed applications in heterogeneous distributed environments. The specific goal of the project is to facilitate research in the area of distributed and parallel application scheduling on distributed computing platforms ranging from simple network of workstations to Computational Grids.
This framework was shown orders of magnitude faster than concurrent simulators such as GridSim or PeerSim, and can simulate up to a few million lightweighted P2P processes on a single node [1]. This falls however short to simulate ExaScale systems, as these systems are expected to count dozen of millions of heavy processes. Both CPU and memory limitations must be overtaken to scale the simulation further. In a previous work, we shown that parallel simulation can improve the computational performance in some cases [2], but the memory limitation claim for the distribution of the simulation to leverage the memory of several computational nodes.
Precise Work Description =======================First, we would like to improve the parallel execution mode described in [2]. Currently, only events occuring at the exact same timestamp are executed in parallel. This criteria must be relaxed to increase the amount of potential parallelism exhibited by the application. This require to determine the future events that are independent of the events executed in the current timestamp. Such future events can safely be added to the current timestamp. This detection must rely on an independence theorem, very similar (but still novel) to the ones used when applying Dynamic Partial Order Reduction on model checking.
Then, we would like to propose a distributed execution mode allowing to overcome memory limitations in very large scenarios. Several designs are possible to that extend. The intern is expected to develop several proof of concepts to understand their relative advantages. S/he will then select the best design through a careful evaluation before implementing the selected design. The ultimate goal is to run a typical HPC application (such as linpack, used for the Top 500 ranking) using a sizable portion of the Grid'5000 experimental facility.
Bibliography ===========[1] Laurent Bobelin, Arnaud Legrand, David Marquez, Pierre Navarro, Martin Quinson, Frédéric Suter, Christophe Thiéry. Scalable MultiPurpose Network Representation for Large Scale Distributed System Simulation. 12th Intl Symposium on Cluster Computing and the Grid (CCGrid'12), 2012. http://hal.inria.fr/hal00650233
[2] Martin Quinson, Cristian Rosa, Christophe Thiéry. Parallel Simulation of PeertoPeer Systems. 12th ACM/IEEE Intl Symposium on Cluster Computing and the Grid (CCGrid'12), Canada, May 2012. http://hal.inria.fr/inria00602216 URL sujet détaillé : http://people.irisa.fr/Martin.Quinson/Research/Students/2016mastersimpar.pdf
Remarques : Gratification; Suite en thèse souhaitée.





SM207101 Bisimulations for the rhocalculus


Description


The rhocalculus [1] extends lambdacalculus with rewriting notions, such as firstorder patterns, which can be passed to functions as arguments and manipulated by terms. The calculus is useful for studying the theory of rulebased programming languages such as Tom[2]. The goal of this internship is to study the behavioral theory of the rhocalculus, i.e., to find how to prove formally that two rhocalculus terms behave in the same way. To do so, one has to define relations called bisimulations and prove their soundness. More precisely, we want to extend the already existing bisimulations for the lambdacalculus to the rhocalculus. We aim at proving equivalences between patterns or startegies in the rhocalculus, or maybe between programs in Tom.
[1] rho.loria.fr [2] tom.loria.fr URL sujet détaillé :
:
Remarques : Stage coencadré avec Horatiu Cirstea





SM207102 On conjunctivedisjunctivepredicative logics and their role in type theory


Description


Background. In recent years a large number of logical systems, corresponding to a broad range of proofcultures, have been introduced in Computer Science. Often these are associated with with software systems serving Interactive Theorem Provers (ITP), such as Coq, Isabelle, Agda, and Epigram.
Research Goals. Our aim is to make a study of the unionintersection type discipline, injected into the Logical Frameworks. Specifically, our project aims to: (i) develop a type theory incorporating both dependenttypes and intersection+union types, ii) revalidate the metatheory for the resulting calculus, iii) implement a proofofconcept of the type checking, type reconstruction and type inhabitation algorithms. The last will be the most challenging, requiring an humanmachine interaction, due to the well known undecidability of the type inhabitation problem in the Logical Framework. We expect the candidate to be able to deal with type systems and typedλcalculus, especially with dependenttypes and to be fluent in the OCaml programming language. The output of this stage will be (either or both of) i) advances in the type/proof theory/logic foundations, such as for example a decidable pattern matching algorithm; ii) an implementation of a prototype interactive proof assistant, build from scratch, or forked from an existing ITP based on LF typetheory (like Twelf or Beluga). URL sujet détaillé : https://www.dropbox.com/s/5mybbqqat1g0lzd/Stage%20%20conjunctivedisjunctivepredicativelogicsatwork.pdf?dl=0
Remarques : Note:
Possibility of a coadvising with Prof. Daniel Dougherty, and small/medium stays in WPI, MA USA.
http://web.cs.wpi.edu/~dd/
http://web.cs.wpi.edu/
MidLong term project, with concrete possibility to pursue with a Ph.D.
Payment: ~ 500 euros / month





SM207103 Privacy Control for Online Social Networks


Description


The majority of social networks (like Facebook, LinkedIn, etc) provide control functions to limit the visibility of certain data (such as friend list, wall posts and images) to a specific user group. However, most of users are unaware of the risks associated with the publication and exchange of personal data on social networks. For example, publishing and sharing location information on Facebook could easily lead to a burglary. Accordingly, the risks due to data sharing are constantly increasing, allowing privacy attacks with unfortunate consequences, and making people very reticent to remain socially active (e.g. staying connected with friends and expanding friendship circle). To get online social activities with greater confidence and less risk, it is imperative to devise tools that allow users to control themselves the usages that their data can be destined to. These tools assist users to detect and minimize the dissemination and use of personal information.
The objective of this internship is devising an environment for monitoring interactions in social networks based on user defined privacy requirements. This environment is expected to have the following functionalities:
1. Detection of privacy vulnerabilities: Privacy risks may appear either directly after online data publication (e.g. finding a user's phone number within a wall post) or indirectly through an inference of private information (e.g. deducing sexual orientation from some friendship relations). In this part, we will propose a methodology for characterizing and building direct and indirect attacks. For direct attacks, given a user target, we will provide efficient algorithms for crawling a social subgraph in order to fix a subset of attributes. Since indirect attacks allow extracting information not explicitly stated in user profiles, we will combine algorithmic and statistical approaches to infer data with high probability.
2. Protection against privacy vulnerabilities: When privacy vulnerability is detected, it may arise from one or several users linked by friendship relations. To eliminate or minimize privacy vulnerabilities, we plan to explore two tradeoff techniques. The first one must combine optimally two possible actions: (a) hiding sensitive attributes (such as home address, email address and phone number) and (b) not disclosing some friends to others. Besides a binary logic (publish or hide), the second technique enables us to change the semantics of the published information in such a way it becomes less accurate (or noised). This technique has to adapt some anonymization methods [1, 2, 3] (used for offline publication) for online user interactions.
References: [1] H. H. Nguyen, A. Imine and M. Rusinowitch. =93Anonymizing Social Graphs via Uncertainty Semantics=94. In ACM Symposium on Information, Computer and Communications Security (ASIACCS), 2015, pp. 495506.
[2] H. H. Nguyen, A. Imine and M. Rusinowitch. =93Differentially Private Publication of Social Graphs at Linear Cost=94. To appear in IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2015.
[3] H. H. Nguyen, A. Imine and M. Rusinowitch. =93A Maximum Variance Approach for Graph Anonymization=94. In Symposium on Foundations and Practice of Security (FPS), Best paper, 2014, pp. 4964. URL sujet détaillé :
:
Remarques :





SM207104 Multilinear structured polynomial systems


Description


Polynomial systems appear in many applications of engineering and information sciences. Most of the time, the systems to be solved appear with a special structure. Exploiting these structures by tuning algorithms for obtaining dedicated strategies is an efficient way to handle the intrinsic hardness of polynomial system solving.
In the context of Gr"obner bases, several progresses have been achieved during these last years. They allowed to understand and improve algorithms for solving systems of multilinear equations (quadratic systems which have degree one in several blocks of variables), determinantal systems (systems of minors of matrices with polynomial entries) and unmixed sparse systems (those with few monomials each of them appearing in each equation).
At the same time, the use of Gr"obner bases algorithms has increased to solve systems coming from matricial problems, e.g. semidefinite programming in optimization, nonnegative matrix rank factorization, semidefinite positive rank factorization, etc. Various algorithms have been proposed. To solve these problems, a common basic building block is to compute points where those matrices have a small rank. However, this basic task is still not well understood, both from the modeling and the algorithmic viewpoint. The goal is to obtain practically fast Gr"obner bases algorithms for computing those points.
A starting point is to consider matrices whose entries are linear forms with possibly additional structures (symmetric matrices, Hankel matrices, etc.). Algebraic properties of zerodimensional ideals (with a finite number of solutions) defined by minors (of fixed size) of these matrices are well studied. In the positive dimensional case, much more remains to be done. The first step is of course to identify the modeling yielding the best practical results (considering incidence varieties and other natural definitions of rank deficiencies) depending on the various parameters of the problem (size of the matrix, number of variables, etc.). A second step is to exploit as much as possible the structure of the considered modeling. Most of the modelings that will be studied are characterized by structures that have not been studied yet in the context of Gr"obner bases. Hence, there is here some algorithmic work to do to understand the geometry of these systems (dimension, degree of the algebraic set), the complexity of computing Gr"obner bases of ideals generated by these systems and how to tune the algorithms to obtain better practical performances.
Finally, in the positive dimensional case, an important task is to compute critical points of some polynomial map restricted to the set where the considered matrices are rank defective. This will lead to systems which are characterized by even more algebraic structures that are to be studied.
The new strategy will be evaluated by the impact on some challenging applications which remain unsolved. URL sujet détaillé : http://wwwpolsys.lip6.fr/~safey/Stages/2015/stagematrixrank.pdf
Remarques : Coencadrement avec JeanCharles Faugère (http://wwwpolsys.lip6.fr/~jcf/)





SM207105 Algorithms in Real Algebraic Geometry


Description


This internship is dedicated to the development of algorithms and implementations for solving multivariate polynomial systems {em over the reals}. As in the univariate case, polynomial systems solving  when this means deciding the existence of roots  is usually more difficult over the reals than over the complex numbers.
However, polynomial system solving over the reals appears in many applications of engineering sciences and most of the time information on the set of real solutions are requested from the enduser. The requested information vary a lot depending on the application. Most of the time, they consist in begin{itemize} item deciding the existence of real solutions to polynomial systems of equations and inequalities (emptiness test decision); item computing a system defining the projection on a given linear space of the real solution set of a given system of equations and inequalities (quantifier elimination) item deciding if two given points lie in the same connected component of the real solution set of a given polynomial system of equations and inequalities (roadmap computation). end{itemize} Many progress have been achieved on all these topics these recent years towards the design of algorithms with the best known complexity and are practically fast in practice. However, many questions remain open.
At the heart of the aforementioned progress, topological properties of critical points and values play a crucial role. These properties hold under some special assumptions (properness assumptions). Recently, a notion of generalized critical values has been defined to get rid of these properness assumptions. One goal would be to design fast algorithms for computing generalized critical values.
These could be used to dramatically improve quantifier elimination algorithms, in particular the projection phase which consists in computing an algebraic set containing the frontier of the real set to be projected. Other steps of quantifier elimination are also subject to severe improvements. For instance, describing the connected components of real solution sets to polynomial systems of equations and inequalities is a problem that is still not solved efficiently in practice.
Finally, new roadmap algorithms have been developed recently with complexity improvements to the longstanding (25 years old) complexity bounds due to Canny. Developing this new family of algorithms would yield practically fast algorithms for deciding connectivity queries. URL sujet détaillé : http://wwwpolsys.lip6.fr/~safey/Stages/2015/rag.pdf
Remarques :





SM207106 Revisiting the complexity of easy problems


Description


Recently, many computer scientists have started revisiting very well known ``easy" problems in order to design algorithms that can be applied to very large graphs. As an example, recent progress have allowed to compute the (exact) diameter of facebook, wikipedia, etc. [BCH+15] using few Breadth First Search (BFS). These algorithms have quadratic worstcase complexity but appear to have much better (linear) practical complexity. On the other hand, the diameter cannot be computed (in general graphs) in subquadratic time unless the strong Exponential Time Hypothesis fails [RW13]. *OBJECTIVE 1: Hence, one objective of this project is to give theoretical explanation of this good practical behavior. An interesting research direction is the study of parameterized algorithms: that is, algorithms that perform in time O(f(k)n) in the class of graphs with some parameter bounded by k, where f is a function depending only on k that must be as small as possible. As far as we know, there are very few parameterized algorithms that address polynomialtime solvable problems [AWW15]. *OBJECTIVE 2: what is the complexity of computing the diameter in the class of graphs of maximum degree (or degeneracy) at most where a fixed parameter? On the other hand, experiments show that multisweep algorithms achieve good performances in graphs of bounded hyperbolicity (roughly hyperbolicity measures the ``space" between shortest paths [BCCM15]). Hence, we will address the question of the complexity of computing the diameter in the class of graphs of hyperbolicity at most k. In particular, [BCH14] presented a quasi linear Karp reduction that reduces the problem of deciding whether the diameter is 2 or 3 (which cannot be solved in time ^{2epsilon} ETH fails [RW13]) to the problem of computing the hyperbolicity of a given pair of vertices. *OBJECTIVE 3: A natural question is to know whether a practical algorithm for the latter problem may be turned into a practical algorithm for the first one. More generally, there exist subcubic, subquadratic or linear algorithms that link various problems [BCH14]. Can we use them to ``transfer" the practical efficiency of an algorithm from one problem to another one? Diameter is not the only graph parameter that we plan to consider in this context: many graph parameters are very important since they provide accurate information about the structure of the distances and of the shortest paths in a graph (important for routing). For instance: hyperbolicity, treelength [CDN14] and centrality (measuring the fraction of shortest paths in which a node is involved). *OBJECTIVE 4: A goal of this project concerns the design and analysis of efficient (in practice) algorithms to compute these properties. REFERENCES: [AWW15] A. Abboud, V.V. Williams, J.R. Wang: Approximation and Fixed Parameter Subquadratic Algorithms for Radius and Diameter. CoRR abs/1506.01799 (2015) [BCCM15] M. Borassi, D. Coudert, P. Crescenzi, A. Marino: On Computing the Hyperbolicity of RealWorld Graphs. ESA 2015: 215226 [BCH+15] M. Borassi, P. Crescenzi, M. Habib, W.A. Kosters, A. Marino, F.W. Takes: Fast diameter and radius BFSbased computation in (weakly connected) realworld graphs: With an application to the six degrees of separation games. Theor. Comput. Sci. 586: 5980 (2015) [BCH14] M. Borassi, P. Crescenzi, M. Habib: Into the Square  On the Complexity of QuadraticTime Solvable Problems. CoRR (2014) [CDN14] D. Coudert, G. Ducoffe, N. Nisse. Diameter of Minimal Separators in Graphs. Research Report, INRIARR8639, 2014. URL sujet détaillé :
:
Remarques : coencadrant: David Coudert (Inria) The student must have excellent knowledge about graph theory, graph algorithms and complexity. Programming abilities will be a plus (Python, Java, etc.)





SM207107 Efficient algorithms for computing structured polynomial systems and applications


Description


Recently~cite{faugere:hal00953501}, we have proposed variants of solve efficiently Sparse Polynomial systems whose support lie in the same support (this case is known as the unmixed case); we have also obtained sharp complexity results when the monomials in the support are integer points in a lattice polytope. We want to investigate two new research directions. A natural possible extension of this work would be the generalization to mixed systems (where the algorithms and the complexity would depend on the Newton polytope of {em each} of the polynomials of the system). Some preliminary results seem to indicate that such a generalization may be possible.
Another restriction of the result, is that the current complexity analysis (that is bounding the maximal degree occurring in the Gr"obner basis computation) is restricted to the polytopal case. However, a classical question is to bound the number of solutions in the algebraic closure of a polynomial system where all polynomials have generic coefficients. When the exponent vectors of the monomials are the points with integer coordinates in a lattice polytope, Kushnirenko's theorem shows that the number of solutions is bounded by the normalized volume of the polytope.
A natural question that arises is then to extend this work from the polytopal case to the case where only a small subset of monomials appear in the equations (emph{fewnomial systems}). A noticeable result would be to compute a sparse Gr"obner basis of such a system in polynomial time when the number of monomials in the support is close to the number of variables.
From an implementation point of view, implementing efficiently this new generation of algorithms in a general framework is also a new research area. For instance, several algorithms from convex geometry or the combinatorial world would be necessary components of an efficient implementation. Also, merging the existing approach relying on the sparseMatrix F5 with a Buchberger's type approach~cite{St96} could lead to a termination criterion of the algorithm in the nonregular cases and for positive dimensional systems.
From the application point of view applications of structured systems are also numerous. For instance, in Cryptography, the algebraic systems arising in algebraic cryptanalysis are often structured. Hence efficient computation of the discrete logarithms in finite finite fields rely on solving efficiently bilinear systems. Also the security of McEliece, one of the oldest public key cryptosystem, can be reduced to the hardness of solving a set of multihomogeneous equations of bidegree 100 24 25 29 44 46 100 110 118 200 201 2591,d) Consequently, a possible application is to develop the fastest asymptotic keyrecovery attack against codebased schemes (for some parameters). For structured systems over finite fields, investigating whether the information on the structure could be used to improve a hybrid approach is also as new research direction. URL sujet détaillé : http://wwwpolsys.lip6.fr/~jcf/Teaching/M2GB.pdf
Remarques :





SM207108 Datadriven characterisation of social tie heterogeneities in real information cascades


Description


The recent and ongoing collection of large digital datasets opens up the possibility to follow the evolution of realworld social systems, which were impossible to study before due to their size and complexity. However, at the same time these studies highlight the limitations of our understanding on the conceptual, theoretical, and modeling aspects of complex systems. The goal of the proposed internship is to address these challenges and advance the general knowledge about the emergence and evolution of social contagion phenomena. These processes define a general metaphor for a wide range of phenomena from information diffusion and innovation spreading to the emergence of collective behaviour and grassroots movements.
We are currently building a corpus of Twitter with 10% of the French tweets posted by users residing in the GMT and GMT+1 time zones. The collection of this corpus began in late June 2014 and has currently more than 60 million tweets, of which approximately 6.5% are geolocalised. In parallel, we collect the list of all users followed by the authors of these tweets. We have so far a database of nearly 1.3 million users, which information is refined and enhanced to build a social network between them. In addition, the combined data corpuses of the social network and individual twitter posts allow us to directly follow the spread of adoption of hashtags, ideas, memes, etc.
In this internship out aim is to detect and quantify emerging adoption cascades disseminated via social influence. Beyond the mere observations of adoption cascades our second aim is to identify whether the underlying process behind the social spreading phenomena can be interpreted as a complex or as a simple contagion phenomena (Centola 2007). Finally, our aim is to understand how social influence, and thus the emergence of adoption cascades, depends on the strength of social ties.
The internship will employ a single candidate  M2 student coming from the fields of Computer Science or Physics and preferably enrolled in the modelling complex systems master program. The candidate should present programming skills and advanced understanding on network theory and statistical analysis, with an interest in datadriven research. The internship will be carried out at ENS LyonLIP in the DANTE Inria team, and will take place at IXXI RhoneAlpes Complex System Institute under the supervision of M=E1rton Karsai and Eric Fleury (ENS LyonLIP, Inria). URL sujet détaillé :
:
Remarques : Cosupervisor: Pr. Eric Fleury





SM207109 GreenBigData


Description


Avalon team works on energy efficiency for infrastructures, services and large scale applications. This internship will focus more on data management aspects and impact on energy monitoring and efficiency. URL sujet détaillé :
:
Remarques : This internship will be co advised by Laurent Lefevre and Gilles Fedak from the Avalon team.





SM207110 RDF querying tool for an advanced reasoning engine


Description


We are looking for an ambitious and autonomous intern to work on novel RDF query mechanisms. The goal of the project is to define and implement a software component that will expose functionalities offered by a proprietary innovative reasoning engine as SPARQLcompatible services. Willingness to explore and research potential limitations of such integration is expected. This is a great opportunity for a student with experience in semantic web technologies (e.g. RDF, SPARQL) and/or knowledge of formal reasoning techniques and corresponding languages and models. REQUIREMENTS:  PhD candidate or researchoriented master student  Interest in Semantic Web technologies (RDF, SPARQL, OWL) is required; experience with the tools used in the semantic web stack is desirable  Familiarity with programing language techniques (esp. parsing techniques & grammar definition)  Good knowledge of Python or D (Java could also be considered). DURATION: 6 months START DATE: ASAP APPLICATION: The intern will work closely with researchers in an international environment, and will be strongly encouraged to produce scientific publications. Inquiries are welcomed (to be made at jeanyves.viondury at xrce.xerox.com and nikolaos.lagos at xrce.xerox.com). To submit an application, send your cover letter and CV to the same addresses, with copy to xrcecandidates at xrce.xerox.com . Please include in your CV reference(s) we can contact for letters of recommendation. URL sujet détaillé :
:
Remarques : http://www.xrce.xerox.com/AboutXRCE/Jobs/WorkingatXRCE





SM207111 Information Extraction for Real Business


Description


Information Extraction (IE) is the task of automatically extracting structured information from unstructured and/or semistructured machinereadable documents [Wikipedia].
Xerox is interested in automating complex IE tasks on real customer usecases, which provides challenges of various types: adaption of existing tools, generation of specific lexical resources, use of various technologies (tagging, categorization, inference).
In this context, this internship will focus on:  the design of a methodology (sketch) for this endtoend problem using Xerox usecases.  the implementation of a solution (prototype). The candidate will have to use Xerox internal and stateoftheart tools.  a comparison of existing stateoftheart methods/algorithms (rulebased approach, CRF, supervised, unsupervised,...) We will provide data from real usecases (documents and groundtruth) Experience in Information Extraction and Natural Language Processing is recommended.
Master Student in Computer Science, or preferably Ph.D. student
Duration: 69 month Start ASP URL sujet détaillé :
:
Remarques : http://www.xrce.xerox.com/AboutXRCE/Jobs/WorkingatXRCE





SM207112 Flexible Car Pooling Tool


Description


Duration: 6 months  Dates will be confirmed after the interview, also based on the availability of the candidate Start Date: December 2015or later Description The Work Practice Technologies team at the Xerox Research Centre Europe (XRCE) is conducting a research project on Sustainable Commuting. In this project, we are investigating best practices by workers and companies that reduce the use of singleoccupancy vehicles. The aim is to develop technologies that can support those practices. In particular we are working on tools to support flexible carpooling among colleagues in a workplace. This internship is for one or two student(s) who will work with the team on the construction of a proofofconcept of a tool to permit colleagues to share rides from their homes to their workplace and viceversa. The idea is to allow users to organize work related carpooling with colleagues in a flexible manner with respect to both location and time. The successful intern candidate(s) will create such proofofconcept building upon preliminary work done by the internship proposers. The desired job profile includes very good SW development skills (i.e., can develop a SW prototype independently while defining the requirements with the team) and background in HCI (Human Computer Interaction)/CSCW (Computer Supported Collaborative Work). To submit an application, please send your CV and cover letter to both xrcecandidates.xerox.com and Stefania Castellani (stefania.castellani.xerox.com) , Jutta Willamowski (jutta.willamowski.xerox.com). Please specify "Flexible Car Pooling Tool" in your subject line. URL sujet détaillé :
:
Remarques : http://www.xrce.xerox.com/AboutXRCE/Jobs/WorkingatXRCE





SM207113 Spectral Learning Methods for Knowledge Base Expansion


Description


Start Date Around Feb 2016 Duration 45 months Description Machinereading is the task of extracting structured data from unstructured text and generating a representation of the data that enables efficient interactions between a user seeking for information and a large textual corpora. Xerox is interested in automating complex machinereading tasks on real customer usecases. In this context, the objective of the internship is to develop novel machine reading methods that take as input a large textual corpora and a knowledge base and output new domain relevant concepts to expand the KB. At the technical side we will focus on spectral learning algorithms for learning functions over structured objects to achieve these goals. While there has been a considerable amount of research on developing spectral learning method in the context of supervised learning, there is little prior work on applying them for the concept discovery scenario. We expect as an outcome of the internship a publication in a top NLP conference. The ideal candidate will have knowledge of Machine Learning and Natural Language Processing and be a Master or preferably PhD student in Computer Science. URL sujet détaillé :
:
Remarques : http://www.xrce.xerox.com/AboutXRCE/Jobs/WorkingatXRCE





SM207114 Personal Language Analytics for Emotion, Sentiment and Personality Modelling


Description


Start Date Around FebMarch 2016 Duration 46 months Description Personal Language Analytics (PLA) is an approach to text mining whereby the focus is on the authors of texts rather than the texts themselves. It is a computational field which combines aspects drawn from natural language processing, data mining, linguistics, psychology and sociology.
At XRCE we are interested in understanding how language can be used across cultures to express mental states, like sentiment, emotion or mood; to convey a sense of an individual's personality. This internship will explore the intersection of these areas as part of a much larger project on customer modelling and personalisation.
We are particularly interested in the areas of:  textual expression of emotion in human dialogue;  grounding and focus of this emotion;  the relationship between the degree of emotional expression and personality traits of the interlocutors.
Tasks/Responsibilities
o Contribute to design of annotation schemes and corpora annotation for complex PLA tasks using dedicated annotation platforms. o Work with and extend existing linguistic processing and text analytics tools o Develop prototype text classification models and design experimental evaluation program.
Ideal candidates are Masters or PhD students with: a strong background in natural language processing, experience in linguistics along with text/data mining and machine learning; preferably an ability to script (i.e. python); and ideally an interest in human language use.
Application instructions
Informal inquiries are welcome and can be made to scott.nowson.xerox.com or caroline.brun.xerox.com .
To submit an application, please send your CV and cover letter to all of: xrcecandidates.xerox.com ; caroline.brun.xerox.com and; scott.nowson.xerox.com . URL sujet détaillé :
:
Remarques : http://www.xrce.xerox.com/AboutXRCE/Jobs/WorkingatXRCE





SM207115 Structural problems on excluded directed minors and graph immersions


Description


[Note: this internship position has been occupied already. Please don't contact us if you are looking for an internship.]
Start date: 25 Jan
The project looks at developing structural graph theoretic tools in the context of general labeled graphs.
Specific topics being studied under the project include structural problems on excluded directed minors as well as coloring problems and graph immersions.
References:
[1] Paul Wollan. The structure of graphs not admitting a fixed immersion. JCTB 110 (2015), pp 47=9666. URL sujet détaillé :
:
Remarques :





SM207116 Transformation affine d'image avec précision arbitraire


Description


Using affine or linear transformations such as rotations, symmetries, homotheties and shear mappings is commonplace in digital image processing. These transformations are usually modelled as matrices of real numbers. Because floatingpoint numbers can not represent all real numbers, but only a finite set of rational numbers, the implementations of affine transformations often have properties which differ from those of the real transformations they are supposed to represent. This may be the case for algebraic properties (such as bijectivity), topologic ones (such as connexity), and geometric ones (such as distance conservation).
QuasiAffine Applications (a.k.a. by QAA) are obtained by the discretization of affine applications. They can be represented as matrices of rational numbers and allow to describe linear or affine transformations such as discrete rotations. An ΩQAA can be viewed as a sequence of these QuasiAffine Applications whose successive values (rational numbers) are closer to the actual real application. Using this representation allows both capturing the discrete features of the plane (or space) as well as computing exactly with a discrete model of the continuum based on integers only : the HarthongReeb line.
In this internship, we shall state and prove the mathematical properties of the ΩQAA as well as those of their implementation. We shall describe these applications formally using the Coq proof assistant and then prove formally their basic properties in this setting. Several algorithms based on the ΩQAA and implementing digital images transformations are being currently developed using Python/Sage. The first step will consist in carrying out experiments to state some properties of these transformations as conjectures, especially with respect to its connections with the reference real transformation. Next, we shall describe these algorithms in Coq and formally prove their properties. URL sujet détaillé : https://dptinfo.ustrasbg.fr/~magaud/TrADiCont/SujetStageM2.pdf
Remarques : Coencadrement entre les équipes MIV et IGG du laboration ICube. Stage à la frontière des mathématiques et de l'informatique. Stage rémunéré.





SM207117 Performance study of making IDSs selfadaptable in IaaS clouds


Description


Infrastructure as a Service (IaaS) clouds use virtualisation to allow tenants to outsource their information systems in the operator's cloud. Key characteristics of IaaS clouds include scalability, multitenancy and resource sharing. Tenants can also automate the creation, destruction or reconfiguration of virtual machines (VM) and networks. However, the same characteristics that make IaaS clouds agile and dynamic, also affect the ability of a security monitoring framework to successfully detect attacks [1]. Traditional security monitoring frameworks are not designed to automatically cope with reconfigurations of the virtual environment, and the potentially high rate of such reconfigurations makes it impossible for a security administrator to reconfigure the security monitoring framework accordingly. For these reasons, a successful cloudtailored security monitoring framework should automatically adapt its components whenever the information system is reconfigured or relocated in the cloud.
A security monitoring framework is based on probes in the information system, like Intrusion Detection Systems (IDS). In particular, Networkbased IDSs (NIDS) analyse the network traffic to detect attack attempts and abnormal behaviour.
The purpose of this internship is to extend a selfadaptable cloud security framework called SAIDS [2], in order to understand the requirements of different kinds of IDSs, and study the performance impacts and benefits of using adaptation in the security monitoring framework. The student should:
 write SAIDS drivers for Suricata and Bro NIDSs (SAIDS already features a driver for Snort);
 study the performance impact of NIDS adaptation on cloud configuration changes (VM creation, deletion, migration as well as addition or removal of services in VMs);
 study how selfadaptation can help making NIDSs scale with the monitored network traffic.
References :
[1] N. Shirazi et al. Assessing the impact of intracloud live migration on anomaly detection. In Proc. CloudNet, 2014.
[2] A. Giannakou et al. Towards Self Adaptable Security Monitoring in IaaS Clouds. In Proc. CCGrid, 2015. URL sujet détaillé :
:
Remarques : Coadvised with Christine Morin





SM207118 Omegaregular real functions


Description


A notion of function from the real numbers to the real numbers, computable by finite automaton, was introduced in a 2013 article. It is shown that the continuity of a function can be decided given an automaton computing this function. The goal of this internship is to study the decidability of other properties, such as differentiability, and more generally to get a better understanding of this class of functions, through characterizations. An implementation is also possible, in order to visualize functions described by automata. URL sujet détaillé :
:
Remarques :





SM207119 Inria student internship


Description


Abstract: This project addresses unsupervised object localization in a general scenario where a given image collection contain multiple dominant object classes and even noisy images without any target objects.
Objective: 1) develop an efficient optimization algoritihm for the problem of object discovery and segmentation, building on our recent work (Cho et al., 2015). 2) develop better image feature and representation to improve the performance. 3) evaluating the algorithm on a challenging benchmark dataset (e.g. MS COCO dataset.) URL sujet détaillé :
:
Remarques : Coadvised by Prof. Jean Ponce, ENS Paris.





