SM2071 Performance Analysis of Congestion in Commodity Networks for Exascale Computing Platforms


Description


The MontBlanc project aims at building a supercomputer based on commodity lowpower hardware such as the ARM and the Ethernet technology. It will not be able to reach Exascale but is a first step toward this direction.
Although there are numerous articles explaining how to tune or modify TCP to optimize endtoend bandwidth in a HPC context, most of them completely ignore flow control and congestion management aspects. Our initial experiments with TCP/Ethernet revealed some nontrivial network congestion phenomenon in the presence of intensive HPC workload. More precisely, some MPI collective operations and the NAS PB conjugate gradient can lead, even at moderate scale, to critical congestion situations that force the whole application to freeze until a TCP timeout is triggered. Such blocking situations only last for 200 milliseconds but such this is unacceptable on a platform where typical communication times are of the order of a few microseconds. Indeed, if such timeouts are harmless in a wide area context over the Internet they can occur systematically in such HPC workloads and slow down the whole execution by more than 30%!!! Beyond a simple application slowdown, it should be understood that the whole set of processors, even when forced to idleness, keeps consuming energy uselessly. Given the scale evolution of HPC platforms, such issue is likely to quickly become impossible to overlook.
There are several options to circumvent this issue: 1. Such timeouts seem to be link to the retransmission timeout (TCP RTO), which could be decreased. The value of this timeout (200ms) is unfortunately hardcoded in the kernel. Although having kernels specifically tuned for HPC workloads is perfectly legitimate, it is quite difficult to know how to set such value. Indeed, aggressively decreasing this timeout may trigger massive retransmissions that could make the network collapse even more. 2. Since most TCP variants are likely to run into the same kind of issues, another option could be to use a completely different protocol like SCTP that manages congestion in a very different way. 3. Another option could be to have user space deamons that smartly inject packets in the system to prevent such timeouts to occur.
The goal of this project is to evaluate these different options. Experiments will be conducted on the Grid5000 platform (probably on Nancy's Graphene cluster). In a first step, the student will familiarize with the experimental procedures and on the techniques to control their experimental environment (e.g., recompiling the kernel and preparing custom images,...).
If time allows, similar experiments will be conducted on the MontBlanc prototype. Such study should provide invaluable feedback to Exascale platform architects in particular with respect to the network and CPU provisioning aspect. URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_tcp_analysis.pdf
Remarques : Ce stage est évidemment rémunéré...





SM2072 Modeling and Simulation of Dynamic Applications for Exascale Computing Platforms


Description


Multicore architectures comprising several GPUs have become mainstream in the field of HighPerformance Computing. However, obtaining the maximum performance of such heterogeneous machines is challenging as it requires to carefully offload computations and manage data movements between the different processing units. The most promising and successful approaches so far rely on taskbased runtimes that abstract the machine and rely on opportunistic scheduling algorithms. As a consequence, the problem gets shifted to choosing the task granularity, task graph structure, and optimizing the scheduling strategies. Trying different combinations of these different alternatives is also itself a challenge. Indeed, getting accurate measurements requires reserving the target system for the whole duration of experiments. Furthermore, observations are limited to the few available systems at hand and may be difficult to generalize. Finally, since execution time on real machine exhibit variability, dynamic schedulers tend to make varying scheduling decisions, and the obtained performance is thus far from deterministic. This makes performance comparisons more questionable and debugging of nondeterministic deadlocks inside such runtimes even harder.
Simulation is a technique that has proven extremely useful to study complex systems and which would be a very powerful way to address these issues. Performance models can be collected for a wide range of target architectures, and then used for simulating different executions, running on a single commodity platform. Since the execution can be made deterministic, experiments become /completely reproducible/, also making debugging a lot easier. Additionally, it is possible to try to extrapolate target architectures, for instance by trying to increase the available PCI bandwidth, the number of GPU devices, etc. and thus even estimate performance which would be obtained on hypothetical platforms.
In a recent Europar article, we show how we crafted a coarsegrain hybrid simulation/emulation of StarPU, a dynamic runtime system for heterogeneous multicore architectures, on top of SimGrid, a simulation toolkit specifically designed for distributed system simulation. This approach allows to obtain performance predictions accurate within a few percents on classical dense linear algebra kernels in a matter of seconds, which allows both runtime and application designers to quickly decide which optimization to enable or whether it is worth investing in higherend GPUs or not.
It is thus currently possible to precisely evaluate the performance of *dynamic* HPC applications running on a single hybrid node but evaluating such applications at larger scale remains quite challenging. StarPU was recently extended to exploit clusters of hybrid machines by relying on MPIcite{starpumpi}. Since SimGrid's ability to accurately simulate MPI applications has already been demonstrated (PMBS'13), combining both works should allow to investigate performance predictions of complex applications on largescale highend HPC infrastructures.
The goal of this internship is thus to experiment with dynamic HPC applications based on StarPUMPI and to evaluate their performance in simulation. In a first step, mediumsize clusters of hybrid nodes will be used. If time allows, such validation could be conducted on the MontBlanc prototypes that rely on ARM processors and GPUs similar to the ones that may be used in future Exascale platforms. Such performance study should provide invaluable feedback to Exascale platform architect in particular with respect to the network and CPU provisioning aspect. URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_starpu_mpi.pdf
Remarques : Ce stage est évidemment rémunéré...





SM2073 Analysis and Modeling of Cache Performance for Exascale Computing Platforms


Description


The MontBlanc project aims at building a supercomputer based on the ARM technology. It will not be able to reach Exascale but is a first step toward this direction. ARM processors have almost no memory hierarchy and may thus be easier to model than more recent multicore architectures used in PetaScale architectures.
The goal of this internship is thus to experiment with multicore ARM processors similar to the ones that may be used in future Exascale platforms and to derive from such experimentation a performance and power consumption model. Measuring the parameters of such model will require to design specific microbenchmarks that reach the peak performance of these processors, which in turns requires exploring a wide parameter space of code compiling options, loop unrolling and vectorizing options. To ease such code generation, we will rely on the [[https://forge.imag.fr/projects/boast/][BOAST]] framework that allows to metaprogram computing kernels and their optimizations.
Such model will then be implemented in the [[http://simgrid.gforge.inria.fr/][SimGrid]] toolkit to try to evaluate the performance of simple benchmarks like LINPACK on such platforms. Such study should provide invaluable feedback to Exascale platform architect in particular with respect to the network and CPU provisioning aspect. URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_cache_analysis.pdf
Remarques : Ce stage est évidemment rémunéré...





SM2074 Response Time Optimization of MultiUser BOINC Projects


Description


Recent evolutions in internetworking technology and the decreasing costperformance ratio of computing components have enabled Volunteer Computing (VC) systems. Among these platforms, BOINC ([[http://boinc.berkeley.edu/][Berkley Open Infrastructure for Network Computing]]) is the most representative, with over 580,000 hosts that deliver over 2,300 TeraFLOP per day and many projects deployed. In such platforms, volunteers usually donate their machines' CPU idle time to scientific projects and indicate some preferences on how to share their processing power among them. BOINC projects usually have hundreds of thousands of independent CPUbound tasks and are interested in overall throughput, i.e., maximize the total number of tasks completed per day. Each project has its own server which is responsible for distributing work units to clients, recovering results and validating them.
To handle this complexity, BOINC implements a simple and robust distributed protocol which ensures that local shares and volunteer priorities are respected. This protocol has been used for many years now and its efficiency and fairness have been assessed in the context of throughput oriented projects.
Lately, the BOINC workload has evolved in the following two noticeable ways:  First, to answer the burden of deploying and maintaining BOINC servers, multiuser projects (also called emph{umbrella} projects) have appeared. Such BOINC project host several scientific projects (in the later, we will talk about users rather than scientific projects to avoid confusion with BOINC projects), each with their own application, which provides a much larger audience to every user than if they had deployed their own server. For example, the [[http://www.worldcommunitygrid.org/][World Community Grid]], funded by IBM, has been one of the first such BOINC projects, helping research on drug search for Leishmaniasis, clean water localization, childhood cancer, dots. More recently, [[http://casathome.ihep.ac.cn/][CAS]] a Chinese project has been set up to help both protein structure, nanotechnology, cancer genomics and high energy physics research. Likewise [[http://www.ibercivis.es/][Ibercivis]] a SpanishPortuguese initiative has been set up to provide a huge BOINC umbrella Project for scientists in those countries and [[http://rechenkraft.net/yoyo/][Yoyo]] helps research on harmonious trees, elliptic curve factorization, and DNA evolution among others. In such BOINC projects, demand for computing power generally exceeds supply, hence a need for a mechanism to divide the supply among users.  Second, although in the past most users had an unlimited supply of jobs to process (throughputoriented users), the popularization of BOINC has attracted other type of users. In CAS, Many users have campaigns (batches) made of fewer tasks and are interested in response time of campaigns. For such users, throughput is not meaningful and response time should be minimized (responsetime oriented users). In such multiuser projects, each such user will have a numeric quota representing the fraction of the computing resource they should get. What are the precise semantics of quotas? Ideally the mechanism should accommodate the previous two classes of users. The GridBot projectcite{gridbot,Silberstein11,Silberstein12} recently made use of hybrid computing platform composed of grid, clusters and VC to execute workload resulting from mix of throughput and response time oriented applications. Such initiatives are very encouraging but somehow emph{ad hoc} and does not consider the problem of streams of batches nor the resource sharing issue. We think that regular BOINC servers can be improved to handle seamlessly such workloads.
In the last BOINC workshops ([[http://boinc.berkeley.edu/trac/wiki/WorkShop11][2011]], [[http://boinc.berkeley.edu/trac/wiki/WorkShop12][2012]], [[http://boinc.berkeley.edu/trac/wiki/WorkShop13][2013]]) members from the Mescal team, the BOINC team, the World Community Grid project and the OurGrid project have brainstormed to design a new server architecture that would enable to fairly and efficiently handle complex workloads such as the ones we previously described. Yet, many parts of this architecture (batch prioritizing, batch completion estimation, client reliability estimation, dots) remain to be specified.
The goal of this internship is thus to first implement a simulation of such prototype and to evaluate its performance under some realistic workload from the World Community Grid or from CAS. Such evaluation will then guide the design of different scheduling and estimation mechanisms. The project could involve visits to Berkeley, California to present results, which will later be implemented in the BOINC server and deployed on the World Community Grid and in CAS. URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_boinc.pdf
Remarques : Ce stage est évidemment rémunéré...





SM2075 Learning algorithms for optimal routing in large scale systems


Description


Consider the following multiflow routing problem: N flows (given by source , destination and volume) share a communication network (with limited throughput on the links). We want to design an distributed algorithm that discovers the best routes for all flows with only partial knowledge on the network topology and the routes chosen by other flows.
Classical algorithms are based on game theory and see the flows as players and all the routes from source to destination as their possible strategies. The best algorithms are shown to scale well with the number of flows but not so well with the number of possible routes per flow. The goal here is to explore alternative approaches where the nodes of the communication network become active players and their strategies are limited to output links. These new approaches have the potential to solve very large routing problems very fast. URL sujet détaillé :
:
Remarques :





SM2076 Learning and mean field approximations.


Description


Qlearning is an algorithmic technique that learns the transition probabilities in a Markov decision process while maintaining an optimal reward over an infinite horizon. While this makes this approach particularly attractive on a conceptual point of view, it suffers acutely from the curse of dimensionality that makes it inapplicable in practice. In this internship we propose to explore the possibility to aggregate the state space of a large Markov decision process made of several identical objects and to use Qlearning on the resulting mean field approximation so as to avoid the state space explosion.
This project involves both mathematical modeling and algorithmic techniques. No background knowledge is needed for this project but some familiarity with probability theory (or a taste for it) will help. URL sujet détaillé :
:
Remarques :





SM2077 Dynamic solution of the memory affinity problem.


Description


One of the main problem faced by HPC applications that run over modern multicore NUMA architectures is to solve the memory affinity problem: place each task on a processor close to the memory where the data used by the task is stored. If considered in a static way, optimal placement is NPcomplete. One way to approach the problem is to dynamically change the task allocation according to an adaptive algorithm based on the aggregate performance of all the allocations used from the start of the prgram rather than on the current allocation, and learn from past mistakes. This approach has never been used in this context but should prove very efficient compared with static heuristics based on graph partitioning that are currently used. URL sujet détaillé :
:
Remarques :





SM2078 Assemblage et estimation de la valeur adaptive de génomes du virus de la Dengue Assembly and fitness estimation of Dengue virus genomes


Description


390 million people worldwide are infected yearly by Dengue virus, through a mosquito bite. No therapeutic is currently available. To control the epidemics, the most promising strategies target the mosquito host to stop viral transmission. However, little is known about how the virus interacts with its mosquito host, which makes it very hard to predict whether those strategies could be successful. To understand how the virus interacts with the mosquito cell, a team of virologists headed by Marlène Dreux runs viral evolution experiments in mosquito cells and sequences the viral population throughout its evolution. The analysis of the resulting sequences will help us identify the sites of the viral genome that interact with the mosquito cell machinery. The master internship aims at developing bioinformatic and statistical methods to i/ assemble complete viral genomes from the sequences and ii/ predict the fitness of those assembled genomes to identify important sites. These tasks require parallel algorithms to handle massive amounts of sequence data as well as handling of the quasispecies equation to infer genome fitness. The intern will have to implement the abovedescribed algorithms. Good programming skills are most welcome, as well as a good working knowledge of probability and statistics. Moreover, interest in evolution, virology or public health issues is primordial. The host laboratory (LBBE) is a stimulating and pleasant place to work, where one can meet biologists, physicians, computer scientists, mathematicians and statisticians working on problems that range from ecology to medicine, through genomics and evolution. URL sujet détaillé : https://lbbe.univlyon1.fr/IMG/pdf/project_m2_student_bastien_boussau.pdf?1229/4b82a603f0cc81c9cba9f30b14690da6e18d2747
Remarques :





SM2079 Formal Verification of Security Properties of Evoting Systems


Description


Designing and implementing cryptographic systems that offer effective security guarantees is known to be a complicated task. They are widelydeployed applications involving complex information flows, making code reviews and testing less effective. They use modern cryptographic primitives that must be carefully designed: the history of cryptography (including this last decade) is full of examples of flawed cryptographic systems that were thought secure for a long time (and sometimes even proved secure). Furthermore, their security should hold in hostile environments.
Electronic voting schemes are one example of such systems. They are meant to offer at least the same security guarantees as traditional paperbased systems. They make use of cryptographic primitives and rely on protocols that are notoriously difficult to design and verify.
In this internship, we propose to formally define and verify security properties of Helios, one of the best established academic protocols. The development will be performed in EasyCrypt, a proofassistant jointly developed by the IMDEA Software Institute and Inria, for verifying the security of cryptographic constructions in the computational model.
We invite applicants to read the following paper for more information:
https://eprint.iacr.org/2013/235 URL sujet détaillé : https://www.easycrypt.info/Internships
Remarques : Possibilité de rénumération
Coencadrants: F. Dupressoir, PY. Strub





SM20710 Formal Verification of a Compiler for MaskingBased Countermeasures


Description


In the real world, attackers have access to sidechannels that may reveal information about intermediate variables of a computation, even those that depend on secrets. For instance, recent and very efficient attacks have allowed adversaries to recover the secret key using only AES power traces, and the design and implementation of countermeasures is now of central importance.
Masking is such a countermeasure. It consists in using a secretsharing scheme to split each sensitive variable and the compuations on them into (t + 1) shares such that observing any t of them does not reveal any information about the secrets. Intuitively, an implementation that is split over (t + 1) shares should be able to resist the leakage of t of its intermediate variables.
In this internship, we propose to formally verify, in the Coq proofassistant, a masking compiler, i.e. a compiler that transforms an input program into a functionally equivalent program that is resistant to t intermediate observations. Both the implementation, the functional correctness and the security proofs will be performed in Coq.
We invit applicants to read the following paper for more information:
http://www.cs.berkeley.edu/~daw/papers/privcirccrypto03.pdf URL sujet détaillé : https://www.easycrypt.info/Internships
Remarques : Possibilité de rénumération
Coencadrants: F. Dupressoir, PY. Strub





SM20711 Energyefficient cloud elasticity for datadriven applications


Admin


Encadrant : AnneCécile ORGERIE 
Labo/Organisme : energyefficiency, performance and elasticity. This work will be done in collaboration with the Lawrence Berkeley National Laboratory (USA) in the context of the DALHIS associate team. 
URL : http://www.irisa.fr/myriads 
Ville : Rennes 



Description


Distributed and parallel systems offer to users tremendous computing capacities. They rely on distributed computing resources linked by networks. They require algorithms and protocols to manage these resources in a transparent way for users. Recently, the maturity of virtualization techniques has allowed for the emergence of virtualized infrastructures (Clouds). These infrastructures provide resources to users dynamically, and adapted to their needs. By benefiting from economies of scale, Clouds can efficiently manage and offer virtually unlimited numbers of resources, reducing the costs for users.
However, the rapid growth for Cloud demands leads to a preoccupying and uncontrolled increase of their electric consumption. In this context, we will focus on data driven applications which require to process large amounts of data. These applications have elastic needs in terms of computing resources as their workload varies over time. While reducing energy consumption and improving performance are orthogonal goals, this internship aims at studying possible tradeoffs for energyefficient data processing without performance impact. As elasticity comes at a cost of reconfigurations, these tradeoffs will consider the time and energy required by the infrastructure to dynamically adapt the resources to the application needs.
The validations of the proposed algorithms may rely on the French experimental platform named Grid'5000. This platform comprises about 8,000 cores geographically distributed in 10 sites linked with a dedicated gigabit network. Some of these sites have wattmeters which provide the consumption of the computing nodes in realtime. This validation step is essential as it will ensure that the selected criteria are well observed: energyefficiency, performance and elasticity. This work will be done in collaboration with the Lawrence Berkeley National Laboratory (USA) in the context of the DALHIS associate team.
References: [RDG11] N. Roy, A. Dubey and A.S. Gokhale. "Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting". In Proceedings of IEEE CLOUD, 2011. [Koo11] J. Koomey, "Growth in data center electricity use 2005 to 2010", Analytics Press, 2011. [ADE10] D. Agrawal, S. Das and A. El Abbadi, "Big data and cloud computing: new wine or just new bottles?", Journal of the Very Large Data Base Endowment, vol 3 no. 12, pp. 16471648, Sept. 2010. [ODL13] A.C. Orgerie, M. Dias de Assunç=E3o, L. Lefèvre, "A Survey on Techniques for Improving the Energy Efficiency of Large Scale Distributed Systems", ACM Computing Surveys, 2013.
Advisors: AnneCécile Orgerie annececile.orgerie.fr Christine Morin christine.morin.fr URL sujet détaillé : http://www.irisa.fr/myriads/openpositions/internships1314/document.20130925.9133673154
Remarques : Contact: AnneCécile Orgerie annececile.orgerie.fr Christine Morin christine.morin.fr





SM20712 LargeScale Graph Processing Platform on Supercomputers


Description


A largescale graph analytics is one of the important approaches for big data processing for a variety of purposes such as web and social network mining, life science, cyber security, and so forth. We have been developing an open source billionscale graph analytics library called ScaleGraph (URL: http://www.scalegraph.org) on top of a parallel and distributed programming language named X10. For the last 3 years, we have been successful in demonstrating its high performance nature and handling billionscale whole Twitter followerfollower network on supercomputers with more than 3000 CPU cores.
In this project, research project members will work on various research themes depending on their interest and skills by using a supercomputer named TSUBAME 2 located Tokyo Institute of Technology in Japan. The project can be mainly divided into three teams: A) Graph Algorithms Team: The team designs and implements a set of parallel and distributed graph algorithms that the ScaleGraph library is currently missing such as community detection, clustering, link prediction, bipartite matching, and their incremental algorithms. After finishing them, they will test the algorithms over our crawled Twitter network and conduct the performance evaluation on TSUBAME 2. B) System Software Team: The team tackles various problems such as optimizing the underlying graph processing framework based on the Pregel model, designing and implementing asynchronous graph processing models, and implementing outofcore graph processing with external memory. We could broaden the scope if you have good ideas. C) Application Team: By working together with other teams and also domain experts, the application team will explore new application areas such as human brain network, proteinprotein interaction, linked data such as RDF, cyber security, and so forth with the ScaleGraph library. Each member is expected to deliver both software assets as well as publication for international conferences. For further information, you can see the ScaleGraph web page (http://www.scalegraph.org)
Recommended Skills: ・ OS, Programming Languages, Tools: Linux, C/C++, Java, MPI, Performance profiler, Git/Subversion, etc ・ Knowledge on graph analytics, High performance computing, parallel and distributed systems ・ A skill on writing and reading publications for international conferences
Duration: 6 months (negotiable) Salary: =801000=801300 (depending on the skills) Work Location: UCD School of Computer Science and Informatics, Dublin, Ireland
Contact: Please send your CV/resume (the detailed information such as programming experiences would be more preferable) and your preferred research theme to crest.research.com. If selected, you will be contacted and interviewed by Prof. Toyotaro Suzumura (University College Dublin / IBM Research) and the other team members. URL sujet détaillé :
:
Remarques :





SM20713 Integrated blackbox+whitebox performance modeling of Cloud applications


Description


Context: Cloud computing offers unprecedented levels of flexibility to efficiently deploy demanding applications. Cloud users have access to a large variety of computing resources with various combinations of configurations and prices. This allows them in principle to make use of the exact types and numbers of resources the applications need. However, this flexibility also comes as a curse as choosing the best resource configuration for an application becomes extremely difficult: cloud users often find it very hard to accurately estimate the requirements of complex applications. It is therefore useful to model the performance of Cloud applications, such that we can use the model to choose the best resource configuration for an application.
The HARNESS European project (http://www.harnessproject.eu) is currently working on two complementary approaches to model the performance of cloud applications: First, a ``blackbox'' method can build a performance model of arbitrary applications. This approach is very general, but on the other hand it may require a significant number of profiling executions before capturing all the relevant specificities of a new application. We therefore also proposed to develop one or more ``whitebox'' performance modeling components. Each such component may be specialized for a particular class of application, for example MapReduce applications. Whitebox performance modeling may be faster than blackbox modeling in identifying key information thanks to its domainspecific knowledge of a specific class of applications. When a whitebox component can be applied to a user's application, it can thus considerably speed up the blackbox performance modeling process.
Research topic:
The objective of this master project is to design and develop a unified performance modeling component which combines the best aspects of both models: a combined blackbox/whitebox model would be able to handle any type of application; but for a number of common application types it would make use of domainspecific information. The idea is to make the blackbox model be in charge of exploring possible resource configurations, measuring the performance and cost of the application using these resources, and deriving a performance model from these experiments. Measured metrics are also given to all the whitebox models. If one of the whitebox models does generate useful information, it can inform the blackbox model about configurations that should be avoided.
For example, we have a useful whitebox model for MapReduce applications which can quickly decide if using GPUs or FPGAs for executing the map or reduce task is likely to provide performance benefits.
The student will be in charge of proposing concrete ways to make two performance models work in conjunction, following these principles. Developments will be based on ConPaaS (www.conpaas.eu) where the blackbox model is already implemented. The validations of the proposed algorithms may rely on the French experimental platform named Grid'5000 (www.grid5000.fr). This platform comprises about 8,000 cores geographically distributed in 10 sites linked with a dedicated gigabit network.
This work will be realized in collaboration with Imperial College London, which is developing whitebox performance models for MapReduce and datastreaming applications.
References:
[1] S. Pllana, I. Brandic, and S. Benkner. A survey of the state of the art in performance modeling and prediction of parallel and distributed computing systems. International Journal of Computational Intelligence Research, 4(1), 2008.
[2] E. Ipek, B. de Supinski, M. Schulz, and S. McKee. An approach to performance prediction for parallel applications. In Proceedings of the EuroPar Conference, 2005.
[3] ConPaaS: a Platform for Hosting Elastic Cloud Applications. Guillaume Pierre and Corina tratan. IEEE Internet Computing 16(5), SeptemberOctober 2012.
[4] Imperialled project aims to unlock processor diversity in the cloud. InformationAge, April 2013. http://bit.ly/1mu926F URL sujet détaillé :
:
Remarques :





SM20714 Using constraint programming to solve


Description


Constraint programming is a powerful tool to solve problems if they can be modeled with decision variables and constraints, where a constraint is a relationship between one or more variables that limit the values simultaneously taken by each variables involved in the constraint. Algorithms of solution search inconstraint programming, typically rely on constraints propagation to reduce the number of candidate solutions to explore, as well as a systematic search among the different possible assignments of each variable. Such algorithms guarantee to find a solution, when it exists, and allow to prove that there is no solution to a problem if it has not been resolved at the end of the exhaustive search.
The AES is the new block cipher standard since 2001. There is currently no attacks on the entire encryption process although recent results have led to progress in the understanding of its security. The purpose of this trainee is to use constraint programming to find the results of attacks against the AES obtained by A. Biryukov et al. in [1] and P. Fouque, A. et al. in [2] and attempt to extend the method to the case of the block cipher Rijndael. The attack by Byriukov et al. proposed in 2009 is a socalled chosen key attack against the AES. The main idea is to look for truncated differentials that form valid paths through both the encryption process and the key schedule itself. The authors of this paper proposed a branch and bound algorithm to solve this problem.
In this trainee, we propose to try to solve this problem using constraint programming and the results we have already obtained in [3]. It will therefore be initially to validate the results of Biryukov et al. and of Fouque et al. (Crypto 2013). Then, in a second time, it will be to extend to the Rijndael cipher.
references: [1] Alex Biryukov, Ivica Nikolic: Automatic Search for RelatedKey Differential Characteristics in ByteOriented Block Ciphers: Application to AES, Camellia, Khazad and Others. EUROCRYPT 2010: 322344 [2] PierreAlain Fouque, Jérémy Jean, Thomas Peyrin: Structural Evaluation of AES and ChosenKey Distinguisher of 9Round AES128. CRYPTO (1) 2013: 183203 [3] Marine Minier, Christine Solnon and Julia Reboul: Solving a Symmetric Key Cryptographic Problem with Constraint Programming. MODREF 2014, workshop of the CP 2014 conference
URL sujet détaillé :
:
Remarques : this trainee will be done with Christine Solnon from the LIRIS laboratory.





SM20715 An overlay network for a distributed cloud within the backbone


Description


While the Internet was built in a decentralised fashion, we are currently witnessing a strong recentralisation of computing facilities such as those run by Amazon, Google or Facebook. The intrinsic limitations to this model is increasingly visible, chiefly due to reliability, privacy, and energy issues, as is experienced recently by users [1].
The Discovery project [2] is driven by the objective of proposing a decentralised cloud architecture grounding in local resources located within the core the network. Indeed, the infrastructures set up by Internet providers are most of the time overprovisioned given the actual needs. It consequently appears that installing computing resources along the path of the backbone could reduce the costs related to the construction, maintenance, and energy supply of a computing centre by capitalising on the existing infrastructure. Also, it will lead to a natural distribution of resources while favouring the use of resources that are geographically close to the user.
Operating such an infrastructures calls for means to abstract out the details of the complex, heterogeneous and unreliable underlying network, so as to offer a a simple API to be levered by users to deploy their applications easily. This requires to design this overlay network to be deployed between the infrastructure and the applications on which will rely the API. An identified promising approach consists in the usage of the Vivaldi protocol [3], which places nodes in a Cartesian space according to their physical distance.
The work will consist in the design of such an overlay, and on the definition of routing strategy to be adopted within. In particular, one needs to be able to efficiently locate close nodes, retrieve objects stored in the network, while minimising the maintenance cost of the overlay. The work will be an algorithmic work firstly, and secondly experimental, through the validation of the proposed overlay network, its routing and maintenance algorithms in a software prototype to be deployed on the Grid'5000 national platform [4]. URL sujet détaillé :
:
Remarques :  Advisor 1: Cédric Tedeschi, INRIA / Univ. Rennes 1 (http://people.irisa.fr/Cedric.Tedeschi)
 Advisor 2: Marin Bertier, INRIA / INSA Rennes (http://www.irisa.fr/asap/?page_id=195)
 Pay: ~500 =80





SM20716 Dynamic distributed adaptation of data stream processing platforms


Description


The amount of data generated by social media such as Twitter makes it appealing being able to extract relevant information from these data in real time. Storm [1,2], initially developed at Twitter, has been designed to propose a simple API to deal with this problem. Developing a Storm application chiefly consists in defining the different steps a data stream will go through so as to extract information from it. Storm provides a way to ease the deployment of this topology over a computing platform.
In other terms, Storm automates the deployment of the set of processes composing the extraction application on the physical computation nodes available for instance in a cluster. To achieve and maintain such a deployment, Storm relies on a hierarchical structure monitoring worker nodes so as to replace them in case of a failure. One strong limitation of this approach is its hierarchical shape and its inability to cope with the dynamic nature of computing platforms. Indeed, the top of the hierarchy is occupied by an agent responsible for the application of faulttolerance [3] and load balancing policies. Also, the problem of dynamically remapping the processes on the computing nodes as the velocity of the stream can quickly change is still an open problem in Storm, and more generally in stream processing platforms [4].
The work will consists in modelling a Storm topology, and to propose distributed algorithms (without the need for a monitoring hierarchy) to adapt to dynamic changes in the stream and platform's conditions. More precisely, the computing power dedicated to the support of each process needs to be dynamically adjusted, and this needs to be done in a decentralised fashion, each node deciding, based on local information about its workload , to duplicate or to terminate.
Validating this work will rely on the development of a prototype including these algorithms and its deployment over the Grid'5000 platform [5]. URL sujet détaillé :
:
Remarques :  The team:  Marin Bertier (Associate prof., INRIA / INSA Rennes)
 Matthieu Simonin (Engineer, INRIA)
 Cédric Tedeschi (Associate prof., INRIA / U. Rennes 1)
 Christine Morin (Senior Researcher, INRIA)
 Pay: ~500 =80





SM20717 Secure multiparty computation using the bitcoin protocol


Description


General context:
The problem studied in the field of Secure MultiParty Computation (SMPC) is as follows: There are n parties or nodes in a network. Each node has an input which is private. The node does not wish to reveal the input to any other node in the network. A secure multiparty computation protocol enables the nodes to collaboratively compute a function over their inputs such that all inputs remain private, i.e., known only to their owners. An example of such a protocol is private maximum, i.e. given n nodes each with a private input, how to determine the maximum without revealing who has the maximum or any of the other private inputs. Other examples of possible private computations may include geometric functions, set operations, reputation [1], etc.
Objectives of the internship project:
The objective of this project would be to study secure multiparty computation protocols. Such protocols are often resourceheavy, which limits their practical application. The specific objective of the project would be to analyze and propose methods for rendering these protocols resourceefficient and thus practical. A study of cryptographic techniques and privacy enhancing technologies such as zeroknowledge proofs, homomorphic encryption, electronic cash, etc. will be part of the project.
Recent advancements in cryptographic electronic currencies (e.g. bitcoin) are also impacting how secure multiparty computation protocols are constructed. In particular, the bitcoin protocol may be used for building fully decentralized secure multiparty protocols that do not need to rely on trusted third parties. The project will thus also include a study of cryptographic electronic currency protocols.
References:
[1] A Decentralized Privacy Preserving Reputation Protocol for the Malicious Adversarial Model. Omar Hasan, Lionel Brunie, Elisa Bertino, and Ning Shang. IEEE Transactions on Information Forensics and Security (TIFS). April 2013. URL sujet détaillé :
:
Remarques :





SM20718 Approche spectrale appliquée aux symétries engendrées par transformations d'auto ressemblance


Description


La symétrie joue un rôle important dans notre vie, en fournissant à notre cerveau une représentation compacte des réalités complexes du monde extérieur. C'est pour cette raison que le traitement des symétries est une branche très active dans la vision par ordinateurs, dans le traitement et la synthèse d'images. Les travaux très récents ont permis de caractériser les symétries le plus répandues (symétrie bilatérale, symétrie rotationnelle ou miroir) grâce à l'étude spectrale des matrices dites matrices de correspondance symétrique. Dans ce travail de stage, nous proposons d'étudier la possibilité d'appliquer l'étude spectrale aux symétries d'un type plus évolué : celles engendrées par une transformation d'auto ressemblance. On trouve ce genre de symétrie partout dans le règnes végétal et animal : dans l'arrangement des branches et des feuilles des plantes, dans les arrangements des feuilles, fleurs et des graines dans les phyl lotaxies, dans la répartition des chambres des mollusques, etc. Ce travail de stage demandera à un étudiant de faire preuve d'imagination et de connaissances mathématiques considérables ainsi que de maîtriser des outils de l'analyse spectrale : calcul matriciel évolué, étude des valeurs propres pour les matrices éparses de grande dimension, etc. URL sujet détaillé :
:
Remarques :





SM20719 Etude des propriétés spectrales des empilements compacts à plusieurs sphères


Description


La théorie d'échantillonnage joue un rôle prépondérant dans la synthèse d'image. Les systèmes d'échantillonnage multidimensionnels évolués avec des qualités spectrales contrôlables peuvent également trouver leur application dans d'autres domaines qui font appel aux méthodes de MonteCarlo et de Quasi MonteCarlo multidimensionnelles, par exemple, dans des systèmes de simulations. On propose à l'étudiant d'examiner les propriétés spectrales des empilements compacts à plusieurs sphères, ainsi que les méthodes rapides pour générer de tels arrangements en 2D, en 3D, et en dimensions plus élevées. Ce projet est passionnant et très ambitieux. Pour comprendre à quel point le problème des empilements compacts est nontrivial, lisez l'article de Pfender et Ziegler : vous y verrez que certaines réponses aux problèmes posés depuis des siècles ont été apportées en 2003 par un mathématicien peu connu. Ou lisez les livres de Thompson ou de Conway & Sloane, pour voir à quel point le problème des empilements compacts est lié à d'autres branches de la science, par exemple à la théorie des groupes ou bien à la théorie des codes correcteurs. Références : Pfender, Florian; Ziegler, G=FCnter M. (September 2004), "Kissing numbers, sphere packings, and some unexpected proofs", Notices of the American Mathematical Society: 873=96883, http://www.ams.org/notices/200408/feapfender.pdf. Thomas M. Thompson (1983) "From errorcorrecting codes through sphere packings to simple groups". Conway, J.H. & Sloane, N.J.H. (1998) "Sphere Packings, Lattices and Groups". URL sujet détaillé :
:
Remarques :





SM20720 EVisual cues play essential role in the human visual system. They ease the process of reconstruction of 3D scenes from 2D retinal images, the only visual information that the human being possesses about the world. Consequently, understanding the underlying mechanisms is essential for better understanding the process of vision. It is also important for computer graphics


Description


Visual cues play essential role in the human visual system. They ease the process of reconstruction of 3D scenes from 2D retinal images, the only visual information that the human being possesses about the world. Consequently, understanding the underlying mechanisms is essential for better understanding the process of vision. It is also important for computer graphics: in fact, taking into account visual cues may dramatically reduce harmful rendering artifacts and also save computational resources. In this exploratory project, the student will first systematically study perceptual properties of the human visual system and identify the most important visual cues to model. Then, based on both bibliographical search and his own exploration, he will develop computational models for the most tangible visual cues. He will implement them in the rendering framework, test them and evaluate the results. URL sujet détaillé :
:
Remarques :





SM20721 Analyse statique de performance pour le langage audio Faust


Description


Sujet : Développement d'un analyseur automatique automatique WCET (WorstCase Execution Time) pour le langage Faust de traitement du signal audio et musical.
Projet : Le langage Faust, développé initialement au Grame (Lyon), permet de spécifier de manière fonctionnelle des opérations de traitement du signal ; il est particulièrement adapté aux applications audio et musicales (http://faust.grame.fr). Le traitement du son se fait, naturellement, de manière synchrone au sein des programmes Faust, et la question de la performance de ce langage se pose. En effet, un langage synchrone se doit d'être conforme à ce qu'on appelle "Hypothèse synchrone", qui stipule que tous les traitements peuvent être exécutés entre les arrivées de deux échantillons sonores successifs. Pour cela, une analyse statique de cout, du genre WCET (WorstCase Execution Time), de ces traitements est nécessaire au moment de la compilation pour vérifier que cette hypothèse est valide, assurant ainsi la correction du code Faust à exécuter.
Ce stage s'insère dans le cadre du projet ANR FEEVER (voir http://feever.fr) qui a pour ambition d'offrir, via Faust, une solution ubiquitaire innovante pour la manipulation multimodale du son.
Descriptif : L'objet du stage de niveau Master Recherche, de quelques mois, avec prolongation en thèse possible, est, en partant par exemple de l'évaluateur Faustine développé au CRI (voir http://feever.fr/Papers/propFaustine2013.pdf) :
 d'étudier les techniques actuelles d'analyse WCET;  de concevoir et d'implanter via Faustine un algorithme d'analyse WCET pour le langage Faust;  si le temps le permet, d'en faire une version C++ à intégrer dans le compilateur Faust.
Egalement en fonction du temps disponible, une preuve informatique, en Coq, d'une partie de ce processus pourra être abordée. URL sujet détaillé :
:
Remarques : Exigences : Une connaissance de la programmation fonctionnelle et des techniques de compilation est nécessaire. Un intérêt pour la musique ou les traitements audio est un plus.





SM20722 exploration of computational models for visual cues


Description


Visual cues play essential role in the human visual system. They ease the process of reconstruction of 3D scenes from 2D retinal images, the only visual information that the human being possesses about the world. Consequently, understanding the underlying mechanisms is essential for better understanding the process of vision. It is also important for computer graphics: in fact, taking into account visual cues may dramatically reduce harmful rendering artifacts and also save computational resources. In this exploratory project, the student will first systematically study perceptual properties of the human visual system and identify the most important visual cues to model. Then, based on both bibliographical search and his own exploration, he will develop computational models for the most tangible visual cues. He will implement them in the rendering framework, test them and evaluate the results. URL sujet détaillé :
:
Remarques :





SM20723 Behaviours as solutions for higherorder programs and processes


Description


This internship proposal focuses on operational methods to study the operational semantics of programs and processes (process calculi, lambdacalculus). A description can be found at http://perso.enslyon.fr/daniel.hirschkoff/Stages/dhcontr.pdf URL sujet détaillé : http://perso.enslyon.fr/daniel.hirschkoff/Stages/dhcontr.pdf
Remarques : Ce travail s'inscrit dans le cadre d'une collaboration avec D. Sangiorgi (Université de Bologne, Italie).





SM20724 Dynamic muscle model for virtual human animation


Description


This project concerns the accurate and realtime simulation of real human movement. To perform such simulation, we need a model of the real human in the virtual world of the simulation. Different models can be used depending on the intended application. In this project, we focus on models involving the anatomy and inner working of the human body.
The purpose of this project is to design a new model of muscle action line for the realtime simulation of virtual characters. The currently used muscle model uses line segments (called action lines) which represent only within a certain limit the action that a muscle can produce. These limits are reached when muscle attachments are large, or when the muscle fibers have a complex arrangement or when they follow a complex routing (e.g. wrapped around bones or other muscles). Some musclebased simulation platforms use predefined primitive surfaces around which these action lines can wrap, but these are still rough approximations of the deformations that are involved in the human body.
Nevertheless musclebased simulations are very sensitive to the directions of the muscle forces, and the routings of the action lines define the direction of the applied forces. Different action lines routings have a significant effect on the simulations. This project will investigate the benefits of using dynamic action lines i.e. that adapt to the posture. These adjustments will be estimated from the geometric deformations of the muscle. As these deformations are expensive to compute, they will be part of an offline preprocess, and only a precalculated table of each muscle moment arms will be used according to the current posture of the virtual character.
Expectations for this project are threefold. (1) The student will use a geometric method to calculate the deformations of the action lines. (2) The student will then assess the impact of this new model on muscle function (i.e. are the differences in forces significant?). (3) Finally, demonstration scenarios will be developed to validate the model and support the results. URL sujet détaillé : https://liris.cnrs.fr/nicolas.pronost/Papers/StudentProject/DynamicActionLine.pdf
Remarques : Secondary advisor: Saida Bouakaz (LIRISSAARA) Ce stage est rémunéré et pourra être poursuivit en thèse.





SM20725 Machine learning techniques for classification of viral genomic sequences


Description


Since the sequencing of the first bacterial genome in 1995, genomic sequences of thousands of microbial species have been sequenced, which covers only a tiny portion of known microbes. Genomes can be viewed as linear strings of four bases, adenine (A), guanine (G), cytosine (C), and thymine (T). This enables the treatment of genome sequences as symbolic sequences and the characterization of these sequences using mathematical models. The mathematical models of genomic sequences of particular interest to us in this project are a class of models called genome signatures. Genome signatures are compact mathematical representations of DNA sequences (see for a detailed presentation e.g. [5]) and in the context of this project we will consider the decomposition of genomic sequences in kmers, that is words of size k.
Analysis of microbial communities has been until recently a complicated task due to their high diversity and the fact that many of these organisms can not be cultured. Metagenomics has emerged as essentially the only way to characterize these unculturable microbes. Metagenomic studies generate sequence data at an even higher rate than classical genomics. Indeed, the number and diversity of microbial communities are not known, but current estimations place the number of species for certain environments in the range of millions [1]. The recent announcement from the Earth Microbiome Project is to generate petabases 10^15 or more of environmental sequences [10]. Unfortunately, the general question of classifying complex sequence data coming from a non apriori sequencing of the whole metagenomic sample is a particularly challenging computational problem. Moreover, the most blatant difficulty is to recognize novel viral sequences within a set of sequences.
Nonetheless, a number of efficient methods exist to classify the composition of a sample that mainly contains known species. Computational solutions for taxonomic assignment can be broadly organized in two main categories: (1) sequence similarity methods and (2) sequence composition methods.
Methods that rely on sequence similarity can be themselves subdivided in classical alignmentbased techniques (mostly attempting to improve BLAST accuracy) and indexbased. Alignmentbased methods suffer from two limitations: speed and lack of sensitivity, see e.g. [2] and [3]. Alternative solutions have been suggested to overcome these limitations. These methods are based on long kmers (subsequences of size k) and conceptually rely on the fact that when k is sufficiently large, sequences become speciesspecific. Consequently, the idea is to index the sequences by long kmers. This is indeed the foundation of MegaBlast (a generalpurpose sequence aligner using long seeds), but also of a number of methods specific for taxonomic assignment such as e.g. LMAT [4] and Kraken [3]. The downside of these sequence similarity approaches is overspecificity, which makes classification of unknown sequences problematic.
A complementary approach is based on sequence composition analysis. Such methods rely on the decomposition of sequences into frequencies of kmers (with k that is not too large), and make use of machine learning techniques (e.g. SVM, kNN, Naive Bayes, etc) to train a classifier on the reference database. The taxonomic assignment of novel sequences is then predicted by applying the pretrained model. These methods theoretically are better suited to the task of novel species classification as short kmers distributions are less prone to overfitting. However, even these techniques fail to classify about 50% of species absent from the training set [6]. This is especially salient for viral sequences, as the vast majority of them fails to be uniquely assigned to any domain [7].
In this project we propose to explore the use of spaced kmers [8] (that is kmers where certain letter positions are considered to be wildcards) for classification of novel viral sequences. Indeed, it has been recently shown that such signatures can be better suited to certain genomic classification tasks such as construction of phylogenetic trees [9].
The intern will develop an algorithmic method for viral sequence classification based on spaced kmers. For this purpose he/she will have to acquire and apply machine learning techniques as well as "Big Data" approaches for dealing with highly voluminous data. The proposed method(s) will have to be implemented and will be tested both on public data as well as on novel data acquired by our collaborators at INRA. Good programming skills are most welcome, as well as a good working knowledge of algorithmics and statistics.
Advisors: Macha Nikolski and Hayssam Soueidan Laboratory : Bordeaux Bioinformatics Center (CBiB) and LaBRI (Laboratoire Bordelais de Recherche en Informatique)
Contact: Please send your CV/resume to and
References:
[1] J. Gans, M. Wolinsky, and J. Dunbar. Computational improvements reveal great bacterial diversity and high metal toxicity in soil. Science, 309:1387=9690, 2005. [2] A. Bazinet and M. Cummings. A comparative evaluation of sequence classification programs. BMC Bioinformatics, 13(1):92, 2012. [3] Wood, D. E. and Salzberg, S. L. (2014). Kraken: Ultrafast metagenomic sequence classification using exact alignments. Genome Biology, 15(3), R46. [4] Ames S.K., Hysom D.A., Gardner S.N., Lloyd G.S., Gokhale M.B. and Allen J.E. (2013), Scalable metagenomic taxonomy classification using a reference genome database. Bioinformatics, vol. 29, p. 2253=962260 [5] Nalbantoglu O., Computational Genomic Signatures and Metagenomics, Phd dissertation University of NebraskaLincoln USA, 2011 [6] Nalbantoglu O., Way SF, Hinrichs SH, Sayood K: RAIphy: Phylogenetic classification of metagenomics samples using iterative refinement of relative abundance index profiles. BMC Bioinformatics 2011, 12:4141. [7] Rosen G.L., Reichenberger E.R., Rosenfeld A.M. (2010), NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads. Bioinformatics 27, p. 127=96129 [8] Keich, U., Li, M., Ma, B. and Tromp, J. 2004, On spaced seeds for similarity search, Discrete Appl. Math. vol. 138, p. 253=96263 [9] Leimeister, C.A., Boden, M., Horwege, S., Lindner, S., and Morgenstern, B. (2014). Fast alignmentfree sequence comparison using spacedword frequencies. Bioinformatics, DOI: 10.1093/bioinformatics/btu177. [10] J. Gilbert, J. Meyer, D. Antonopoulos, P. Balaji, C. Brown, N. Desai, J. Eisen, D. Evers, D. Field, W. Feng, D. Huson, J. Jansson, R. Knight, J. Knight, E. Kolker, K. Konstantindis, J. Kostka, N. Kyrpides, R. Mackelprang, A. McHardy, C. Quince, J. Raes, A. Sczyrba, A. Shade, and R. Stevens. Meeting report: the terabase metagenomics workshop and the vision of an earth microbiome project. Stand Genomic Science, 3:243=968, 2010. URL sujet détaillé :
:
Remarques : Ce stage est, bien sur, rémunéré





SM20726 Motion control on soft or liquid support


Description


This project concerns the accurate and realtime simulation of real human movement. To perform such simulation, we need a model of the real human in the virtual world of the simulation. Different models can be used depending on the intended application. In this project, we focus on models capable of dealing with complex interactions with the environment. These interactions affect the simulation and we are seeking to control the motion of the virtual human regarding these interactions.
The purpose of this project is to create a new physics based motion controller for virtual characters. The current controllers are not designed to deal with complex interactions with soft bodies or liquids that can have an influence on the movement. Yet these interactions happen in many real life situations. For example, we can imagine a virtual human easily moving through 5 cm of water on a beach. This exact same human will not walk around with the same gait pattern at all with water up to the knee. It means that the controller must know how to adapt to such different situations. To do so, we need to define a mathematical formulation to describe how we expect to see humans move in these situations (e.g. by optimizing a cost function that describes the effectiveness of the gait).
Expectations for this project are twofold. (1) The student will implement a computational method for interaction between rigid bodies(representing the virtual character) and soft bodies /liquids(water, mud, snow, sand etc.) that is efficient enough to be applied to realtime animation. (2) The student will use this method (i.e. mainly collision handling) and a physics based controller to produce plausible locomotion (walking and running). URL sujet détaillé : https://liris.cnrs.fr/nicolas.pronost/Papers/StudentProject/SoftMotionControl.pdf
Remarques : Secondary advisor: Saida Bouakaz (LIRISSAARA) Ce stage est rémunéré et pourra être poursuivit en thèse.





SM20727 Vérification paramétrée de systèmes tempsréel


Description


Les systèmes temps réel multiprocesseurs sont devenus omniprésents ces dernières années. Certains d'entre eux (pilotage automatique des avions et drones, voitures sans chauffeur) sont critiques en ce sens qu'aucune erreur ne doit survenir. Tester de tels systèmes peut éventuellement détecter la présence de bugs, mais pas en garantir l'absence. Il est alors nécessaire d'utiliser des méthodes telles que le model checking afin de prouver formellement la correction d'un système.
Déterminer un ordonnancement (scheduling) correct des tâches effectuées par les processeurs est crucial pour la sûreté du système. Ces systèmes sont caractérisés par un ensemble de constantes temporelles, telles que la période de lecture d'un capteur d'altitude sur un drone, ou la durée minimale ou maximale d'un calcul sur un satellite. Bien que des techniques permettant de vérifier le système pour un ensemble de constantes existent, vérifier formellement le système pour de nombreuses valeurs de ces constantes peut demander un temps extrêmement long, voire infini si l'on cherche à vérifier des ensembles denses de valeurs.
Il est alors intéressant de raisonner paramétriquement, en considérant que ces constantes temporelles sont inconnues, c'estàdire des paramètres, et synthétiser une contrainte sur ces paramètres garantissant qu'il existe un ordonnancement des tâches à effectuer par les processeurs tel que le système est correct. Plusieurs techniques ont été proposées afin de résoudre ce problème, notamment dans le cadre de l'ordonnancement.
L'objectif du stage est de mettre au point des techniques de vérification paramétrée de systèmes tempsréel dédiées au cas de l'ordonnancement multiprocesseurs.
Une implémentation sera également effectuée par l'étudiant afin de valider l'approche proposée. Une option possible est de réutiliser l'outil IMITATOR. URL sujet détaillé : http://lipn.univparis13.fr/~andre/verificationparametreedesystemestempsreel.pdf
Remarques : Coencadrement avec Giuseppe Lipari (LIFL, Lille). Le stage peut être effectué au LIPN (Paris 13) ou au LIFL (Lille), en fonction du souhait du candidat.
Rémunération standard.





SM20728 BigData in NeuroScience


Description


Models studied in computational neuroscience are fed up with facts obtained from the large literature reporting knowledge obtained from biological experimentation. Such knowledge is put in words, leading to phenomenological descriptions of the underlying behaviors. Then, a part of this information is put in equations and in distributed algorithms to check some quantitative or qualitative aspects of the presumed global understanding. Surprisingly enough, this work is mainly realized like by craftsmen, and more or less by intuition.
Other domains of knowledge have taken into account recent and powerful developments regarding semantic representations and introduce an intermediate framework between knowledgeinwords and knowledgeinequations: knowledgeinfacts. From very large and heterogeneous corpus like Wikipedia turned in DBpedia semantic data base, to very specialized fields of knowledge formalized as ontology, it is clear that powerful methodological tools are now available to manage the data at a higher scale.
The goal of the proposed work is to address this issue in three steps:
1/ Considering a welldefined topic (either earlyvision processing from retina to the thalamus, or functional models of basalganglia and related structures of action selection, the choice being made with the student) build an ontology that describe our current knowledge of these brain subsystems.
2/ From this real example, propose an intermediate language (say, ``turtle=B4=B4) that allows to computer scientists to easily annotates publications building the related ontology. This steps is realized with colleagues specialists of semantic web formalism.
3/ Considering existing tools and existing initiative, propose the design of a collective platform allowing a scientific community in the field to construct such bigdata, beyond this seminal work. URL sujet détaillé : http://wwwsop.inria.fr/members/Thierry.Vieville/projects/neurobigdata/
Remarques : Rénumération du stage. Sujet de thèse à la clé. Le stage se déroule à l'Inria SophiaAntipolis en lien avec l'équipe du NeuroCampus de Bordeaux.





SM20729 Study of P2P Systems for Video Live Streaming


Description


The goal of this internship is to study and propose models to describe P2P systems. These systems are cheap to operate and scale well, but their highly distributed nature raises questions about reliability, durability, availability, confidentiality, and routing of the data. In particular, the focus of this internship will be on streaming in peer to peer networks. In such system peers are relaying packets or video frames, and the goal is to play the video the nodes ask for. This is possible only if some delay in the playback is allowed, and there exist tradeoffs be tween the number of customers nodes and the delay that the systems can support. Moreover some scheduling protocols must decide which packets are exchanged and who send them in order to achieve the best possible behavior. The analysis of P2P systems involves a wide range of techniques between practice and theory, from mathematical analysis to simulations and deployment of experiments. This will allow us to adapt the orientation of the project to the taste of the interested student.
Objective: During the internship, we will study and simulate a live streaming system for video. We will study its efficiency when there are frequent arrivals and departures of peers. The metric considered will be the delay and bandwidth usage.
The internship can be followed by a Ph.D.
URL sujet détaillé : http://wwwsop.inria.fr/members/Frederic.Giroire/stagestreaming.pdf
Remarques : Possibilité de rémunération.





SM20730 Energy, SoftwareDefined Networks and Scalability


Description


Softwaredefined Networks (SDN), in particular OpenFlow, is a new networking paradigm enabling innovation through network programmability. Over past few years, many applications have been built using SDN such as server load balancing, virtualmachine migration, traffic engineering and access control. For example, recently, Google has deployed during three years a network connecting their data centers across the planet using SDN.
In this internship, we will focus on using SDN for energyaware routing (EAR). Energy efficiency of networking infrastructure (telecom, data center, ISP or enterprise networks) is a growing concern, due to both increasing energy costs and worries about CO2 emissions. Since traffic load has a small influence on power consumption of routers, EAR allows to put unused links into sleep mode to save energy. SDN can collect traffic matrix and then computes routing solutions satisfying QoS while being minimal in energy consumption. However, prior works on EAR have assumed that the forwarding tables can hold an infinite number of rules. In practice, this assumption does not hold since these tables are implemented with Ternary Content Addressable Memory (TCAM) which is expensive and powerhungry. In [1], we have proposed an optimization method to minimize energy consumption for a backbone network while respecting capacity constraints on links and rule space constraints on routers. In details, we have presented an exact formulation using Mixed Integer Linear Program (ILP) and introduced efficient greedy heuristic algorithm that were analyzed through simulations.
The objectives of this internship are to extend this work in order to take into account the possibility to add wildcards within the forwarding rules. As an example, assuming that the rules are the mapping of [(src, dest): porttofoward], we can compact several different rules with same source s, same porttoforward p, but different destinations into: [(s,*): p]. In this case, the forwarding tables will be shrinked, but the general problem becomes more complicated. Therefore, the candidate will study the extension of the linear program to take this into account, study new heuristics, and perform the corresponding simulations on a realistic topology and set of traffic demands.
Prerequisites for candidates:  Knowledge and/or taste for Networking  Knowledge and/or taste for discrete mathematics, graph theory and/or combinatorial optimization
Contact information: For further information, please contact  Frédéric Giroire  frederic.giroire.fr  Joanna Moulierac  joanna.moulierac.fr
Location: COATI =96 INRIA/I3S (CNRSUNS)  2004 route des lucioles, 06906 SOPHIA ANTIPOLIS https://team.inria.fr/coati
References: [1] Optimizing Rule Placement in SoftwareDefined Networks for Energyaware Routing. Frédéric Giroire, Joanna Moulierac and Khoa Truong Phan. In Globecom 2014.
[2] On the effect of forwarding table size on SDN network utilization. R. Cohen, L. LewinEytan, J. S. Naor, D. Raz In INFOCOM, 2014
[3] SoftwareDefined Networking: A Comprehensive Survey. Diego Kreutz, Fernando M. V. Ramos, Paulo Verissimo, Christian Esteve Rothenberg, Siamak Azodolmolky, Steve Uhlig In arXiv:1406.0440
URL sujet détaillé : wwwsop.inria.fr/members/Joanna.Moulierac/SDN_Energy_internship_2014_COATI.pdf
:
Remarques : CoEncadrement : Frédéric Giroire frederic.giroire.fr





SM20731 Petri net unfolding of biological networks


Description


Biological networks, such as gene regulatory networks or signalling networks, can be modeled as automata networks or Petri nets. It results in concurrent models involved hundreds or thousands of parallelinteracting components. Aiming at capturing the dynamical landscape of those models, and ultimately proposing mutations to control the biological cells, we are interested in computing several properties, such as the attractors (live/dead locks, characterizing the longterm behaviour) and set of mutations that would trigger a redifferentiation of the cell, i.e. will change the current attractor.
In order to cope with the huge size of the networks, this internship will address the design, implementation, and evaluation of new algorithms based on Petri net unfoldings for computing dynamical properties on discrete models of biological networks. Petri net unfoldings exploit the concurrency between transitions to obtain socalled complete prefixes, that result in compact representations of all the possible behaviour of the net. Tools such as mole (http://www.lsv.enscachan.fr/~schwoon/tools/mole) or cunf (https://code.google.com/p/cunf) allow to compute efficiently the complete prefix of the unfolding of Petri nets.
This internship will first consist in evaluating and comparing existing algorithms for computing attractors in Petri net models of biological networks. Several variants and extensions of Petri net unfoldings will be explored in order to improve the scalability of the method. This task will involve adapting existing algorithms and implementing them to make them more efficient when applied to large biological networks. Then, new algorithms will be designed for identifying sets of components of the network for which the activation or inhibition would (1) prevent the reachability of a particular attractor from a given initial state while conserving other attractors reachability; (2) force a transition from one attractor to another given one.
This work may be continued with a PhD thesis on developing causality and concurrency theory for the formal analysis of dynamics of discrete biological networks. URL sujet détaillé : http://loicpauleve.name/unfoldingbioM2.pdf
Remarques : Cosupervised by Loïc Paulevé (CNRS / LRI, Université ParisSud)  http://loicpauleve.name





SM20732 Analyse des chemins d'accès aux variables partagées pour l'anticipation en ligne des placements routage


Description


De nos jours, des processeurs à plusieurs centaines de cóurs tels que la puce Xeon Phi d'Intel (64 cóurs) ou la puce MPPA de Kalray (256 cóurs) promettent un gain important de performance ainsi qu'une réduction de la consommation. Ces gains dépendent fortement de la programmabilité des puces mais aussi des stratégies de placement routage lors du déploiement de l'application. L'un des moyens d'obtenir la performance d'exécution consiste à réduire le temps d'accès aux mémoires partagées. Pour cela il suffit de réduire les messages coûteux du protocole d'accès à la mémoire partagée (ex. messages bloquants dans la programmation synchrone ou messages de contrôle, etc.) tout en plaçant les données au plus proche de la tâche. Dans le laboratoire LaSTRE, des travaux antérieurs ont porté sur le développement d'une chaîne de compilation à protocoles multiples permettant d'associer à chaque donnée partagée un protocole donné. L'assignation d'un protocole se fait de manière statique ou dynamique. Dans le contexte de ce stage, nous voudrions doter cette chaîne de compilation d'outils d'analyse permettant de caractériser les chemins d'accès à la mémoire partagée afin d'anticiper les placements routage. Le stagiaire aura à identifier et à formaliser les types d'accès à la mémoire partagée afin d'aider à identifier la meilleur stratégie de placement routage. Le travail se déroulera selon trois phases : 1. Etat de l'art sur les stratégies de placement routage 2. Formalisation des analyses de dépendance qui permettraient de faciliter les placements routage 3. Implémentation d'un prototype d'analyse Le stage se déroulera dans les locaux NanoInnov du CEA Saclay, au sein d'un laboratoire de recherche notamment spécialisé dans l'analyse de code, les chaînes de compilation et les systèmes embarqués pour architectures massivement parallèles.
Ce stage peut se poursuivre en thèse. URL sujet détaillé :
:
Remarques :





SM20733 Analyse de modèles d'accès aux données partagées pour le choix de protocoles de cohérence pour architectures massivement parallèles


Description


Afin de répondre aux demandes de puissance de calcul tout en maîtrisant la consommation d'énergie, des processeurs à plusieurs centaines de cóurs  dits manycoeurs  apparaissent dans les systèmes de calcul haute performance et les systèmes embarqués. Ainsi nous avons vu apparaître sur le marché, la puce Xeon Phi d'Intel qui contient 64 cóurs ou encore la puce MPPA de Kalray est composée de 256 cóurs. Les performances des applications exécutées sur de telles architectures dépendent fortement de la programmabilité de la puce, et notamment des mécanismes et des techniques de gestion de la mémoire sur puce. Le choix du modèle de cohérence des données (les règles qui déterminent la fraîcheur d'une donnée lors d'un accès) et des protocoles de cohérence impacte directement l'efficacité des accès. Ceci peut rapidement constituer un goulot d'étranglement pour l'application si ces choix sont mal adaptés. Nous avons proposé dans nos précédents travaux une chaîne de compilation à protocoles multiples permettant d'associer à chaque donnée partagée un protocole donné. Dans ce contexte, un axe de recherche ouvert consiste à s'appuyer sur des techniques d'analyse statique du code source afin d'identifier et de caractériser les accès aux données partagées (ex: Identification des tâches de l'application et des variables partagées, construction des graphes de dépendance des accès aux données partagées). Le travail proposé dans ce stage consiste à dresser un état de l'art autour de l'analyse des accès aux données partagées dans les codes d'applications parallèles, puis de proposer une solution qui répond aux points suivants: 1. La possibilité d'extraire, à partir du code source, des graphes représentant les dépendances des accès aux données partagées, 2. La caractérisation de motifs et schémas d'accès mémoire qui s'appuie sur la représentation précédente, 3. Une analyse automatisée des graphes de dépendance afin de détecter les schémas d'accès aux données. Enfin, la solution proposée devra faire l'objet d'un prototype pour la validation du stage. Le stage se déroulera dans les locaux NanoInnov du CEA Saclay, au sein d'un laboratoire de recherche notamment spécialisé dans l'analyse de code, les chaînes de compilation et les systèmes embarqués pour architectures massivement parallèles. URL sujet détaillé :
:
Remarques :





SM20734 Lagrangian Duality in Online Algorithms


Description


Online algorithms is a domain studying problems with uncertainty about future. In general, requests arrive overtime and one needs to make irrevocable decisions without any information about the future. The problems appears in many practical settings, for example in decision making, stock investment, network routing, etc. Hence, it is important to design algorithms/strategies with performance guarantee in uncertain environment.
The most successful technique until now in online algorithms is the potential argument. However, the technique has its limit (it does not give much insight about the problems, etc.) Currently, there is a trend looking for a principled approach to design and analyze online algorithms. A direction [1] is to exploit the power of mathematical programming and use the Lagragian duality. The method is powerful and flexible as it is based on natural relaxations (even nonconvex ones) and it successfully gives theoretical evidence for practical algorithms.
Our project is to develop this technique and the internship is a part of the project. The goal is to study concrete online problems (online scheduling/matching/paging etc) using the approach. Ideally, a research publication could be derived at the end of the internship.
Requirement: Motivation and basic knowledge on Algorithms, Discrete Mathematics.
[1] Nguyen Kim Thang. Lagrangian Duality in Online Scheduling with Resource Augmentation and Speed Scaling. European Symposium on Algorithms (ESA), 2013. URL sujet détaillé : https://www.ibisc.univevry.fr/~thang/Stages/lagrange.html
Remarques : Remuneration standard 523,26 euros/mois.





SM20735 Analysis of the Mifare Plus Distance Bounding Protocol


Description


* Keywords: Security, Cryptography, Contactless Smartcards, RFID.
* Objective: Check in practice the efficiency of the distance bounding protocol of the Mifare Plus RFID tag.
* Content: Radio Frequency Identification (RFID) allows to identify objects or subjects without any physical nor optical contact, using transponders  microcircuits with an antenna  queried by readers through a radio frequency channel. This technology is one of the most promising of this decade and is already widely used in applications such as access cards, transportation passes, payment cards, and passports. This success is partly due to the steadily decrease in both size and cost of passive transponders called "tags".
The "relay attack"  sometimes referred to as "Mafia fraud"  exhibited by Desmedt, Goutier, and Bengio recently became a major issue of concern for RFID authentication protocols. The adversary pretends to be the legitimate prover by relaying the messages that are exchanged during the execution of the protocol. This is illustrated through the following example. Consider an RFIDbased ticket selling machine in a theater. To buy a ticket, the customer is not required to show his theater pass, an RFID tag. The customer needs to be close enough to the machine (verifier) so that the pass (prover) can communicate with it. The pass can be kept in the customer's pocket during the transaction. Assume there is a line of customers waiting for a ticket. Bob and Charlie masterminded the attack. Charlie is in front of the machine while Bob is far in the queue, close to Alice, the victim. When the machine initiates the transaction with Charlie's card, Charlie forwards the received signal to Bob who transmits it to Alice. The victim's tag automatically answers since a passive RFID tag  commonly used for such applications  responds without requiring the agreement of its holder. The answer is then transmitted back from Alice to the machine through Bob and Charlie who act as relays. The whole communication is transparently relayed and the attack eventually succeeds: Alice pays Charlie's ticket.
When it was first introduced in the late eighties, the relay attack appeared unrealistic. Nowadays, the relay attack is one of the most effective and feared attacks against RFID systems. It can be easily implemented since the reader and the tag communicate wirelessly, and it is not easily detectable by the victim because queried (passive) tags automatically answer to the requests without agreement of their bearers. Relay attacks are for example used to steal cars in parking lots.
Distance bounding protocols are promising solutions to prevent such relay attacks. These protocols measure the roundtrip time of messages exchanged between the reader (verifier) and the tag (prover): if the round trip time is too long, then the reader considers the tag is outside its neighborhood and an attack running. There exist many distance bounding protocols but most of them are theoretical proposals. The Mifare Plus tag is the only device that takes benefit of a distance bounding protocol. The implemented protocol is however much more simple than theoretical ones. The protocol so mitigates the problem but cannot fully avoid them. Consequently, this project will seek for the limits of the distance bounding protocol implemented in the Mifare Plus. URL sujet détaillé : http://www.avoine.net/irisa_emsec_avoine_1.pdf
Remarques :





SM20736 Practical Cryptanalysis of PseudoRandom Number Generators in RFID


Description


* Keywords: RFID, Contactless Smartcard, Practical Attacks, PRNG
* Objective: Analyze (in practice) PRNGs implemented on tags available in our everyday lives
* Content:Randomness is the keystone of any cryptographic system. Generating randomness is yet far from being an easy task, especially with lowcost devices as RFID passive tags. Indeed, on one hand few logic gates can be "wasted" just for generating randomness; on the other hand passive RFID tags have no battery  hence no internal clock  and must so trust in some way the reader that feeds them. Most current pseudorandom number generators (PRNG) implemented on RFID tags definitely suffer from poor designs. For example, the PRNG implemented on Mifare Classic RFID tags  1 billion of such tags have been sold since 1995  is reinitialized each time the tag enters the field of a reader. Consequently an adversary can easily drive the PRNG of a Mifare Classic tag just by measuring the time between the tag is powered and the PRNG is called.
This internship aims to analyze (in practice) PRNGs implemented on tags available in our everyday lives, e.g., tags for building access control, public transportation, ski lift, etc. The analysis will depends on whether or not the specifications of the PRNG are known. When they are not known, which is the most usual case, the RFID tag is considered as a black box and two approaches are required: (1) Determining whether the PRNG can be influenced by some external events; (2) Determining whether the PRNG output is biased. Guidelines and tools provided by the US National Institute of Standards and Technology might be useful for this work. The expected aim of this project is to eventually break some RFID solutions in practice.
Depending of the profile of the student, this topic can be tackled from different approaches, either practical or theoretical. For example, it is expected to analyze the PRNG as a black box, but students familiar with reverse engineering techniques may wish apply there knowledge against some PRNGs. Whatever the approach, the student will be required to implement tests, and so skills in programming are required. Choice of the language is let to the discretion of the candidate. URL sujet détaillé : http://www.avoine.net/irisa_emsec_avoine_2.pdf
Remarques :





SM20737 Can SSD Memory Improve Cryptanalytic TimeMemory TradeOff?


Description


* Keywords: Cryptography, TimeMemory TradeOff, Password Cracking, Algorithmics, Probability.
* Objective: Check whether SSD memory may improve cryptanalytic timememory tradeoffs.
* Content: Many cryptanalytic problems can be solved in theory using an exhaustive search in the key space, but are still hard to solve in practice because each new instance of the problem requires restarting the process from scratch. Such a problem is for example the cracking of passwords.
The basic idea of a cryptanalytic timememory tradeoff (TMTO) is to carry out an exhaustive search once for all such that following instances of the problem become easier to solve. Thus, if there are N possible solutions to a given problem, a timememory tradeoff can solve it with T units of time and M units of memory. In the methods we are looking at, T is proportional to N^2/M^2 and a typical setting is T=M=N^(2/3).
The cryptanalytic timememory tradeoff has been introduced in 1980 by Hellman and applied to DES. Given a plaintext P and a ciphertext C, the problem consists in recovering the key K such that C=E_K(P), where E is an encryption function assumed to follow the behavior of a random function. Encrypting P under all possible keys and storing each corresponding ciphertext allows for immediate cryptanalysis but needs N elements of memory. The idea of a tradeoff is to use chains of keys, which is done using a reduction function R that generates a key from a ciphertext. Using E and R, chains of alternating ciphertexts and keys can thus be generated. The key point is that only the first and the last element of each chain are stored. In order to retrieve K, a chain is generated from C. If at some point it yields a stored end of chain, then the entire chain is regenerated from its starting point. However, finding a matching end of chain does not necessarily imply that the key will be found in the regenerated chain. There exist situations where the chain that has been generated from C merges with a chain that is stored in the memory that does not contain K. This situation is called a false alarm and is very costly.
A TMTO is efficient as long as the precalculations of the chains can be entirely stored in a fastaccess memory. RAM is currently used but its quite limited size is the bottleneck of TMTO. This internship aims to check whether SSD memory may improve cryptanalytic timememory tradeoffs.
During this internship, the student will:
(1) implement the cryptanalytic timememory tradeoff (TMTO) technique, (2) evaluate the efficiency of a TMTO when a SSD is used, (3) find the best approach to combine SSD and RAM, (4) apply the technique to crack passwords. URL sujet détaillé : http://www.avoine.net/irisa_emsec_avoine_3.pdf
Remarques :





SM20738 Hardwaresoftware codesign of TLS/SSL support for Zynq embedded systems


Description


Zynq platforms embed booth a processor (ARM dual core Cortex A9) and a FPGA (Artix7). We will study and prototype a TLS/SSL support on this platform. At hardware level, an ECC cryptoprocessor will be used to accelerate some cryptographic primitives.
Complete description at http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet2014codesignhwswtlsssl.pdf URL sujet détaillé : http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet2014codesignhwswtlsssl.pdf
Remarques : See complete description





SM20739 Software extensions for an ECC cryptoprocessor


Description


A dedicated processor for elliptic curve cryptography on FPGA circuits is under development at IRISACAIRN in Lannion.
In this work we will study and prototype software extensions for advanced support with more curves, computation algorithms, and algorithmic protections against side channel attacks. The proposed solutions will be implemented and tested on the poweranalysis attack setup.
Complete description at http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet2014extensionsoftcryptoprocecc.pdf URL sujet détaillé : http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet2014extensionsoftcryptoprocecc.pdf
Remarques : See complete description





SM20740 Vehicle Sharing Systems


Description


The number of cities equipped with Vehicle Sharing Systems (VSS, BSS, VLS) is increasing rapidly. VSS are viewed as a promising way to reduce the usage of personal car and improve the quality of life by reducing traffic, noise and air pollution. Despite the growing number of VSS, they are still poorly understood and often suffer from congestion problems (absence of a vehicle or of a parking spot at the destination).
The goal of the internship will be to contribute to the understanding of VSS by developing computer and mathematical methods. For completeness, we describe four research directions bellow. According to his preferences and skill, the student will choose one of the them during the internship. The others can serve as a basis for a subsequent PhD proposal.
Benchmark the historical data of various real VSS, and develop visualization tools, to help humans deal directly with such data. Data mining, in order to estimate the underlying intensity of user flow depending on the stations, time, day, season, weather. Modelling and simulating so as to have a solid basis for investigating the behavior of future systems and optimizing them. Mathematical analysis of models of VSS, so as to have an analytic understanding of how VSS behave and help optimizing them. URL sujet détaillé :
:
Remarques : Le stage sera coencadré par Nicolas GAST (CR INRIA Montbonnot) et Vincent JOST (CR CNRS GSCOP Grenoble)





SM20741 A dynamical system approach to compute the transient behavior of stochastic distributed systems


Description


Let us consider a random walker over a finite grid (or torus) in dimension d. A natural question is how long does it take for the walker to have a chance to visit any state in the grid, or, more or less equivalently, how long does it take for the walker's initial position not to matter anymore?
These questions are all related to the {it mixing time} of the random walk: the time it takes for the probability distribution at time t, to be close (for the total variation norm) to the stationary distribution. Computing the mixing time is an important challenge in modern Markov chain theory that has been approached in many ways (spectral decomposition of the transition matrix, coupling theory, perfect simulations).
In this internship, we propose to investigate a new approach based on asymptotic arguments (by letting the space dimension d, go to infinity). This approach transforms the stochastic random walk into a deterministic dynamical system whose long run behavior can be well understood (sometimes).
This new approach can allow one to compute the mixing time of random walks that cannot to tackled using the classical tools. It is also useful fo compute the complexity of distributed randomized algorithms such as cooperative protocols in games. Finally, it can help to assess the expected time to extension of population processes with intricate dynamics (such as populations of particules under general interaction processes). URL sujet détaillé :
:
Remarques : Will be cosupervised by Bruno Gaujal (DR Inria, Grenoble).





SM20742 Numerical Optimal Transport in 3d


Description


This project aims at developing new efficient numerical methods for optimal transport in 3d. The rich mathematical structure of the problem leads to interesting theoretical developments. It also has the potential of leading towards new efficient numerical methods to solve some nonlinear PDEs of a certain class (MongeAmpere), see Benamou, Carlier, Merigot, Oudet's article see http://arxiv.org/abs/1408.4536 on the JKO (JordanKinderlehrerOtto) scheme. My early experiments in 3d seem promising: the semidiscrete optimal transport between a measure with density and a sum of Dirac massescan be computed for up to 1 million Dirac masses. For solving MongeAmpere PDEs, the JKO scheme requires solving multiple instances of optimal transport problems, the goal is to study the possibility of accelerating the speed of convergence of the algorithm, to make it usable in practice. URL sujet détaillé : http://arxiv.org/abs/1409.1279
Remarques : We have several possibilities of funding for continuing with a Ph.D. thesis.





SM20743 Understanding the pelvic dynamics through a 3D simulation of childbirth


Admin


Encadrant : Fabrice JAILLET 
Labo/Organisme :MaSTR team of the Laboratory of Mechanics in Lille are involved in the development of a biomechanical model for the simulation of interactions between pelvic dynamics of the pregnant woman and the fetus during delivery. 
URL : https://liris.cnrs.fr/saga/wiki/doku.php 
Ville : Villeurbanne Doua 



Description


Context:
The SAARA team of the LIRIS in Lyon and the ER3:MaSTR team of the Laboratory of Mechanics in Lille are involved in the development of a biomechanical model for the simulation of interactions between pelvic dynamics of the pregnant woman and the fetus during delivery.
The purpose of this project is now to change the existing 3D model to allow interaction with the fetus during labor. This implies ultimately to provide a more or less "patientspecific" geometric model, and to set it under different scenarios (orientation and shape of the pelvis, the fetal head, altering the properties of ligaments, etc.).
This 3D model can be used to perform biomechanical simulations, which will aim to:
1) assist in understanding the role of various anatomical structures,
2) observe the physiological damage consecutive to the childbirth, and characterize the possible induced trauma.
Proposed missions:
The physiological model of childbirth will be degraded to move towards more interactive simulations. This is reflected for example by the use of lower resolution meshes, using laws of simpler behavior (linear, nonlinear) ...
This work will go through a phase of confrontation with the full model (taken as "ground truth") and dynamic MRI 2D images. The challenge will be to control the error while ensuring the mechanical validity.
It will be necessary to perform the analysis and the following developments:
 literature review leading to the definition of a relevant error criteria, a posteriori and possibly during the simulation,
 software implementation (C++) of the adopted solution in the CamiTK platform,
 coupling with the simulation software Abaqus=99.
Depending on the progress, it will be useful to adapt the 3D model (Catia=99) according to the different scenarios.
Area of expertise:
The candidate will ideally be skilled at the interface between mechanical and computer graphics applied to the biomedical simulation. Skills in computational mesh processing and numerical analysis would be a plus. URL sujet détaillé : http://liris.cnrs.fr/fabrice.jaillet/data/M2.ErreurAccouchement201415.pdf
Remarques : Contract Type: Master research or engineering school regulated internship
Équipe SAARA,
LIRIS UMR CNRS 5205,
Domaine scientifique de la Doua,
Bâtiment Nautibus,
2325 Av. Pierre de Coubertin,
F69100 Villeurbanne Cedex.
Contact :
LIRIS : Fabrice JAILLET (fabrice.jaillet (at) liris.cnrs.fr)
Florence ZARA (florence.zara (at) liris.cnrs.fr)
LML : Mathias BRIEU (mathias.brieu (at) eclille.fr)
LIRISCHU Lyon : Géry LAMBLIN (gery.lamblin (at) chulyon.fr)





SM20744 Bounded graph coloring


Description


A proper vertexcoloring of a graph G = (V, E) is a mapping from the set of vertices V to a set of colors, usually represented by the set of integers {1, ..., k}, in such a way that every two adjacent vertices are assigned distinct colors.
There exist many variations of graph coloring: coloring edges instead of vertices, coloring with lists instead of integers, adding constraints such as forbidding bicolored cycles...
One (quite natural) constraint on the coloration is to bind the number of elements (vertices or edges) that may receive the same color. In that case, we speak about bbounded coloring, where b denotes the maximum number of elements that may receive the same color. Such colorings have applications in timetable scheduling or job scheduling in multiprocessor systems for instance.
The goal of this internship is:  firstly, to review existing results on kbounded colorings,  secondly, to study the kbounded version of other types of graph colorings (such as, for example, oriented vertexcolorings or weight edgecolorings, two types of colorings wellstudied at LaBRI) both in their structural and algorithmic aspects.
An oriented coloring of an oriented graph is a proper coloring such that if there exists an arc uv with color(u) = a and color(v) = b, then there is no arc wx with color(w) = b and color(x) = a. A weight edgecoloring of an undirected graph is a mapping from its set of edges to a set of weights W = {1, ..., w} such that the mapping that assigns to each vertex the sum of the weights of its incidents edges is a proper vertexcoloring.
References. P. Hansen, A. Hertz and J. Kuplinsky. Bounded vertex colorings of graphs. Discrete Math. 111:305312 (1993). M. Jarvis and B. Zhou. Bounded vertex coloring of trees. Discrete Math. 212:145151 (2001). R. Rizzi and D. Cariolaro. Polynomial time complexity of edge colouring graphs with bounded colour classes. Algorithmica 69:494500 (2014). URL sujet détaillé :
:
Remarques : Coencadrant : Eric Sopena, PR LaBRI





SM20745 Types and interpretations for programs of polynomial time complexity


Description


Can one define simple and modular programming disciplines in a highlevel language ensuring that the programs produced have a bounded complexity, for instance of polynomial time? This question is explored by the field of implicit computational complexity: it aims at defining static complexity criteria, which do not rely on explicit bounds on resources. This internship proposal focuses on implicit complexity criteria for functional programs, which can use higherorder subprograms. It aims at relating two approaches for this issue, based respectively on type systems and on the semantic notion of interpretation. URL sujet détaillé : http://perso.enslyon.fr/patrick.baillot/STAGES/2015/sujet_interpretations2015.pdf=09
Remarques :





SM20746 Evaluation Sécurisée d'une BoiteS


Description


L'exécution d'un algorithme cryptographique (DES, AES, RSA,...) sur un appareil embarqué implique, en l'absence de précautions particulières dans l'implémentation, des variations dans le comportement de l'appareil. Ces variations peuvent être observées (via la consommation de courant, les émissions électromagnétiques,...) et combinées à l'aide de méthodes statistiques afin d'en déduire les secrets manipulés par l'appareil. L'implémentation de fonctions cryptographiques pour des applications embarquées (carte à puce par exemple) nécessite par conséquent des contremesures contre de telles attaques, dites attaques par canaux auxiliaires. Une contremesure classique, appelée masquage, consiste à séparer les entrées des primitives cryptographiques en plusieurs parties générées aléatoirement puis à remplacer les opérations sensibles (i.e. opérant sur des données secrètes) par des séquences de calculs n'opérant que sur un nombre limité de ces parties. Les façons d'appliquer ce type de contremesures diffèrent pour chaque primitive et elles s'appuient sur des techniques de Secure Multiparty Computation, de codage et/ou de calcul efficace sur des corps finis.
La sécurisation de l'évaluation de certaines primitives appelées boitesS à l'aide du masquage est particulièrement difficile et pose des questions qui rejoignent des problématiques bien connues en implantation matérielle efficace de polynômes définis sur des corps finis de caractéristique 2. Des récents travaux ont ainsi montré que la difficulté de protéger l'implantation d'une boiteS dépendait du nombre de multiplications nécessaires à l'évaluation du polynôme correspondant à la boiteS (les multiplications représentant des élévations au carré étant omises dans le calcul de la complexité). Une série d'articles a ainsi conduit à la publication de méthodes d'évaluation essayant de minimiser le nombre de multiplications. La plus efficace d'entre elles, publiée dans les actes de la conférence CHES cette année, peut sur plusieurs points être considérée comme adhoc.
Le but du stage proposé ici est d'investiger des pistes permettant de rendre la méthode [CRR14] plus formelle et, éventuellement, d'en améliorer l'efficacité. Pour se faire, l'étudiant devra dans un premier temps se familiariser avec l'état de l'art. Puis, il devra implanter, dans un système de calcul formel comme Maple ou Magma, la solution proposée dans [CRR14]. La seconde partie du stage consistera à rechercher avec les encadrants des pistes pour améliorer l'état de l'art.
[CRR14] J.S. Coron, A. Roy, and S. Vivek. "Fast Evaluation of Polynomials over Finite Fields and Application to Sidechannel Countermeasures". In CHES, 2014. To appear. URL sujet détaillé : http://wwwsalsa.lip6.fr/~perret/Site/stageEval.pdf
Remarques :





SM20747 FouCo


Description


Isabelle/HOL is a popular proof assistant based on higherorder logic. It has recently been extended with support for codatatypes and corecursion. The goal of this project would be to extend Isabelle so that it can process a larger class of corecursive functions, i.e., functions that produce possibly infinite values. This would enable Isabelle to go beyond the state of the art in its competitors (Agda, Coq, and Matita).
The implementation language is Standard ML (an AngloSaxon dialect of OCaml). The implementation would consist of two modules:
1. A command that would parse the specification of a function f, define f in terms of a socalled corecursor, and derive some theorems.
2. A command that would be used to derive the corecursor needed for point 1.
We explicitly do not expect the internship to result in the completion of both modules. The supervisor and one of the cosupervisors would also participate in the coding effort. URL sujet détaillé : http://www21.in.tum.de/~blanchet/corec.pdf
Remarques : Coencadrants : Andrei POPESCU et Dmitriy TRAYTEL.





SM20748 Relativized Implicit Computational Complexity and Declassification


Description


On one hand, implicit computational complexity is the attempt to characterize complexity classes without having to refer to specific machines. Statically analyzing a program may give bounds on the complexity of this program. On the other hand, confidentiality and integrity properties can be statically analyzed: each variable is given a security level (for the sake of simplicity assume two levels : public and private) and various systems have been proposed to ensure that no information may leak from private to public variables. Both of those problems rely heavily on the noninterference property. Roughly two variables are noninterferent when they are independant, i.e. when changing the value of one variable does not change the value of the other one. Several techniques of noninterference analysis have been successfully adapted to the field of implicit computation complexity : the fact that variables are non interferent allows to hierarchize computations in tiers.
Noninterference is too strict to cope with many every day life scenarios, for instance think at the password checking : a secret value is compared to a public value and the result of the test is public. Thus many declassification frameworks have been proposed.
The aim of the internship is to explore the following idea : in some declassification settings downgrading of information is possible through some fixed functions. Is it possible to reuse those kind of frameworks from an implicit computation point of view with the idea that those downgrading are viewed as oracles ? Is it possible to give the characterization of relativized (to the oracle) complexity classes in this context ?
References :
[1] Frédéric Prost: Enforcing Dynamic Interference Policy. SocialCom/PASSAT 2011.
[2] JeanYves Marion: A Type System for Complexity Flow Analysis. LICS 2011.
[2] Andrei Sabelfeld, David Sands: Declassification: Dimensions and principles. Journal of Computer Security 17(5): 517548 (2009). URL sujet détaillé :
:
Remarques :





SM20749 Preventing SQL Injection Attacks using Information Flow Control


Description


Keywords : Security, SQL Injections, Access/Flow/Dissemination Control, Declassification
Since the last decade, the importance of web security has tremendously increased. In fact, all companies have a web portal and many collaborative services are nowadays provided by web applications. Software vulnerabilities like buffer overflows and format strings are less and less significant compared to higher level attacks on web applications such as SQL injections, XSS, CSRF... In this work we want to focus on SQL injections.
An SQL injection is due to the fact that an application (e.g. in Java), that is responsible for most of the computations and input/output, interact with a database application (e.g. in SQL), responsible for most of the data storage and retrieval. Thus, the Java application transmits SQL queries to the database without knowing what is being requested: an SQL query is just a string without any semantic meaning for the Java application (we voluntarily exclude parameterized queries, which only solve part of the problem). It is this blind interaction that is largely responsible for transmitting malicious SQL queries (i.e. injections) to DBMS that Java applications cannot detect. Moreover, beyond the different languages used to develop these applications, this interaction is also complicated due to the different paradigms used in these two programs: imperative for Java and declarative for SQL. In this work, we aim at preventing SQL injections by tracking endtoend information flow through both the application and the database.
Many contributions have been proposed to deal with information flow control in imperative or declarative programs, but only few of them [1,6] have tracked endtoend information flow through the whole program (both imperative and declarative). The main matter of this work is to study this interaction and to propose an approach to track data from their input in the imperative system to their storage in the declarative system, and conversely to track the data output through the declarative layer and then through the imperative one. For this purpose, the dissemination control model TBAC [2], proposed for declarative systems, will be adapted and used throughout the whole system.
During this internship, the student will have to conduct a literature review on SQL injection countermeasures [7], on decentralized information flow control [3], dissemination control [4] and also on declassification [5] which is a central point in the TBAC approach. Then the student will have to propose a common imperative/declarative formalism tracking information flows through the application and the database and taking declassification into account. Finally, a prototype will be developed to validate the proposed approach and to evaluate the performance overhead. Please note that this internship may lead to a PhD thesis.
References : [1] David Schultz, Barbara Liskov. IFDB: decentralized information flow control for databases. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys), 2013.
[2] François Lesueur, Sabina Surdu, Romuald Thion, Yann Gripay, Meriam Ben GhorbelTalbi. Palpable Privacy through Declarative Information Flows Tracking for Smart Buildings. In Proceedings of the International Conference on Availability, Reliability and Security (ARES), Suisse, 2014.
[3] Andrew C. Myers Barbara Liskov. A Decentralized Model for Information Flow Control. in the Proceedings of the ACM Symposium on Operating Systems Principles, France, 1997.
[4] Roshan K. Thomas, Ravi Sandhu. Towards a Multidimensional Characterization of Dissemination Control. Proceedings of the Fifth IEEE International Workshop on Policies for Distributed Systems and Networks, POLICY 2004.
[5] Andrei Sabelfeld, David Sands. Dimensions and Principles of Declassification. In Proceedings of the IEEE Computer Security Foundations Workshop, France, 2005.
[6] Li, Peng, et S. Zdancewic. Practical information flow control in Webbased information systems. 18th IEEE Workshop In Computer Security Foundations CSFW, 2005.
[7] William G.J. Halfond, Jeremy Viegas, and Alessandro Orso. A Classification of SQL Injection Attacks and Countermeasures. Proceedings of the IEEE International Symposium on Secure Software Engineering, USA, 2006. URL sujet détaillé :
:
Remarques : Ce stage sera rénuméré et coencadré par Meriam Talbi.





SM20750 Optimizing avatars lifecycle in a Cloud infrastructure


Description


Avatars are virtual abstractions to extend physical objects on the Web. They are a central piece of the ASAWoO project (http://liris.cnrs.fr/asawoo/). A cloud infrastructure is an ideal location for avatars, since they can rely on the almost unlimited resources provided by the interconnected cloud infrastructures. Therefore, we need to study several aspects about the lifecycle of those avatars. Those aspects range from creation and destruction to migration of avatars while considering resource usage (energy consumption, storage space, bandwidth usage, processing power,...). Thus, we need to study how to optimize avatar instantiation, deployment, execution and hibernation in a cloud infrastructure and ultimately migrate an avatar between two infrastructures when required.
The cloud software architecture will be built on top of an existing infrastructure currently hosted at the computer science department of Université Claude Bernard Lyon 1 : This infrastructure is currently composed of 10 nodes based on hexaprocessor machines globally providing 640 GB RAM and 52 TB disk space. The current software cloud solution running is the free, open source OpenStack solution.
Requirements : Some knowledge about OSGi and Openstack is preferred
Resources and links :
Orgerie, A.C.; Lefevre, L.; Gelas, J.P., "Demystifying energy consumption in Grids and Clouds", 2010 International Green Computing Conference, pp.335,342, 1518 Aug. 2010. Lefevre, L.; Orgerie, A.C., "Designing and evaluating an energy efficient Cloud", The Journal of Supercomputing, vol.51, n.3, March 2010. Koomey, J. "Growth in data center electricity use 2005 to 2010", Analytics Press, 2011. Orgerie, A.C.; Dias de Assunçio, M.; Lefèvre, L., "A Survey on Techniques for Improving the Energy Efficiency of Large Scale Distributed Systems", ACM Computing Surveys, 2013. Topic: TIWE
URL sujet détaillé :
:
Remarques : Coencadrement : JeanPatrick Gelas





SM20751 Asynchronous convergence of monotone Boolean networks


Description


Boolean networks are finite dynamical systems that consist in a finite number of binary variables evolving, in a discrete time, through mutual interactions and according to a fixed low. They are use in many disciplines for their ability to model nonlinear behaviors. Most famous applications are in biology, where there are classical models for the dynamics of neural and gene networks.
This training course consists in studying the asynchronous dynamics of monotone Boolean networks, which present an interesting intermediate level of complexity, and which has interesting applications in the context of biological networks. It has been proved in [1] that, given a monotone Boolean network and an initial configuration x, the set R of configurations reachable from x by some asynchronous updating has the following properties:
1/ R contains a fixed point reachable from x by updating to most one time each component;
2/ R contains a unique maximal fixed point and a unique minimal fixed point, both reachable from x by updating at most two times each component.
These results suggest the following natural question, which is the core of the subject: Is every fixed point of R reachable from x by updating a polynomial number of times each component ?
The study could start with ANDOR nets, i.e. monotonous Boolean networks in which each local transition function is either a disjonction or a conjonction of positive literals. These networks can thus be represented by a directed graph with two types of vertices (andvertices and orvertices), and the use of concepts from graph theory could allows rapid and informative progress on the main question, which is difficult in the general case.
[1] T. Melliti, D. Regnault, A. Richard and S. Sené. On the convergence of Boolean automata networks without negative cycles. In J. Kari, M. Kutrib, A. Malcher, Cellular Automata and Discrete Complex Systems (Proceeding of AUTOMATA 2013), Springer Berlin Heidelberg, LNCS 8155, 124138. {em PDF version}: {verb+http://www.i3s.unice.fr/~richard/+}
URL sujet détaillé : http://www.i3s.unice.fr/~richard/stageM2/stage_M2_ENS_LYON.pdf
Remarques : Rémunération: Gratification usuelle (436,05=80)





SM20752 Generation of RDF triples from raw sensor data


Description


The objective of this MSc topic is to implement a tool capable of processing massive raw data gathered from primary sources of all kinds, such as sensor devices, surveillance videos, manually entered inputs, etc., cleaning it up into consistently uniform RDF representations. The generated RDF data is to comprise triplestores encompassing data to be analyzed and further processed into intensional knowledge characterizing the sort of specific information this data relies on. URL sujet détaillé : http://cedar.liris.cnrs.fr/papers/mstopic1.pdf
Remarques :





SM20753 Derivation of ontologies from RDF triplestores


Description


This MSc topic concerns the analysis of RDF triplestores generated from raw sensor data in order to derive terminological schemas that the generated RDF triples abide by. The objective is to categorize these triplestore data using classes and attributes fitting their structure using Formal Concept Analysis (FCA). The outcome is to constitute a terminological schema for this data that can be used for reasoning about such data. URL sujet détaillé : http://cedar.liris.cnrs.fr/papers/mstopic2.pdf
Remarques :





SM20754 Use of terminological knowledge for monitoring applications


Description


This topic's objective is to develop a use case exploiting terminological knowledge over monitoring data with reasoning based on OrderSorted Feature constraint logic for smart information retrieval, data analytics, and query processing.4 The object is to demonstrate how to leverage the capabilities brought by knowledgebased processing of monitoring data for intelligent applications with a convincing use case based on actual gathered data. URL sujet détaillé : http://cedar.liris.cnrs.fr/papers/mstopic3.pdf
Remarques :





SM20755 DYNFO and Regular Trees


Description


The seminal result of Courcelle et al. states that every MSOL (optimisation/counting) property can be solved in polynomial time in graph classes of bounded cliquewidth. Even though this is interesting, many objects in real life are dynamic (for instance updates in databases) and when a database is modified recomputing a decomposition is counterproductive. Instead, we would like to maintain the answers as the database changes. This situation is addressed by the introduction of DynFO, the properties that can be maintained and queried in First Order logic. Since its introduction the main interest was in some small set of graph properties and in regular words. The goal is to extend the study to regular trees and then to structures that are images of trees (graphs of bounded treewidth/cliquewidth, matroids/hypergraphs of bounded width,etc.)
For more details see the link below. URL sujet détaillé : http://www.isima.fr/~kante/cours/DynFO.pdf
Remarques : fund is possible (around 437euros)





SM20756 Algorithms for the Traveling Salesman Problem


Description


The wellstudied symmetric Traveling Salesman Problem (STSP) is to find a minimumcost tour in an undirected, weighted complete graph in which the edge weights are nonnegative and obey the triangle inequality. While the problem is NPcomplete, there is a simple algorithm with an approximation guarantee of 2 based on doubling a minimum spanning tree. A slightly more sophisticated algorithm due to Christofides shows that the problem can be approximated to within a factor of 3/2. Improving on this factor is a wellknown open problem.
In the last few years, there have been many exciting results (e.g. MoemkeSvensson 2011, Mucha 2012, SeboVygen 2012) pertaining to a special case known as GraphTSP, where the edge weights correspond to the shortest path distances in a given unweighted graph. The current bestknown approximation guarantee for this problem stands at 1.4. These new approaches have resulted in many interesting related questions in graph theory.
The goal of this project is to study the recent developments and related questions.
Requirement: interest in Algorithms and Graph Theory. URL sujet détaillé :
:
Remarques :





SM20757 Formal proofs for realtime analysis using CoQ


Description


Scientific context: Realtime systems are systems that must satisfy timing requirements such as response time bounds, that is, bounds on the maximum time it may take a software task to complete its execution once it has been activated. Establishing such properties is not a trivial thing because tasks executing on the same processor may have complex timing influence on each other due to scheduling. Busy window analysis [1] is a wellestablished method to compute bounds on worstcase response times in uniprocessor realtime systems that is based on a fixpoint computation. It was originally introduced for the Static Priority Preemptive scheduling policy (SPP), then reused for the nonpreemptive case (SPNP) [2] and has since been an inspiration for the analysis of many new scheduling policies. However, these proofs are too often somewhat informal. For example, the original SPNP analysis had a flaw which was discovered only years after its original publication [3].
Thesis subject: The objective of this thesis is to initiate formal support based on the proof assistant CoQ [4] for these analyses. The student will have to:  get acquainted with the SPP and SPNP scheduling policies, busy window analysis and its extension to arbitrary event models [5];  identify the proper concepts and lemmas of the original proofs;  and formalize them in CoQ. A particularly interesting sideeffect of this work would be to identify the key properties of the concept of busy window so that it can be easily adapted for proofs of other scheduling policies.
Bibliography: [1] Ken Tindell, Alan Burns, Andy J. Wellings: An Extendible Approach for Analyzing Fixed Priority Hard RealTime Tasks. RealTime Systems 6(2): 133151 (1994) [2] Ken Tindell, Alan Burns: Guaranteeing message latencies on Controller Area Network (CAN). Proceedings of 1st international CAN conference: 111 (1994) [3] Robert I. Davis, Alan Burns, Reinder J. Bril, Johan J. Lukkien: Controller Area Network (CAN) schedulability analysis: Refuted, revisited and revised. RealTime Systems 35(3): 239272 (2007) [4] http://coq.inria.fr/ [5] Arne Hamann, Rafik Henia, Razvan Racu, Marek Jersak, Kai Richter und Rolf Ernst: SymTA/S  Symbolic Timing Analysis for Systems. WiP Proceedings of the Euromicro Conference on RealTime Systems: 1720 (2004) URL sujet détaillé :
:
Remarques : This thesis could result in a publication and/or a PhD thesis. There is a strong collaboration with the group of Rolf Ernst at TU Braunschweig (Germany) which could give the student the opportunity to travel there if she/he is interested.
Feel free to contact me if you have questions related to this subject!





SM20758 Incremental proofs for realtime analysis à la SymTA/S


Description


Scientific context: Realtime systems must satisfy timing requirements such as response time bounds, i.e., bounds on the maximum time it may take a software task to complete its execution once it has been activated. Establishing such properties is nontrivial as tasks executing on the same processor may have complex timing influence on each other due to scheduling. Furthermore, such effects propagate through the system so that tasks executing on different processors may also influence each other. Compositional Performance Analysis (CPA), best known through its commercial implementation SymTA/S [1], is a compositional approach for computing such bounds. It is based on fixpoint computation at the system level on top of busy window analyses [2] at the processor level  which means that there are nested fixpoint computations. In the current state of the art, any change to the sytem requires to perform a new analysis from scratch.
Thesis subject: The objective of this master thesis is to establish simple criteria for guaranteeing that it is correct to reuse, for a responsetime analysis, the fixpoint obtained by previous analysis. The student will have to:  understand the principles of CPA and get acquainted with its free Pythonbased implementation pyCPA [3];  establish a list of structural changes (adding/removing a task, replacing/remapping a task etc.) for which previous results may be reused and formally prove the correctness of the approach;  implement the established criteria in pyCPA and quantify the efficiency gain compared to the state of the art.
Bibliography: [1] Arne Hamann, Rafik Henia, Razvan Racu, Marek Jersak, Kai Richter und Rolf Ernst: SymTA/S  Symbolic Timing Analysis for Systems. WiP Proceedings of the Euromicro Conference on RealTime Systems: 1720 (2004) [2] Ken Tindell, Alan Burns, Andy J. Wellings: An Extendible Approach for Analyzing Fixed Priority Hard RealTime Tasks. RealTime Systems 6(2): 133151 (1994) [3] pycpa.readthedocs.org/ URL sujet détaillé :
:
Remarques : This thesis could result in a publication and/or a PhD thesis on a related topic. There is a strong collaboration with the group of Rolf Ernst at TU Braunschweig (Germany) which could give the student the opportunity to travel there if she/he is interested.
Feel free to contact me if you have questions related to this subject!





SM20759 Modèles mécaniques formels pour l'algorithmique distribuée


Description


Distributed computing is one of the domains where informal reasoning is not an option, in particular when Byzantine failures are involved. What characterises also Distributed Computing is its diversity of models subtle modifications of which induce radical change in the system behaviour. We consider Robot Networks, that is swarms of *autonomous* mobile entities that have to accomplish some task in *cooperation*. In this emerging framework, models can be distinguished by the capabilities of robots, the topology of the considered space, the level of synchrony (that is the properties of a demon), the type of the failures likely to occur, etc.
We are interested in obtaining formal and moreover mechanical guarantees of properties for certain protocols, using the Coq proof assistant. A Coq framework for robot networks recently proposed can express quite a few variants of models for such networks, in particular regarding topology or demon properties. This framework is higher order and based on coinductive types. It allows us to prove in Coq positive results (the task will be fulfilled using the algorithm embedded in all robots *for all* initial configuration) as well as negative results (*there cannot be any* embedded algorithm that will work for this task for all initial configuration).
An objective of this proposal is to extend this framework to some particular topologies (higher dimensions) so as to propose a framework in which some new results could be formally established, in particular regarding the election of a leader, or the formation of geometrical patterns. Those two problems have been shown equivalent (meaning that there is an algorithm that solves one of them if one has as an input a solution for the other one) under some particular assumptions. Extending this equivalence to less constrained models is an interesting open problem. URL sujet détaillé : http://pactole.lri.fr/pub/M2_1415.pdf
Remarques : Coencadrement, avec :  Pierre Courtieu (pierre.courtieu.fr)
 Sébastien Tixeuil (sebastien.tixeuil.fr)
Compétence en Coq nécessaire.





SM20760 Strategies to deal with deadline misses in a weaklyhard realtime system


Description


Scientific context: Realtime systems are systems that must satisfy timing requirements such as response time bounds, that is, bounds on the maximum time it may take a software task to complete its execution once it has been activated. For some systems called weaklyhard, the specified timing requirements may be violated once in a while, as long as the number of violations can be bounded. Establishing such properties is not a trivial thing because tasks executing on the same processor may have complex timing influence on each other due to scheduling. Busy window analysis [1] is a wellestablished method to compute bounds on worstcase response times in uniprocessor realtime systems that is based on a fixpoint computation. In the current state of the art, the analyzed scheduler is deadlineagnostic: whenever a timing requirement is violated because the specified bound on response times has been exceeded, the task that is experiencing this violation still runs to completion. An alternative would be to kill the belated execution, which would reduce the delay experienced by subsequent executions.
Thesis subject: The objective of this thesis is to adapt busy window analysis for the Static Priority Preemptive (SPP) scheduling policy assuming that executions are stopped as soon as they miss their deadline. The student will have to:  get acquainted with the SPP scheduling policy, busy window analysis and weaklyhard systems analysis;  adapt the standard response time analysis to support killing of task executions missing their deadline as well as the corresponding weaklyhard analysis;  implement the proposed method within the free Pythonbased response time analysis tool pyCPA [2] and test it on existing test cases. An interesting extension of this work would be to extend the approach to arbitrary event models (as opposed to fully periodic tasks) [3]. The longterm goal is to combine these results with an analysis of the impact of deadline misses on the controller receiving the output generated by the analyzed tasks.
Bibliography: [1] Ken Tindell, Alan Burns, Andy J. Wellings: An Extendible Approach for Analyzing Fixed Priority Hard RealTime Tasks. RealTime Systems 6(2): 133151 (1994) [2] pycpa.readthedocs.org/ [3] Arne Hamann, Rafik Henia, Razvan Racu, Marek Jersak, Kai Richter und Rolf Ernst: SymTA/S  Symbolic Timing Analysis for Systems. WiP Proceedings of the Euromicro Conference on RealTime Systems: 1720 (2004) URL sujet détaillé :
:
Remarques : This thesis could result in a publication and/or a PhD thesis. There is a strong collaboration with the group of Rolf Ernst at TU Braunschweig (Germany) which could give the student the opportunity to travel there if she/he is interested.
Feel free to contact me if you have questions related to this subject!





SM20761 Efficient symbolic resolution of systems of linear differential and difference equations


Description


In the context of problems involving solutions of linear differential or difference equations, computer algebra is often crucially based on the resolution of auxiliary systems in terms of rational functions. The simplest approach to this problem is to uncouple the system, i.e., to obtain a scalar equation of higher order for each component of the vector solutions. Rational solutions are then computed using algorithms due to Abramov for the resolution of scalar equations. The uncoupling procedure by itself is already time consuming, so it seems advantageous to use direct solution algorithms that avoid uncoupling. The basis of such an algorithm exists (Barkatou, 1999), but it remains to adapt it to applications and to precisely quantify the expected gain.
The internship is expected to complete two goals. First, the algorithm described in (Barkatou, 1999) will be validated by an implementation in Maple, first in its simplest version, then in its parameterized version. This parameterized version will then be integrated in the Mgfun package that implements algorithms by (Chyzak, 2000) for the symbolic summation and integration in Maple.
 Moulay A. Barkatou: On rational solutions of systems of linear differential equations. Journal of Symbolic Computation 28 (1999), 547=96567.  Frédéric Chyzak: An extension of Zeilberger's fast algorithm to general holonomic functions. Discrete Mathematics 217 (2000), 115=96134. URL sujet détaillé : http://specfun.inria.fr/internships/
Remarques : Cosupervised by
Alin Bostan (alin.bostan((at))inria.fr) and
Frédéric Chyzak (frederic.chyzak((at))inria.fr)





SM20762 Efficient computation of power series coefficients


Description


The need to compute a single coefficient of a given formal power series is frequent in combinatorics. Indeed, in many case, the enumeration of objects subject to certain structural constraints is reduced to extracting a coefficient of a series. In more elaborate contexts, several parameters are involved, and the corresponding power series have several variables. The question is then to extract the coefficient of a monomial in several variables. For obvious efficiency reasons, it is important to be able to calculate this coefficient without computing all previous coefficients. This internship focuses on two fast algorithms, due to Massazza and Radicioni, for extracting such coefficients for classes of power series whose coefficients are related by a set of linear recurrences. The main objective of the internship is to implement these algorithms and compare them on various applications.
 P. Massazza and R. Radicioni: On computing the coefficients of rational formal series. (2004). In: Formal Power Series and Algebraic Combinatorics, Vancouver, 2004.  P. Massazza and R. Radicioni: On computing the coefficients of bivariate holonomic formal series. Theoretical Computer Science 346 (2005), no. 23, 418=96438. URL sujet détaillé : http://specfun.inria.fr/internships/
Remarques : Cosupervised by
Alin Bostan (alin.bostan((at))inria.fr) and
Frédéric Chyzak (frederic.chyzak((at))inria.fr)





SM20763 Code automodifiant pour la sécurité des systèmes embarqués


Description


The CEA is developing a technology for runtime code generation around a tool called deGoal, which initial motivation is the observation that execution performance may be strongly correlated to the characteristics of the data to process (cf. related publications on this website).
Preliminary developments have been carried out to achieve the objectives of the project: the deGoal framework has been extended so that code polymorphism could be applicable to embedded constrained platforms such as microcontrollers; a preliminary validation of our approach has been performed on AES and experimental measures have been carried out.
The objectives of this internship are to continue the implementation of our software solution for code polymorphism, and to focus in particular on the performance of runtime code generation to reduce the overall overhead of our approach.
The intern will be in charge of:
identifying the main performance issues in the current implementation perform an analysis about the potential degree of polymorphism achievable with our solution design and implement more efficient solutions. The solutions will be taken in the perspective of the amount of polymorphic instances that could be generated with the new solution. validate the approach experimentally URL sujet détaillé : http://www.cogitoanr.fr/posts/internshippositionatceagrenoble.html
Remarques :





SM20764 Algorithmic proofs for the transcendence of power series


Description


A formal power series is called algebraic if it is a root of a polynomial, and transcendental otherwise. Establishing the transcendence of a generating function is a constant need in enumerative combinatorics. The goals of the internship are the study, the comparison and the implementation of algorithmic methods allowing to answer in a uniform manner the following question: Given a sequence of rational numbers satisfying a linear recurrence with polynomial coefficients, decide the transcendence of its generating function. The question is non trivial even for sequences satisfying a recurrence of first order. A classic algorithm is sufficient, in principle, to answer the general case (Singer, 1979). However, this algorithm suffers from too high a complexity to be effective in practice. A more recent method has been proposed, on a nontrivial combinatorial example, in (Bostan et al, 2014). The central objectives of this internship are to understand the new method, to translate it into an explicit algorithm working uniformly on a class of inputs to be defined, to implement this algorithm, test it and measure its theoretical and practical efficiency.
 A. Bostan, M. BousquetMélou, M. Kauers, S. Melczer : On 3dimensional lattice walks confined to the positive octant (2014), http://arxiv.org/abs/1409.3669.  M. F. Singer: Algebraic Solutions of nth Order Linear Differential Equations. Queens Papers in Pure and Applied Mathematics, 54 (1979), 379=96420. URL sujet détaillé : http://specfun.inria.fr/internships/
Remarques : Cosupervised by
Alin Bostan (alin.bostan((at))inria.fr) and
Frédéric Chyzak (frederic.chyzak((at))inria.fr)





SM20765 Verification of Distributed Systems with Parameterized Network Topology


Description


We consider distributed systems that can be run on arbitrary network topologies from a given class of architectures (e.g., on all pipelines, all grids, or all ring topologies). Typically, such a program is given by a single sequential process, and a copy of that process can be run on any node in the given network topology.
Parameterized systems are hard to analyze, and their verification problems quickly turn undecidable. There have been several orthogonal approaches to regain decidability: (i) restricting the amount of information that can be exchanged (e.g., unary tokens instead of binary ones), (ii) relaxing pointtopoint communication towards broadcasting or nondeterministic choice, or (iii) imposing a bound on the number of contexts that a process traverses; a context may, e.g., restrict communication to one fixed neighbor, or to one type of communication such as =93send=94. In the internship, we propose to investigate some extensions of (iii). See also the detailed description of the internship and the references therein. URL sujet détaillé : http://www.lsv.enscachan.fr/~bollig/MPRI/stageM2.pdf
Remarques : Coencadré par Benedikt Bollig: http://www.lsv.enscachan.fr/~bollig/





SM20766 Cordes pour l'analyse dynamique de formes 2D et 3D


Description


Que ce soit pour l'analyse de formes 2D/3D ou la modélisation interactive, il est important de pouvoir caractériser localement l'épaisseur d'un objet géométrique. Une approche classique consiste, en un point de la surface, à tracer une polyligne à l'intérieur de la forme de manière à atteindre un point situé de l'autre côté. De cette polyligne appelée corde, plusieurs informations géométriques peuvent être extraites : la longueur comme estimation de l'épaisseur locale, le centre comme appartenant à une approximation de l'axe médian de la forme, etc.
Si nous considérons un ensemble de cordes issues d'un ensemble de points sur la surface, nous pouvons également extraire des descripteurs globaux sur la forme 2D ou 3D (distribution d'épaisseur). Par ailleurs, dans un contexte de modélisation interactive où l'utilisateur fait évoluer la surface, ces cordes sont cruciales pour détecter des changements topologiques (autointersections...).
Dans la littérature, différents travaux se sont intéressés à ces structures. Les objectifs du stage seraient les suivants:
D'un point de vue algorithmique, plusieurs approches existent pour la construction de ces cordes (dans la direction de la normale au point, premier point de contact par croissance de boule, par un filtrage des boules de l'axe médian...). Il s'agira dans un premier temps d'implémenter des approches afin d'évaluer leurs coûts. Dans un contexte dynamique, comment va évoluer l'ensemble des cordes et quels sont les évènements auquel cet ensemble va être soumis. On se servira en particulier de cette analyse pour maintenir une structure de cordes lorsque l'on déforme la surface de l'objet. Pour de l'analyse de formes par la fonction d'épaisseur, quelles sont les robustesses/stablités de ces différentes approches en présence de bruit ? Estce que ce système de cordes peut être à la base d'un processus d'indexation et de reconnaissance de formes 3D ? Le stage se déroulera au sein du laboratoire LIRISLyon1 (bâtiment Nautibus)
Références: Fast and Accurate Approximation of Digital Shape Thickness Distribution in Arbitrary Dimension D. Coeurjolly, Computer Vision and Image Understanding (CVIU), Volume 116, Issue 12, December, pp 1159=961167, 2012 Shapira, L., Shamir, A., & CohenOr, D. (2008). Consistent mesh partitioning and skeletonisation using the shape diameter function. The Visual Computer, 24(4), 249259. RollandNeviere, X., Doërr, G., & Alliez, P. (2013). Robust diameterbased thickness estimation of 3D objects. Graphical Models, 75(6), 279296. A Dynamic Surface Reconstruction Framework for Large Unstructured Point Sets Rémi Allègre, Raphaëlle Chaine, and Samir Akkouche. Computers & Graphics, vol. 31, issue 2 URL sujet détaillé :
:
Remarques : coencadrement Raphaelle Chaine Indémnisation de stage prévue.





SM20767 Semicomputable geometry


Description


The internship is about computability theory over the real numbers (and not algorithmic geometry). More specifically, it is about computability of simple geometrical figures of the plane. A subset of the plane is computable if there is a program that draws a pixelized version of the set at any scale on a screen (it colors pixels that intersect the set in white, the others in black). A subset of the plane is semicomputable if there is a program that at each scale progressively fills the set on the screen (but we never know when the coloration is finished). For instance some sets appearing in complex analysis (Julia sets) are semicomputable but not computable.
The goal is to understand what it means for a simple geometrical figure, such as a filled triangle, to be semicomputable. A filled polygon is computable iff its coordinates are computable real numbers. However such a characterization is lacking in the case of semicomputable geometrical figures. Many questions arise and will be studied during the internship:
 How is the semicomputability of a triangle reflected in its coordinates?
 Is it possible to have a parametrization of triangles in such a way that the semicomputable triangles are exactly the triangles with semicomputable parameters (a real number x is semicomputable from below if there is a program semideciding the set of rational numbers below x)?
 If it is possible, what is the minimal number of parameters?
 What about other geometrical figures, such as disks?
Concepts from advanced computability theory will naturally come in when studying this problem. URL sujet détaillé : http://www.loria.fr/~hoyrup/geometry.pdf
Remarques :





SM20768 Géodésiques sur des surfaces digitales


Description


Dans de nombreuses applications d'analyse et de traitement de formes 3D, le calcul de géodésiques sur la surface d'objets est un problème préliminaire très important. Ainsi, être capable de construire, sur la surface, le plus court chemin entre deux points est un problème variationnel complexe d'un point de vue théorique et algorithmique.
Récemment, des approches par optimisation de champs de vecteurs ont été proposées avec un objectif encore plus général : il ne s'agit plus construire le plus court chemin entre deux points mais le transport optimal entre deux mesures sur la surface.
Le contexte applicatif de ce stage est l'analyse de structure fibreuses (cf image) pour laquelle la construction d'une métrique intrinsèque (géodesique) à la structure est curciale pour la reconstruction en graphe de ces objets ou encore la simulation de diffusion thermoelectique dans le matériaux.
Dans le cas des surfaces digitales (bords d'objets de Z^3), des premières expérimentations de ces outils ont été menées (au sein de la bibliothèque opensource DGtal). Il s'agira donc de :
Poursuivre l'expérimentation et notamment optimiser les calculs (en temps et en mémoire). Dans le cas de surfaces digitales, estce que des propriétés de convergence peuvent être obtenues sur ces structures ? Utiliser ces outils pour la caractérisation de matériaux fibreux. Références: Solomon, J., Rustamov, R., Guibas, L., & Butscher, A. (2014). Earth mover's distances on discrete surfaces. ACM Transactions on Graphics (TOG), 33(4), 67. Crane, K., Weischedel, C., & Wardetzky, M. (2013). Geodesics in heat: a new approach to computing distance based on heat flow. ACM Transactions on Graphics (TOG), 32(5), 152. ISO 690 DGtal: http://dgtal.org URL sujet détaillé : http://liris.cnrs.fr/david.coeurjolly/sujets/
Remarques : Indémnisation de stage prévue.





SM20769 Études des algorithmes pour différentes classes de jeux stochastiques, jeux représentés implicitement


Description


We have a team at PRiSM working on algorithmic game theory. We are interested in game with two players and randomness. The first task will be to compare the different methods to compute optimal strategies in all possible variations of theses games (the most classical examples being parity games and simple stochastic games). The aim is to understand what really makes hard a stochastic game (there are not NP complete but no polynomial algorithm are known to find optimal strategies for all but the most contrieved class of games).
Then we will try to give a relevant notion of game represented implicitely (for instance a graph whose states are given in some simple implicit manner). We will try to adapt the methods useful for classical stochastic games to these implicit games. We plan to use this kind of modelization for an application to traffic management algorithm.
We also plan to devise new methods to compute optimal strategies in classical games. We will try to adapt learning algorithms such as regret minimization to compute a sequence of strategies converging to the optimal straegy.
See the more detailed french subject at http://www.prism.uvsq.fr/~ystr/stages.html URL sujet détaillé : http://www.prism.uvsq.fr/~ystr/stages.html
Remarques : Coencadrement par David Auger et Pierre Coucheney.





SM20770 Complexity of enumeration


Description


This internship is part of the ANR project AGGREG, to which several teams in Lille, Marseille, Caen and Paris 7 participate. In complexity, we are usually interested in the time used to decide wether a problem has a solution. In enumeration complexity, we are interested in the time used to produce all solutions, which can be in very large number. In this context, one of the relevant complexity measure is the time between two produced solutions.
Quite a lot of complexity classes for enumeration have been designed, their major weakness being that they have no complete problem. However, it is possible to separate some of these classes modulo complexity hypotheses such as P != NP. The aim of this internship will be to separate some of these classes, in particular the incremental delay with the polynomial delay. To do this we will try to classify the problems which can be generated by simple rules of saturation of solutions. We hope to be able to prove, like it is done for CSP, that some rules yield a polynomial delay algorithm while the rest are hard in some way (to determine) for the incremental delay.
URL sujet détaillé : http://www.prism.uvsq.fr/~ystr/stages.html
Remarques : Possibilité de rémunération et de financement de thèse.





SM20771 Connectivity inference in structural proteomics


Description


Context and multidisciplinary nature: Molecular assemblies involving from tens to hundreds of macromolecules (proteins or nucleic acids) are commonplace, yet, very little is known on their structure. To fill this gap, a number of biophysical experiments (mass spectrometry, tandem affinity purification, etc.) providing information on the composition of subsystems called oligomers are being developed [SR07]. More precisely, given an assembly, one such experiment gives access to the composition of one subsystem, called oligomer, so that experiments for multiple overlapping oligomers convey information on the overall connectivity. For example, if one oligomer reduces to two proteins, these proteins touch in the assembly. Connectivity inference is the problem concerned with the elucidation of this connectivity. Instances for the connectivity inference problem can be modeled as a graph in which each vertex corresponds to a protein, and an assembly (oligomer or set of proteins) is a connected subset of vertices.
The problem is of multidisciplinary nature, since developing solutions requires a virtuous circle between biophysics on the one hand, and algorithmic  optimization  graph theory on the other hand.
Goals: The connectivity inference problem in the particular case where each protein species is represented by a single copy has recently been studied using tools from graph theory and integer linear programming [AAC+13, ACCC]. Roughly, the minimum connectivity inference problem is to find the smallest subset of edges of a complete graph such that each subset of vertices associated to an assembly gets connected. This optimization problem has been proved NPhard and hard to approximate in general. Nonetheless, when the number of proteins is moderate (less than 20), the problem can be solved efficiently using mixed integer linear programming (MILP) and a greedy approximation algorithm. These two types of algorithms generate ensembles of (optimal) solutions, from which consensus solutions were defined. Tests conducted on systems with up to 20 proteins show an almost perfect agreement between the predicted contacts and the experimentally determined ones.
The goal of this internship is to go beyond this state of the art, in the following directions: * Understand the relationship between the structure of the oligomers and the reported solutions. * Exploit hypothesis on the geometry of the proteins/oligomers, to speed up the optimization process. * Generalize the modeling in order to handle the cases where multiples copies of a protein are present.
References: [AAC+13] D. Agarwal, J. Araujo, C. Caillouet, F. Cazals, D. Coudert, and S. Perennes. Connectivity inference in mass spectrometry based structure determination. In H.L. Bodlaender and G.F. Italiano, editors, European Symposium on Algorithms (LNCS 8125), pages 289=96300, SophiaAntipolis, France, 2013. Springer.
[ACCC] D. Agarwal, C. Caillouet, F. Cazals, and D. Coudert. Unveiling contacts within macromolecular assemblies by solving minimum weight connectivity inference problems. submitted.
[DDC13] T. Dreyfus, V. Doye, and F. Cazals. Probing a continuum of macromolecular assembly models with graph templates of subcomplexes. Proteins: structure, function, and bioinformatics, 81(11):2034=962044, 2013.
[SR07] M. Sharon and C.V. Robinson. The role of mass spectrometry in structure elucidation of dynamic protein complexes. Annu. Rev. Biochem., 76:167=96193, 2007. URL sujet détaillé : ftp://ftpsop.inria.fr/coati/Stages/coatiabs1415.pdf
Remarques : Cosupervisors of the internship: * Christelle Caillouet (COATI)
* Frédéric Cazals (ABS, https://team.inria.fr/abs/)
Remuneration according to the Inria grid.
Possibility to followup with a PhD thesis.





SM20772 Modèles mécaniques formels pour l'algorithmique distribuée (comportements probabilistes)


Description


Distributed computing is one of the domains where informal reasoning is not an option, in particular when Byzantine failures are involved. What characterises also Distributed Computing is its diversity of models subtle modifications of which induce radical change in the system behaviour. We consider Robot Networks, that is swarms of *autonomous* mobile entities that have to accomplish some task in *cooperation*. In this emerging framework, models can be distinguished by the capabilities of robots, the topology of the considered space, the level of synchrony (that is the properties of a demon), the type of the failures likely to occur, etc.
We are interested in obtaining formal and moreover mechanical guarantees of properties for certain protocols, using the Coq proof assistant. A Coq framework (http://pactole.lri.fr) for robot networks recently proposed can express quite a few variants of models for such networks, in particular regarding topology or demon properties. This framework is higher order and based on coinductive types. It allows to prove in Coq positive results (the task will be fulfilled using the algorithm embedded in all robots *for all* initial configuration) as well as negative results (there cannot be *any* embedded algorithm that will work for this task for all initial configuration).
An objective of this proposal is to extend the formal framework so as to express and allow one to work on robot behaviours that are *probabilistic*. As a matter of fact, randomised algorithms are most convenient to handle situations where symmetricity makes any deterministic protocol useless. See for instance a recent article by Yamauchi & Yamashita (DISC 14). The Coq library ALEA will be an interesting starting point to this goal. URL sujet détaillé : http://pactole.lri.fr/pub/M2_1415_prob.pdf
Remarques : Coencadrement, avec :  Pierre Courtieu (pierre.courtieu.fr)
 Sébastien Tixeuil (sebastien.tixeuil.fr)
Compétence en Coq nécessaire.





SM20773 Information flow type preserving compilation in the CompCert verified compiler


Description


Cachebased attacks are a class of sidechannel attacks that are particularly effective in virtualized or cloudbased environments, where they have been used to recover secret keys from cryptographic implementations. One common approach to thwart cachebased attacks is to use constanttime implementations,i.e. which do not branch on secrets and do not perform memory accesses that depend on secrets. In [1], Barthe et al propose a new informationflow analysis [2] that checks if an x86 application executes in constanttime. The analysis if formally verified in the Coq proof assistant and integrated in the verified Compcert compiler [3]. The effectiveness of the tool has been demonstrated on several mediumsize cryptographic implementations like AES, DES and SHA256. The current tool rejects program which may branch on secrets. The goal of this internship is to leverage this limitation and allows such a branch if we can guaranty both branches will have the same executation time. We would like the approach to be developed at the RTL representation of the compcert compiler, and compile along Compcert chain trough assembly code. This internship requires a strong interest and background in formal semantics, static analysis and proof assistant.
References : [1] G. Barthe, G. Betarte, J. D. Campo, C. Luna and D. Pichardie. Systemlevel noninterference for constanttime cryptography. 21st Conference on Computer and Communications Security (CCS 2014). [2] D. M. Volpano, C. E. Irvine, G. Smith: A Sound Type System for Secure Flow Analysis. Journal of Computer Security 1996 [3] X. Leroy. A Formally Verified Compiler Backend. Journal of Automated Reasoning 2009. URL sujet détaillé :
:
Remarques : The internship will take place at IRISA Rennes or IMDEA Software in Madrid. David Pichardie (ENS Rennes) and Gilles Barthe (IMDEA Software Madrid) will supervise the internship.





SM20774 Internet of Things


Description


The Internet of Things (IoT) is the latest phase in the expansion of the Internet where billions of smart sensors and actuators will be reachable by their IPv6 address. A consequence of the large number of these devices is their lowcost requirement. This results in hardware that is typically very constrained in terms of processing power, memory and particularly energy. Nevertheless, 15 years of research have resulted in optimal distributed algorithms and protocols that allow =93smart objects=94 to form multihop wireless sensor networks (WSN) in order to facilitate data exchange.
One of the most widely used mechanisms for timing and suppression of messages in WSN is the Trickle algorithm [1]. For example, Trickle regulates the emission of control messages of the principal routing protocol for the Internet of Things, called RPL (RFC 6550), and therefore directly affects the lifetime of the network and of the individual nodes. Whatsoever, due to its wide spread use, Trickle has been separately standardized in RFC 6206, and many other popular protocols build upon it, making the understanding of its behavior crucial for performance optimization of control overhead.
Latest research, however, suggests that Trickle may suffer from unfairness  it distributes unequally the transmission load among nodes in a network. As a consequence of the limited energy supply of IoT devices, this leads to premature battery depletion and shortened lifetime of individual nodes, but also the whole network. The root cause for this unfairness is the probability of transmission with Trickle for individual nodes, dependent on the network topology. Due to the distributed nature of networks, each node only has a local view of its surroundings and therefore a different probability of transmission with Trickle.
The goal of the internship is to study this transmission probability and to derive a distributed algorithm that will equalize the possibility of transmission among nodes in the network, based on existing, analytical models of Trickle [2] [3] [4]. The algorithm may be coupled with a simple communication protocol, allowing nodes to communicate among themselves their current transmission probabilities and converge to the optimal one, which will render the equalized transmission load in a Trickle network. The derived algorithm will be implemented and validated by experiments for stateoftheart IoT devices and the Contiki operating system.
Requirements:  Solid math background and affinity for analytical models (see references), Basic Probability, Distributed Algorithmics  C Programming language  Networking (UDP, IPv6) Open to learn or/and exploit knowledge in:  IoT systems  Wireless Communications  Approximation Algorithms  Wireless Sensor Networks  Python Programming Language  Embedded Systems References: [1] P. A. Levis, N. Patel, D. Culler, and S. Shenker, =93Trickle: A self regulating algorithm for code propagation and maintenance in wireless sensor networks,=94 in Proceedings of the First USENIX/ACM Symposium on Networked Systems Design and Implementation,, 2004.
[2] H. Kermajani, C. Gomez, and M. Arshad, =93Modeling the message count of the trickle algorithm in a steadystate, static wireless sensor network,=94 Communications Letters, IEEE, vol. 16, no. 12, pp. 1960=96 1963, December 2012.
[3] T. M. Meyfroyt, S. C. Borst, O. J. Boxma, and D. Denteneer, =93Data Dissemination Performance in Largescale Sensor Networks,=94 in The 2014 ACM International Conference on Measurement and Modeling of Computer Systems,ser.SIGMETRICS'14. NewYork,NY,USA:ACM, 2014, pp. 395=96406.
[4] T. Coladon, M. Vucinic, and B. Tourancheau, =93Modeling the Trickle Algorithm and Increasing its Fairness with Variable Redundancy Constants,=94 Technical Report, Grenoble Informatics Laboratory, July 2014, pp. 19. URL sujet détaillé :
:
Remarques : Coencadrant: Malisa Vucinic Renumeration: ~400 eur/month





SM20775 Optimal complexity bounds for real root refinement for polynomial systems


Description


Given isolating intervals for all the real roots of a univariate polynomial having degree d and coefficients of bitsize at most t, we can refine all the intervals by a factor of k = 2^L in almost optimal Boolean complexity, O~(d^t + d L). The algorithm uses a combination of bisections and Newton iterations. We interested for an analogous optimal bound for the multivariate case.
The problem is as follows: We are given a 0dimensional polynomial system, f_1(x) = = f_n(x) = 0, where x = x_1, . . . , x_n. The polynomials f_i have degree at most d and integer coefficients of maximum bitsize t. For each isolated real root of the system, z, we are given an (isolating) hyperbox B_0 = [a_1, b_1] x x [a_n, b_n], where a_k, b_k are rationals, that contains z and no other root of the system. Let w_0=width(B)=max{b_1 − a_1,...,b_n − a_n}. We are targeting for an algorithm, with optimal Boolean complexity, for obtaining a refined hyperbox, B_t, such that width(B_t) = w_0/k, where k = 2^L, based on Newton operator. URL sujet détaillé :
:
Remarques :





SM20776 Automatic speech recognition


Description


Framework of ANR project ContNomina
The technologies involved in information retrieval in large audio/video databases are often based on the analysis of large, but closed, corpora, and on machine learning techniques and statistical modeling of the written and spoken language. The effectiveness of these approaches is now widely acknowledged, but they nevertheless have major flaws, particularly for what concern proper names, that are crucial for the interpretation of the content. In the context of diachronic data (data which change over time) new proper names appear constantly requiring dynamic updates of the lexicons and language models used by the speech recognition system. As a result, the ANR project ContNomina (20132017) focuses on the problem of proper names in automatic audio processing systems by exploiting in the most efficient way the context of the processed documents. To do this, the student will address the contextualization of the recognition module through the dynamic adjustment of the language model in order to make it more accurate.
Subject
Current systems for automatic speech recognition are based on statistical approaches. They require three components: an acoustic model, a lexicon and a language model. This stage will focus on the language model. The language model of our recognition system is based on a neural network learned from a large corpus of text. The problem is to reestimate the language model parameters for a new proper name depending on its context and a small number of adaptaion data. Several tracks can be explored: adapting the language model, using a class model or studying the notion of analogy. Our team has developed a fully automatic system for speech recognition to transcribe a radio broadcast from the corresponding audio file. The postdoc will develop a new module whose function is to integrate new proper names in the language model. URL sujet détaillé :
:
Remarques : To be familiar with the tools for automatic speech recognition, background in statistics and computer program skills (C, objectoriented programming and Perl). Candidates should email a detailed CV and diploma.





SM20777 Formal proof certificates using a zero knowledge protocol


Description


Formal verification permits to prove properties on programs. When a property on a software is true, it is still necessary to convince the users that it has been proven. If users generally trust big software companies, it may not be the case for isolated developpers or for free software. In case, the developper will have to prove to users that the proof has been done. It is the role of a proof certificate. This kind of certificates has been studied in the "Proof Carrying Code" (PCC) framework defined in [1]. In PCC, the developper gives to the user sufficient informations to rebuild and replay the proof on its software. The proof is replayed thanks to a checker that the user has to blindly trust.
In this internship, we want to go further than what proposes PCC. We want the developper to give to the user a certificate only showing that the proof exists. In particular, the user will not have to replay the proof and to trust a checker. This is possible by combining to areas of computer science that are, apparently, disconnected: mechanisms used in formal proofs on one side and cryptography on the other side. The idea is to see a logical property to prove as a ciphering key and the proof of this property as the secret key to decipher. Then, the user can send a random value (a nonce) ciphered by the property. If the developper has proven the property then he has the secret key. Thus, he will be abble to get the nonce in clear. To prove to the user that he has the proof, he will only have to send the nonce back: this is the certificate.
This is a form of zero knowledge protocol [2] commonly used in cryptography. This has, also, to do with the CurryHoward isomorphism used in the proof assistant Coq. This tool can be used to produce the proof of a property on a program. In Coq, a property is represented by a type and a proof by any lambdaterm having this type. Thus, if we consider a type (a property) has a ciphering key then a lambdaterm (a proof of this property) will be the deciphering key. This new case of the isomorphism has already been studied in [3]. The aim of this internship is to start back from this work and to produce a complete certification solution for any property of the propositionnal logic. In particular, it will be necessary to investigate what cryptographic functions are necessary and to prove that the solution is robust using results of [4]. On the cryptographic side, it will be also interesting to compare this approach to [5] and [6].
[1] George Necula. Proof carrying code. POPL'97.
[2] http://en.wikipedia.org/wiki/Zeroknowledge_proof
[3] Amrit Kumar, PierreAlain Fouque, Thomas Genet, Mehdi Tibouchi. Proofs as Cryptography: a new interpretation of the CurryHoward isomorphism for software certificates. http://hal.inria.fr/hal00715726
[4] Martin Abadi, Véronique Cortier. Deciding knowledge in security protocols under equational theories. TCS 367(12):232 (2006). http://users.soe.ucsc.edu/~abadi/Papers/AbadiCortier.pdf
[5] Yael Tauman Kalai, Ran Raz, and Ron D. Rothblum Delegation for bounded space In Proceedings of the 45th annual ACM symposium on Theory of computing (STOC), pp. 565574, 2013. Available at: http://research.microsoft.com/enus/um/people/yael/publications/2013delegationboundedspace.pdf
[6] Eli BenSasson, Alessandro Chiesa, Daniel Genkin, Eran Tromer, Madars Virza, SNARKs for C: verifying program executions succinctly and in zero knowledge , proc. CRYPTO 2013, Part II, LNCS 8043, 90108, Springer, 2013. Available at http://www.tau.ac.il/~tromer/papers/csnark20131007.pdf URL sujet détaillé :
:
Remarques : PierreAlain Fouque (http://www.di.ens.fr/~fouque/)





SM20778 Complexity of homology computation methods on cellular structures


Description


Many combinatorial structures exist to encode subdivisions of an ndimensional space. They are for instance used in topologybased geometrical modeling. For such applications, it is particularly important to be able to compute topological characteristics of the constructed object, such as connected components, holes in any dimension... These particular properties can be obtained through the computation of the socalled homology groups of the object.
We are particularly interested in the computation of such groups on structures derived from combinatorial maps. Several approaches have been theoretically explored, mainly incremental computation during the construction of an object, and direct computation on the whole object.
We are now studying the complexity of these approaches, and partial results have been obtained. The goal of the work is to explore the open questions related to complexity, from both theoretical and practical points of view.
The student will have to understand at least one of the methods to provide an efficient implementation of it, to check experimentally its complexity for different classes of objects, and also to study open problems related to the evaluation of its theoretical complexity.
Do not hesitate to contact us for further information.
References: [1] Ndimensional generalized combinatorial maps and cellular quasimanifolds, Lienhardt P., International Journal on Computational Geometry and Applications, Vol. 4, n=B0 3, pp. 275324  1994 [2] A Boundary Operator for Computing the Homology of Cellular Structures, Alayrangues S., Damiand G., Lienhardt P., Peltier S. [hal00683031 − version 2] URL sujet détaillé :
:
Remarques : coadvisors: _ Sylvie Alayrangues : sylvie.alayranguespoitiers.fr
_ Samuel Peltier : samuel.peltierpoitiers.fr





SM20779 Logique temporelle multiagents pour la modélisation formelle des exigences


Description


Requirement engineering consists in methods and tools which allows to provide a clear, non ambiguous specification of a system to design. I can be seen as the step from the informal to the formal. Agent and Goaloriented methods, such as KAOS and i*/Tropos, are particularly convincing academic propositions. They emphasize different concepts, in particular (1) agents, which represent the different parts/components of the system that play a role in the evolution of the system and collaborate in order to fulfill goals; (2) goals, which represent highlevel needs and are refined in more and more precise goals until we get the requirements of the system. These methods have already been related to formal methods. In particular, many goals can be seen as temporal logic formulas.
Onera has proposed a language (Khi) allowing to analyze the relations between goals an agents. To handle this, the semantics of Khi relies on a multiagents temporal logic, called Updatable Strategy Logic (USL). USL extends temporal logic with the ability to reason about the strategies that agents follow in order to fulfill their goals.
In this internship, one of the following lines will be treated, depending on the applicant:  a verification line: the objective is to get a reasonable way to check that a formula is satisfied by a model (model checking problem). Indeed, the theoretical complexity of USL model checking is non elementary. One goal is to define a fragment of (or a variation on) USL which enjoys a better complexity, and/or to propose an efficient implementation of the USL model checking algorithm.
 a requirement engineering line: the objective is to apply Khi to a concrete case study, and to assess limits and advantages of Khi. In particular, a focus can be put on the way the data structures can be formalized (with firstorder logic for instance). URL sujet détaillé : http://sites.onera.fr/stages/sites/sites.onera.fr.stages/files/DTIM201522_0.pdf
Remarques : rémunération : 500  600 euros/mois





SM20780 Lightweight formal verification for OCaml


Description


Keywords: Semantics, OCaml, Static analysis, Rewriting, Regular tree languages, Tree automata, Abstract interpretation.
Some strongly typed programming languages (like Haskell, OCaml, Scala et F#) have a type inference mechanism. This mechanism permits to automatically detect some kind of errors in a program. To prove richer properties on programs (stronger than being correctly typed), people usually use proof assistants that require a high level of expertise both to define the property and to achieve the proof.
We propose to define a lightweight formal verification tool for functional programming languages relying on a representation finer than what offer simple types. Several propositions in this direction have been made using type systems extended by arithmetic constraints [1,2] or intersection types [3]. Here, we propose to use regular languages to represent the (infinite) set of possible inputs of a function. We already know that it is possible to compute an overapproximation of the image of a regular language by a term rewriting system using a form of abstract interpretation [4]. Thus, using this technique, if we transform a functional program into a term rewriting system, it is possible to get an overapproximation of the (infinite) set of results of the function for a given (infinite) set of input values [5].
The objective of this internship is to define a transformation of an interesting subset of OCaml to term rewriting systems. This transformation should produce systems that can be approximated by techniques defined in [4,5]. One of the challenges will be to deal with higher order functions. Transforming a higher order functional program into a first order term rewriting system is possible [6] and gives encouraging results [7]. The results obtained during this internship will have to be compared with the stateoftheart analysis techniques for higherorder programs [8,9,10].
[1] P. Rondon, M. Kawaguci, R. Jhala: Liquid Types. PLDI'08.
[2] D. Xu, S. P. Jones, K. Claessen: Static contract checking for Haskell. POPL'09.
[3] G. Castagna, K. Nguyen, Z. Xu, H. Im, S. Lenglet, L. Padovani: Polymorphic functions with settheoretic types: part 1: syntax, semantics, and evaluation. POPL'14.
[4] T. Genet, V. Rusu: Equational approximations for tree automata completion. J. Symb. Comput. 2010.
[5] T. Genet: Towards Static Analysis of Functional Programs using Tree Automata Completion. WRLA'14.
[6] N.~D. Jones, N. Andersen: Flow analysis of lazy higherorder functional programs. TCS 2007.
[7] T. Genet, Y. Salmon: Tree Automata Completion for Static Analysis of Functional Programs. Technical Report. http://hal.inria.fr/hal00780124
[8] C. H. Broadbent, A. Carayol, M. Hague, O. Serre: CSHORe: a collapsible approach to higherorder verification. ICFP'13.
[9] N. Kobayashi: Model Checking HigherOrder Programs. J. ACM 60. 2013.
[10] C.H. L. Ong, S. J. Ramsay: Verifying higherorder functional programs with patternmatching algebraic data types. POPL'11.
URL sujet détaillé :
:
Remarques : Barbara Kordy (http://satoss.uni.lu/members/barbara/)





SM20781 Formal proof certificates using a zero knowledge protocol.


Description


Mots clés: Lambdacalculus, Typing, CurryHoward Isomorphism, Zero knowledge proofs, cryptography.
Formal verification permits to prove properties on programs. When a property on a software is true, it is still necessary to convince the users that it has been proven. If users generally trust big software companies, it may not be the case for isolated developpers or for free software. In case, the developper will have to prove to users that the proof has been done. It is the role of a proof certificate. This kind of certificates has been studied in the "Proof Carrying Code" (PCC) framework defined in [1]. In PCC, the developper gives to the user sufficient informations to rebuild and replay the proof on its software. The proof is replayed thanks to a checker that the user has to blindly trust.
In this internship, we want to go further than what proposes PCC. We want the developper to give to the user a certificate only showing that the proof exists. In particular, the user will not have to replay the proof and to trust a checker. This is possible by combining to areas of computer science that are, apparently, disconnected: mechanisms used in formal proofs on one side and cryptography on the other side. The idea is to see a logical property to prove as a ciphering key and the proof of this property as the secret key to decipher. Then, the user can send a random value (a nonce) ciphered by the property. If the developper has proven the property then he has the secret key. Thus, he will be abble to get the nonce in clear. To prove to the user that he has the proof, he will only have to send the nonce back: this is the certificate.
This is a form of zero knowledge protocol [2] commonly used in cryptography. This has, also, to do with the CurryHoward isomorphism used in the proof assistant Coq. This tool can be used to produce the proof of a property on a program. In Coq, a property is represented by a type and a proof by any lambdaterm having this type. Thus, if we consider a type (a property) has a ciphering key then a lambdaterm (a proof of this property) will be the deciphering key. This new case of the isomorphism has already been studied in [3]. The aim of this internship is to start back from this work and to produce a complete certification solution for any property of the propositionnal logic. In particular, it will be necessary to investigate what cryptographic functions are necessary and to prove that the solution is robust using results of [4]. On the cryptographic side, it will be also interesting to compare this approach to [5] and [6].
[1] George Necula. Proof carrying code. POPL'97.
[2] http://en.wikipedia.org/wiki/Zeroknowledge_proof
[3] Amrit Kumar, PierreAlain Fouque, Thomas Genet, Mehdi Tibouchi. Proofs as Cryptography: a new interpretation of the CurryHoward isomorphism for software certificates. http://hal.inria.fr/hal00715726
[4] Martin Abadi, Véronique Cortier. Deciding knowledge in security protocols under equational theories. TCS 367(12):232 (2006). http://users.soe.ucsc.edu/~abadi/Papers/AbadiCortier.pdf
[5] Yael Tauman Kalai, Ran Raz, and Ron D. Rothblum Delegation for bounded space In Proceedings of the 45th annual ACM symposium on Theory of computing (STOC), pp. 565574, 2013. Available at: http://research.microsoft.com/enus/um/people/yael/publications/2013delegationboundedspace.pdf
[6] Eli BenSasson, Alessandro Chiesa, Daniel Genkin, Eran Tromer, Madars Virza, SNARKs for C: verifying program executions succinctly and in zero knowledge , proc. CRYPTO 2013, Part II, LNCS 8043, 90108, Springer, 2013. Available at http://www.tau.ac.il/~tromer/papers/csnark20131007.pdf
URL sujet détaillé :
:
Remarques : PierreAlain Fouque (http://www.di.ens.fr/~fouque/)





SM20782 Towards an optimal data locality in MapReduce


Description


MapReduce [1] is emerging as a prominent tool for largescale data analysis: It is often advocated as an easiertouse, efficient and reliable replacement for the traditional programming model of moving the data to the computation. The popular open source implementation of MapReduce, Hadoop [2], was developed primarily by Yahoo!, where it processes hundreds of terabytes of data on at least 10,000 cores, and is now used by other companies, including Facebook, Amazon, Last.fm, and the New York Times.
The Hadoop MapReduce system runs on top of the Hadoop Distributed File System (HDFS), where files are partitioned into chunks, and each chunk is replicated to tolerate failures. Data processing is colocated with data storage; when a node is available to process a map task, the job scheduler consults the storage metadata service to get the hosted chunks as close as possible, in this order: on the same node, on another node within the same rack and on another node outside the rack. The goal is to leverage data locality. Although, many studies have been dedicated to improve the locality of task executions in Hadoop through static data placement [3] or dynamic task scheduling [4]. Comparatively, little attention has been devoted on investigating the impacts of the several interleaved factors which contribute to data locality executions in Hadoop (i.e., data distribution, job size, node configuration, environment dynamicity) or on investigating the impact of data locality on the performance of MapReduce applications.
The proposed project aims at investigating the correlation between the aforementioned factors and to provide a step towards an optimal data locality execution in Hadoop and therefore improve the performance of MapReduce applications. This work will provide an important opportunity to learn how to make experiments with the Hadoop system for rigorous evaluations of Hadoop largescale environment using scientific testbed (i.e., Grid'5000). Depending on the completion of the work, this work could lead to the publication of a research article.
References
[1] J. Dean and S. Ghemawat, =93MapReduce: simplified data processing on large clusters,=94 Communications of the ACM, vol. 51, no. 1, pp.107=96113, 2008. [2] Hadoop Apache homepage: http://www.hadoop.org [3] J. Xie, S. Yin; X.J. Ruan; Z.Y. Ding, Y. Tian, M. J. Manzanares and X. Qin, =93Improving MapReduce performance through data placement in heterogeneous Hadoop clusters,=94 Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on , vol., no., pp.1,9, 1923 April 2010 [4] S. Ibrahim, H. Jin, L. Lu, B. He, G. Antoniu, S. Wu, =93Maestro: Replicaaware map scheduling for mapreduce=94, in CCGrid 2012. URL sujet détaillé : https://www.irisa.fr/kerdata/doku.php?id=master
Remarques : Cosupervisors: Gabriel Antoniu
Luc Bougé





SM20783 Exploring the Limits of Distributed Computation


Description


In the setting of distributed computing, the graph coloring problem is of fundamental practical importance, and also serves as a sort of "benchmark problem" for testing our understanding and limitations of local networkbased computational models. Despite more than 20 years of intensive study, many fascinating research questions remain open.
This internship is intended to focus on a theoretical study of the interplay between time and quality of distributed graph coloring algorithms, taking into account aspects of determinism/randomization and properties of special graph classes.
For more details, please consult the PDF file of the proposal. URL sujet détaillé : http://www.liafa.fr/~kosowski/stagecoloring.pdf
Remarques : Coencadrement : Pierre FRAIGNIAUD, Adrian KOSOWSKI.





SM20784 Realtime online emulation of real applications on SimGrid with Simterpose


Description


The goal of this project is to design an evaluation environment for distributed applications (e.g.; P2P applications) where the instances of the real application (a standard & unmodified BitTorrent client) are executed in a virtual environment simulated by the SimGrid simulator. URL sujet détaillé : http://www.loria.fr/~quinson/Research/Students/2015simterpose.pdf
Remarques : Rémunération prévue.





SM20785 State Space Reduction for the Dynamic Formal Verification of Legacy Distributed Applications


Description


The state space explosion problem greatly hinders the applicability of exhaustive verification techniques. The goal of this work is to explore new reduction techniques specifically tailored to the dynamic verification of large legacy MPI applications. URL sujet détaillé : http://www.loria.fr/~quinson/Research/Students/2015reduction.pdf
Remarques : Rémunération prévue.





SM20786 Optimizing Incremental MapReduce Computations for Live Data Streams


Description


Big Data analytics on cloud infrastructures saw a rapid rise in recent years thanks to the promise of delivering valuable insight in an affordable fashion for the masses. However, so far most attention has been focused on how to optimize the performance and costeffectiveness of the computations, while largely neglecting an important aspect: users need to upload massive datasets on the cloud for their computations.
This goal of this internship is to study the problem of how to run Big Data analytics applications based on MapReduce [1] when considering the simultaneous optimization of the performance and cost of both the data upload and its corresponding computation taken together. To this end, one approach would be to analyze the feasibility of incremental MapReduce frameworks (e.g. HourGlass [2] from LinkedIn) to advance the computation as much as possible during the data upload by using already transferred data to calculate intermediate results.
An interesting research direction is to investigate the possibility of dynamically adapting the computation to account for fluctuations and variability of the cloud environment. Also, we plan to tackle the cost optimization aspect by making use of a complex model that takes both computation and storage into consideration, following the new ResourceasaService paradigm [3] introduced for the next generation clouds. In this context, we plan to study the viability of several storage elasticity features introduced by our previous work [4].
Experiments to validate this approach will be carried out on the Grid'5000 platform [5], in the context of the H2020 BigStorage project.
URL sujet détaillé : http://www.irisa.fr/kerdata/doku.php?id=master#optimizing_incremental_mapreduce_computations_for_live_data_streams
Remarques : This internship will be coadvised by Bogdan Nicolae (IBM Research) and will be remunerated.





SM20787 A Theory of Complexity, Condition and Roundoff


Description


In 1989 Blum, Shub and Smale presented a theory of complexity over the real numbers that was meant to model numerical computations as usually performed in numerical analysis. At the end of their paper they referred to the fact that their theory did not accounted for iterative algorithms or for the effects of finiteprecision and explicitly stated as an open problem to incorporate these features into the theory.
In the last year I developed a theory that does precisely this. The paper can be downloaded at
http://arxiv.org/abs/1403.6241v5
It presents versions of the classes P and NP in terms of notions of "input size" and "cost of a computation" that capture the usual features of finiteprecision computations.
The subject of the stage is to explore the extension of other common complexity classes within this framework. URL sujet détaillé :
:
Remarques : Joint supervision is possible (Pascal Koiran would be an appropriate supervisor at the LIP). Financial support is also possible.





SM20788 Study and implementation of an intent recognition model in the context of the Web of Things


Description


This work is part of the ASAWoO ANR project (http://liris.cnrs.fr/asawoo/) that proposes a software infrastructure based on avatars [1,2] to build applications for the Web of Things. An avatar is a logical abstraction for a physical object. The avatar allows to extend the behavior and knowledge of physical objects, exploiting semantic Web and Web service technologies and reasoning about their contextual environment. In addition, an avatar allows interavatars cooperation mechanisms.
An avatar can participate to the realization of several objectives related to different applications driven by users and/or organizations. To do so, it must cooperate with other avatars to fill its lack of knowledge or functionality. In order to collaborate with entities of its environment that have compatible objectives, an avatar must be able to identify the intents of other avatars based on its observation of their activities. We use a plan and intent recognition model [3] based on an artificial corpus to realize this observation.
It is necessary to demonstrate the applicability of this model in the context of the Web. In particular, it is required to design and develop a resourceoriented solution that allows to realize the different interactions necessary in the context of our intent recognition model.
References:
[1] J.P. Jamont, L. Médini, M. Mrissa: A WebBased AgentOriented Approach to Address Heterogeneity in Cooperative Embedded Systems. PAAMS (Special Sessions) 2014: 4552
[2] M. Mrissa, L Médini, J.P. Jamont , Semantic Discovery and Invocation of Functionalities for the Web of Things, IEEE International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises, Parma. pp. 281286. 2014.
[3] G. Sukthankar, R. P. Goldman, C. Geib, D. V. Pynadath , H. H. Bui (editors), Plan, Activity and Intent Recognition, Morgan Kaufmann, 2013.
[4] Ferber. J. Les systèmes multiagents. Vers une intelligence collective, InterEditions, Paris, 1995.
URL sujet détaillé :
:
Remarques : Encadrement :
JeanPaul Jamont (LCIS, Université de Grenoble Alpes),
Michael Mrissa (LIRIS CNRS, Université de Lyon)





SM20789 A Security Property and Constraint Description Language


Description


The intership is part of the SEMCO project (www.semcomdt.org) which is a framwork to assist system and software developers in the domain of security and safety critical systems to capture, implement and document distributed system applications. SEMCO aims at supporting model and patternbased development approaches by defining and providing several artifacts types representing the different related engineering concerns (Security, Dependability, Safety and Ressources) and architectural information. These artifacts are stored in a modelbased repository and provided in the form of modeling languages (SEMCOML), tools (SEMCOMDT) and methodologies (SEMCOPM).
Goal: The objective of this work is to design a modeling language for the specification of security properties and constraints in the context of model and patternbased security engineering. This language is closely related to a modeling, formal specification and validation framework of security pattern studied in the context of the SEMCO project.
Tasks: 1) Identify a set of roles that a security pattern description language should perform to support a security engineering process, reuse process and the representations commonly used in industry and academy. 2) Define a set of design requirements that should be considered when building a security property and constraint description language. 3) Design a security property and constraint description language to satisfy these requirements providing a language for the specification of security pattern properties and constraints. 4) Develop an xtextbased editor and suggest appropriate transformations in order to integrate this tool into the semcomdt toolsuite. URL sujet détaillé :
:
Remarques : Le stage est rémunéré





SM20790 Faster, Faster, Faster


Description


The classical theory of automata, of transducers and of rational expressions, admits a very elegant and extremely useful extension (eg, in natural language processing) taking into account the concept of weighting. The weights are then taken in a semiring, which can be classical (⟨𝔹, ∨, ∧⟩, ⟨ℤ, +, =D7⟩, ⟨ℚ, +, =D7⟩, etc..), tropical (⟨ℤ, min, +⟩, etc..), or yet of another type (e.g. rational expressions). Vcsn is a project led by Alexandre DuretLutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical objectoriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI). Vcsn aims at speed and user friendliness. To achieve this goal, it is built in three layers:  static: this is a templated C++ library with efficient data structures and algorithms  dyn: this C++ API support typeerasure and shields the user from the complex API of static while preserving performances. It also support codegeneration and linking on the fly  Python: on top of dyn, a Python binding allows a pleasant use of Vcsn, without any C++ knowledge required.
The objective of this internship is to improve the performances of Vcsn in every possible ways. This includes:  design of a binary format to save and restore automata (possibly with mmap)  improvements to the core data structures design of efficient alternatives to classical data structures such as found in the C++ library  improvement of the existing algorithms taken from Automata Theory, such as epsilontransition removal  optimization of the dyn typeerased dynamic dispatch development of benches and comparison with other platforms etc. URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_AD_Faster,_Faster,_Faster
Remarques :





SM20791 Symmetric Difference Automata


Description


The classical theory of automata, of transducers and of rational expressions, admits a very elegant and extremely useful extension (eg, in natural language processing) taking into account the concept of weighting. The weights are then taken in a semiring, which can be classical (⟨𝔹, ∨, ∧⟩, ⟨ℤ, +, =D7⟩, ⟨ℚ, +, =D7⟩, etc..), tropical (⟨ℤ, min, +⟩, etc..), or yet of another type (e.g. rational expressions). The field 𝔽₂ is a particular interest, and shows quite a few unique properties, as demonstrated by the current active research under various names (e.g., "Symmetric Difference Automata"). Vcsn is a project led by Alexandre DuretLutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical objectoriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI).
The objective of this internship is to exploit the existing features of Vcsn to apply them to 𝔽₂ in a first step, and then, in a second step, to use Vcsn as a tool to explore novel results. URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_AD_2015_Symmetric_Difference_Automata
Remarques :





SM20792 Syntactic Monoids


Description


Many properties and characteristics of an automaton can be easily computed from its syntactic monoids. Such properties are of particular importance to theoreticians. Vcsn is a project led by Alexandre DuretLutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical objectoriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI). Vcsn has a sound base of data structure and algorithms for automata and rational expressions. However, it offers no support for syntactic monoids at all.
The objective of this internship is to develop support for syntactic monoids in Vcsn, and to implement recent research results in Automata Theory that use the syntactic monoid. URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_AD_2015_Syntactic_Monoids
Remarques :





SM20793 Vcsn for Linguists


Description


The classical theory of automata, of transducers and of rational expressions, admits a very elegant and extremely useful extension (eg, in natural language processing) taking into account the concept of weighting. The weights are then taken in a semiring, which can be classical (⟨𝔹, ∨, ∧⟩, ⟨ℤ, +, =D7⟩, ⟨ℚ, +, =D7⟩, etc..), tropical (⟨ℤ, min, +⟩, etc..), or yet of another type (e.g. rational expressions). Automata are heavily used in computational linguistics, and conversely, automata used in computational linguistics are "heavy". Vcsn is a project led by Alexandre DuretLutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical objectoriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI). Vcsn has a sound base of data structure and algorithms for automata and rational expressions. It is already able to deal with many of the typical needs of linguists. However some specific semirings have not been implemented, and some wellknown algorithms are needed.
The objective of this internship is to develop a complete Computational Linguistics toolchain on top of Vcsn, for instance the "SMS to French" project from François Yvon (sms.limsi.fr). To this end, Vcsn will have to be completed with the needed data structures and algorithms, and possibility the existing implementation will have to be revised to cope with the extremely demanding size of these reallife automata. URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_AD_2015_Vcsn_for_Linguists
Remarques :





SM20794 Development of an IPython interface, and a couple of web interfaces for the Spot library


Description


The Spot library (http://spot.lip6.fr/), written in C++, offers several algorithms and data structures to build model checkers. It contains many algorithms for handling lineartime temporal logic (LTL) formulas, or for different variants of B=FCchi automata. Spot also distributes several commandline tools, built on top of the library. Finally, Spot includes some Python bindings, as well as a web interface (http://spot.lip6.fr/ltl2tgba.html) for easy access to one of the features of the library.
The goal of the internship is to improve the last two points (Python bindings and web interface): we would like to access all features of Spot from Python (and in particular from IPython), and on top of this Python interface, we would like more online services that people can use without having to install Spot. URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_ADL_Python_Interfaces_for_Spot
Remarques :





SM20795 Minimization of B=FCchi automata using SATsolving


Description


The Spot library (http://spot.lip6.fr/) contains many algorithms for translating LTL formulas into B=FCchi automata, and to simplify these formulas and automata. A recently added technique (in Spot version 1.2) allows us to minimize deterministic B=FCchi automata using a SATsolver. To minimize a nstate deterministic B=FCchi automaton (an NPcomplete problem) we encode its equivalence with a (n1)state deterministic B=FCchi automaton as a SAT problem, and let a SAT solver do the work of finding a solution. If such a solution is found, we try again, looking for a (n2) state automaton, etc.
Presently, our first implementation saves the SAT problem as a huge file before calling the SATsolver, and it does this for each iteration. The goal of this internship is to improve this situation in multiple ways:  implement purely technical optimizations (like direct communication with the SAT solver, bypassing any temporary files)  take advantage of feature offered by modern SAT solvers (for instance an "incremental SATsolver" can be given new constraints after it has found a solution, without needing to restart from scratch)  implement optimization to the encoding of the SATproblem, based on structural properties of the automaton (we have some ideas).
In a second, step, we would like to generalize the existing technique to nondeterministic automata, or to different types of acceptance conditions (i.e., not B=FCchi). URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_ADL_SATbased_Minimization
Remarques :





SM20796 Algorithms for automata over infinite words using a general acceptance condition


Description


The formal method community uses several types of automata over infinite words (called "omegaautomata"): B=FCchi automata, Streett automata, Rabin automata... These automata differ by their acceptance condition, i.e., the condition under which an infinite run of the automaton is accepted. For each of these different acceptance conditions there exist specialized algorithms, for instance to decide if an automaton is accepting, or to change an acceptance condition into another, or to determinize an automaton... We have recently defined file format for exchanging omegaautomata between tools. Ce format can represent all the above types of automata thanks to a very general way to to represent the acceptance condition. In fact the acceptance condition specification is so general that it allows us to represent combination of existing conditions.
The goal of this internship is to survey existing algorithms that differ from one acceptance condition to the other, and develop algorithms that will work with the generalized acceptance condition. The first algorithms we would like to have are:  emptiness check (is the language of the automaton empty?)  conversion to B=FCchi acceptance URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_ADL_General_Acceptance_Condition
Remarques :





SM20797 Lisp for the design and the development of DSLs


Description


DSL design and implementation is booming. There are many approaches to this problem, one is about using the extensibility capabilities of a generalpurpose programming language in order to meet specific requirements. Many multiparadigm approaches (eg functional languages / compiletime metaprogramming) also offer benefits. Lisp is a language that is particularly well suited for this type of problem, but it is almost totally absent from the literature on DSLs. Indeed, most researchers are not (or not anymore) familiar with this language. They prefer a language based on a static approach, or have simply forgotten everything Lisp is capable of.
The purpose of this internship is to (begin to) fill the gap by adding Lisp to the comparative literature. The work will consist in the study of literature demonstrating some useful approaches for the design of DSLs, further it should propose an alternative approach with Lisp and then compare the results. The steps of this training will include:  Familiarization with Common Lisp and in particular its extensibility capabilities.  Getting acquainted with the literature on the design and implementation of DSLs in other languages.  Choice of an applicationcentric approach or a comparative approach based on this literature and implementation of a Lispish equivalent.  Finally, analysis of the results following the aspects such as: ease of implementation, expressiveness, performance etc. . URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_DV_Conception_DSL
Remarques :





SM20798 Lisp for statistics


Description


R is a programming language for statistics. Because of its historical development, it has many similarities with Scheme (a language from the Lisp family), but also many problems: in particular, it is a language which is not compiled. The biggest complaint of its users is therefore its lack of performance. For this (bad) reason a number of totally esoteric features have been added to it.
The goal of this internship is to study how to port a particular application of R to an equivalent one in Common Lisp. The first objective (which is certain) is to achieve a much higher level of performance than that of the original application. The long term goal is to consider a new implementation of R (or of a subset of R) over Common Lisp, with the intuition that it will result in a much simpler and better system, which is in addition much more efficient. URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_DV_Lisp_For_Statistics
Remarques :





SM20799 Verified graph transformations with attributes


Description


Within the ANR Climt project, we are working on graph transformations, and in particular the verification of graph transformations.
We are currently developing an imperative language for graph transformations, and a family of Hoarestyle logics for reasoning about programs of this language. The logics are variants of Description Logics. A distinctive feature of our work is that both the programming language and the logic have been formalized in a proof assistant.
The graph transformations we have considered so far only take into account structural aspects of graphs, but not attributes, i.e. numeric values, strings and the like. Our graph transformation language and its associated program logic have been designed to yield decidable proof problems (with an effective proof procedure). The internship aims at exploring what happens when adding attributes: conceptually, how to preserve decidability; practically, how to implement corresponding proof procedures. URL sujet détaillé : http://www.irit.fr/~Martin.Strecker/Teaching/Master/
Remarques : Coencadrement avec Rachid Echahed (LIG, Grenoble)





SM207100 Parsing Algorithms for Abstract Categorial Grammars


Description


Abstract Categorial Grammar (ACG) is a framework dedicated to the description of the syntax and the semantics of natural languages. It is based on the implicative fragment of linear logic and uses linear λterms. The latter allows for the modeling of both strings and trees, two datastructures that are central in computational linguistics. From a formal languagetheoretic point of view, ACGs can encode several classes of languages. In particular, secondorder ACGs encode mildly contextsensitive languages (mCSL) such as TreeAdjoining Grammars (TAG) or multiple contextfree grammars (MCFG). M. Kanazawa showed how to reduce the parsing problem for this class of grammars to Datalog querying.
In order to model natural language phenomena, using higherorder grammars also proved to be useful. The goal of the research topic we propose is to adapt the results that are available for secondorder grammars to higherorder grammars using the reduction of parsing to a proof search problem in multiplicative exponential linear logic (MELL). The decidability of the latter fragment being an open problem, we don't know whether the parsing problem is decidable for any higherorder ACG. For lexicalized ACG, we know the problem to be decidable but NPcomplete. For these reasons, we aim at identifying suitable fragments (particular lexicalization) or principles (bounded typing contexts) that would provide decidable parsing algorithms with a controlled complexity. The ACG toolkit, developed in the Polymnie ANR project will provide a development and experimental environment. URL sujet détaillé : http://semagramme.loria.fr/lib/exe/fetch.php?media=projects
Remarques : The internship will be cosupervised with Sylvain Pogodalla (http://www.loria.fr/~pogodall).
The internship position is supported (rémunération) by the ANR project Polymnie (http://semagramme.loria.fr/doku.php?id=projects:polymnie).





SM207101 Version management for bigdata storage


Description


Many modern applications access very large objects, such as the friendship graph of a social network, or the directory used for storing a large email collection. The cost of copying such a large object, either to or from disk or over the network, may overwhelm the computational cost of using the object. However, copying is a frequent operation, for instance when inplace update is undesirable, e.g., to accomodate concurrent updates, to ensure atomicity, or for fault tolerance.
Techniques for addressing this problem include copyonwrite (CoW) page management (as used in operating system kernels), and socalled persistent objects (as used by the implementation of functional languages). Unfortunately, existing CoW techniques are lowlevel and hard to use, and furthermore are not applicable in modern managed language environments such as Java or Erlang. Conversely, persistent object techniques are general but not have not been applied to a distributed bigdata setting. One recent result is the LSMTree, used in the LevelDB database, for the special case of a large index tree.
The above techniques all follow a common pattern. A frozen snapshot of the main (large) data structure is created. A new version consists of both a pointer to the previous version and some kind of buffer. An update is stored in the buffer, and does not modify the snapshot. Reading the object state first checks the buffer, then (possibly) the snapshot. Versions can be stacked or chained, several possibly sharing the same base snapshot. When the chain of recentupdate buffers becomes too large or too complex, they are merged (according to the object's semantics) into a new frozen snapshot.
This pattern has been used so far mostly in an adhoc way and has not been studied scientifically. We wish to study its theoretical and practical foundations and limitations, and to leverage the mergeability, commutativity, associativity and idempotence properties of ConflictFree Replicated Data Types (CRDTs) to optimise and generalise it in a principled way. The specific objectives of the internship are the following:  Develop a library of bigdata versionable data types based on this technique. Data types of interest include maps (directories), trees and directed graphs.  Implement them efficiently in the context of large multicore computers and/or clusters.  Deploy them in some representative applications.  Perform experiments and measure performance, compared to alternative implementation.  Publish the findings in the scientific literature. URL sujet détaillé :
:
Remarques : Requirements. The intern must have:  An excellent academic record.
 A strong interest and good knowledge of bigdata applications,
distributed systems, distributed algorithms, or distributed
databases.
Knowledge of node.js is a plus.
The internship is fully funded, and will take place in the Regal group,
at Laboratoire d'Informatique de Paris6 (LIP6), in Paris. A successful
intern will be invited to apply for a PhD.
To apply, contact Marc Shapiro , with the
following information:
 A resume or Curriculum Vit=E6
 A transcript of studies and grades of the last two years of study
(an informal transcript is OK).
 Names and contact details of two references (people who can recommend
you. We will contact these people directly.





SM207102 A distributed log for scalable bigdata storage


Description


A wellknown technique from databases and file systems to support fast writes and to be able to recover from failures, is to track all update activity into a log. A log is an appendonly, writemostly file, organised as an array of update records; this ensures that writes are fast. If all activity is logged to disk, this supports recovery and rebuilding state.
Another wellknown database technique is to partition the storage, i.e., to logically divide it into disjoint parts (also known as shards). This improves scalability, manageability, performance, availability and distribution. Shards are kept independent. For fault tolerance, a shard is replicated over multiple nodes.
The subject of this internship is to explore the design of a distributed, partitioned and replicated log. To support both performance and faulttolerance, log records are replicated among different servers (instead of writing to disk). Each shard's replica must remain totally ordered, in order to provide consistent reads.
Bigdata applications manipulate data structures such as maps, trees or general graphs. The aim of the log is to combine fast and persistent writes to such structures, delaying the actual application of the operation to later asynchronous processing. This enables maintaining different views and versions of the database efficiently, which is important for transaction processing. Upper layers play the log and build their state on demand, off of the write path.
The work of the intern will be to design and implement this log, studying:  A replication scheme supporting high performance.  Mapping read and write operations to replicas.  Comparing to existing solutions, such as disk or SSDbased logs.
The intern's findings shall be written up for publication in the scientific literature. URL sujet détaillé :
:
Remarques : Requirements. The intern must have:  An excellent academic record.
 A strong interest and good knowledge of bigdata applications,
distributed systems, distributed algorithms, or distributed
databases. Knowledge of the Erlang programming language is a plus.
The internship is fully funded and will take place in the Regal group, at Laboratoire d'Informatique de Paris6 (LIP6), in Paris. A successful intern will be invited to apply for a PhD.
To apply, contact Alejandro Tomsic , with the
following information:
 A resume or Curriculum Vit=E6
 A transcript of studies and grades of the last two years of study (an
informal transcript is OK).
 Names and contact details of two references, people who can recommend
you. We will contact these people directly.





SM207103 Types and interpretations for programs of polynomial time complexity


Description


Can one define simple and modular programming disciplines in a highlevel language ensuring that the programs produced have a bounded complexity, for instance of polynomial time? This question is explored by the field of implicit computational complexity: it aims at defining static complexity criteria, which do not rely on explicit bounds on resources.
This internship proposal focuses on implicit complexity criteria for functional programs, which can use higherorder subprograms. It aims at relating two approaches for this issue, based respectively on type systems and on the semantic notion of interpretation. URL sujet détaillé : http://perso.enslyon.fr/patrick.baillot/STAGES/2015/sujet_interpretations2015.pdf
Remarques :





SM207104 Modelisation and structural analisys of preferences relations in the qualitative approach to multicriteria decision making


Description


*Basic setting*
In decision theory there are two main problems: modelling of preferences and the axiomatization of such models which allow, in particular, a justification for the chosen model. In addition to these two, a third is added and that deals with the elicitation, mining and learning of preferences. For references see, e.g., [BDPP,DeBaets,PRSR09].
Aggregation functions [GMMP] constitute an important tool when dealing with preference modelling. Preference relations can be represented by global utility functions that score different alternatives, thus providing a simple decision procedure: alternatives with higer scores are preferred to those with lower scores. In this procedure, aggregation functions play a relevant role as they can be used to aggregate (merge) the different criteria (describing each alternative) into a single scale. This refined model for representating preference relations R over a set of alternatives X=X_1 x ... x X_n (e.g., appartements to buy) that are described by different attributes X_i (such as price , surface, localisation, etc.) can be formalised by the rule: a R a' iff U(a)=< U(a'), where U:X > Y is a global utility fonction defined by a composition
U(a) =A(g_1(a_1),...,g_n(a_n) right) (*)
of an aggregation function A:Y^n > Y (e.g., a weighted sum or a majority function) local utility functions g_i:X_i > Y that express our (local) preferences on each attribute X_i. For instance, concerning the price it is reasonable to assume that cheaper appartements are preferred to more expensive ones, whereas for surface, bigger appartements are usually preferred to smaller ones.
*Qualitative setting and goals*
This proposal explores a qualitative approach to multicriteria decision making where the aggregation procedure A of choice is the Sugeno integral [DubPraSab01,DubMarPraRouSab01]. In [CW1,CW2] proposed axiomatizations of model (*). As it turned out, different aggregation procedures may culminate in the same global preference relation for suitable rankings over attributes. This asked for algorithmic procedures to obtain all possible factorisations of form (*), that is, all Sugeno integrals A and all local utility functions g_i whose composition results in the same global utility function U. This was provided in [CW2] and lead to logical characterizations of preference relations in this qualitative setting [CW3] and gave tools for analysing the effect of local preferences on global preferences modelled by (*). The first objective is to familiarise the student the emerging domain of aggregation theory with applications in decision making and preference learning. Then the student will be asked to implement the algorithmic procedures given in [CW1,CW2] with the following objectives:
1. Study the behaviour of global preferences with respect to local changes (variations of local preferences),
2. Analyse the computational complexity and the effect of considering simplified models (e.g., by limiting the interactions between criteria),
3. Explore methods of elicitation and learning of preferences in this qualitative setting.
*Desired skills*
The candidate should be motivated and have an initiative spirit. Basic knowledge in discrete mathematics and good programing skills (for algorithmic implementations over ordered structures) is desirable.
*Bibliography*
[BDPP] D. Bouyssou, D. Dubois, H. Prade, M. Pirlot (eds). DecisionMaking Process  Concepts and Methods, ISTE/John Wiley, 2009.
[DeBaets] K. CaoVan, B. De Baets, S. Lievens. A probabilistic framework for the design of instancebased supervised ranking algorithms in an ordinal setting, Annals Operations Research 163 (2008) 115142.
[CW1] M. Couceiro, T. Waldhauser, Axiomatizations and factorizations of Sugeno utility functions, Internat. J. Uncertain. Fuzziness KnowledgeBased Systems, 19(4) (2011) 635658.
[CW2] M. Couceiro, T. Waldhauser. Pseudopolynomial functions over finite distributive lattices, Fuzzy Sets and Systems 239 (2014) 2134
[CW3] M. Couceiro, D. Dubois, H. Prade, T.~Waldhauser. Decision making with Sugeno integrals: DMU vs. MCDM. 20th European Conference on Artificial Intelligence (ECAI2012), 288293, 2012.
[DubMarPraRouSab01] D. Dubois, J.L. Marichal, H. Prade, M. Roubens, R. Sabbadin. The use of the discrete Sugeno integral in decision making: a survey. International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems 9:5 (2001) 539561.
[DubPraSab01] D. Dubois, H. Prade, R. Sabbadin. Decisiontheoretic foundations of qualitative possibility theory. European Journal of Operational Research 128 (2001) 459478.
[GMMP] M.~Grabisch, J.L. Marichal, R. Mesiar, and E. Pap. Aggregation Functions. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2009.
[FH] J. F=FCrnkranz, E. H=FCllermeier (eds). Preference Learning. Springer, 2010.
[PRSR09] H. Prade, A. Rico, M. Serrurier, E. Raufaste. Eliciting Sugeno integrals: Methodology and a case study, in Proceedings of European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty 2009, pp. 712723. URL sujet détaillé :
:
Remarques :





SM207105 XPathlike query logics


Description


XPath est un langage permettant de décrire des arbres XML. Il peut être vu comme une logique comportant des modalités de navigation dans les arbres, et éventuellement des primitives permettant de comparer les données attachées aux noeuds des arbres. Le problème de la satisfaction de requêtes XPath est très complexe dès lors que l'on se permet de manipuler les données, puisqu'elles sont prises dans un ensemble infini.
Des travaux récents laissent penser qu'un point de vue unificateur peut être développé pour mieux comprendre les résultats de décidabilité et complexité pour XPath avec données. Celuici s'appuie sur la théorie de la preuve, et plus précisemment sur l'utilisation de logiques sousstructurelles.
Le sujet du stage s'inscrit dans ce programme. On considèrera une logique modale correspondant à un fragment simplifié de XPath avec données, et on s'attachera à caractériser la complexité de son problème de satisfaction, en prenant cette question comme un problème de recherche de preuve dans un calcul des séquents bien structuré qu'il faudra préalablement développer.
Voir PDF cidessous pour plus de détails. URL sujet détaillé : http://www.lsv.enscachan.fr/Stages/Fichier/m215ssdb.pdf
Remarques : Le stage sera coencadré par Sylvain Schmitz et David Baelde. Il peut avoir lieu soit intégralement à l'université de Warwick (où Sylvain Schmitz sera prof. invité durant le second semestre 2014/1025) soit à moitié au LSV. Le stage peut mener à une thèse au LSV.





SM207106 Coinductive and symbolic algorithms for non standard automata


Description


We recently proposed algorithms for checking language equivalence of nondeterministic finite automata (NFA), by relying: 1/ on techniques from concurrency and process algebra theory: ``bisimulations up to'' [1], and 2/ on a symbolic representation of automata, using BDD [2] . Thanks to these techniques, our algorithms seem to be more efficient than all algorithms known up to now (Hopcroft&Karp, partition refinment, antichains). We propose to study the possibility of extending these algorithm to other kinds of automata, that are widely used in practice: B=FCchi automata, alternating automata, tree automata, etc...
[1] F. Bonchi and D. Pous, Checking NFA equivalence with bisimulations up to congruence, in Proc. POPL 2013. [2] D. Pous, Symbolic Algorithms for Language Equivalence and Kleene Algebra with Tests. To appear in Proc. POPL 2015. URL sujet détaillé :
:
Remarques :





SM207107 Formalisation and certification of ownership in a specialised Java virtual machine


Description


(Joint work with Fabienne Boyer and Olivier Gruber, Grenoble) One major challenge of reconfigurable componentbased systems is to maintain the consistency of the system despite failures occurring at reconfiguration time. Fabienne Boyer and Olivier Gruber recently proposed a lowlevel model allowing one to guarantee that a given system is always consistent. This model was implemented as a Java virtual machine and we proved in Coq that the reconfiguration algorithms are correct [1]. The model includes many other features for componentbased programming, including a welldefined notion of ownership [2]. We propose to formalise this aspect of the model in Coq: the corresponding algorithms deal with potentially cyclic graphs of objectc, making it a really challenging task to certify them.
[1] F. Boyer, O. Gruber, and D. Pous, Robust Reconfigurations of Component Assemblies. In Proc. ICSE'13, IEEE / ACM, 2013. Associated Coq development: http://perso.enslyon.fr/damien.pous/rrca/
[2] F. Boyer, O. Gruber, OwnershipBased Isolation for Concurrent Actors on Multicore Machines. In Proc. ECOOP'13, 2013. URL sujet détaillé :
:
Remarques : Coencadré avec Fabienne Boyer et Olivier Gruber (LIG, Grenoble)





SM207108 PiCoq


Description


We would like to apply the recently introduced proof techniques to help the formalisation in Coq of the operational theory of picalculus like programming languages. The picalculus is a formalism introduced by R. Milner [1,2] for the specification and the analysis of concurrent systems or processes. Reasoning about these processes usually involves behavioural equivalence relations: two processes are considered as equal when they exhibit the same behaviour. While the theory of these equivalences is well established, proving formally that two concrete processes are equal can be hard, and tedious. This is where a proof assistant would really by useful, both to make sure that all cases were covered, but also to solve automatically numerous subproofs obligations. The internship would consist both in applying known techniques within the Coq proof assistant, to prove various results about the picalculus, and in developing new techniques or new abstractions allowing one to overcome the difficulties that are inherent to the mechanised formalisation of such proofs.
[1] R. Milner, Communicating and Mobile Systems: the PiCalculus, Cambridge University Press. [2] M. Hennessy, A Distributed picalculus, Cambridge University Press. URL sujet détaillé :
:
Remarques :





SM207109 Approximations finies de chaînes de Markov modélisées par des graphes réguliers probabilistes


Description


Markov chains are a simple and powerful tool to model uncertainty in systems. It enable to define a probability measure on the set of executions. It then enables to quantify several data on the system itself.
Probabilistic pushdown automata are certainly the most abundantly stutied infinite state Markov chain [1]. Such an automaton is defined similarly as a classical pushdown automaton but transitions are also labelled by probabilities. More recently similar systems, defined by deterministic graph grammars (with probabilistic labelling functions) have been considered [2]. These grammars define structures where finitely many patterns are repeated according to a finite set of rules.
In this work we plan to study finite abstractions of such infinite systems. In particular we expect to characterize systems where finite restrictions contain a subset of executions having a strictly positive measure. In such cases we also want to produce algorithms ensuring construction of fintite restrictions containing a subset of executions of measure p for an arbitrary p. We also want to compare this approach with the "probability of termination" of Etessami [3]. We would also like to establish the precise complexity of each algorithm. Finally it would be interesting to connect this work with those of Bradzil and Kucera which define approximations for more general chains [4].
[1] J. Esparza, A. Kucera et R. Mayr, Model Checking Probabilistic Pushdown Automata, LMCS 2006 [2] N. Bertrand et C. Morvan, Probabilistic regular graphs, Infinity 2010 [3] K. Etessami, The complexity of analyzing infinitestate Markov chains, Markov decision processes, and stochastic games; Stacs 2013 [4] T. Bradzil et A. Kucera, Computing the Expected Accumulated Reward and Gain for a Subclass of Infinite Markov Chains URL sujet détaillé : http://master.irisa.fr/index.php/fr/370?id_stage=39
Remarques :





SM207110 Mitigation of harmful Network Interactions for Extreme Scale Computing Platform


Description


Context 
Most of the current supercomputers have a unique and multipurpose interconnection network. Such a network carries both I/O traffic and internal communications of jobs. Blue Waters, with its 3D torus network, is concerned with such issues. The network is shared by all the jobs running on the machine. This leads to harmful effects such as congestion and hinders application's performances.
Two orthogonal lines of research have already been identified:
1. Enhanced allocation strategies:
 Constructing allocation using specific shapes for jobs (e.g., convex or contiguous jobs) would allow to contain most of the network traffic generated by a job within its physical geometric boundaries. With such shapes we are able to optimize the network load induced by an application. Solving this problem is however not enough as the I/O nodes are spread in the machine. It may also be specific to the 3D torus topology.  Another strategy would be to overallocate and then wisely use the spare nodes allocated to a job. Still, there is a clear tradeoff to determine.
2. I/O pacing:
The network use tends to be bursty, hence leading to congestion. By pacing or delaying the I/O of some nodes, we may avoid congestion and thus gain in performance. Instead of detecting such pattern with packets' drop, we could be proactive by using the gathered metrics.
Tough points 
As of today, we know with a time granularity of about a minute the global behaviour of the Blue Waters machine. However, such granularities (in space and time) seem too coarse to deal efficiently with network congestion and other related issues.
Program 
1. Improve data collection to get per job information: which time/space granularity tradeoff is the right one? 2. Identify and characterize applications that could benefit from one or both approaches. This step entails the identification of the relevant metrics and the corresponding temporal scales to characterize such applications. 3. Design algorithms with time complexity compatible with the identified scales. These algorithms aim at mitigating jobs' harmful network interactions. URL sujet détaillé :
:
Remarques : Duration: 4 to 6 months





SM207111 Résolution de jeux combinatoires à l'aide des graphes


Description


Les jeux combinatoires forment un domaine de recherche relativement récent. Ce sont des jeux à deux joueurs, à information parfaite, et où le hasard n'intervient pas. Parmi les jeux les plus connus on citera les échecs, le Go, Hex... L'objectif de la recherche dans ce domaine est l'obtention d'une stratégie gagnante pour l'un des deux joueurs. Celleci reste toutefois difficile à obtenir dans de nombreux cas, vu la complexité de la combinatoire qui entre en jeu. Par ailleurs, la théorie mathématique de résolution "exacte" de tels jeux est aujourd'hui en cours de construction, et fait notamment intervenir des concepts de théorie des graphes ou de théorie des nombres. Dans ce stage, l'étudiant commencera par s'approprier les grands concepts de la théorie des jeux combinatoires, en résolvant par luimême des premiers jeux dont les stratégies gagnantes sont connues. Les notions de somme de jeux, de fonction de Grundy seront abordées. Par la suite, on s'intéressera aux jeux dits "miserables" (introduits par Gurvich en 2012), càd ceux pour lesquels les versions dites "normale" (celui qui joue le dernier coup gagne) et "misere" (celui qui joue en dernier perd) ont des résolutions quasiidentiques. Une telle étude est un premier pas vers la résolution des jeux misères, qui constituent le sujet le plus ambitieux du domaine. URL sujet détaillé :
:
Remarques : Coencadrement avec Aline Parreau.





SM207112 Étude logique de l'ordonnancement dans les algèbres de processus


Description


The CurryHoward correspondence establishes an isomorphism between intuitionistic proofs and functional programs and provides fruitful methods for studying both proofs and programs. In its extension to classical logic by means of sequential control operators, the wellknown continuationpassingstyle translations appeared as the operational counterpart of translations from classical to intuitionistic logic; this extensions has been very fruitful both in logic and in programming theory. The transposition of these methods to concurrent models of computation is still problematic because crucial features of concurrent systems, like nondeterminism and explicit temporal constraints between events, have not yet found their logical counterpart.
The point of this internship is to explore a new approach to this question in which proofs are put in correspondence not with concurrent programs but with schedulers. The technical objective is to reinterpret known transformations on processes from synchronous formalisms to asynchronous ones as translations between logical systems, as a concurrent analogue of CPS translations. URL sujet détaillé : http://iml.univmrs.fr/~beffara/stagem2if.pdf
Remarques :





SM207113 Terminaison de programmes à structures de données complexes


Description


Proving the termination of a flowchart program can be done by exhibiting a ranking function, i.e., a function from the program states to a wellfounded set, which strictly decreases at each program step. In [compsysTermination:sas2010] we proposed a general algorithm for the computation of linear ranking functions, which is an adaptation of the LinearProgramming based scheduling method of [DarteVivien97]
However, the limits of our approach are raised as soon as we are faced with recursive programs, or irregular data. In fact, the underlying model of program, namely, the emph{polyhedral model}, has intrinsic restrictions that have only been partially solved.
In this internship, we will focus on the relationship between termination techniques and sequential scheduling/termination techniques.
In this internship, we aim at finding a way to prove termination of sequential programs with data structures (without translating into TRSs).
The candidate is expected to~: * Read the state of the art research articles : sequential termination, rewriting termination. * Propose a framework that is able to deal with sequential programs with inductively defined datastructures and recursive programs. * Implement a prototype. URL sujet détaillé : http://laure.gonnord.org/pro/research/termdata_2015.pdf
Remarques : co encadrement possible avec D. Monniaux (vérimag), et Carsten Fuhs (UCLondon)





SM207114 Affine Transformations of Digital Images using Arbitrary Precision Computations


Description


Using affine or linear transformations such as rotations, symmetries, homotheties and shear mappings is commonplace in digital image processing. These transformations are usually modelled as matrices of real numbers. Because floatingpoint numbers can not represent all real numbers, but only a finite set of rational numbers, the implementations of affine transformations often have properties which differ from those of the real transformations they are supposed to represent.
QuasiAffine Applications (a.k.a. by QAA) are obtained by the discretization of affine applications. They can be represented as matrices of rational numbers and allow to describe linear or affine transformations such as the discrete rotations. An ΩQAA can be viewed as a sequence of these QuasiAffine Applications whose successive values (rational numbers) are closer to the actual real application. Using this representation allows both capturing the discrete features of the plane (or space) as well as computing exactly with a discrete model of the continuum based on integers only : the HarthongReeb line.
In this internship, we shall state and prove the mathematical properties of the ΩQAA. Proofs shall be carried out using the Coq proof assisatant. We shall also study some algorithms based on the ΩQAA and formally prove their properties. A possible extension consists in adapting these algorithms and their proofs of correctness to a threedimensional setting.
URL sujet détaillé : http://dptinfo.ustrasbg.fr/~magaud/TrADiCont/SujetStageM2English.pdf
Remarques : Coencadrement avec MarieAndrée Da ColJacob et Loïc Mazo (équipe MIV, ICube). Stage financé dans le cadre du projet interne "TrADiCont" de ICube.





SM207115 Analyse in situ de données de simulations moléculaires sur très grand calculateur


Description


Les capacités de calcul des prochaines générations de machines permettront des simulations numériques de très grandes tailles produisant des quantités de données massives. La gestion de ces données du moment où elles sont produites jusqu'à leur analyse et leur visualisation est un enjeu majeur. Ce stage s'inscrit dans ce contexte en se focalisant sur les simulations de dynamique moléculaire. Lorsque les traitements (compression, indexation, analyse, visualisation) des données sont réalisés alors qu'elles sont encore stockées dans la mémoire vive de la machine exécutant la simulation, ils sont dit " insitu ". Si ces traitements ont lieu une fois les résultats stockés sur disques une fois la simulation terminée, on parle de " posttraitement ". Ce domaine de recherche, à la frontière entre les problématiques du Big Data et de l'Exascale, est identifié comme critique pour la simulation numérique à très grande échelle sur le prochaines générations de machines [5]. Il intéresse une large communauté qui va bien audelà de la dynamique moléculaire. Notre équipe au CEA développe le code de dynamique moléculaire classique ExaStamp. Exastamp est un code destiné à des simulations à très grandes échelles sur les futures machines Exaflopiques. Dès à présent Exastamp est capable de s'exécuter sur plusieurs milliers de machines hybrides associant CPU multicóurs et accélérateurs de type GPU ou Xeon Phi [1]. ExaStamp permet déjà des calculs à plusieurs milliards de particules (atomes et molécules). Sans un gestion adaptée des données produites, il sera très difficile voir impossible d'exploiter pleinement les résultats potentiellement très volumique de ces simulations. A l'INRIA, nous avons développé un environnement logiciel, pour le traitement parallèle insitu et postmortem parallèle de données [2,3]. Cet environnement très flexible permet d'évaluer une grande variété de scénarios, mais ne constitue qu'une première étape vers le support efficace des machines Exascale.
L'objectif de ce stage est d'étudier la parallélisation d'un algorithme de traitement insitu des données à partir des positions des atomes fournies par la simulation. Cet algorithme devra calculer à partir de ces données une grille découpant l'espace de simulation en voxels portant chacun des informations de pression, température, champ électromagnétique. Cet algorithme engendrera des calculs et des communications dont il faudra minimiser l'impact sur les performances de la simulation s'exécutant en même temps. Pour cela le candidat étudiera plusieurs stratégies d'entrelacement de la simulation et de l'algorithme développé (traitements asynchrones sur une coeur dédié par exemple). Cette grille de calcul sera ensuite utilisée pour effectuer des traitement complémentaires comme par exemple l'extraction d'une isosurface de pression que l'utilisateur pourra visualiser en cours de simulation.
Le candidat aura accès pour ses expérimentations à des très grands calculateurs. Il intégrera l'équipe développant le code Exastamp, localisée sur le site du CEA de BruyèresleChâtel, tout en étant coencadré par Bruno Raffin, chercheur à l'INRIA Grenoble et spécialiste de l'analyse et de la visualisation in situ. Le stagiaire aura la possibilité de poursuivre en thèse, le financement étant déjà assuré.
[1] A. Heinecke, M. Klemm and H.J. Bungart, "From GPGPUs to ManyCore: NVIDIA Fermi and Intel Many Integrated Core Architecture," Computing in Science and Engineering, vol. 14, no. 2, p. 78=9683, 2012. [2] Matthieu Dreher, Bruno Raffin; =93A Flexible Framework for Asynchronous In Situ and In Transit Analytics for Scientific Simulations=94, 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2014). (http://hal.inria.fr/hal00941413/en) [3] Matthieu Dreher, Jessica PrevoteauJonquet, Mikael Trellet, Marc Piuzzi, Marc Baaden, Bruno Raffin, Nicolas Férey, Sophie Robert, Sébastien Limet. " ExaViz: a Flexible Framework to Analyse, Steer and Interact with Molecular Dynamics Simulations=94, Faraday Discussions of the Chemical Society (2014) (http://hal.inria.fr/hal00942627/en) [4] C. Docan, M. Parashar, and S. Klasky, =93DataSpaces: an Interaction and Coordination Framework for Coupled Simulation Workflows,=94 Cluster Computing, vol. 15, no. 2, pp. 163=96181, 2012. [5] Synergistic Challenges in DataIntensive Science and Exascale Computing. US Department of Energy, March 2013 (http://science.energy.gov/~/media/40749FD92B58438594256267425C4AD1.ashx). [6] F. Zheng, H. Zou, G. Eisnhauer, K. Schwan, M. Wolf, J. Dayal, T. A. Nguyen, J. Cao, H. Abbasi, S. Klasky, N. Podhorszki, and H. Yu, =93FlexIO: I/O middleware for LocationFlexible Scientific Data Analytics,=94 in IPDPS'13, 2013. [7] M. Dorier, G. Antoniu, F. Cappello, M. Snir, and L. Orf. =93Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitterfree I/O,=94 in CLUSTER =96 IEEE. International Conference on Cluster Computing. IEEE, Sep. 2012. [8] M. Dorier, S. Ibrahim, G. Antoniu, R. Ross. Omnisc'IO: A GrammarBased Approach to Spatial and Temporal I/O Patterns Prediction. Accepted, to appear in Proc. of ACM/IEEE SC 2014 (New Orleans, November 2014. [9] Fang Zheng, Hongfeng Yu, Can Hantas, Matthew Wolf, Greg Eisenhauer, Karsten Schwan, Hasan Abbasi, Scott Klasky. GoldRush: Resource Efficient In Situ Scientific Data Analytics Using FineGrained Interference Aware Execution. Proceedings of ACM/IEEE Supercomputing Conference (SC'13), November, 2013. URL sujet détaillé :
:
Remarques : Lieu: CEA, Bruyères le chatel, France Rémunération: environ 1000 euros/mois.
Ce stage pourra être suivi d'une thèse (financement acquis)





SM207116 Analye insitu et simulations parallèles à grande échelle en mécanique de fluides


Description


Les capacités de calcul des prochaines générations de machines permettront des simulations numériques de très grandes tailles produisant des quantités de données massives. La gestion de ces données du moment où elles sont produites jusqu'à leur analyse est un enjeu majeur. Ce stage s'inscrit dans ce contexte en se focalisant sur les simulations de mécanique des fluides. Lorsque les traitements (compression, indexation, analyse, visualisation) des données sont réalisés alors qu'elles sont encore stockées dans la mémoire vive de la machine exécutant la simulation, ils sont dit " insitu ". Si ces traitements ont lieu une fois les résultats stockés sur disques une fois la simulation terminée, on parle de " posttraitement ". Ce domaine de recherche, à la frontière entre les problématiques du Big Data et de l'Exascale, est identifié comme critique pour la simulation numérique à très grande échelle sur le prochaines générations de machines [5].
Le département de R&D d'EDF développe des codes de simulation parallèle tel que le code SATURNE de mécanique des fluides qui s'exécute sur plusieurs milliers de coeurs de calcul. Les études réalisées à l'aide de tels codes produisent de grandes masses de données essentiellement décrites sous forme de champs scalaires (par exemple la température dans chaque position de l'espace dans un problème de thermodynamique) et champs vectoriels (par exemple la vitesse des particules d'eau). La Figure 1 montre un cas de visualisation de flux dans un environnement industriel. La quantité de données produite par simulation est très importante, mais très souvent chaque simulation est reproduite plusieures fois avec des jeux de paramètres différents.
Un premier prototype d'analyse de données insitu a été développé avec SATURNE et la librairie Catalyst/Paraview [1]. L'objectif de ce stage est d'étudier des stratégies de partage efficace des resources entre l'analyse et la simulation. Certaines approches dédient un coeur de calcul, helpercore, pour les analyses par nóud de simulation, on parle de calculs insitu. D'autres essayent d'entrelacer sur les mêmes coeurs analyse et simulation en activant les analyses sur les phases d'inactivité de la simulation. D'autres nóuds peuvent être dédiés à 100% aux analyses. On parle alors de nóuds intransit. Après une étude de l'existant le stagiaire sera amené à développer sa propre stratégie dans le contexte des analyses des simulations du code SATURNE. Le travail consistera en une partie de développement et d'expérimentation sur des supercalculateurs de grande taille. Le stage se déroulera à l'INRIA Grenoble (possibilité de faire le stage à Clamart en acccord avec les encadrants) dans une équipe spécialisée en calcul parallèle et analyse insitu, en étroite collaboration avec les équipes EDF R&D en charge du développement de SATURNE et de l'analyse de données. URL sujet détaillé : http://moais.imag.fr/membres/bruno.raffin/MISC/stageinsituINRIA+EDF.pdf
Remarques : Stage en collaboration avec EDF R&D (quelques mission à prévoir à Clamart) Rémunération: suivant situation (minimum de 430 euros/mois)





SM207117 Social network and language change


Description


Context Research on modeling and analyzing human language with complex network is on the rise in the recent years [1]. An important subfield focusses on multiagent modeling of the influence of social network on language use and language change [210]. Indeed, understanding the links between society and language benefits from the insights of the dynamic modeling of populations of agents interacting with one another and sharing simulated linguistic variants under social and cognitive constraints [11]. Many attempts have been made in this direction, each of them achieving local goals and accounting for certain linguistic, social, and cognitive parameters: structure and size of the social network, type of learning (statistical vs. categorical), functional value and frequency of the alternating linguistic variants [3], social status of the agents and social distance between them [5], spatial distribution of the population [10], birth and death of the agents as well as preference for interacting with other agents using the same dialect [7], trend toward social distinction [4], delay of decay of the memorized instances of the variants [6]. However, considering the field, one feels disappointed because of the huge diversity of models implementing overlapping sets of parameters. We lack a general view on what is to be modeled and how to model it. Answering these questions is important because modeling is the only way for testing hypotheses in a field where experimentation in the real environment is not possible. Answering these questions implies an interdisciplinary work bringing together Network Science and Linguistics. In this context, the work proposed for the internship will be carried out in an interdisciplinary environment. It will follow two stages corresponding to two objectives.
Objectives 1/ Taking an overview of the issues at stake  What parameters are included in the available models? Into what categories could they be classified? (Cognitive parameters, social parameters, linguistic parameters, etc.)  In the light of the work on language change in the field of linguistics, what parameter should be prioritized?  What implementation tools have been used in the different modeling attempts? 2/ Taking into account this overview, the internship should then aim at:  Formulating a general framework for this type of modeling in comparison with similar models (e.g. opinion models)  Implementing a model inspired by opinion models and variationist linguistics with reals between 0 and 1 accounting for the weight of the linguistic variants ([12] for a first attempt)  Observing how the spread of the linguistic variants throughout the social network depends on the structure of the communities within the network and the overlapping between them.  Laying the foundation for implementing the reverse influence, that is the influence of the linguistic states of the agents on the structure and the frequency of their interactions. URL sujet détaillé :
:
Remarques : coencadrement avec
********************************************
JeanPierre Chevrot
Institut Universitaire de France
http://iuf.amue.fr/
Université Grenoble 3
Laboratoire Lidilem  BP 25, 38040, Grenoble cedex
France
http://lidilem.ugrenoble3.fr/spip.php?article30&var_mode=calcul&mem_id=4





SM207118 Machine learning and data mining for vehicle simulation results


Description


ANALYSIS OF THE IMPACT OF ADVANCED VEHICLE TECHNOLOGIES USING LARGESCALE SIMULATIONS
Context Vehicle energy consumption and powertrain operations for future vehicle powertrain technologies (from current vehicles to year 2045) is predicted through vehicle simulations using Autonomie, a vehicle simulation tool developed at Argonne. Over 4000 vehicles, with different powertrain types (conventional and numerous electrified powertrains) and component technologies (advanced engines, batteries, electrical machines, light weighting) are simulated for vehicle fuel consumption and cost. The output of this study is used by the US government to evaluate the impact of its R&D funding on energy security and CO2 emission.
Objectives and Tasks The project will involve running large scale simulations of advanced technology vehicles using the large scale simulation process internally developed. The student will also help in automating the large scale simulation process. The project will involve improving/developing analysis methods for the simulation results. In particular: 1. Developing algorithms to extract important parameters/values from large sets of simulation results and results database: a. For quality check and analysis using mathematical techniques (statistical methods, machine learning, fuzzy clustering and data mining for data validation). b. For dataanalysis. 2. Developing standardized plots to explain component operation and power/energy requirements (e.g. engine operation, battery power and energy requirements) and trends with powertrain technology and time. 3. Time permitting, continuation of development of a graphical interface allowing to access the results and assumptions. This will include making the tool more userfriendly, more customizable and with more plotting possibilities (histograms, scatter plots, pies, charts...)
Skills/prerequisites (desired and/or required): Strong mathematical background: statistical analysis, machine learning, data mining, good knowledge of Matlab, graphical user interface programing (C# preferred). URL sujet détaillé :
:
Remarques : Salary: approx. /month  Benefits: roundtrip ticket to Chicago, visa expenses
 Duration: 5.5 months max, 4 months minimum
Please send your resume in English to Ayman Moawad: amoawad.gov





SM207119 Strucutre of classes of graphs defined by constraints on chords




SM207120 Evaluating and Exploiting Dynamic Pricing Strategies


Description


Some online shops and other ecommerce providers, including plane or train ticketing, car rentals, or hotel booking websites, use dynamic pricing. When using dynamic pricing, the offered price is not identical for all customers, but varies as a function of many parameters, for example time of the day, day of the week, geographic location, IP address, cookies, browsing history, device or browser used, etc. For example, flights can be sold at different prices in different places, or may become more expensive if you already looked for them in the past. Some web shops offer cheaper prices on price comparison websites, which means that customers arriving via these websites pay lower prices than customers that directly consulted the shop's website.
The goal of these strategies is to maximize profits by always charging the highest price the customer is willing or able to pay. Since different customers might not all be willing to pay the same price, the website providers aim to determine the optimal price for each customer, based on the available information.
However, using privacyenhancing techniques such as proxys or VPN connections, many of these parameters can be influenced by the user. For example, HTTP headers and cookies can be modified, IP addresses changed, etc. Using such techniques, users could potentially exploit the dynamic pricing strategies to their advantage, and obtain lower prices than they would normally have been offered.
The goals of this project are twofold. In the first part, the goal is to experimentally evaluate dynamic pricing strategies used by different websites. The goal of the second part is to identify effective pricereducing strategies for customers, and implement them in a tool that automatically adapts the users access to a website to reduce prices.
For the first part, a system to experimentally measure prices as a function of the various parameters has to be developed. For example, header rewriting tools or proxys can be used to prevent browser tracking, and simulate various possible clients  different browsers, operating systems, devices etc. VPN servers or systems like TOR can be used to change IP addresses, and in particular to obtain IP addresses from various geographical locations all over the world. Finally, a website crawler can connect to the targeted websites via these tools to automatically collect the relevant prices for various configurations. To obtain sufficient data and meaningful statistics, this data acquisition processes, i.e. configuration of the tools and price crawling, should be automated.
In the second part, once the data is collected, it should be analyzed to identify opportunities for customers to exploit these techniques to their advantage. These strategies then have to be implemented in a tool helping users to automatically obtain the lower prices. The tool will automatically configure the header rewriting tools, proxys and VPN servers (already used in the data collection process) so that a customer can simply connect via the tool to a website to get lower offers.
If precise models of the pricing strategy can be inferred from the collected data, these models could be validated and used to predict e.g. further price evolution. This would be done in collaboration with a behavioural economist from ETH Zurich. Depending on how far this modelling and prediction is taken, the project could also be suitable for two students working in collaboration. URL sujet détaillé : http://people.inf.ethz.ch/jdreier/projects/dynamicpricing.pdf
Remarques : Coencadrement avec Prof. David Basin. Eventuellement possibilité de renumération.





SM207121 Analyzing Accountability in (E)Exams


Description


Many companies and universities increasingly use electronic exam (eexam for short) systems: for example Cisco's CCNA certifications or language tests such as IELTS or TOEFL are organized using computer systems, and also at ETH Zurich electronic exams are held regularly. This naturally raises the question of the security of such systems. From the point of view of the organizing institution, cheating is a major concern, but candidates (i.e., students) are typically more concerned with fairness and transparency of the process. In particular the latter concerns are often ignored by current systems. Typically the system is assumed to be trusted, and the candidates can only partly verify whether everything went well. Yet, as recent scandals have shown, (e)exam systems cannot always be trusted entirely.
In security protocol design the notions of Verifiability, Accountability and Auditability have been known for some time. A verifiable protocol protocol allows the participants to verify, after the protocol's execution, whether the outcome of the protocol is correct. When verifiability fails, accountability additionally allows to identify which party is responsible for this failure. Finally a protocol is auditable if it logs sufficient evidence to convince an outside judge of the absence of errors.
Some preliminary work on verifiability in eexams exist, however accountability and auditability have not yet been studied in this context.
The goal of this project is study accountability and auditability of (e)exams. In the first part, based on previous work, formal definitions of accountability and auditability have to be developed. This requires identification of the stakeholders, and the properties requiring verification. Moreover, a suitable model for the definitions has to be chosen.
In the second part, these definitions will be applied on a case study. The goal is to validate the model and the definitions, but also to analyze to which extent current systems ensure the desired properties. Possible case studies include existing real eexam systems (e.g., the system used at ETH), or cryptographic protocols from the literature. A formal analysis using a protocol verification tool such as Tamarin, ProVerif, CryptoVerif or similar is expected.
Optionally also a traditional penandpaper exam system can be studied, or new and improved eexam protocols can be proposed. Moreover, if shortcomings in the case studies are identified, fixes can be developed. URL sujet détaillé : http://people.inf.ethz.ch/jdreier/projects/eexam.pdf
Remarques : Coencadrement avec Prof. David Basin. Eventuellement possibilité de renumération.





SM207122 Cloud resource management


Description


Clouds are changing the way computing resources are used. From Google to Amazon, most major IT companies are separating their computing resources from their usage using virtualization technique. Assuming the content of virtual machines are black box, it becomes possible to save energy at unprecedented scale using migration techniques and DVFS. SEPIA team at irit is investigating two complementary directions to improve sustainability of large scale datacenters:
1  Distributed work stealing applied to virtual machines: In this work the focus will be put on decentralized decision center able to scale easily where some physical machines may steal work from others, if given threshold budget (financial, service level violations, power, thermal) is/are not exceeded. Obviously we will study the tradeoff between optimal solutions and decentralized possible inaccurate ones. 2  Thermal and energy aware scheduling: In this work the focus will be on the thermal and energy models of clouds, from the architecture level to the application level. These models will serve decision mechanisms to schedule (and reschedule) the virtual machines among the set of physical host in order to achieve good performance while saving globally energy.
The goal of this research project is to merge those two approaches and to investigate how thermal and energy characteristics of nodes and applications can lead to reduce energy consumption of heterogeneous data centers. This project will use inhouse cloud system CloudMip (32nodes, up to 256 VM) and Grid5000 (5000 cores, 11 sites in France). URL sujet détaillé :
:
Remarques :





SM207123 Energyefficient Kernel level DVFS for High Performance Computing


Description


Classically, High Performance Computing systems are following the principle: The faster, the better, whatever the cost. Next generation HPC hardware has to be able to reduce power and energy consumption. It is necessary to reduce not only their cost, but also to make their existence possible: a HPC system needing a dedicated power plan would be impossible to build. One key element is to manage at fine grain the processor and cores frequencies. In HPC systems, the tradeoff between energy and performance lean toward the latest. The goal of this project is to optimize frequency to reduce power consumption without impacting performance. To reach this objective, we will first detect the type of bottleneck a particular core is facing: computing, memory or IOintensive application. This detection will be done using only black box models and thus will use only system information, but also performance counters. Depending on the particular bottleneck, the kernel DVFS system (governor) will adapt the processor frequency. A second step will be to design a processor model in order to generalise the approach. Finally a holistic model will be proposed in order not only to apply DVFS at the processor level, but also to manage memory banks and bandwidth and network card Adaptive Link Rate. URL sujet détaillé :
:
Remarques :





SM207124 SmartCampus Geolocation


Description


With thousand of inhabitants, campus are now small towns. For all users (students, teachers, support staff), incampus movements are part of the everyday job. Often being able to find a particular service or a particular lecture hall can be difficult. It is even harder as lastminute changes often occur (schedule change, logistic problems in lecture or meeting rooms). But Geolocation on a campus is not only a problem, it can be a precious resource. University can use anonymous geolocation data to improve road infrastructures, but also schedule. Using these data at a personal level also allows to automate some tasks (switching off lights and computers when everyone leaves a room, finding an available room for small meetings, tutoring or students willing to work on particular projects). The goal of this project is to propose a global geolocation system based on a new language using geolocation condition to perform tasks in accordance of a global behavioral model based on differentiated profiles (students, teachers,...). There will be a cooperative learning component of the system able to detect global changes in paths followed by people to help detect automatically moving problems. This part of the system will be of great use to help motor disabled people to reach easily their destination. URL sujet détaillé :
:
Remarques : Part of neOCampus operation (www.irit.fr/neocampus)





SM207125 Green smart campus


Description


In the ambient context of the Campus the objective is to improve the life of the users of a university campus while answering ecological, financial objectives ... The campus is then a place of training but also an ecological place. We suggest considering the set of machines of an university as being a cloud of machines and to virtualize the applications of the users. The project will study various scientific aspects to propose a dynamic center of decision to execute the applications in a global objective of reduction of the energy consumption. From the model of the set of physical machines and Virtual Machines characteristics, this project will propose placement algorithms to choose the place of execution of the VM (consolidation, reduction of the energy consumption, reduction of the h spots). This work will be complemented by the study of the adaptation / elasticity so that the decisions also adapt themselves to the peaks of loads (for example in case of arrival / departure of students at the same time). URL sujet détaillé :
:
Remarques : Part of neOCampus operation (www.irit.fr/neocampus)





SM207126 Energyaware task management in hybrid environment


Description


In this project, we will work on the dynamic management of services (may be composed of several interdependent individual tasks) proposed to the users in an hybrid environment (multicloud). These services can be executed either on dedicated cloud or on the users resources. We aim at providing services to users optimizing several objectives related to the experience of the user (performance, latency, cost, ...) while reducing the overall energy consumption at its maximum, leading to necessary tradeoff studies.
The objectives will be to propose distributed algorithm for placement, replacement and scheduling of services in heterogeneous systems. Genetic algorithm approach will be compared to vectorpacking. URL sujet détaillé :
:
Remarques :





SM207127 Dynamique des Machines de Turing abstraites / Dynamics of Abstract Turing Machines


Description


The purpose of the internship is to study dynamical properties of abstract Turing machines, which correspond intuitively to Turing machines where the set of states may be arbitrary (but compact).
More information may be found on the link.
A familiarity with the course CR05 (Tilings: between Dynamical Systems and Computability) is appreciated but not mandatory.
URL sujet détaillé : http://www.loria.fr/~ejeandel/files/stageM2.2.eng.pdf
Remarques :





SM207128 Type refinement in λΠcalculus modulo


Description


Dedukti is an experimental language designed to write proof checkers for logics. It is used to implement independent proof checkers for proof assistants such as Coq and Isabelle and trace checkers for automated deduction tools such as Zenon and iprover modulo. The logic underlying Dedukti is called λΠmodulo: it is an extension of λΠ, the simplest form of dependently typed λcalculus, with userdefined rewriting rules for type equality.
The goal of this internship is to develop a typeinference engine for Dedukti. In dependently typed languages, type inference, unification and proofsearch merge. Therefore, we propose to develop, in this internship, a common abstract interface for these, improving on the work which has been done for the proofsearch engine of Coq.
This line of research contains both theoretical and practical programming aspects. The internship can be geared towards on or the other depending on the student's preferences. URL sujet détaillé : http://assertfalse.net/arnaud/internships/deduktife.pdf
Remarques : Cosupervised by Olivier Hermant





SM207129 Proof of convergence for a spacecraft guidance rendezvous algorithm


Description


The rendezvous process for spacecraft is an orbital operation involving at least two vehicles (satellite or International Space Station servicing, debris removal, formation flight) and consists in bringing the actuated vehicle (chaser) into the proximity of the passive vehicle (target) via a series of elementary phases (phasing, far range rendezvous, close range rendezvous). Each phase is composed of orbital maneuvers that must be carefully planned to reach their goal.
We focus on impulsive rendezvous problems for technological reasons since most satellites are equipped with ergol thrusters. The impulsive approximation for the thrust means that an instantaneous velocity increment is applied to the chaser for each impulse. We intend to obtain validated solutions to these problems under control and continuous trajectory constraints.
Since rendezvous problems are naturally formulated as optimal control problems we distinguish both direct approaches  which transform the dynamic optimal control problem into a static optimization problem [Hull97]  and indirect approaches  which are based on solving optimality conditions obtained from the Pontryagin Maximum Principle [Pontryagin62].
Concerning indirect methods, the derivation of the optimality conditions from the Pontryagin principle is known as the primer vector theory [Lawden63]. In [Arzelier13] the framework of minimumfuel fixedtime rendezvous problem is presented and necessary and sufficient conditions of optimality are recalled. These conditions are nonconvex and transcendental, so a mixed iterative algorithm using dynamic griding and polynomial optimization techniques is developed. This algorithm is based on a heuristic approach which consists mainly of two stages: (1) an initialization step solving a 2impulse rendezvous problem; (2) an iterative procedure which builds up final kimpulse solution by adding/merging impulses in an attempt to minimize the maximum of the norm of the primer vector.
While the convergence of this algorithm is observed in all practical cases tested, no analytical proof exists so far.
The objective of this internship is to assess the systematic convergence of this algorithm. One intuitive starting point would be to compare and contrast the idea of this iterative approach to classical minimax, "points exchange technique" approximation algorithms, like the one given by Remez [Remez34], [Chap. 3, Cheney98].
Hence, the candidate should analyse the primer vector theory in order to understand the construction and functionning of the algorithm developed at LAASCNRS. Thereafter, he (she) should highlight links between this and the Remez algorithm that would eventually lead to the construction of a proof of convergence.
Keywords: space rendezvous, primer vector theory, minimax approximation
Supervisors: Mioara Joldes, Christophe Louembet, MAC, LAASCNRS.
References
[Arzelier13] Arzelier, D. and Louembet, C. and Rondepierre, A. and KaraZaitri, M., A New Mixed Iterative Algorithm to Solve the FuelOptimal Linear Impulsive Rendezvous Problem, Journal of Optimization Theory and Applications, 2013, vol. 159, no. 1, pages 210230.
[Cheney98] Cheney, E.W.,Introduction to approximation theory, Amer. Mathematical Society, 1998.
[Hull97] Hull, D. G., Conversion of optimal control problems into parameter optimization problems, J. Guidance, Control and Dynamics, AIAA, 1997, vol. 20, no. 1.
[Lawden63] Lawden, D.F., Optimal trajectories for space navigation, Butterworth, London, England, 1963.
[Pontryagin62] Pontryagin, L.S. and Boltyanskii, V.G. and Gamkrelidze, R.V. and Mishchenko, E.F., The Mathematical Theory of Optimal Processes, Interscience, 1962.
[Remez34] E. Remes, Sur le calcul effectif des polynomes d'approximation de Tchebichef, Comptes rendus hebdomadaires des séances de l'Académie des Sciences, no. 199, pages 337=96340, 1934. URL sujet détaillé :
:
Remarques :





SM207130 Compiling process networks on GPU


Description


A process network is a model of parallel computation, which serves as an intermediate representation for parallelizing compilers.
GPU accelerating cards, initially designed to fit the performance requirements of video game, serves now as basic blocs for Highperformance supercomputers.
The goal of this internship is to find a way to map our process networks on GPU. That is, to find out how to allocate the process network elements (process, buffers) on the GPU components (memory hierachy, SP, warps, core).
The trainee will:  study the state of the art on process networks and automatic parallelization, notably on GPU ;  translate ``by hand'' several simple process networks (vector sum, matrix multiply, 1D Jacobi, etc) ;  propose a translation algorithm, and possible improvements to the process model ;  implement (in C++), test and validate the approach experimentally. URL sujet détaillé : http://perso.enslyon.fr/christophe.alias/stages/stage_gpu_en.pdf
Remarques : This internship can lead to a PhD thesis.





SM207131>Liens entre propriétés dynamiques et algorithmique


Description


Etant donné une classe de systèmes dynamiques effectifs, il apparait naturellement des conditions nécessaires liées à la calculabilité pour décrire des propriétés dynamiques. On cherche alors à savoir si ce sont les seules. Cette approche a beaucoup été étudiée pour les automates cellulaires et les pavages définis par règles locales. Pour ces derniers, les réels réalisant l'entropie sont exactement les réels positifs récursivement approchables par le haut. Ce type de résultats repose sur des constructions techniques qui permettent d'encoder une machine de Turing, mais on souhaiterait dégager les éléments de notre classe de systèmes dynamiques qui permettent cet encodage.
Une approche pour aller dans cette direction, accessible pour un stage de M2 et qui ouvre certainement de nombreuses pistes de recherches, est de regarder les transformations de l'intervalle (i.e. étant donné une fonction 1166BR>f:[a,b]^d>[a,b]^d, on considère le système dynamique défini par u_{n+1}=f(u_n) où u_0in[a,b]^d). En effet, il est facile de faire varier la classe de systèmes dynamiques suivant les propriétés des fonctions considérées (continue, affine par morceaux, linéaire par morceaux...). Par exemple, il apparaît des limitations liées à la calculabilité pour l'entropie, mais on souhaiterait savoir à partir de quelles restrictions sur la classe des systèmes on obtient toutes les entropies possibles. Cette problématique a pour but d'établir un dictionnaire entre propriétés dynamiques et propriétés liées à la calculabilité voir à l'algorithmique dans le cadre des systèmes dynamiques effectifs. URL sujet détaillé : http://www.latp.univmrs.fr/~sablik/Stage.html#SystemeDynamiqueCalculable
Remarques :





SM207132 Information content of programs


Description


Algorithms usually process finite inputs such as natural numbers, finite strings, finite graphs, etc. In computability theory one is often interested in providing algorithms with infinite inputs such as functions on natural numbers, infinite graphs, real numbers, etc. There are mainly two ways of making it possible. In the Type1 model, one only allows computable infinite inputs, which can be represented by finite programs, which are the actual inputs to the algorithm. In the Type2 model, the infinite input is given by an oracle, or is entirely written on an infinite input tape. The first model is in a sense more powerful: from a program computing an input one can recover the input, so every Type2 algorithm can be converted into a Type1 algorithm. Comparing these two models has been a central topic for a long time, but the Type1 model is not well understood. Celebrated results such as the undecidability of the halting problem proved by Turing or Rice theorem essentially say that having a program does not help in understanding its behavior, so they provide cases when the Type1 model is not more powerful than the Type2 model. However there exist situations where the two models differ. Our main question is: what makes the Type1 model strictly more powerful than the Type2 model? What additional information does a program computing an input provides about the input?
We have recently shown that to some extent, giving a program computing an input is the same as giving the input itself together with an upper bound on the size of a program computing it (its Kolmogorov complexity). Hence each Type1 algorithm can be converted to a Type2 algorithm that is additionally given an upper bound on the Kolmogorov complexity of its input. This equivalence holds to some extent, depending on the class of inputs and the task performed by the algorithm.
The objective of the internship is to continue this work. Several problems are raised about this equivalence: understanding its consequences, its limits, understanding what is the ``information content'' of a program. URL sujet détaillé : http://www.loria.fr/~hoyrup/programs.pdf
Remarques :





SM207133 Proof of convergence for a spacecraft guidance rendezvous algorithm


Description


The rendezvous process for spacecraft is an orbital operation involving at least two vehicles (satellite or International Space Station servicing, debris removal, formation flight) and consists in bringing the actuated vehicle (chaser) into the proximity of the passive vehicle (target) via a series of elementary phases (phasing, far range rendezvous, close range rendezvous). Each phase is composed of orbital maneuvers that must be carefully planned to reach their goal. We focus on impulsive rendezvous problems for technological reasons since most satellites are equipped with ergol thrusters. The impulsive approximation for the thrust means that an instantaneous velocity increment is applied to the chaser for each impulse. We intend to obtain validated solutions to these problems under control and continuous trajectory constraints. Since rendezvous problems are naturally formulated as optimal control problems we distinguish both direct approaches  which transform the dynamic optimal control problem into a static optimization problem [Hull97]  and indirect approaches  which are based on solving optimality conditions obtained from the Pontryagin Maximum Principle [Pontryagin62]. Concerning indirect methods, the derivation of the optimality conditions from the Pontryagin principle is known as the primer vector theory [Lawden63]. In [Arzelier13] the framework of minimumfuel fixedtime rendezvous problem is presented and necessary and sufficient conditions of optimality are recalled. These conditions are nonconvex and transcendental, so a mixed iterative algorithm using dynamic griding and polynomial optimization techniques is developed. This algorithm is based on a heuristic approach which consists mainly of two stages: (1) an initialization step solving a 2impulse rendezvous problem; (2) an iterative procedure which builds up final kimpulse solution by adding/merging impulses in an attempt to minimize the maximum of the norm of the primer vector. While the convergence of this algorithm is observed in all practical cases tested, no analytical proof exists so far. The objective of this internship is to assess the systematic convergence of this algorithm. One intuitive starting point would be to compare and contrast the idea of this iterative approach to classical minimax, "points exchange technique" approximation algorithms, like the one given by Remez [Remez34], [Chap. 3, Cheney98]. Hence, the candidate should analyse the primer vector theory in order to understand the construction and functionning of the algorithm developed at LAASCNRS. Thereafter, he (she) should highlight links between this and the Remez algorithm that would eventually lead to the construction of a proof of convergence. Keywords: space rendezvous, primer vector theory, minimax approximation, Remez algorithm
Supervisors: Mioara Joldes, Christophe Louembet, MAC, LAASCNRS.
References [Arzelier13] Arzelier, D. and Louembet, C. and Rondepierre, A. and KaraZaitri, M., A New Mixed Iterative Algorithm to Solve the FuelOptimal Linear Impulsive Rendezvous Problem, Journal of Optimization Theory and Applications, 2013, vol. 159, no. 1, pages 210230.
[Cheney98] Cheney, E.W.,Introduction to approximation theory, Amer. Mathematical Society, 1998.
[Hull97] Hull, D. G., Conversion of optimal control problems into parameter optimization problems, J. Guidance, Control and Dynamics, AIAA, 1997, vol. 20, no. 1.
[Lawden63] Lawden, D.F., Optimal trajectories for space navigation, Butterworth, London, England, 1963.
[Pontryagin62] Pontryagin, L.S. and Boltyanskii, V.G. and Gamkrelidze, R.V. and Mishchenko, E.F., The Mathematical Theory of Optimal Processes, Interscience, 1962.
[Remez34] E. Remes, Sur le calcul effectif des polynomes d'approximation de Tchebichef, Comptes rendus hebdomadaires des séances de l'Académie des Sciences, no. 199, pages 337=96340, 1934. URL sujet détaillé : https://www.laas.fr/boreal/web/fr/stage/voirStage/224
Remarques : Possibilité d'indemnisation





SM207134 Lambda Calculus on the reals compared to other real models of computation


Description


Contrary to discrete computation, there is no ChurchTuring thesis for models of computation on the reals. Various models exist: Recursive Analysis[1] that uses Turing Machines for computing reals at any precision; the General Purpose Analog Computer[2] which combines basic continuous bricks in a fashion similar to circuits; the Blum Shub and Smale model[3] which transforms the RAMmodel so that it can read and store real numbers in registers; Abstract Geometrical machines[4] inspired by cellular automata... Some comparability and incomparability results between them, but none of those models is clearly the good one.
Recently, an extension of Lambda calculus was introduced by Di Ginanatonio and Edalat[5] for capturing continuous functions with the help of a Differential operator. This model presents similarity with the GPAC as differentiation is a core operator, however it also has the ability to produce noncontinuous functions with a test which can only be found in the BSS model. Where this model can be placed on the map of models of real computation is hence an open question.
The goal of this internship will be to compare this Lambda Calculus with Differential (LCD) with the GPAC. Are all functions generated by GPAC definable in LCD, and conversely. For possible simulations do they preserve computing time? References
1. Weihrauch, K.: Computable Analysis: an Introduction. Springer (2000).
2. Shannon, C.E.: Mathematical theory of the differential analyzer. J. Math. Phys. MIT. 20, 337=96354 (1941).
3. Blum, L., Cucker, F., Shub, M., Smale, S.: Complexity and Real Computation. Springer (1998).
4. DurandLose, J.: Abstract geometrical computation 5: embedding computable analysis. Natural Computing. 10, 1261=961273 (2011).
5. Di Gianantonio, P., Edalat, A.: A language for differentiable functions. Foundations of Software Science and Computation Structures. pp. 337=96352 (2013). URL sujet détaillé : http://www.loria.fr/~hainry/sujet2_en.html
Remarques : Coadvisor: Romain Péchoux





SM207135 Complexity Analysis in Java with Tier based Type System


Description


Domain
The aim of Implicit Complexity is to design criteria (type systems, semantic interpretations) to prove that programs belong to given classes of complexity. A new implicit complexity analysis based on a type system for imperative and object oriented languages was proposed in articles [1], [2] and [7]. This analysis is inspired by Data Ramification techniques [3, 4] and by noninterference [5]. It ensures that if a program can be typed and terminates, it will run in polynomial time (or in a different context, polynomial space). Goal
The goal will be to extend the existing type system to capture more features of Object Oriented Programs, while still ensuring polynomial time or polynomial space execution.
This work should focus on the following ideas:
increase the number of programs that can be analysed through the use of program transformation techniques; improve the expressivity of the considered language by considering forks, threads, functions, exceptions for example. explore the common cases in real world programs for which this analysis fails and correct the type system to capture them.
Prerequisite
Knowledge of Java (or similar Object Oriented Language), good understanding of the theory of complexity and of type theory would be welcome. References
(1) J.Y. Marion, A Type System for Complexity Flow Analysis. LICS 2011, pp. 123132 (2011) (2) J.Y. Marion, R. Péchoux, Complexity Information Flow in a Multithreaded Imperative Language, Tamc 2014, lncs. (3) S. Bellantoni and S. Cook, A new recursiontheoretic characterization of the polytime functions, Computational Complexity 2 (1992), p. 97=96110. (4) D. Leivant and J.Y. Marion, Lambda calculus characterizations of polytime, Fundamenta Informaticae 19 (1993), no. 1,2, p. 167,184. (5) Dennis M. Volpano, Cynthia E. Irvine, Geoffrey Smith, A Sound Type System for Secure Flow Analysis. Journal of Computer Security 4(2/3): 167188 (1996) (6) Byron Cook, Andreas Podelski, Andrey Rybalchenko, Terminator: Beyond Safety. CAV 2006, 415418 (7) E. Hainry, R. Péchoux, Typebased heap and stack space analysis in Java, http://hal.inria.fr/hal00773141, 2013
URL sujet détaillé : http://www.loria.fr/~hainry/sujet1.html
Remarques : Coadvisor: Emmanuel Hainry





SM207136 Etude des Schnyder woods dans les graphes sur les surfaces


Description


Une "Schnyder wood" est une orientation et coloration des arêtes internes d'un graphe triangulé planaire vérifiant une jolie propriété locale. Chaque sommet doit avoir exactement trois arêtes sortantes de différentes couleurs : sortante rouge, verte, bleue apparaissant dans le sens horaire autour d'un sommet (voir Figure 1). Toutes les autres arêtes incidentes à ce sommet doivent être entrantes. Les entrantes rouges doivent entrer dans le secteur entre la sortante bleu et la sortante vert, et symétriquement pour les autres couleurs. Cette simple propriété locale implique une propriété globale où les arêtes internes de la triangulation se retrouvent partionnées en 3 arbres selon la couleur des arêtes, d'où le nom de Schnyder wood (voir Figure 2). Ces Schnyder woods ont de nombreuses applications en informatique et mathématique comme l'encodage compact, le dessin de graphes, les représentations par des objets géométriques, etc.
Nous proposons une généralisation de cette structure aux graphes triangulés sur les surfaces orientés de genre quelconque : sphère, tore, double tore, bretzel, etc. Dans cette généralisation un sommet à un dégré sortant égal à (0 modulo 3) et la propriété locale avec les couleurs est alors répétée plusieurs fois autour d'un sommet (voir Figure 3). Nous savons qu'il existe toujours une orientation des arêtes telle que chaque sommet a un dégré égal à (0 modulo 3) d'après un résultat de Barat et Thomassen. Mais pour le moment, la propriété sur les couleurs n'a pu etre démontrée que dans le cas de la sphère et du tore.
Le but de ce stage est d'étudier de manière appronfondie le cas du double tore. Il est facile de voir qu'étant donné une orientation où tous les sommets ont un degré sortant égal à (0 modulo 3), il suffirait de retourner des cycles orientés pour obtenir aussi la propriété sur les couleurs. L'étudiant pourra dans un premier temps écrire un programme permettant de comprendre comment se comporte ces retournements de cycles sur des exemples de triangulations du double tore pour ensuite essayer de prouver l'existence de l'objet étudié en toute généralité.
La proposition de stage est disponible ici avec les Figures : http://www.lirmm.fr/~leveque/Stage.pdf URL sujet détaillé : http://www.lirmm.fr/~leveque/Stage.pdf
Remarques : Gratification possible pour les non normaliens





SM207137 Erreurs d'arrondi en analyse numérique


Description


Ce stage concerne la vérification de programmes utilisant l'arithmétique à virgule flottante.Cette arithmétique est une approximation de l'arithmétique usuelle sur les nombres réels. Les calculs y sont effectués en précision finie ; les résultats intermédiaires ne sont pas représentés exactement. Cela cause des erreurs dites d'arrondi qui peuvent se propager, s'amplifier et potentiellement faire diverger les résultats.
L'une des approches pour certifier un programme est la vérification déductive, c'estàdire la preuve de propriétés logiques extraites du programme. Cela s'appuie sur une base de théorèmes formellement prouvés avec le système Coq.
Nous nous intéresserons ici à des schémas numériques stables adaptés à des équations aux dérivées ordinaires et à leur erreur d'arrondi. Nous avons démontré un cas particulier de compensation très importante dans un cas simple pour une équation aux dérivées partielles. Il serait donc intéressant de savoir si, de façon générale, les schémas numériques se comportent bien ou peuvent être des cas pathologiques visàvis des calculs flottants.
Ce stage a pour but d'éclaircir, pour les équations aux dérivées ordinaires seulement, le lien entre deux notions : la stabilité au sens de l'analyse numérique et la stabilité au sens de l'arithmétique flottante : le fait que le calcul se passe "bien", i.e. que les erreurs se compensent à cause de la forme particulière qu'a un schéma numérique stable et convergent. URL sujet détaillé : https://www.lri.fr/~sboldo/files/stageM2_an.pdf
Remarques :





SM207138 Bisimulations for the rhocalculus


Description


The rhocalculus [1] extends lambdacalculus with rewriting notions, such as firstorder patterns, which can be passed to functions as arguments and manipulated by terms. The calculus is useful for studying the theory of rulebased programming languages such as Tom[2]. The goal of this internship is to study the behavioral theory of the rhocalculus, i.e., to find how to prove formally that two rhocalculus terms behave in the same way. To do so, one has to define relations called bisimulations and prove their soundness. More precisely, we want to extend the already existing bisimulations for the lambdacalculus to the rhocalculus. We aim at proving equivalences between patterns or startegies in the rhocalculus, or maybe between programs in Tom.
[1] rho.loria.fr [2] tom.loria.fr URL sujet détaillé :
:
Remarques : Stage coencadré avec Horatiu Cirstea





SM207139 Multiscale classification and analysis of data on networks


Description


Keywords: signal processing, networks, graph wavelets, semisupervised learning, clustering
In a growing number of scientific studies, it is necessary to add to the data a description layer in terms of networks: it is true when studying transportation networks, biological networks, social networks, sensor networks, Internet networks [New10, BBV08]. or even images possibly with irregular samples, or only point samples [LG12], For processing these data having a structure of relational properties encoded by networks, the field of data processing on networks is emerging [SNF+13] and needs to rely jointly on approaches from computer science, physics and signal processing.
Elaborating on our previous works conducted about multiscale community detection in networks [TB14] and semisupervised learning on graphs [GGFSM14], the general idea proposed for this internship is to study graphbased multiscale processing, transformations and filtering of data on networks. For that, one promising approach is to consider that many detection, segmentation or estimation problems for data on graphs can be represented as a problem of learning, or of classification, with a varying degree of supervision. An objective of this project will be to probe how semisupervised learning (SSL) [AGMS12] can be designed for data in graphs in a way to fall between supervised and unsupervised cases.
This internship is clearly envisioned as the preparatory stage of a possible PhD work within the framework of the ANR grant "Graph Data and Signal Processing" in collaboration with the Physics Lab of ENS de Lyon, GREYC (Caen), GIPSAlab (Grenoble) and LIGM (Marnelavallée).
[AGMS12] K. Avrachenkov, P. Gonçalves, A. Mishenin, and M. Sokol. Generalized optimization framework for graph based semisupervised learning. In SIAM Data Mining, 2012. [BBV08] A. Barrat, M. Barthlemy, and A. Vespignani. Dynamical processes on complex networks. Cambridge University Press, 2008. [GGFSM14] B. Girault, P. Gonçalves, E. Fleury, and A. Singh Mor. Semisupervised learning for graph to signal mapping: a graph signal wiener filter interpretation. In IEEE Int. Conf. on Acoust. Speech and Sig. Proc., Firenze (Italy), May 2014.
[LG12] O. Lezoray and L. Grady, editors. Image Processing and Analysis with Graphs. Theory and Practice. CR Press, 2012. [New10] M. E. J. Newman. Networks: An Introduction. Oxford University Press, 2010. [SNF+ 13] D. Shuman, S.K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst. The emerging field of signal processing on graphs: Extending highdimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine, 30(3):83=9698, 2013. [TB14] N. Tremblay and P. Borgnat. Graph wavelets for multiscale community mining. Signal Processing, IEEE Transactions on, 62(20), October 2014.
URL sujet détaillé :
:
Remarques : The internship will be cosupervised by Paulo Gonçalves (Inria DR, LIP ENS Lyon) and Pierre Borgnat (CNRS CR, Physics Lab. ENS Lyon).
Indemnisation de stage au tarif en vigeur





SM207140 The total senergy of a multiagent system


Description


The objective of the internship is to study the algorithmic proof of the general upper bound on the senergy of a multiagent system with bidirectional communication graphs and to extend it to other models for which convergence is also guaranteed (eg., the decentralized model with semisimple communication graphs). One application of particular interest is the variant of the HegselmannKrause model in social epistemology which assumes a cognitive division of labor [6]: Truth seeking systems consist of one fixed agent (the truth) and mobile agents (truth seekers and ignorants). Agents thus play nonsymmetric role and communication graphs are nonbidirectional. However each truth seeking system corresponds to the decentralized model, and the convergence result for this model implies that truth seeking systems converge. An upper bound on the senergy in the semidecentralized model would provide informations on the convergence time of truth seeking systems. URL sujet détaillé : http://www.lix.polytechnique.fr/~charron
Remarques : Possibilité de rémunération dans le cadre d'un contrat DGALIX





SM207141 Vers une encyclopédie des centres de triangles certifiée


Description


Chacun connaît quelques points caractéristiques des triangles: le centre de gravité, le centre du cercle circonscrit, l'orthocentre, etc... Ces points ont de nombreuses propriétés qui ont été étudiées par les géomètres pendant des siècles: par exemple le centre de gravité, le centre du cercle circonscrit et l'orthocentre sont alignés sur une droite appelée droite d'Euler[1]. Plus récemment, sous l'impulsion de Clark Kimberling une encyclopédie des centres de triangles a été développée: http://faculty.evansville.edu/ck6/encyclopedia/ETC.html Cette encyclopédie comporte à l'heure actuelle la définition de plus de 6000 centres de triangles et la liste de nombreuses propriétés. Il est possible à l'aide d'un logiciel de calcul formel de vérifier automatiquement de nombreuses propriétés de ces centres de triangles. Il est aussi possible de vérifier certaines de ces propriétés à l'aide de l'assistant de preuve Coq. Par exemple, pour tester si trois points sont alignés cela revient à vérifier si un déterminant correspond au polynôme nul. Notre objectif est de créer une version certifiée de l'encyclopédie ETC. Un travail préliminaire a déjà été réalisé pour étudier la faisabilité du projet et générer automatiquement la preuve de nombreuses propriétés (par David Braun). Mais ce travail admet certains faits comme par exemple: le fait que le déterminant de la matrice 3*3 formé par les coordonnées trilineaires des trois points est nul ssi les points sont collinéaires, la caractérisation sous forme d'une equation en coordonnées trilineaires du cercle circonscrit à un triangle, etc. L'objectif de ce stage est de combler ces lacunes en prouvant dans le contexte de la géométrie de Tarski (http://dptinfo.ustrasbg.fr/~narboux/tarski.html) les différents lemmes nécessaires pour réaliser des calculs en utilisant les coordonnées trilineaires ou barycentriques. URL sujet détaillé : https://docs.google.com/document/d/1R39fSKGhBU3PPLyjemCs6x8yVGUh1kyDsYljTiZ_yXU/pub
Remarques :





SM207142 Sommeil/Mémoire et Analyse du signal avancée en Neurosciences


Description


Despite of the huge progress made recently in understanding the memory formation processes, the function of sleep, which is a critical step for this mechanisms, remains largely unknown.
We are currently studying the mechanisms of plasticity related to the different stages of sleep after learning in rats, which is a consistent model to understand this function in humans. For this purpose, we have made very long term recordings of freely behaving rats, sleeping and learning. These recordings includes local field potentials, which are spontaneous electrical oscillations varying according to vigilance states and slightly differents between distant brain regions. For the first time, we are coupling this measures to synaptic evoked potentials reflecting the strength of the connections between these areas.
We have developped a MatlabBased graphic software for the visualization and analysis of such signals according to vigilance state. This software allow the quantification of various parameters of advanced signals analysis like coherence between oscillations, phase differences, crosscorrelation in amplitude at specific frequencies, detection of phasic burst of oscillations and other spectral parameters informing us about the evolution of functional connectivity between the brain regions of interest.
We are looking for a student motivated by a transdisciplary and highly fundamental approach to continue the development of this software, improving its functionnality, and help us in the treatment and management of the big data we acquired to uncover the roles of sleep in memory. URL sujet détaillé :
:
Remarques :





SM207143 Absorber en couleurs


Description


This internship will deal with the following conjecture attributed to Erd=F6s:
For every integer k there exists a constant f(k) such for any karccoloured tournament on n vertices there exists an absorbing subset of vertices of size less than f(k).
URL sujet détaillé : http://www.isima.fr/~limouzy/absorption.pdf
Remarques : Stage coencadré par L. Beaudou et V. Limouzy





SM207144 Formal Semantics for JavaScript Proxies


Description


JavaScript has become the de facto assembly language of the web. Not only it is used in the setting of web applications, but it is now the compilation target of several languages, enabling many programs to run directly in a browser. Unfortunately, JavaScript is a complex and very dynamic language, thus JavaScript programs are difficult to analyze and even harder to prove correct. The combination of widespread use and complexity motivates the setting of this internship: research formal techniques for safer JavaScript development.
To control the access to resources, new programming languages constructs have been proposed, as illustrated by Secure EcmaScript [1] and the Caja language [2]. These constructs rely on "object capabilities" and "proxies" [3], the latter currently being standardized in EcmaScript 6 for the next version of JavaScript.
The topic of this internship is to formally define these constructs in the setting of the JSCert formalization of JavaScript in Coq [4] and to show that resource isolation may be achieved using proxies.
The internship will be supervised by [[http://www.irisa.fr/celtique/aschmitt/][Alan Schmitt]], researcher in the [[http://www.irisa.fr/celtique/][Celtique team]] at Inria Rennes. The student will have the opportunity to interact with the other members of the team, but also with researchers of the JSCert project from Inria Saclay and Imperial College. This internship opens up the opportunity of a PhD thesis in the setting of the AJACS ANR project.
[1] https://code.google.com/p/eslab/
[2] https://developers.google.com/caja/
[3] Trustworthy Proxies: Virtualizing Objects with Invariants. T. V. Cutsem, M. S. Miller. Proceedings of ECOOP '13. http://research.google.com/pubs/pub40736.html
[4] A Trusted Mechanised JavaScript Specification. M. Bodin, A. Charguéraud, D. Filaretti, P. Gardner, S. Maffeis, D. Naudziuniene, A. Schmitt, G. Smith. Proceedings of POPL '14. http://jscert.org/ URL sujet détaillé : http://ajacs.inria.fr/internships/proxies.html
Remarques : Ce stage pourra être rémunéré.





SM207145 Construction d'un atlas statistique pour la modélisation et le recalage de maillages dynamiques


Description


We will construct a statistical atlas of a polulation of timevarying (dynamic) 3D objects, in particular human meshes. Linear and nonlinear models will be adopted in order to reduce the dimension of the original data. Several types of distributions will also be studied, such as gaussian and nongaussian. Finally, we will propose an improved registration algorithm by using the constructed atlas. URL sujet détaillé : https://drive.google.com/open?id=0Bxt4xLT1VDZiRDJsdlMtSXBjaGM&authuser=0
Remarques : * Coencadrement avec  Christian Heinrich (ICube, Strasbourg, christian.heinrich.fr)
 Frederic Cordier (ICube, Strasbourg et UHA, Mulhouse, fcordier.fr)





SM207146 Eye Tracking et modèle de saillance visuelle pour les objets tridimensionnels


Description


In this project, we will use eyetracker devices to study visual saliency of 3D objects, with an aim of building a groundtruth for automatic saliency extraction methods. Once the constructed groundtruth, we will propose an error metric or evaluation method for existing saliency extraction methods. Finally, we will propose a new saliency extraction method based on the constructed groundtruth, by adopting a learning method, for instance. URL sujet détaillé : https://drive.google.com/open?id=0Bxt4xLT1VDZic2poMzMtNWtWU3c&authuser=0
Remarques : * Coencadrement avec  Guillaume Lavoué (LIRIS, INSA)
 Chaker Larabi (XLIM, Poitiers)





SM207147 Outils multidimensionnels de déformation


Description


This work is about geometric modeling. Interactive editing of 3D objects deformation is a tool for creating and modifying objects.
The objective of this work would be to apply deformations based on tools of different sizes simultaneously (points, curves, cages, etc.)
The difficulty of the work comes from different mathematical formulation for each tool different dimension.
A method for mixing the different tools will have to be proposed. It will allow to ensure the continuity of deformation at the interfaces between these tools. URL sujet détaillé : https://dptinfo.ustrasbg.fr/~bechmann/M2ISIStage2015Deformation3D.pdf
Remarques :





SM207148 Reconstruction des vaisseaux à partir de leur structure topologique


Description


Reconstruct the surface of organs from a medical image is a difficult problem, especially for structures such as blood vessels or vascular networks.
3D medical images are composed of voxels from which traditional algorithms reconstruct surface meshes composed of triangles, for example. However, in the reconstruction algorithms, defects are observed at the ends of the vessels. We therefore wish to design and implement a reconstruction adapted to these structures . This is the subject of this work. URL sujet détaillé : https://dptinfo.ustrasbg.fr/~bechmann/M2ISIStage2015ReconstructionVaisseaux.pdf
Remarques :





SM207149 Reconstruction multiorganes d'images médicales


Description


Reconstruct the surface of organs from a medical image is a difficult problem, especially if the algorithm must ensure topological consistency between the organs.
3D medical images are composed of voxels with different gray levels. These images are difficult to interpret. In addition, they are not suitable for more advanced treatment. Thus, it became of interest to analyze these images and model the structures they represent.
A wellknown algorithm to reconstruct these meshes for all configurations of voxels working on the edge of each organ is Marching Cubes.
His results create inconsistencies between the meshes of two adjacent anatomical structures. These topological problems are visually annoying but mostly they do not respect the anatomical reality.
The subject of this work is to develop an algorithm for Marching Cubes MultiOrgan for simultaneous reconstruction of all the anatomical structures present in the image. URL sujet détaillé : https://dptinfo.ustrasbg.fr/~bechmann/M2ISIStage2015MarchingCubeMultiOrganes.pdf
Remarques :





SM207150 Modélisation dynamique d'une langue 3D en utilisant un articulographe


Description


** Context The human tongue plays a significant role in speech production. It participates in several kinds of sounds, and its position is critical for some phonemes. Even though it is only partially visible from outside the mouth, the tongue helps to improve the intelligibility of audiovisual speech. The tongue is a complex organ that is highly flexible, expandable, compressible, can be curved and display a fine degree of articulation. The dynamic aspects of the tongue articulation (mainly coarticulation) are important as this has an influence on the intelligibility of the overall gestures. It is possible to observe the tongue behavior and measure its dynamics using electromagnetic articulography (EMA). In particular, the articulograph AG501 allows recording 3D positions and 2 rotation angles at 250 Hz. By gluing sensors on the tongue, it is possible to track its movement spatially and temporally.
**Missions In our work, we mainly focus on visual intelligibility. In this context, the tongue is partially visible and thus the dynamics are more important than finely modeling the shape of the tongue. For these reasons, the objective of this work is to build a simple dynamic 3D tongue model. The challenge is to find a compromise between a highly flexible 3D structure with very complex movements and a simple representation controlled by few parameters. The initial model will be composed of few geometric primitives. Then, articulatory data will be used to refine the tongue shape. This data will be recorded using an articulograph that we have in our lab. The regions of the tongue where the sensors are glued will be carefully chosen for modeling the shape. In addition, the dynamic behavior of the tongue model will be trained on this collected data. The final result of this work would be a 3D deformable tongue, that can be controlled by few parameters and that will be integrated in an existing talking head.
Strong background in optimization and 3D geometry is recommended. URL sujet détaillé : http://www.loria.fr/~slim/prop/propositionTongueSujetM22015.pdf
Remarques : Gratification de stage Possibilité de poursuite en thèse (doctorat).





SM207151 Modélisation des expressions faciales d'une tête parlante 3D


Description


** Context Speech communication is essentially bimodal, consisting of the acoustic signal that carries the auditory modality and the facial deformation represents the visual modality. This audiovisual communication is not just expressing the linguistic content of spoken text but it allows conveying an emotion (joy, sadness, disgust, etc) or an expressivity.
In the field of audiovisual speech, we develop talking heads to study how to make audiovisual speech communication expressive. We recall that a talking head is an animated and virtual 3D face synchronously with audio. In this context, we focus on the implementation of facial expressions of the major emotions (joy, fear, sadness, disgust, surprise) to build an expressive talking head.
** Mission The goal of this work is to model the facial expression of a talking head. The control parameters of the facial expressions have to be determined to be able to synthesize the predefined emotions. In the context of this project, we first acquire 3D facial data of several emotions using two motion capture techniques: (1) Motion capture software based on a Kinect ; (2) Vicon motion capture system that allows tracking markers glued on the face of the human speaker. Next, we will study the facial gestures allowing producing each emotion. The facial features of each emotion will be determined. Finally, we will propose synthesis techniques of the emotions allowing the modification of the talking head (either using machine learning techniques or using control parameters).
Good knowledge in algorithmic geometry and data modeling are appreciated. URL sujet détaillé : http://www.loria.fr/~slim/prop/proposition_ExpressionsFacialesM22015.pdf
Remarques : Gratification de stage Possibilité de poursuite en thèse (doctorat).





SM207152 Sémantique opérationnelle catégorique


Description


La sémantique opérationnelle est une méthode de description des langages de programmation, ou au moins de leurs modèles mathématiques, proposée par Plotkin vers 1980. Le point de départ de ce stage est la distinction entre d'une part les données rassemblées dans une sémantique opérationnelle (termes, éventuellement avec variables liées, étiquettes et règles de transition) et d'autre part le système de transitions étiquetées engendré par de telles données.
Le sujet du stage est l'organisation du passage de l'un à l'autre à l'aide de la théorie des catégories. Le but est de mathématiser le processus de description des langages de programmation, l'avantage de l'approche catégorique étant que chaque langage ainsi décrit vient avec une notion de modèle dénotationnel.
Plus précisément, on définira une catégorie, dont les objets seront les sémantiques opérationnelles, et une catégorie dont les objets comprendront les systèmes de transitions étiquetées engendrés; on construira ensuite ce qu'on appelle une " adjonction " entre ces deux catégories. On s'intéressera ensuite à certains langages de programmation déjà équipés d'une sémantique catégorique, pour comparer la notion standard de modèle avec celle obtenue dans le nouveau cadre.
Le travail sera guidé par un article antérieur [1] qui fait en gros la même chose dans le cadre plus limité de la récriture d'ordre supérieur.
[1] T. Hirschowitz. Cartesian closed 2categories and permutation equivalence in higherorder rewriting. Logical Methods in Computer Science 9(3:10). 2013.
http://www.lmcsonline.org/ojs/viewarticle.php?id=757 URL sujet détaillé :
:
Remarques :





SM207153 Étude combinatoire d'arbres et d'arrangements d'hyperplans


Description


The set of rooted labelled trees is well studied and appears in many problems. We define a generalization, for reasons that will be explained later, by coloring the edges. The number of rooted trees is known to be ^{n1}1166BR>and so the number of colored trees is ^{n1} where the number of vertices and the number of colors.
Among these colored trees we consider some special conditions such as : for a given color the children are smaller than the parent (decreasingness) or at least one of the children is smaller than the parent (non increasingness). These definitions being new many things are to be found.
We defined these conditions on colored trees in relation to the study of some hyperplane arrangements. In dimension an hyperplane arrangement is a set of affine hyperplanes ={H_1,ldots,H_n} Each hyperplane given by its equation ^{(ell)} x_1+a_2^{(ell)} x_2+cdots+a_d^{(ell)}x_d=b^{(ell)} People are looking for combinatorial properties of the arrangement like describing or counting the regions of the arrangement. One combinatorial information which is more general is the characteritic polynomial. Some special hyperplane arrangements will be of interest for us because they have simple defining equations :equations of the form x_j=b These arrangements are called sometime deformations of the braid arrangement or affinographic arrangement. The first of these families is the braid arrangement dimension =0 The case =1 called the Linial arrangement and the case =0 the Shi arrangement. These three special together with many other obtained by changing the values of the object of many questions and results in the twenty last years.
The first question when looking at such arrangements is to determine the number of regions or preferably the characteristic polynomial. A good way to determine the number of regions is to give a bijective proof which means giving a bijection between the family we want to count and a family already known. For this kind of proofs we need to find the (known) target family. So the colored trees with the special properties have been defined because we proved them to be in bijection with the regions of these kind of arrangements. The main reference for the study of this kind of arrangements is : Postnikov and Stanley, Deformations of Coxeter hyperplane arrangements. Journal of Combinatorial Theory Serie A (2000).
The questions that should be studied in this project are:
 the combinatorial study of the colored trees with decresing or non increasing conditions. Some results have just been found but there are still many things to look at.
 understanding the relation with the hyperplane arrangements. Finding bijective proofs were our starting point. It needs still some more work.
 there are some obvious direct generalizations to look at. The deformation of the braid arrangement are very specific. Since it is now working not too bad we can try to apply our tools to other similar families. URL sujet détaillé :
:
Remarques : Coencadrement avec David Forge (LRI, Universite ParisSud, forge.fr)





SM207154 Reoptimization of the Shortest Superstring Problem


Description


The Shortest Superstring Problem (SSP) is a well known problem. For a set P of finite strings, it seeks the smallest superstring (a string w where each string of P is a substring of w). This problem is NPhard (Gallant et al. 1980) and there exist many algorithms of approximation (Blum et al. 1994, Paluch 2014). The reoptimization of the Shortest Superstring Problem describes the following scenario: given a set of strings together with an optimal solution for it, we want to find a good solution for a locally modified instance (addition, deletion or modification of a string or a set of strings).
This research topics of this internship aims at survey the literature and propose ideas for one of the following goals:
1Find an algorithm for reoptimization of the Shortest Superstring Problem for different locally modified instances, for example :  addition of a string (Bilo et al. 2011)  modification of a string (insertion, deletion or mutation) 2Given a set P of strings, an algorithm A with a ratio of approximation r and a solution w of A for P, we want to find the ratio between w and the solution of A for different locally modified instances of P.
This topic is well suited to a candidate with a good background in algorithmics and complexity. URL sujet détaillé : http://www2.lirmm.fr/~rivals/INTERN/
Remarques : The compensation is that of the CNRS for internships .





SM207155 Algorithmique des structures d'ARN localement optimales


Description


RiboNucleic Acids (RNAs) are singlestranded macromolecules which can be crudely described as sequences of length ranging from 20 to 3 000 letters, over the fourletters alphabet {A,C,G,U}. Due to its singlestranded nature, any synthesized RNA molecule undergoes a folding process, in which connections, also known as basepairs mediated by hydrogen bonds, are established between its nucleotides. The outcome of RNA folding is a large variety of structures, or spatial conformation, which strongly determines the function of a given RNA within cellular mechanisms. From a computer science perspective, the structure(s) of an RNA can be acceptably abstracted as a graph, or a tree.
The main goal of this internship is the design and implementation of one (or several) algorithm(s) for the efficient enumeration and random generation of RNA locally optimal structures. Such structures act as kinetic traps in the energy landscape, aka local minima of the energy function, and are generally believed to significantly slowdown, or even disrupt, the folding of structured RNAs. Their enumeration represents a mandatory first step towards an efficient in silico analysis of the kinetic behavior of RNA from its sequence. Previous works [A] have resulted in a polynomialtime algorithm for the enumeration of locally optimal structure in a combinatorial setting based on basepair maximization. This algorithm will be the starting point of the internship, and it will be completed/extended to capture the complete set of features supported by the  more accurate and realistic  Turner model [B].
[A] Azadeh Saffarian, Mathieu Giraud, Antoine de Monte, and Hélène Touzet. RNA locally optimal secondary structures. J Comput Biol, 19(10) :1120=961133, Oct 2012. [B] D. H. Mathews, J. Sabina, M. Zuker, and D. H. Turner. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol, 288(5) :911=96940, May 1999. URL sujet détaillé : http://www.lix.polytechnique.fr/~ponty/stages/LocOptStageBONSAIRNALands2014EN.pdf
Remarques : Coencadrement avec Yann Ponty (LIXEcole Polytechnique) Poursuite en thèse envisageable dans le cadre du projet RNALands





SM207156 Estimation of the photometric properties for digitized 3D objects acquired in uncontrolled lighting conditions


Description


Current 3d digitization technologies for real objects considerably simplify the process of creating 3d models, especially for the digital content production industry (video games, virtual museography, ebusiness, etc.). In order to perform realistic rendering with variable lighting conditions, the common approach consists in estimating the parameters of a Bidirectional Reflectance Distribution Function (BRDF) model that characterizes the optical properties of a material. Most existing methods for the estimation of the parameters rely on photometric acquisitions performed in perfectly controlled lighting environments, which imposes too many constraints in an industrial context. Currently there is no method to estimate photometric properties in uncontrolled lighting conditions that is sufficiently reliable and robust. The goal of this internship is twofold. The first objective is to study and evaluate an existing method based on a Phong model. The second objective is to propose an improvement, e.g. based on a more complex BRDF model like the Lafortune model. The implementation will be achieved in C++ and it will be incorporated into the digitization software of the IGG group. URL sujet détaillé : http://dptinfo.ustrasbg.fr/~allegre/SUJETS/MASTER/Stage_M2_Numerisation_2015_EN.pdf
Remarques :  Advisor: JeanMichel Dischler, Professeur d'Informatique  Coadvisor: Rémi Allègre, Maître de Conférences en informatique
 Duration: 6 months
 Funding: About 520 euros per month, net salary





SM207157 Sloppy Equations


Description


Context: Computational scientists spend a lot of time and effort implementing their "equations" as programs to run their simulations. Why can't the equations themselves be executed if they define the computation? This is because the equations are usually too "sloppy" to serve as formal executable specifications.
The devils are usually in the details, boundaries, and it is tedious and error prone to complete them. The main topic of this internship is to explore the limits of "sloppiness" one could enjoy while writing programs/equations.
Subject: The main goal of this internship is to develop methods to derive precise (nonsloppy) definition of the computation starting from a loosely defined (sloppy) input. The "sloppiness" may come in as various forms, such as unspecified domain, unspecified boundary cases, and so on.
The main focus is on Systems of Affine Recurrence Equations, which can be extracted from regular loop programs. SAREs can be analyzed and transformed using linear algebraic techniques, used in automatic parallelizers today. The result of this work is expected to be used for verification of programs. URL sujet détaillé : http://perso.enslyon.fr/tomofumi.yuki/M2SloppyEquations.pdf
Remarques : Possible interactions with Dr. Sanjay Rajopadhye at Colorado State University.
Stipends are possible.





SM207158 Development of Miniapps for Polyhedral Compilation


Description


Context: Experimental validation is an essential part of research in many domains, including optimizing compiler design. New ideas for program transformations are usually implemented as "proof of concept" tools and tested against various programs prior to publication. However, compiler writers are often criticized by application writers to be focused too much on benchmark kernels that can be far from what can be found in realworld applications.
Miniapps attempt to bridge this gap by developing smaller and simpler versions of the full application that can still serve as a performance proxy to the full version. The main topic of this internship is to develop such miniapps for polyhedral compilers, and then to tests the limits of current tools.
Subject: The polyhedral model is one of the few successful techniques for automatic parallelization, and is being actively developed today. However, prototype research tools have difficulty handling realworld applications, as it is not as robust as production compilers.
We seek to develop miniapps based on applications from bioinformatics (sequence alignment), numerical simulations (shallow water modeling), and more. Once the miniapps are ready, we will evaluate stateoftheart tools from polyhedral compilation, and explore opportunities for improvement.
URL sujet détaillé : http://perso.enslyon.fr/tomofumi.yuki/M2PolyApps.pdf
Remarques : Stipends are possible.





SM207159 Assemblage de tuiles et erreurs


Description


e premier résultat important dans l'étude des pavages définis par interactions locales a été de trouver des règles locales qui forment uniquement des pavages quasipériodiques (i.e. pour tout motif qui apparaît dans ce pavage, il existe un rayon r tel que ce motif apparaît dans toute boule de rayon r). Tout naturellement, ces objets ont servi à modéliser les quasicristaux découverts en 1982 par D. Shechtman, qui lui valut le prix Nobel de chimie en 2011.
On s'intéresse généralement à la structure des pavages quasipériodiques définis par règles locales. On peut aussi s'intéresser à leur formation, c'est à dire comment former des motifs arbitrairement grands en respectant les règles locales et en s'autorisant uniquement des transformations locales (i.e. la transformation n'a pas accès à la totalité du motif). Différents modèles de croissance ont été établis: autoassemblage (on part d'un motif correct et on rajoute des tuiles en respectant les règles locales) ou flipstochastiques (on part d'un motif quelconque et on effectue des perturbations locales pour rétablir les règles locales). Mais les différentes études établissent peu de liens entre les propriétés structurelles et les modèles de croissance.
Le but de ce stage est d'étudier la vitesses de formation d'un pavage en fonction des propriétés algorithmiques issues des règles locales qui le forment. URL sujet détaillé : http://www.latp.univmrs.fr/~sablik/Stage.html#FormationCristal
Remarques :





SM207160 Stretch factor of the 3D Delaunay triangulation


Description


The Delaunay triangulation is a popular structure in computational geometry (a particular triangulation, easy to compute and with numerous applications). The stretch factor of a triangulation is the largest ratio between the length of a path in the triangulation and the euclidean length between two vertices. In 2D, finding a tight upperbound on the is still open even if recent progress have been made recently.
The aim of the internship os to obtain result in 3D where almost nothing is known. URL sujet détaillé : http://www.loria.fr/~odevil/stretch3d.pdf
Remarques : a priori indemnité légale 436 euros/mois





SM207161 Analysis of realworld largescale mobile traffic datasets


Description


The usage of the Internet is becoming increasingly mobile and nomadic, fostered by the popularity of smartphones and mobile apps. This has led to an explosion of mobile traffic, whose compound annual growth rate (CAGR) has been of 146% between 2006 and 2013: not even fixed traffic on the Internet grew that fast during the golden age of the World Wide Web at the turn of the millennium. In addition, forecasts by prominent stakeholders, such as Cisco, tell us that the mobile communication hype is far from having reached its peak. Global mobile data traffic is still expected to grow at a compound annual rate of 61%, which means that the volume of mobile traffic in 2017 will be 13 times that recorded in 2012 [1].
As a result, current and upcoming radio access technologies will struggle to keep the pace with the demand generated by bandwidthhungry highend smartphone applications. This manifests through poor service availability, and represents a major impediment for advanced applications. Mobile telecommunication operators are thus studying new solutions to accommodate the capacity demand in densely populated metropolitan areas. A number of approaches aiming at enhancing the access network infrastructure are under study. Yet, many of these solutions need knowledge of the spatiotemporal dynamics of mobile traffic demand, which, in other words, means answering to the question =93when, where and how mobile users consume cellular network resources?=94
Today, we have do not have a precise response to this question. This project falls within a larger initiative aiming at providing a definitive answer to the question above. To that end, we collected, in cooperation with cellular network operators, multiple datasets of realworld mobile traffic in large European and African cities, and we are developing a framework to automatically analyze the datasets and characterize the dynamics of the mobile traffic demand [2]. The successful candidate will work on developing new analytic tools, inspired from fields such as data mining, machine learning, statistical analysis or complex systems, and integrate these tools in the framework. The work will be carried out in tight cooperation with PhD students, postdoctoral fellows, and researchers based at Inria, France, and CNRIEIIT, Italy.
References [1] Cisco, =93Global Mobile Data Traffic Forecast Update=94, 2013 [2] D. Naboulsi, R. Stanica, M. Fiore, =93Classifying Call Profiles in Largescale Mobile Traffic Datasets=94, IEEE Infocom, 2014 URL sujet détaillé :
:
Remarques :





SM207162 Authenticating logs in collaborative editing systems


Description


In recent years collaborative editing systems such as wikis, GoogleDrive and version control systems became very popular.In order to ensure a high availability these systems replicate data. Users work simultaneously on replicas of shared documents following some cycles of divergence and convergence. Document replicas diverge when users work in isolation and converge later when users synchronize their changes. Most of these systems keep user modifications in a log of operations that are then exchanged among replicas during synchronisation.
However, users can misbehave by tampering logs for their convenience. This might be critical for some collaborative systems such as version control systems. If logs can be modified, revisions do not correspond to the expected behavior of the software. Moreover, developers cannot be made responsible for the revisions for which they contributed. Furthermore, by modifying the log, a contributor may introduce security holes in the system under the name of another contributor. Therefore, there is a need to ensure integrity of the log, and in case the log was tampered, the misbehaving user should be detected.
Coast team proposed a solution relying on hashchain based authenticators [1] for authenticating logs that ensure the authenticity, the integrity of logs, and the user accountability [2]. Algorithms for authenticators construction and verification of logs were proposed. Our solution is based on a hash chain that is aggregated over time after new interactions among users are performed corresponding to the addition of new entries to the log. Each time a hash value is computed all data used for hashing is used and computation starts from the scratch.
The aim of this project is to investigate solutions for a better performance in constructing authenticators by taking advantage of commutativity among certain elements involved in hashing. Application of oneway accumulators with quasicommutative property will be studied [3]. Theoretical complexity as well as experimental analyses will be performed.
A successful master student can continue with a PhD thesis on related topics in collaborative editing.
References: [1] Di Ma and Gene Tsudik. 2009. A new approach to secure logging. ACM Transactions on Storage, 5(1), Article 2, March 2009 [2] Authenticating Operationbased History in Collaborative Systems. Hien Thi Thu Truong, ClaudiaLavinia Ignat and Pascal Molli. In Proceedings of the ACM International Conference on Supporting Group Work, Group 2012, pages 131140, Sanibel Island, Florida, USA, October 2012 [3] Josh Benaloh and Michael de Mare. Oneway accumulators: a decentralized alternative to digital signatures. In Workshop on the theory and application of cryptographic techniques on Advances in cryptology, EUROCRYPT '93, pages 274285, Lofthus, Norway, May 23=9627, 1993 URL sujet détaillé : http://www.loria.fr/~ignatcla/pmwiki/pmwiki.php/Main/Thesis14
Remarques :





SM207163 Datamining for Parallel Corpus for Quality Estimation Applied to Machine Translation


Description


Introduction  In the scope of Machine Translation, Quality Estimation [5] consists in automatically evaluating the quality (correctness, fluency) of a translated sentence or the correctness of each word in this sentence. This automatic evaluation can be useful to help an expert to postedit and correct the sentence (or the document) [7], or to decide if the translated document can be disseminated at it is [6].
To put it more directly: "Should this sentence t be considered as a correct translation of this one s?"
To make possible this decision by a computer, one uses a machine learning approach. This approach is stemmed on the use of a training corpus made up of a set of triplets (s,t,score_st) where s is a source sentence, t is the automatic translation of s by a machine translation system, and score_st is an evaluation score given to the translation t by a human expert. Each couple (s,t) is automatically described by a vector v_st. v_st is made up of numerical values (see below). Then, a machine learning approach (SVM, bayesian classifier, neural networks...) is applied on the couples (v_st,score_st) in order to estimate a function that predicts the score for any vector. It is then possible to use this function to give a score to another couple (s',t') even if it does not occur in the training corpus.
A vector v_st describes the source s and target t sentences. v_st made up of numerical values/features extracted from s and t. There are two kinds of values: glassbox and blackbox features. Glassbox values are given by the machine translation system (confidence measures from translation tables, from the language model). Blackbox values are independent of the system (for example lengths of s and t). Glassbox and blackbox features have drawbacks. Glassbox values are not available if the translation system is online. Blackbox values, as they are independent of the translation system, are not necessarily correlated with the translation quality.
Subject  In this work, our objective is to add to the vector v_st a new information independent of the translation system, but this information will remain correlated to the quality of translation because it is extracted from a parallel corpus representative of the language pair.
We propose to use a parallel corpus C. In C, each source sentence C_si is given a translation C_ti made by a human expert. For each input sentence s and its automatic translation t, we propose to mine the parallel corpus in order to retrieve the C_si sentences that have common parts with s. We argue that if C_sk is close to s, then C_tk is also close to t. We want first to check this hypothesis in this research work. Second, we want to propose a measure stemmed on this idea, and to add this measure to v_st.
Attempted work  The student will have to become familiar with the approaches in quality estimation for machine translation. Then, he will propose a measure of the intersection degree between a source sentence and the corpus. For that, a line of inquiry could be to use the stateoftheart automatic measures of machine translation quality [4,3].
Work environment  The student will be integrated in the SMarT (http://smart.loria.fr) group. This research group has strong skills in machine translation and quality estimation. We have participated to two evaluation campaigns: Quality Estimation (QE) 2012 (http://www.statmt.org/wmt12/qualityestimationtask.html) and 2013 (http://www.statmt.org/wmt13/qualityestimationtask.html) [1,2]. The results of the student will be incorporated into our research platform. This work will result in the participation to the evaluation campaign QE 2015.
References  [1] D. Langlois, S. Raybaud, and K. Smaïli. Loria system for the WMT12 quality estimation shared task. In Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012. [2] D. Langlois, S. Raybaud, and K. Smaïli. Loria system for the WMT13 quality estimation shared task. In Proceedings of the Eigth Workshop on Statistical Machine Translation, 2013. [3] G. Doddington. Automatic evaluation of machine translation quality using ngram cooccurrence statistics. In Proceedings of the second international conference on Human Language Technology Research, pages 138=96145. Morgan Kaufmann Publishers Inc., 2002. [4] K. Papineni, S. Roukos, T. Ward, and W.J. Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 311=96318. Association for Computational Linguistics, 2002. [5] S. Raybaud, D. Langlois, and K. Smaïli. "This sentence is wrong." Detecting errors in machinetranslated sentences. Machine Translation, 25(1):1=9634, 2011. [6] R. Soricut and A. Echihabi. Trustrank: Inducing trust in automatic translations via ranking. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 612=96621, 2010. [7] L. Specia. Exploiting objective annotations for measuring translation postediting eﬀort. In Proceedings of the 15th Conference of the European Association for Machine Translation, pages 73=9680, 2011. URL sujet détaillé : http://smart.loria.fr/sujet_datamining_qe.pdf
Remarques : The student will be supervised by David Langlois (http://www.loria.fr/~langlois) and Kamel Smaïli (http://www.loria.fr/~smaili).
The remuneration will be standard.





SM207164 Alliances in graphs


Description


We are interested in the study of defensive alliances and secure sets of a graph. Alliances occur in many contexts: members of different political parties, people who unite by friendship, companies with common economic interest, nations wishing to form defensive coalitions. A defensive alliance in a graph, introduced in [KHH04], is a set of vertices S such that each vertex of S has at least as many neighbors in S as outside S, that means that each vertex of S with its neighbors from S can defend against every attack coming from its neighbors outside S. Finding a defensive alliance of minimum size in a graph is NPhard. Nothing is known about the approximation of this problem. A defensive alliance involves the defense of a single vertex. In many contexts, alliances must be formed so that any attack on any subset X of S can be defended by X and their neighbors in S. In this case, the set S is called secure. The security number of a graph, introduced in [BDT07], is the smallest size of a secure set. Nothing is known about the complexity and the approximation of this number. Another interesting question is, given an insecure set N, how to add a minimal number of vertices in order to transform N into a secure set. Some related problems, including complexity and approximation results, were studied in [BTV10]. A generalization of defensive alliances to defensive 0alliances was defined in [RYS09]. A set of vertices S is a defensive 0alliance if any vertex of S has at least neighbors in S as outside S. A similar generalization for secure sets can be considered.
The goal of the internship is to study the complexity and the approximation of these problems. In order to deal with these problems we count on our experience on problems we studied asking for deciding if a graph has a partition of its vertex set into two nonempty parts satisfying some constraints of degree. We wrote a survey on the complexity and the approximation of these problems in [BTV10]. URL sujet détaillé : http://www.lamsade.dauphine.fr/~bazgan/Stages/stagesecure.pdf
Remarques :





SM207165 Nonmonotony in interaction networks


Description


The major part of studies and known results on the dynamics of Boolean automata networks assumes that the set of local transition functions that defines a network contains only locally monotone functions. Intuitively, that means that if an automaton i influences and automaton j in the network, then there is no ambiguity at all about the nature of this influence, that must be either positive (i tends to activate j) or negative (i tends to inhibit j). This local monotony is a strong property that induces de facto a strong restriction about the networks studied. However, no biological results allow a priori to go in the direction of such a restriction. So, now that a lot of interesting properties have been proven on locally monotone networks, it is important and relevant to study networks that do not admit this feature anymore. Moreover, the specific interest we have in this class of Boolean automata networks comes from previous works on the differences of their behaviours depending on their synchronous or asynchronous nature. In particular, we have shown that the smallest Boolean automata networks that are not robust to changes of synchonicity are nonmonotone ones. Moreover, we also have shown that monotone networks can owe this property of nonrobustness. One of the remarkable point here is that it seems that nonrobust monotone networks can always be coded by a smaller nonmonotone network. At present, no general results that explain this phenomenon have been found, although certain conditions for a monotone network to be nonrobust to synchronicity perturbations have shown the need of very specific cycles in its definition. According to us, this emphasises the essential role of nonmonotone architectural patterns on the complex dynamical behaviour of interaction networks. Thus, for biological arguments related to the possible intrinsic nonmonotone nature of regulations as well as for computer science arguments also related to the influence of synchronicity, studying the impact of nonmonotony is very pertinent and promising, and should lead to open other interesting questions. URL sujet détaillé : http://pageperso.lif.univmrs.fr/~sylvain.sene/files/stage2015_nonmon_en.pdf
Remarques :  Coencadrement : Kévin Perrot (kevin.perrot.univmrs.fr)  Durée : 6 mois
 Rémunération : 523,26=80/mois
 Poursuite en thèse envisagée et souhaitée





SM207166 Alternatingtime Temporal Logic with Imperfect Information


Description


Alternatingtime Temporal Logic (ATL) has been recently put forward as a logical language to specify strategic properties of autonomous agents in distributed systems [1]. ATL extends the linear time temporal logic LTL by introducing modal formulas << A>> f, for a group A of agents, whose intended meaning is that the agents in A have a strategy whereby they are guaranteed to achieve f, irrespectively of the moves of A's opponents. ATL has been successfully applied to the specification and verification of game structures. Indeed, the ATL language is rich enough to express key gametheoretic concepts such as (strong) Nash equilibria and Pareto optimality, as well as typical safety properties (e.g., =93the agents in group A have a strategy to perpetually avoid a deadlock=94: << A>> G ! deadlock) and liveness properties (e.g., =93once a message is sent, the agents in A cannot prevent that it will eventually be received=94: sent > ! << A>> ! F received).
Alternatingtime Temporal Logic has been thoroughly investigated as regards its theoretical features (decidability and complexity of the satisfiability and model checking problems [2,3]), and it has been observed that these depend on various strategic capabilities of agents. In one dimension the strategies played by agents might depend on the whole sequence of states visited during the system's execution. If that is the case, we say that agents have perfect recall of visited states, and they use this information to decide which action to perform next. On the other hand, if agents only use information available in the current state to decide which action to take next, then they are said to deploy memoryless (or positional) strategies. In another dimension, agents might have perfect knowledge of the game structure (local states of other agents, available actions, etc  consider chess or go for instance), or only a partial view of the game might be accessible to them, i.e., agents might have imperfect knowledge of the game structure (e.g., poker or bridge). It is well known that perfect/imperfect recall and perfect/imperfect knowledge affect the decision problems for ATL. For instance, deciding ATL with perfect recall is typically computationally harder than the corresponding problems for its imperfect recall counterpart (even though there are some notable exceptions).
This project aims at contributing to the subject of Alternatingtime Temporal Logic under the assumption of imperfect information. Specifically, the stateoftheart for ATL is much more clearcut for game structures with perfect knowledge. For such setting we have decision procedures for the model checking and satisfiability problems [2,3], and extensions are available [4]. However, in contexts of imperfect information the techniques to decide satisfiability for ATL fail. As a result, satisfiability of ATL with imperfect information is still an open problem. The project is intended to map the boundary between perfect and imperfect information in ATL, by building on recent contributions on the subject [5]. Specifically, the envisaged workpackages are as follows: 1) analysis of stateoftheart on tableau and automatabased decision procedures for ATL; 2) diagnosis of failure of aforementioned procedures in contexts of imperfect knowledge; 3) development of a method to decide ATL satisfiability with imperfect knowledge; 4) possible implementation of the results in (3) in a tool.
The project programme is ambitious and challenging. In case that the logical language turns out to be undecidable, it will be of interest to explore syntactic and semantic restrictions that still admit a decidable satisfaction problem. The candidate for the project needs to have a strong background in formal methods for computer science and logic, as well as good skills in working autonomously and be proactive. Further details on the project proposal are available upon request.
Selected Reference
1) Rajeev Alur, Thomas A. Henzinger, and Orna Kupferman. 2002. Alternatingtime temporal logic. J. ACM 49, 5 (September 2002), 672713. DOI=10.1145/585265.585270 http://doi.acm.org/10.1145/585265.585270
2) Bulling, N.; Dix, J. & Jamroga, W. Dastani, M.; Hindriks, K. V. & Meyer, J.J. C. (Eds.) Model Checking Logics of Strategic Ability: Complexity* Specification and Verification of Multiagent Systems, Springer US, 2010, 125159
3) Goranko, V., van Drimmelen, G.: Complete axiomatization and decidability of the Alternatingtime Temporal Logic (2003)
4) Serenella CERRITO, Amélie DAVID, Valentin GORANKO. Optimal Tableauxbased Decision Procedure for Testing Satisfiability in the Alternatingtime Temporal Logic ATL+. Automated Reasoning (IJCAR), LNAI 8562, 2014.
5) Nils Bulling Wojciech Jamroga, Comparing variants of strategic ability: how uncertainty and memory influence general properties of games, Auton Agent MultiAgent Syst (2014) 28:474=96518, DOI 10.1007/s1045801392313 URL sujet détaillé :
:
Remarques : Le stage est coencadré avec Catalin Dima, professeur à l'Université Paris  Créteil.





SM207167 Formalisation effective des réels algébriques




SM207168 Exact QR decomposition over the rationals


Description


Exact linear algebra, computing exactly over arbitrary precision integers, rationals and finite fields, has developed over the last 15 years, motivated by applications in cryptology, computational mathematics and high precision numerical computations. State of the art implementations, proposed in libraries such as LinBox (http://linalg.org) and fflasffpack (http://linalg.org/projects/fflasffpack) rely on efficient dense linear algebra kernels over word size finite fields, and lifting to the integers and rationals by Chinese remainder algorithm or padic representations.
Most computations based on elimination in exact linear algebra rely on efficient Gaussian elimination and the related PLUQ decomposition, that has been already well studied for both sequential and parallel implementations. The QR decomposition, transforming a matrix to a triangular form by unitary transformations, although widely used in numerical linear algebra, has been given less attention in computer algebra until recently. However, applications such as lattice reduction, raised a need for high performance and high precision QR decomposition which motivates this research project.
After reviewing the state of the art numerical algorithms for this computation, the student will design several variants of an exact QR decomposition algorithm adapting the specificities of exact linear algebra used for Gaussian elimination: block tiled and slab recursive structure, use of subcubic matrix arithmetic, Chinese Remainder algorithm and rational reconstruction. Communication efficient variants and their analysis will be investigated.
This exact QR decomposition will then be adapted in a symbolicnumeric framework to offer an arbitrary precision numerical QR algorithm. It will be benchmarked against alternative approaches, such as a direct the use of the multiprecision floating point numbers of the library mpfr (http://www.mpfr.org). Finally, applications to lattice basis reduction could be considered, time permitting. URL sujet détaillé : http://ligmembres.imag.fr/pernet/Publications/exactQR.pdf
Remarques :





SM207169 Ultime periodicity in chipfiring games


Description


Chipfiring games are played on directed graphs. Each vertex is given a number of chips, and a single rule is applied in parallel on every vertex at each time step: if a vertex contains enough chips then it gives one chip to each of its outneighbors, otherwise it does not give any chip.
Chipfiring fames are very expressive (the model has been shown to be Turingpowerful). Nonetheless, on a finite graph and with a finite number of chips, the trajectory of every game is ultimately periodic.
According to the interests of the candidate, the internship may consist in:  producing a digest inventory of known results;  continue the classification of some families of graphs as: 1. the period of every trajectory is smaller or equal to the number of vertices; 2. the period of every trajectory is polynomial; 3. the period of some trajectories may be exponential;  establishing and exploiting links with other models, for example with threshold automata.  studying the impact of the update scheme on the limit cycles. URL sujet détaillé : http://pageperso.lif.univmrs.fr/~kevin.perrot/documents/sujetM2CFGPeriods_en.pdf
Remarques :  Coencadrement : Sylvain Sené (sylvain.sene.univmrs.fr)  Durée : 6 mois
 Rémunération : 523,26 euros/mois
 Poursuite en thèse envisagée et souhaitée





SM207170 Independent sets in hypergraphs


Description


Balogh, Morris and Samotij and Saxton and Thomason have recently developed a novel method for approaching quite diverse problems in asymptotic graph and hypergraph theory. They have shown that certain rather general results on independent sets in hypergraphs can be proved from simple hypotheses and can be applied very effectively in quite different areas, from colouring problems to asymptotic enumeration and probabilistic combinatorics.
Students involved in this project will be expected study the original papers by the authors above, and will be given related literature for the applications. The main aim of this project is to reach a deep understanding of this method and to seek new applications in extremal and probabilistic combinatorics.
References
J. Balogh, R. Morris and W. Samotij, Independent sets in hypergraphs, J. Amer. Math. Soc., to appear.
D. Saxton and A. Thomason, Hypergraph containers, arXiv:1204.6595 [math.CO]. URL sujet détaillé :
:
Remarques :





SM207171 Architecture synthesis for linear timeinvariant filters


Description


Linear timeinvariant filters are a widely used class of digital signal processing blocks. They can be found in most signal processing applications (from sound processing to softwaredefined radio) but also in control applications. These mathematical objects can be implemented in a variety of ways. Such realizations always consist of a combination of registers and of a number of scalar products by constant vectors. However, the choice of a realization impacts performance, cost, and accuracy.
The literature offers papers comparing different realizations of the same filter, and observing that more accuracy can be obtained at the expense of more multipliers. Such studies are relevant in software implementations, where the multiplier size is a given. In the context of hardware or FPGA implementations, more accuracy can also be bought by larger multipliers. This internship will study this tradeoff and design tools that explore it. The ultimate objective is to generate the smallest or fastest filter satisfying a given accuracy specification.
For this, the state of the art is Hilaire's formalism of Specialized Implicit Form (SIF) that captures all possible realizations of a filter. Lopez' thesis at LIP6 recently showed how to formally compute the accuracy of a realization expressed in SIF form, depending on the fixedpoint format used. In parallel, we worked in Socrate on the minimal implementation of a scalar product with a constant vector. The internship should build upon theses recent complementary works. URL sujet détaillé :
:
Remarques :





SM207172 Decompositions of Graphs


Description


Area: Graph Theory
Title: Decompositions of Graphs
Abstract: The topic of research comprises problems on graph decompositions, in a very broad sense. We mention some problems on this topic that we are investigating or would like to welcome students to investigate.
OBS: A pdf file corresponding to the text below (with some definitions and references) can be found at: http://www.ime.usp.br/~yw/2014/projects/project.pdf
Given a graph G, a family of subgraphs {H_1,...,H_k} of G is a decomposition of G if these subgraphs are pairwise edgedisjoint and their union is precisely G. If each H_i is a path (resp. cycle) then we say that G admits a path (resp. cycle) decomposition. If each H_i is isomorphic to a graph H, then we say that G admits an Hdecomposition.
The path number of a graph G, denoted by pn(G), is the size of a minimum path decomposition. According to Lov=E1sz, in 1966 Erd=F6s asked about this parameter, and Gallai stated the following conjecture.
Conjecture [Gallai, 1966] If G is an nvertex connected graph, then pn(G) is at most (n+1)/2.
This parameter is not known for most of the graph classes. Lov=E1sz (1968) found an upper bound for a similar parameter. He proved that every nvertex connected graph G can be decomposed into at most n/2 paths and cycles. From this result, it follows that Gallai's conjecture holds on connected graphs with at most one evendegree vertex. Some upper bounds for the path number of a connected graph in terms of its number of odddegree and evendegree vertices have been obtained by Lov=E1sz (1968), and improved by Donald (1980) and Dean and Kouider (2000). Finding good upper bounds for pn(G) is a challenging problem; even for special classes of graphs not much is known.
Another related problem concerns Hdecompositions of a graph, where H is some fixed graph. This is also a topic that has raised much interest of the researchers, and a number of interesting results have been found. An interesting variant concerns the case in which a tree. In this respect, there is a conjecture posed by Bar=E1t and Thomassen (2006), which states that, for each tree T, there exists a natural number k(T) such that, if G s a k(T)edgeconnected graph and E(T) divides E(G), then G admits a Tdecomposition. In a series of papers, Thomassen has proved this conjecture for stars, some bistars, paths of length 3, and paths whose length is a power of 2. Recently, Botler, Mota, Oshiro and Wakabayashi (20014) have verified this conjecture for paths of length 5.
The topic is very broad, so an interested student may choose one of the variants to focus on, or may write a survey collecting the known results. Some PhD students and postdoc fellows of our Combinatorics group at IMEUSP have been working (or have worked) on this topic. Some other results on decompositions of regular graphs into paths or bistars have also been proved by members of this group. We would be happy to welcome students interested to work with us. URL sujet détaillé : http://www.ime.usp.br/~yw/2014/projects/project.pdf
Remarques :





SM207173 Réseaux sociaux


Description


Large networks (social networks, web graph, neuronal networks, ...) are at the core of complex systems: the nodes correspond to the entities of a system, while arcs describe their interactions. While there is a huge amount of works providing models and analysis of complex networks, there are very few attempts trying to model and exploit the possible existing geometric structure of networks.
Our goal is to exploit the tools from geometric modeling and computational geometry (such as spectral methods) in order to analyze, visualize and better represent such kind of large graphs. The main idea is to take advantage of the underlying geometric constraints of the spatial embedding of the graph in order to better characterize their structure. This could help in designing new and more efficient algorithms for network visualization and network compression. URL sujet détaillé : http://www.lix.polytechnique.fr/~amturing/stages/pdf/Stage_GeometricNetworks.pdf
Remarques : Stage coencadre' par Maks Ovsjanikov (maks.poytechnique.fr)





SM207174 Réseaux complexes


Description


We address the problem of compressing and compactly representing the connectivity of large graphs. In particular we consider complex networks: social networks, web graphs as well as biological networks are a few examples of classes of complex networks arising from real world applications. Complex networks are usually very sparse graphs of huge size (having millions and even billions of nodes) and exhibit a surprinsingly high number of structural properties. The main goal of this internship is to exploit methods from networks analysis in order to provide rigorous and precise evaluations (and comparisons) of compression schemes, from both the practical and theoretical points of view. We also hope to take advantage of some structural properties of complex networks in order to design more efficient representations, finely tuned for some classes of graphs (such social networks). URL sujet détaillé : http://www.lix.polytechnique.fr/~amturing/stages/pdf/Stage_ComplexNetworks.pdf
Remarques :





SM207175 Graphes plongés sur les surfaces


Description


Our main goal is to study, from both the combinatorial and algorithmic points of view, a class of structural properties characterizing the notion of graph planarity. We will consider the characterization known as "Schnyder woods" (and the closely canonical orderings), which can be expressed in term of edge orientation and coloring: they lead to a number of applications in several domains, such as graph drawing and encoding, random and exhaustive generation, compact representations, plane spanners, ... In particular we deal with higher genus graphs, which are graphs embedded on higher genus surfaces. For these graphs, two recent generalizations of Schnyder woods have been proposed, leading to new efficient algorithms for drawing and encoding graphs on surfaces. One of the goals of the internship will be to extend to the toroidal case the algorithms for efficiently generating and embedding triangulated graphs: the solutions for these two problems, in the planar case, make use of canonical orderings. URL sujet détaillé : http://www.lix.polytechnique.fr/~amturing/stages/pdf/Stage_GenerationTriangulations.pdf
Remarques :





