SM2071 Performance Analysis of Congestion in Commodity Networks for Exascale Computing Platforms


Description


The MontBlanc project aims at building a supercomputer based on commodity lowpower hardware such as the ARM and the Ethernet technology. It will not be able to reach Exascale but is a first step toward this direction.
Although there are numerous articles explaining how to tune or modify TCP to optimize endtoend bandwidth in a HPC context, most of them completely ignore flow control and congestion management aspects. Our initial experiments with TCP/Ethernet revealed some nontrivial network congestion phenomenon in the presence of intensive HPC workload. More precisely, some MPI collective operations and the NAS PB conjugate gradient can lead, even at moderate scale, to critical congestion situations that force the whole application to freeze until a TCP timeout is triggered. Such blocking situations only last for 200 milliseconds but such this is unacceptable on a platform where typical communication times are of the order of a few microseconds. Indeed, if such timeouts are harmless in a wide area context over the Internet they can occur systematically in such HPC workloads and slow down the whole execution by more than 30%!!! Beyond a simple application slowdown, it should be understood that the whole set of processors, even when forced to idleness, keeps consuming energy uselessly. Given the scale evolution of HPC platforms, such issue is likely to quickly become impossible to overlook.
There are several options to circumvent this issue: 1. Such timeouts seem to be link to the retransmission timeout (TCP RTO), which could be decreased. The value of this timeout (200ms) is unfortunately hardcoded in the kernel. Although having kernels specifically tuned for HPC workloads is perfectly legitimate, it is quite difficult to know how to set such value. Indeed, aggressively decreasing this timeout may trigger massive retransmissions that could make the network collapse even more. 2. Since most TCP variants are likely to run into the same kind of issues, another option could be to use a completely different protocol like SCTP that manages congestion in a very different way. 3. Another option could be to have user space deamons that smartly inject packets in the system to prevent such timeouts to occur.
The goal of this project is to evaluate these different options. Experiments will be conducted on the Grid5000 platform (probably on Nancy's Graphene cluster). In a first step, the student will familiarize with the experimental procedures and on the techniques to control their experimental environment (e.g., recompiling the kernel and preparing custom images,...).
If time allows, similar experiments will be conducted on the MontBlanc prototype. Such study should provide invaluable feedback to Exascale platform architects in particular with respect to the network and CPU provisioning aspect. URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_tcp_analysis.pdf
Remarques : Ce stage est évidemment rémunéré...





SM2072 Modeling and Simulation of Dynamic Applications for Exascale Computing Platforms


Description


Multicore architectures comprising several GPUs have become mainstream in the field of HighPerformance Computing. However, obtaining the maximum performance of such heterogeneous machines is challenging as it requires to carefully offload computations and manage data movements between the different processing units. The most promising and successful approaches so far rely on taskbased runtimes that abstract the machine and rely on opportunistic scheduling algorithms. As a consequence, the problem gets shifted to choosing the task granularity, task graph structure, and optimizing the scheduling strategies. Trying different combinations of these different alternatives is also itself a challenge. Indeed, getting accurate measurements requires reserving the target system for the whole duration of experiments. Furthermore, observations are limited to the few available systems at hand and may be difficult to generalize. Finally, since execution time on real machine exhibit variability, dynamic schedulers tend to make varying scheduling decisions, and the obtained performance is thus far from deterministic. This makes performance comparisons more questionable and debugging of nondeterministic deadlocks inside such runtimes even harder.
Simulation is a technique that has proven extremely useful to study complex systems and which would be a very powerful way to address these issues. Performance models can be collected for a wide range of target architectures, and then used for simulating different executions, running on a single commodity platform. Since the execution can be made deterministic, experiments become /completely reproducible/, also making debugging a lot easier. Additionally, it is possible to try to extrapolate target architectures, for instance by trying to increase the available PCI bandwidth, the number of GPU devices, etc. and thus even estimate performance which would be obtained on hypothetical platforms.
In a recent Europar article, we show how we crafted a coarsegrain hybrid simulation/emulation of StarPU, a dynamic runtime system for heterogeneous multicore architectures, on top of SimGrid, a simulation toolkit specifically designed for distributed system simulation. This approach allows to obtain performance predictions accurate within a few percents on classical dense linear algebra kernels in a matter of seconds, which allows both runtime and application designers to quickly decide which optimization to enable or whether it is worth investing in higherend GPUs or not.
It is thus currently possible to precisely evaluate the performance of *dynamic* HPC applications running on a single hybrid node but evaluating such applications at larger scale remains quite challenging. StarPU was recently extended to exploit clusters of hybrid machines by relying on MPIcite{starpumpi}. Since SimGrid's ability to accurately simulate MPI applications has already been demonstrated (PMBS'13), combining both works should allow to investigate performance predictions of complex applications on largescale highend HPC infrastructures.
The goal of this internship is thus to experiment with dynamic HPC applications based on StarPUMPI and to evaluate their performance in simulation. In a first step, mediumsize clusters of hybrid nodes will be used. If time allows, such validation could be conducted on the MontBlanc prototypes that rely on ARM processors and GPUs similar to the ones that may be used in future Exascale platforms. Such performance study should provide invaluable feedback to Exascale platform architect in particular with respect to the network and CPU provisioning aspect. URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_starpu_mpi.pdf
Remarques : Ce stage est évidemment rémunéré...





SM2073 Analysis and Modeling of Cache Performance for Exascale Computing Platforms


Description


The MontBlanc project aims at building a supercomputer based on the ARM technology. It will not be able to reach Exascale but is a first step toward this direction. ARM processors have almost no memory hierarchy and may thus be easier to model than more recent multicore architectures used in PetaScale architectures.
The goal of this internship is thus to experiment with multicore ARM processors similar to the ones that may be used in future Exascale platforms and to derive from such experimentation a performance and power consumption model. Measuring the parameters of such model will require to design specific microbenchmarks that reach the peak performance of these processors, which in turns requires exploring a wide parameter space of code compiling options, loop unrolling and vectorizing options. To ease such code generation, we will rely on the [[https://forge.imag.fr/projects/boast/][BOAST]] framework that allows to metaprogram computing kernels and their optimizations.
Such model will then be implemented in the [[http://simgrid.gforge.inria.fr/][SimGrid]] toolkit to try to evaluate the performance of simple benchmarks like LINPACK on such platforms. Such study should provide invaluable feedback to Exascale platform architect in particular with respect to the network and CPU provisioning aspect. URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_cache_analysis.pdf
Remarques : Ce stage est évidemment rémunéré...





SM2074 Response Time Optimization of MultiUser BOINC Projects


Description


Recent evolutions in internetworking technology and the decreasing costperformance ratio of computing components have enabled Volunteer Computing (VC) systems. Among these platforms, BOINC ([[http://boinc.berkeley.edu/][Berkley Open Infrastructure for Network Computing]]) is the most representative, with over 580,000 hosts that deliver over 2,300 TeraFLOP per day and many projects deployed. In such platforms, volunteers usually donate their machines' CPU idle time to scientific projects and indicate some preferences on how to share their processing power among them. BOINC projects usually have hundreds of thousands of independent CPUbound tasks and are interested in overall throughput, i.e., maximize the total number of tasks completed per day. Each project has its own server which is responsible for distributing work units to clients, recovering results and validating them.
To handle this complexity, BOINC implements a simple and robust distributed protocol which ensures that local shares and volunteer priorities are respected. This protocol has been used for many years now and its efficiency and fairness have been assessed in the context of throughput oriented projects.
Lately, the BOINC workload has evolved in the following two noticeable ways:  First, to answer the burden of deploying and maintaining BOINC servers, multiuser projects (also called emph{umbrella} projects) have appeared. Such BOINC project host several scientific projects (in the later, we will talk about users rather than scientific projects to avoid confusion with BOINC projects), each with their own application, which provides a much larger audience to every user than if they had deployed their own server. For example, the [[http://www.worldcommunitygrid.org/][World Community Grid]], funded by IBM, has been one of the first such BOINC projects, helping research on drug search for Leishmaniasis, clean water localization, childhood cancer, dots. More recently, [[http://casathome.ihep.ac.cn/][CAS]] a Chinese project has been set up to help both protein structure, nanotechnology, cancer genomics and high energy physics research. Likewise [[http://www.ibercivis.es/][Ibercivis]] a SpanishPortuguese initiative has been set up to provide a huge BOINC umbrella Project for scientists in those countries and [[http://rechenkraft.net/yoyo/][Yoyo]] helps research on harmonious trees, elliptic curve factorization, and DNA evolution among others. In such BOINC projects, demand for computing power generally exceeds supply, hence a need for a mechanism to divide the supply among users.  Second, although in the past most users had an unlimited supply of jobs to process (throughputoriented users), the popularization of BOINC has attracted other type of users. In CAS, Many users have campaigns (batches) made of fewer tasks and are interested in response time of campaigns. For such users, throughput is not meaningful and response time should be minimized (responsetime oriented users). In such multiuser projects, each such user will have a numeric quota representing the fraction of the computing resource they should get. What are the precise semantics of quotas? Ideally the mechanism should accommodate the previous two classes of users. The GridBot projectcite{gridbot,Silberstein11,Silberstein12} recently made use of hybrid computing platform composed of grid, clusters and VC to execute workload resulting from mix of throughput and response time oriented applications. Such initiatives are very encouraging but somehow emph{ad hoc} and does not consider the problem of streams of batches nor the resource sharing issue. We think that regular BOINC servers can be improved to handle seamlessly such workloads.
In the last BOINC workshops ([[http://boinc.berkeley.edu/trac/wiki/WorkShop11][2011]], [[http://boinc.berkeley.edu/trac/wiki/WorkShop12][2012]], [[http://boinc.berkeley.edu/trac/wiki/WorkShop13][2013]]) members from the Mescal team, the BOINC team, the World Community Grid project and the OurGrid project have brainstormed to design a new server architecture that would enable to fairly and efficiently handle complex workloads such as the ones we previously described. Yet, many parts of this architecture (batch prioritizing, batch completion estimation, client reliability estimation, dots) remain to be specified.
The goal of this internship is thus to first implement a simulation of such prototype and to evaluate its performance under some realistic workload from the World Community Grid or from CAS. Such evaluation will then guide the design of different scheduling and estimation mechanisms. The project could involve visits to Berkeley, California to present results, which will later be implemented in the BOINC server and deployed on the World Community Grid and in CAS. URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_boinc.pdf
Remarques : Ce stage est évidemment rémunéré...





SM2075 Learning algorithms for optimal routing in large scale systems


Description


Consider the following multiflow routing problem: N flows (given by source , destination and volume) share a communication network (with limited throughput on the links). We want to design an distributed algorithm that discovers the best routes for all flows with only partial knowledge on the network topology and the routes chosen by other flows.
Classical algorithms are based on game theory and see the flows as players and all the routes from source to destination as their possible strategies. The best algorithms are shown to scale well with the number of flows but not so well with the number of possible routes per flow. The goal here is to explore alternative approaches where the nodes of the communication network become active players and their strategies are limited to output links. These new approaches have the potential to solve very large routing problems very fast. URL sujet détaillé :
:
Remarques :





SM2076 Learning and mean field approximations.


Description


Qlearning is an algorithmic technique that learns the transition probabilities in a Markov decision process while maintaining an optimal reward over an infinite horizon. While this makes this approach particularly attractive on a conceptual point of view, it suffers acutely from the curse of dimensionality that makes it inapplicable in practice. In this internship we propose to explore the possibility to aggregate the state space of a large Markov decision process made of several identical objects and to use Qlearning on the resulting mean field approximation so as to avoid the state space explosion.
This project involves both mathematical modeling and algorithmic techniques. No background knowledge is needed for this project but some familiarity with probability theory (or a taste for it) will help. URL sujet détaillé :
:
Remarques :





SM2077 Dynamic solution of the memory affinity problem.


Description


One of the main problem faced by HPC applications that run over modern multicore NUMA architectures is to solve the memory affinity problem: place each task on a processor close to the memory where the data used by the task is stored. If considered in a static way, optimal placement is NPcomplete. One way to approach the problem is to dynamically change the task allocation according to an adaptive algorithm based on the aggregate performance of all the allocations used from the start of the prgram rather than on the current allocation, and learn from past mistakes. This approach has never been used in this context but should prove very efficient compared with static heuristics based on graph partitioning that are currently used. URL sujet détaillé :
:
Remarques :





SM2078 Assemblage et estimation de la valeur adaptive de génomes du virus de la Dengue Assembly and fitness estimation of Dengue virus genomes


Description


390 million people worldwide are infected yearly by Dengue virus, through a mosquito bite. No therapeutic is currently available. To control the epidemics, the most promising strategies target the mosquito host to stop viral transmission. However, little is known about how the virus interacts with its mosquito host, which makes it very hard to predict whether those strategies could be successful. To understand how the virus interacts with the mosquito cell, a team of virologists headed by Marlène Dreux runs viral evolution experiments in mosquito cells and sequences the viral population throughout its evolution. The analysis of the resulting sequences will help us identify the sites of the viral genome that interact with the mosquito cell machinery. The master internship aims at developing bioinformatic and statistical methods to i/ assemble complete viral genomes from the sequences and ii/ predict the fitness of those assembled genomes to identify important sites. These tasks require parallel algorithms to handle massive amounts of sequence data as well as handling of the quasispecies equation to infer genome fitness. The intern will have to implement the abovedescribed algorithms. Good programming skills are most welcome, as well as a good working knowledge of probability and statistics. Moreover, interest in evolution, virology or public health issues is primordial. The host laboratory (LBBE) is a stimulating and pleasant place to work, where one can meet biologists, physicians, computer scientists, mathematicians and statisticians working on problems that range from ecology to medicine, through genomics and evolution. URL sujet détaillé : https://lbbe.univlyon1.fr/IMG/pdf/project_m2_student_bastien_boussau.pdf?1229/4b82a603f0cc81c9cba9f30b14690da6e18d2747
Remarques :





SM2079 Formal Verification of Security Properties of Evoting Systems


Description


Designing and implementing cryptographic systems that offer effective security guarantees is known to be a complicated task. They are widelydeployed applications involving complex information flows, making code reviews and testing less effective. They use modern cryptographic primitives that must be carefully designed: the history of cryptography (including this last decade) is full of examples of flawed cryptographic systems that were thought secure for a long time (and sometimes even proved secure). Furthermore, their security should hold in hostile environments.
Electronic voting schemes are one example of such systems. They are meant to offer at least the same security guarantees as traditional paperbased systems. They make use of cryptographic primitives and rely on protocols that are notoriously difficult to design and verify.
In this internship, we propose to formally define and verify security properties of Helios, one of the best established academic protocols. The development will be performed in EasyCrypt, a proofassistant jointly developed by the IMDEA Software Institute and Inria, for verifying the security of cryptographic constructions in the computational model.
We invite applicants to read the following paper for more information:
https://eprint.iacr.org/2013/235 URL sujet détaillé : https://www.easycrypt.info/Internships
Remarques : Possibilité de rénumération
Coencadrants: F. Dupressoir, PY. Strub





SM20710 Formal Verification of a Compiler for MaskingBased Countermeasures


Description


In the real world, attackers have access to sidechannels that may reveal information about intermediate variables of a computation, even those that depend on secrets. For instance, recent and very efficient attacks have allowed adversaries to recover the secret key using only AES power traces, and the design and implementation of countermeasures is now of central importance.
Masking is such a countermeasure. It consists in using a secretsharing scheme to split each sensitive variable and the compuations on them into (t + 1) shares such that observing any t of them does not reveal any information about the secrets. Intuitively, an implementation that is split over (t + 1) shares should be able to resist the leakage of t of its intermediate variables.
In this internship, we propose to formally verify, in the Coq proofassistant, a masking compiler, i.e. a compiler that transforms an input program into a functionally equivalent program that is resistant to t intermediate observations. Both the implementation, the functional correctness and the security proofs will be performed in Coq.
We invit applicants to read the following paper for more information:
http://www.cs.berkeley.edu/~daw/papers/privcirccrypto03.pdf URL sujet détaillé : https://www.easycrypt.info/Internships
Remarques : Possibilité de rénumération
Coencadrants: F. Dupressoir, PY. Strub





SM20711 Energyefficient cloud elasticity for datadriven applications


Admin


Encadrant : AnneCécile ORGERIE 
Labo/Organisme : energyefficiency, performance and elasticity. This work will be done in collaboration with the Lawrence Berkeley National Laboratory (USA) in the context of the DALHIS associate team. 
URL : http://www.irisa.fr/myriads 
Ville : Rennes 



Description


Distributed and parallel systems offer to users tremendous computing capacities. They rely on distributed computing resources linked by networks. They require algorithms and protocols to manage these resources in a transparent way for users. Recently, the maturity of virtualization techniques has allowed for the emergence of virtualized infrastructures (Clouds). These infrastructures provide resources to users dynamically, and adapted to their needs. By benefiting from economies of scale, Clouds can efficiently manage and offer virtually unlimited numbers of resources, reducing the costs for users.
However, the rapid growth for Cloud demands leads to a preoccupying and uncontrolled increase of their electric consumption. In this context, we will focus on data driven applications which require to process large amounts of data. These applications have elastic needs in terms of computing resources as their workload varies over time. While reducing energy consumption and improving performance are orthogonal goals, this internship aims at studying possible tradeoffs for energyefficient data processing without performance impact. As elasticity comes at a cost of reconfigurations, these tradeoffs will consider the time and energy required by the infrastructure to dynamically adapt the resources to the application needs.
The validations of the proposed algorithms may rely on the French experimental platform named Grid'5000. This platform comprises about 8,000 cores geographically distributed in 10 sites linked with a dedicated gigabit network. Some of these sites have wattmeters which provide the consumption of the computing nodes in realtime. This validation step is essential as it will ensure that the selected criteria are well observed: energyefficiency, performance and elasticity. This work will be done in collaboration with the Lawrence Berkeley National Laboratory (USA) in the context of the DALHIS associate team.
References: [RDG11] N. Roy, A. Dubey and A.S. Gokhale. "Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting". In Proceedings of IEEE CLOUD, 2011. [Koo11] J. Koomey, "Growth in data center electricity use 2005 to 2010", Analytics Press, 2011. [ADE10] D. Agrawal, S. Das and A. El Abbadi, "Big data and cloud computing: new wine or just new bottles?", Journal of the Very Large Data Base Endowment, vol 3 no. 12, pp. 16471648, Sept. 2010. [ODL13] A.C. Orgerie, M. Dias de Assunç=E3o, L. Lefèvre, "A Survey on Techniques for Improving the Energy Efficiency of Large Scale Distributed Systems", ACM Computing Surveys, 2013.
Advisors: AnneCécile Orgerie annececile.orgerie.fr Christine Morin christine.morin.fr URL sujet détaillé : http://www.irisa.fr/myriads/openpositions/internships1314/document.20130925.9133673154
Remarques : Contact: AnneCécile Orgerie annececile.orgerie.fr Christine Morin christine.morin.fr





SM20712 LargeScale Graph Processing Platform on Supercomputers


Description


A largescale graph analytics is one of the important approaches for big data processing for a variety of purposes such as web and social network mining, life science, cyber security, and so forth. We have been developing an open source billionscale graph analytics library called ScaleGraph (URL: http://www.scalegraph.org) on top of a parallel and distributed programming language named X10. For the last 3 years, we have been successful in demonstrating its high performance nature and handling billionscale whole Twitter followerfollower network on supercomputers with more than 3000 CPU cores.
In this project, research project members will work on various research themes depending on their interest and skills by using a supercomputer named TSUBAME 2 located Tokyo Institute of Technology in Japan. The project can be mainly divided into three teams: A) Graph Algorithms Team: The team designs and implements a set of parallel and distributed graph algorithms that the ScaleGraph library is currently missing such as community detection, clustering, link prediction, bipartite matching, and their incremental algorithms. After finishing them, they will test the algorithms over our crawled Twitter network and conduct the performance evaluation on TSUBAME 2. B) System Software Team: The team tackles various problems such as optimizing the underlying graph processing framework based on the Pregel model, designing and implementing asynchronous graph processing models, and implementing outofcore graph processing with external memory. We could broaden the scope if you have good ideas. C) Application Team: By working together with other teams and also domain experts, the application team will explore new application areas such as human brain network, proteinprotein interaction, linked data such as RDF, cyber security, and so forth with the ScaleGraph library. Each member is expected to deliver both software assets as well as publication for international conferences. For further information, you can see the ScaleGraph web page (http://www.scalegraph.org)
Recommended Skills: ・ OS, Programming Languages, Tools: Linux, C/C++, Java, MPI, Performance profiler, Git/Subversion, etc ・ Knowledge on graph analytics, High performance computing, parallel and distributed systems ・ A skill on writing and reading publications for international conferences
Duration: 6 months (negotiable) Salary: =801000=801300 (depending on the skills) Work Location: UCD School of Computer Science and Informatics, Dublin, Ireland
Contact: Please send your CV/resume (the detailed information such as programming experiences would be more preferable) and your preferred research theme to crest.research.com. If selected, you will be contacted and interviewed by Prof. Toyotaro Suzumura (University College Dublin / IBM Research) and the other team members. URL sujet détaillé :
:
Remarques :





SM20713 Integrated blackbox+whitebox performance modeling of Cloud applications


Description


Context: Cloud computing offers unprecedented levels of flexibility to efficiently deploy demanding applications. Cloud users have access to a large variety of computing resources with various combinations of configurations and prices. This allows them in principle to make use of the exact types and numbers of resources the applications need. However, this flexibility also comes as a curse as choosing the best resource configuration for an application becomes extremely difficult: cloud users often find it very hard to accurately estimate the requirements of complex applications. It is therefore useful to model the performance of Cloud applications, such that we can use the model to choose the best resource configuration for an application.
The HARNESS European project (http://www.harnessproject.eu) is currently working on two complementary approaches to model the performance of cloud applications: First, a ``blackbox'' method can build a performance model of arbitrary applications. This approach is very general, but on the other hand it may require a significant number of profiling executions before capturing all the relevant specificities of a new application. We therefore also proposed to develop one or more ``whitebox'' performance modeling components. Each such component may be specialized for a particular class of application, for example MapReduce applications. Whitebox performance modeling may be faster than blackbox modeling in identifying key information thanks to its domainspecific knowledge of a specific class of applications. When a whitebox component can be applied to a user's application, it can thus considerably speed up the blackbox performance modeling process.
Research topic:
The objective of this master project is to design and develop a unified performance modeling component which combines the best aspects of both models: a combined blackbox/whitebox model would be able to handle any type of application; but for a number of common application types it would make use of domainspecific information. The idea is to make the blackbox model be in charge of exploring possible resource configurations, measuring the performance and cost of the application using these resources, and deriving a performance model from these experiments. Measured metrics are also given to all the whitebox models. If one of the whitebox models does generate useful information, it can inform the blackbox model about configurations that should be avoided.
For example, we have a useful whitebox model for MapReduce applications which can quickly decide if using GPUs or FPGAs for executing the map or reduce task is likely to provide performance benefits.
The student will be in charge of proposing concrete ways to make two performance models work in conjunction, following these principles. Developments will be based on ConPaaS (www.conpaas.eu) where the blackbox model is already implemented. The validations of the proposed algorithms may rely on the French experimental platform named Grid'5000 (www.grid5000.fr). This platform comprises about 8,000 cores geographically distributed in 10 sites linked with a dedicated gigabit network.
This work will be realized in collaboration with Imperial College London, which is developing whitebox performance models for MapReduce and datastreaming applications.
References:
[1] S. Pllana, I. Brandic, and S. Benkner. A survey of the state of the art in performance modeling and prediction of parallel and distributed computing systems. International Journal of Computational Intelligence Research, 4(1), 2008.
[2] E. Ipek, B. de Supinski, M. Schulz, and S. McKee. An approach to performance prediction for parallel applications. In Proceedings of the EuroPar Conference, 2005.
[3] ConPaaS: a Platform for Hosting Elastic Cloud Applications. Guillaume Pierre and Corina tratan. IEEE Internet Computing 16(5), SeptemberOctober 2012.
[4] Imperialled project aims to unlock processor diversity in the cloud. InformationAge, April 2013. http://bit.ly/1mu926F URL sujet détaillé :
:
Remarques :





SM20714 Using constraint programming to solve


Description


Constraint programming is a powerful tool to solve problems if they can be modeled with decision variables and constraints, where a constraint is a relationship between one or more variables that limit the values simultaneously taken by each variables involved in the constraint. Algorithms of solution search inconstraint programming, typically rely on constraints propagation to reduce the number of candidate solutions to explore, as well as a systematic search among the different possible assignments of each variable. Such algorithms guarantee to find a solution, when it exists, and allow to prove that there is no solution to a problem if it has not been resolved at the end of the exhaustive search.
The AES is the new block cipher standard since 2001. There is currently no attacks on the entire encryption process although recent results have led to progress in the understanding of its security. The purpose of this trainee is to use constraint programming to find the results of attacks against the AES obtained by A. Biryukov et al. in [1] and P. Fouque, A. et al. in [2] and attempt to extend the method to the case of the block cipher Rijndael. The attack by Byriukov et al. proposed in 2009 is a socalled chosen key attack against the AES. The main idea is to look for truncated differentials that form valid paths through both the encryption process and the key schedule itself. The authors of this paper proposed a branch and bound algorithm to solve this problem.
In this trainee, we propose to try to solve this problem using constraint programming and the results we have already obtained in [3]. It will therefore be initially to validate the results of Biryukov et al. and of Fouque et al. (Crypto 2013). Then, in a second time, it will be to extend to the Rijndael cipher.
references: [1] Alex Biryukov, Ivica Nikolic: Automatic Search for RelatedKey Differential Characteristics in ByteOriented Block Ciphers: Application to AES, Camellia, Khazad and Others. EUROCRYPT 2010: 322344 [2] PierreAlain Fouque, Jérémy Jean, Thomas Peyrin: Structural Evaluation of AES and ChosenKey Distinguisher of 9Round AES128. CRYPTO (1) 2013: 183203 [3] Marine Minier, Christine Solnon and Julia Reboul: Solving a Symmetric Key Cryptographic Problem with Constraint Programming. MODREF 2014, workshop of the CP 2014 conference
URL sujet détaillé :
:
Remarques : this trainee will be done with Christine Solnon from the LIRIS laboratory.





SM20715 An overlay network for a distributed cloud within the backbone


Description


While the Internet was built in a decentralised fashion, we are currently witnessing a strong recentralisation of computing facilities such as those run by Amazon, Google or Facebook. The intrinsic limitations to this model is increasingly visible, chiefly due to reliability, privacy, and energy issues, as is experienced recently by users [1].
The Discovery project [2] is driven by the objective of proposing a decentralised cloud architecture grounding in local resources located within the core the network. Indeed, the infrastructures set up by Internet providers are most of the time overprovisioned given the actual needs. It consequently appears that installing computing resources along the path of the backbone could reduce the costs related to the construction, maintenance, and energy supply of a computing centre by capitalising on the existing infrastructure. Also, it will lead to a natural distribution of resources while favouring the use of resources that are geographically close to the user.
Operating such an infrastructures calls for means to abstract out the details of the complex, heterogeneous and unreliable underlying network, so as to offer a a simple API to be levered by users to deploy their applications easily. This requires to design this overlay network to be deployed between the infrastructure and the applications on which will rely the API. An identified promising approach consists in the usage of the Vivaldi protocol [3], which places nodes in a Cartesian space according to their physical distance.
The work will consist in the design of such an overlay, and on the definition of routing strategy to be adopted within. In particular, one needs to be able to efficiently locate close nodes, retrieve objects stored in the network, while minimising the maintenance cost of the overlay. The work will be an algorithmic work firstly, and secondly experimental, through the validation of the proposed overlay network, its routing and maintenance algorithms in a software prototype to be deployed on the Grid'5000 national platform [4]. URL sujet détaillé :
:
Remarques :  Advisor 1: Cédric Tedeschi, INRIA / Univ. Rennes 1 (http://people.irisa.fr/Cedric.Tedeschi)
 Advisor 2: Marin Bertier, INRIA / INSA Rennes (http://www.irisa.fr/asap/?page_id=195)
 Pay: ~500 =80





SM20716 Dynamic distributed adaptation of data stream processing platforms


Description


The amount of data generated by social media such as Twitter makes it appealing being able to extract relevant information from these data in real time. Storm [1,2], initially developed at Twitter, has been designed to propose a simple API to deal with this problem. Developing a Storm application chiefly consists in defining the different steps a data stream will go through so as to extract information from it. Storm provides a way to ease the deployment of this topology over a computing platform.
In other terms, Storm automates the deployment of the set of processes composing the extraction application on the physical computation nodes available for instance in a cluster. To achieve and maintain such a deployment, Storm relies on a hierarchical structure monitoring worker nodes so as to replace them in case of a failure. One strong limitation of this approach is its hierarchical shape and its inability to cope with the dynamic nature of computing platforms. Indeed, the top of the hierarchy is occupied by an agent responsible for the application of faulttolerance [3] and load balancing policies. Also, the problem of dynamically remapping the processes on the computing nodes as the velocity of the stream can quickly change is still an open problem in Storm, and more generally in stream processing platforms [4].
The work will consists in modelling a Storm topology, and to propose distributed algorithms (without the need for a monitoring hierarchy) to adapt to dynamic changes in the stream and platform's conditions. More precisely, the computing power dedicated to the support of each process needs to be dynamically adjusted, and this needs to be done in a decentralised fashion, each node deciding, based on local information about its workload , to duplicate or to terminate.
Validating this work will rely on the development of a prototype including these algorithms and its deployment over the Grid'5000 platform [5]. URL sujet détaillé :
:
Remarques :  The team:  Marin Bertier (Associate prof., INRIA / INSA Rennes)
 Matthieu Simonin (Engineer, INRIA)
 Cédric Tedeschi (Associate prof., INRIA / U. Rennes 1)
 Christine Morin (Senior Researcher, INRIA)
 Pay: ~500 =80





SM20717 Secure multiparty computation using the bitcoin protocol


Description


General context:
The problem studied in the field of Secure MultiParty Computation (SMPC) is as follows: There are n parties or nodes in a network. Each node has an input which is private. The node does not wish to reveal the input to any other node in the network. A secure multiparty computation protocol enables the nodes to collaboratively compute a function over their inputs such that all inputs remain private, i.e., known only to their owners. An example of such a protocol is private maximum, i.e. given n nodes each with a private input, how to determine the maximum without revealing who has the maximum or any of the other private inputs. Other examples of possible private computations may include geometric functions, set operations, reputation [1], etc.
Objectives of the internship project:
The objective of this project would be to study secure multiparty computation protocols. Such protocols are often resourceheavy, which limits their practical application. The specific objective of the project would be to analyze and propose methods for rendering these protocols resourceefficient and thus practical. A study of cryptographic techniques and privacy enhancing technologies such as zeroknowledge proofs, homomorphic encryption, electronic cash, etc. will be part of the project.
Recent advancements in cryptographic electronic currencies (e.g. bitcoin) are also impacting how secure multiparty computation protocols are constructed. In particular, the bitcoin protocol may be used for building fully decentralized secure multiparty protocols that do not need to rely on trusted third parties. The project will thus also include a study of cryptographic electronic currency protocols.
References:
[1] A Decentralized Privacy Preserving Reputation Protocol for the Malicious Adversarial Model. Omar Hasan, Lionel Brunie, Elisa Bertino, and Ning Shang. IEEE Transactions on Information Forensics and Security (TIFS). April 2013. URL sujet détaillé :
:
Remarques :





SM20718 Approche spectrale appliquée aux symétries engendrées par transformations d'auto ressemblance


Description


La symétrie joue un rôle important dans notre vie, en fournissant à notre cerveau une représentation compacte des réalités complexes du monde extérieur. C'est pour cette raison que le traitement des symétries est une branche très active dans la vision par ordinateurs, dans le traitement et la synthèse d'images. Les travaux très récents ont permis de caractériser les symétries le plus répandues (symétrie bilatérale, symétrie rotationnelle ou miroir) grâce à l'étude spectrale des matrices dites matrices de correspondance symétrique. Dans ce travail de stage, nous proposons d'étudier la possibilité d'appliquer l'étude spectrale aux symétries d'un type plus évolué : celles engendrées par une transformation d'auto ressemblance. On trouve ce genre de symétrie partout dans le règnes végétal et animal : dans l'arrangement des branches et des feuilles des plantes, dans les arrangements des feuilles, fleurs et des graines dans les phyl lotaxies, dans la répartition des chambres des mollusques, etc. Ce travail de stage demandera à un étudiant de faire preuve d'imagination et de connaissances mathématiques considérables ainsi que de maîtriser des outils de l'analyse spectrale : calcul matriciel évolué, étude des valeurs propres pour les matrices éparses de grande dimension, etc. URL sujet détaillé :
:
Remarques :





SM20719 Etude des propriétés spectrales des empilements compacts à plusieurs sphères


Description


La théorie d'échantillonnage joue un rôle prépondérant dans la synthèse d'image. Les systèmes d'échantillonnage multidimensionnels évolués avec des qualités spectrales contrôlables peuvent également trouver leur application dans d'autres domaines qui font appel aux méthodes de MonteCarlo et de Quasi MonteCarlo multidimensionnelles, par exemple, dans des systèmes de simulations. On propose à l'étudiant d'examiner les propriétés spectrales des empilements compacts à plusieurs sphères, ainsi que les méthodes rapides pour générer de tels arrangements en 2D, en 3D, et en dimensions plus élevées. Ce projet est passionnant et très ambitieux. Pour comprendre à quel point le problème des empilements compacts est nontrivial, lisez l'article de Pfender et Ziegler : vous y verrez que certaines réponses aux problèmes posés depuis des siècles ont été apportées en 2003 par un mathématicien peu connu. Ou lisez les livres de Thompson ou de Conway & Sloane, pour voir à quel point le problème des empilements compacts est lié à d'autres branches de la science, par exemple à la théorie des groupes ou bien à la théorie des codes correcteurs. Références : Pfender, Florian; Ziegler, G=FCnter M. (September 2004), "Kissing numbers, sphere packings, and some unexpected proofs", Notices of the American Mathematical Society: 873=96883, http://www.ams.org/notices/200408/feapfender.pdf. Thomas M. Thompson (1983) "From errorcorrecting codes through sphere packings to simple groups". Conway, J.H. & Sloane, N.J.H. (1998) "Sphere Packings, Lattices and Groups". URL sujet détaillé :
:
Remarques :





SM20720 EVisual cues play essential role in the human visual system. They ease the process of reconstruction of 3D scenes from 2D retinal images, the only visual information that the human being possesses about the world. Consequently, understanding the underlying mechanisms is essential for better understanding the process of vision. It is also important for computer graphics


Description


Visual cues play essential role in the human visual system. They ease the process of reconstruction of 3D scenes from 2D retinal images, the only visual information that the human being possesses about the world. Consequently, understanding the underlying mechanisms is essential for better understanding the process of vision. It is also important for computer graphics: in fact, taking into account visual cues may dramatically reduce harmful rendering artifacts and also save computational resources. In this exploratory project, the student will first systematically study perceptual properties of the human visual system and identify the most important visual cues to model. Then, based on both bibliographical search and his own exploration, he will develop computational models for the most tangible visual cues. He will implement them in the rendering framework, test them and evaluate the results. URL sujet détaillé :
:
Remarques :





SM20721 Analyse statique de performance pour le langage audio Faust


Description


Sujet : Développement d'un analyseur automatique automatique WCET (WorstCase Execution Time) pour le langage Faust de traitement du signal audio et musical.
Projet : Le langage Faust, développé initialement au Grame (Lyon), permet de spécifier de manière fonctionnelle des opérations de traitement du signal ; il est particulièrement adapté aux applications audio et musicales (http://faust.grame.fr). Le traitement du son se fait, naturellement, de manière synchrone au sein des programmes Faust, et la question de la performance de ce langage se pose. En effet, un langage synchrone se doit d'être conforme à ce qu'on appelle "Hypothèse synchrone", qui stipule que tous les traitements peuvent être exécutés entre les arrivées de deux échantillons sonores successifs. Pour cela, une analyse statique de cout, du genre WCET (WorstCase Execution Time), de ces traitements est nécessaire au moment de la compilation pour vérifier que cette hypothèse est valide, assurant ainsi la correction du code Faust à exécuter.
Ce stage s'insère dans le cadre du projet ANR FEEVER (voir http://feever.fr) qui a pour ambition d'offrir, via Faust, une solution ubiquitaire innovante pour la manipulation multimodale du son.
Descriptif : L'objet du stage de niveau Master Recherche, de quelques mois, avec prolongation en thèse possible, est, en partant par exemple de l'évaluateur Faustine développé au CRI (voir http://feever.fr/Papers/propFaustine2013.pdf) :
 d'étudier les techniques actuelles d'analyse WCET;  de concevoir et d'implanter via Faustine un algorithme d'analyse WCET pour le langage Faust;  si le temps le permet, d'en faire une version C++ à intégrer dans le compilateur Faust.
Egalement en fonction du temps disponible, une preuve informatique, en Coq, d'une partie de ce processus pourra être abordée. URL sujet détaillé :
:
Remarques : Exigences : Une connaissance de la programmation fonctionnelle et des techniques de compilation est nécessaire. Un intérêt pour la musique ou les traitements audio est un plus.





SM20722 exploration of computational models for visual cues


Description


Visual cues play essential role in the human visual system. They ease the process of reconstruction of 3D scenes from 2D retinal images, the only visual information that the human being possesses about the world. Consequently, understanding the underlying mechanisms is essential for better understanding the process of vision. It is also important for computer graphics: in fact, taking into account visual cues may dramatically reduce harmful rendering artifacts and also save computational resources. In this exploratory project, the student will first systematically study perceptual properties of the human visual system and identify the most important visual cues to model. Then, based on both bibliographical search and his own exploration, he will develop computational models for the most tangible visual cues. He will implement them in the rendering framework, test them and evaluate the results. URL sujet détaillé :
:
Remarques :





SM20723 Behaviours as solutions for higherorder programs and processes


Description


This internship proposal focuses on operational methods to study the operational semantics of programs and processes (process calculi, lambdacalculus). A description can be found at http://perso.enslyon.fr/daniel.hirschkoff/Stages/dhcontr.pdf URL sujet détaillé : http://perso.enslyon.fr/daniel.hirschkoff/Stages/dhcontr.pdf
Remarques : Ce travail s'inscrit dans le cadre d'une collaboration avec D. Sangiorgi (Université de Bologne, Italie).





SM20724 Dynamic muscle model for virtual human animation


Description


This project concerns the accurate and realtime simulation of real human movement. To perform such simulation, we need a model of the real human in the virtual world of the simulation. Different models can be used depending on the intended application. In this project, we focus on models involving the anatomy and inner working of the human body.
The purpose of this project is to design a new model of muscle action line for the realtime simulation of virtual characters. The currently used muscle model uses line segments (called action lines) which represent only within a certain limit the action that a muscle can produce. These limits are reached when muscle attachments are large, or when the muscle fibers have a complex arrangement or when they follow a complex routing (e.g. wrapped around bones or other muscles). Some musclebased simulation platforms use predefined primitive surfaces around which these action lines can wrap, but these are still rough approximations of the deformations that are involved in the human body.
Nevertheless musclebased simulations are very sensitive to the directions of the muscle forces, and the routings of the action lines define the direction of the applied forces. Different action lines routings have a significant effect on the simulations. This project will investigate the benefits of using dynamic action lines i.e. that adapt to the posture. These adjustments will be estimated from the geometric deformations of the muscle. As these deformations are expensive to compute, they will be part of an offline preprocess, and only a precalculated table of each muscle moment arms will be used according to the current posture of the virtual character.
Expectations for this project are threefold. (1) The student will use a geometric method to calculate the deformations of the action lines. (2) The student will then assess the impact of this new model on muscle function (i.e. are the differences in forces significant?). (3) Finally, demonstration scenarios will be developed to validate the model and support the results. URL sujet détaillé : https://liris.cnrs.fr/nicolas.pronost/Papers/StudentProject/DynamicActionLine.pdf
Remarques : Secondary advisor: Saida Bouakaz (LIRISSAARA) Ce stage est rémunéré et pourra être poursuivit en thèse.





SM20725 Machine learning techniques for classification of viral genomic sequences


Description


Since the sequencing of the first bacterial genome in 1995, genomic sequences of thousands of microbial species have been sequenced, which covers only a tiny portion of known microbes. Genomes can be viewed as linear strings of four bases, adenine (A), guanine (G), cytosine (C), and thymine (T). This enables the treatment of genome sequences as symbolic sequences and the characterization of these sequences using mathematical models. The mathematical models of genomic sequences of particular interest to us in this project are a class of models called genome signatures. Genome signatures are compact mathematical representations of DNA sequences (see for a detailed presentation e.g. [5]) and in the context of this project we will consider the decomposition of genomic sequences in kmers, that is words of size k.
Analysis of microbial communities has been until recently a complicated task due to their high diversity and the fact that many of these organisms can not be cultured. Metagenomics has emerged as essentially the only way to characterize these unculturable microbes. Metagenomic studies generate sequence data at an even higher rate than classical genomics. Indeed, the number and diversity of microbial communities are not known, but current estimations place the number of species for certain environments in the range of millions [1]. The recent announcement from the Earth Microbiome Project is to generate petabases 10^15 or more of environmental sequences [10]. Unfortunately, the general question of classifying complex sequence data coming from a non apriori sequencing of the whole metagenomic sample is a particularly challenging computational problem. Moreover, the most blatant difficulty is to recognize novel viral sequences within a set of sequences.
Nonetheless, a number of efficient methods exist to classify the composition of a sample that mainly contains known species. Computational solutions for taxonomic assignment can be broadly organized in two main categories: (1) sequence similarity methods and (2) sequence composition methods.
Methods that rely on sequence similarity can be themselves subdivided in classical alignmentbased techniques (mostly attempting to improve BLAST accuracy) and indexbased. Alignmentbased methods suffer from two limitations: speed and lack of sensitivity, see e.g. [2] and [3]. Alternative solutions have been suggested to overcome these limitations. These methods are based on long kmers (subsequences of size k) and conceptually rely on the fact that when k is sufficiently large, sequences become speciesspecific. Consequently, the idea is to index the sequences by long kmers. This is indeed the foundation of MegaBlast (a generalpurpose sequence aligner using long seeds), but also of a number of methods specific for taxonomic assignment such as e.g. LMAT [4] and Kraken [3]. The downside of these sequence similarity approaches is overspecificity, which makes classification of unknown sequences problematic.
A complementary approach is based on sequence composition analysis. Such methods rely on the decomposition of sequences into frequencies of kmers (with k that is not too large), and make use of machine learning techniques (e.g. SVM, kNN, Naive Bayes, etc) to train a classifier on the reference database. The taxonomic assignment of novel sequences is then predicted by applying the pretrained model. These methods theoretically are better suited to the task of novel species classification as short kmers distributions are less prone to overfitting. However, even these techniques fail to classify about 50% of species absent from the training set [6]. This is especially salient for viral sequences, as the vast majority of them fails to be uniquely assigned to any domain [7].
In this project we propose to explore the use of spaced kmers [8] (that is kmers where certain letter positions are considered to be wildcards) for classification of novel viral sequences. Indeed, it has been recently shown that such signatures can be better suited to certain genomic classification tasks such as construction of phylogenetic trees [9].
The intern will develop an algorithmic method for viral sequence classification based on spaced kmers. For this purpose he/she will have to acquire and apply machine learning techniques as well as "Big Data" approaches for dealing with highly voluminous data. The proposed method(s) will have to be implemented and will be tested both on public data as well as on novel data acquired by our collaborators at INRA. Good programming skills are most welcome, as well as a good working knowledge of algorithmics and statistics.
Advisors: Macha Nikolski and Hayssam Soueidan Laboratory : Bordeaux Bioinformatics Center (CBiB) and LaBRI (Laboratoire Bordelais de Recherche en Informatique)
Contact: Please send your CV/resume to and
References:
[1] J. Gans, M. Wolinsky, and J. Dunbar. Computational improvements reveal great bacterial diversity and high metal toxicity in soil. Science, 309:1387=9690, 2005. [2] A. Bazinet and M. Cummings. A comparative evaluation of sequence classification programs. BMC Bioinformatics, 13(1):92, 2012. [3] Wood, D. E. and Salzberg, S. L. (2014). Kraken: Ultrafast metagenomic sequence classification using exact alignments. Genome Biology, 15(3), R46. [4] Ames S.K., Hysom D.A., Gardner S.N., Lloyd G.S., Gokhale M.B. and Allen J.E. (2013), Scalable metagenomic taxonomy classification using a reference genome database. Bioinformatics, vol. 29, p. 2253=962260 [5] Nalbantoglu O., Computational Genomic Signatures and Metagenomics, Phd dissertation University of NebraskaLincoln USA, 2011 [6] Nalbantoglu O., Way SF, Hinrichs SH, Sayood K: RAIphy: Phylogenetic classification of metagenomics samples using iterative refinement of relative abundance index profiles. BMC Bioinformatics 2011, 12:4141. [7] Rosen G.L., Reichenberger E.R., Rosenfeld A.M. (2010), NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads. Bioinformatics 27, p. 127=96129 [8] Keich, U., Li, M., Ma, B. and Tromp, J. 2004, On spaced seeds for similarity search, Discrete Appl. Math. vol. 138, p. 253=96263 [9] Leimeister, C.A., Boden, M., Horwege, S., Lindner, S., and Morgenstern, B. (2014). Fast alignmentfree sequence comparison using spacedword frequencies. Bioinformatics, DOI: 10.1093/bioinformatics/btu177. [10] J. Gilbert, J. Meyer, D. Antonopoulos, P. Balaji, C. Brown, N. Desai, J. Eisen, D. Evers, D. Field, W. Feng, D. Huson, J. Jansson, R. Knight, J. Knight, E. Kolker, K. Konstantindis, J. Kostka, N. Kyrpides, R. Mackelprang, A. McHardy, C. Quince, J. Raes, A. Sczyrba, A. Shade, and R. Stevens. Meeting report: the terabase metagenomics workshop and the vision of an earth microbiome project. Stand Genomic Science, 3:243=968, 2010. URL sujet détaillé :
:
Remarques : Ce stage est, bien sur, rémunéré





SM20726 Motion control on soft or liquid support


Description


This project concerns the accurate and realtime simulation of real human movement. To perform such simulation, we need a model of the real human in the virtual world of the simulation. Different models can be used depending on the intended application. In this project, we focus on models capable of dealing with complex interactions with the environment. These interactions affect the simulation and we are seeking to control the motion of the virtual human regarding these interactions.
The purpose of this project is to create a new physics based motion controller for virtual characters. The current controllers are not designed to deal with complex interactions with soft bodies or liquids that can have an influence on the movement. Yet these interactions happen in many real life situations. For example, we can imagine a virtual human easily moving through 5 cm of water on a beach. This exact same human will not walk around with the same gait pattern at all with water up to the knee. It means that the controller must know how to adapt to such different situations. To do so, we need to define a mathematical formulation to describe how we expect to see humans move in these situations (e.g. by optimizing a cost function that describes the effectiveness of the gait).
Expectations for this project are twofold. (1) The student will implement a computational method for interaction between rigid bodies(representing the virtual character) and soft bodies /liquids(water, mud, snow, sand etc.) that is efficient enough to be applied to realtime animation. (2) The student will use this method (i.e. mainly collision handling) and a physics based controller to produce plausible locomotion (walking and running). URL sujet détaillé : https://liris.cnrs.fr/nicolas.pronost/Papers/StudentProject/SoftMotionControl.pdf
Remarques : Secondary advisor: Saida Bouakaz (LIRISSAARA) Ce stage est rémunéré et pourra être poursuivit en thèse.





SM20727 Vérification paramétrée de systèmes tempsréel


Description


Les systèmes temps réel multiprocesseurs sont devenus omniprésents ces dernières années. Certains d'entre eux (pilotage automatique des avions et drones, voitures sans chauffeur) sont critiques en ce sens qu'aucune erreur ne doit survenir. Tester de tels systèmes peut éventuellement détecter la présence de bugs, mais pas en garantir l'absence. Il est alors nécessaire d'utiliser des méthodes telles que le model checking afin de prouver formellement la correction d'un système.
Déterminer un ordonnancement (scheduling) correct des tâches effectuées par les processeurs est crucial pour la sûreté du système. Ces systèmes sont caractérisés par un ensemble de constantes temporelles, telles que la période de lecture d'un capteur d'altitude sur un drone, ou la durée minimale ou maximale d'un calcul sur un satellite. Bien que des techniques permettant de vérifier le système pour un ensemble de constantes existent, vérifier formellement le système pour de nombreuses valeurs de ces constantes peut demander un temps extrêmement long, voire infini si l'on cherche à vérifier des ensembles denses de valeurs.
Il est alors intéressant de raisonner paramétriquement, en considérant que ces constantes temporelles sont inconnues, c'estàdire des paramètres, et synthétiser une contrainte sur ces paramètres garantissant qu'il existe un ordonnancement des tâches à effectuer par les processeurs tel que le système est correct. Plusieurs techniques ont été proposées afin de résoudre ce problème, notamment dans le cadre de l'ordonnancement.
L'objectif du stage est de mettre au point des techniques de vérification paramétrée de systèmes tempsréel dédiées au cas de l'ordonnancement multiprocesseurs.
Une implémentation sera également effectuée par l'étudiant afin de valider l'approche proposée. Une option possible est de réutiliser l'outil IMITATOR. URL sujet détaillé : http://lipn.univparis13.fr/~andre/verificationparametreedesystemestempsreel.pdf
Remarques : Coencadrement avec Giuseppe Lipari (LIFL, Lille). Le stage peut être effectué au LIPN (Paris 13) ou au LIFL (Lille), en fonction du souhait du candidat.
Rémunération standard.





SM20728 BigData in NeuroScience


Description


Models studied in computational neuroscience are fed up with facts obtained from the large literature reporting knowledge obtained from biological experimentation. Such knowledge is put in words, leading to phenomenological descriptions of the underlying behaviors. Then, a part of this information is put in equations and in distributed algorithms to check some quantitative or qualitative aspects of the presumed global understanding. Surprisingly enough, this work is mainly realized like by craftsmen, and more or less by intuition.
Other domains of knowledge have taken into account recent and powerful developments regarding semantic representations and introduce an intermediate framework between knowledgeinwords and knowledgeinequations: knowledgeinfacts. From very large and heterogeneous corpus like Wikipedia turned in DBpedia semantic data base, to very specialized fields of knowledge formalized as ontology, it is clear that powerful methodological tools are now available to manage the data at a higher scale.
The goal of the proposed work is to address this issue in three steps:
1/ Considering a welldefined topic (either earlyvision processing from retina to the thalamus, or functional models of basalganglia and related structures of action selection, the choice being made with the student) build an ontology that describe our current knowledge of these brain subsystems.
2/ From this real example, propose an intermediate language (say, ``turtle=B4=B4) that allows to computer scientists to easily annotates publications building the related ontology. This steps is realized with colleagues specialists of semantic web formalism.
3/ Considering existing tools and existing initiative, propose the design of a collective platform allowing a scientific community in the field to construct such bigdata, beyond this seminal work. URL sujet détaillé : http://wwwsop.inria.fr/members/Thierry.Vieville/projects/neurobigdata/
Remarques : Rénumération du stage. Sujet de thèse à la clé. Le stage se déroule à l'Inria SophiaAntipolis en lien avec l'équipe du NeuroCampus de Bordeaux.





SM20729 Study of P2P Systems for Video Live Streaming


Description


The goal of this internship is to study and propose models to describe P2P systems. These systems are cheap to operate and scale well, but their highly distributed nature raises questions about reliability, durability, availability, confidentiality, and routing of the data. In particular, the focus of this internship will be on streaming in peer to peer networks. In such system peers are relaying packets or video frames, and the goal is to play the video the nodes ask for. This is possible only if some delay in the playback is allowed, and there exist tradeoffs be tween the number of customers nodes and the delay that the systems can support. Moreover some scheduling protocols must decide which packets are exchanged and who send them in order to achieve the best possible behavior. The analysis of P2P systems involves a wide range of techniques between practice and theory, from mathematical analysis to simulations and deployment of experiments. This will allow us to adapt the orientation of the project to the taste of the interested student.
Objective: During the internship, we will study and simulate a live streaming system for video. We will study its efficiency when there are frequent arrivals and departures of peers. The metric considered will be the delay and bandwidth usage.
The internship can be followed by a Ph.D.
URL sujet détaillé : http://wwwsop.inria.fr/members/Frederic.Giroire/stagestreaming.pdf
Remarques : Possibilité de rémunération.





SM20730 Energy, SoftwareDefined Networks and Scalability


Description


Softwaredefined Networks (SDN), in particular OpenFlow, is a new networking paradigm enabling innovation through network programmability. Over past few years, many applications have been built using SDN such as server load balancing, virtualmachine migration, traffic engineering and access control. For example, recently, Google has deployed during three years a network connecting their data centers across the planet using SDN.
In this internship, we will focus on using SDN for energyaware routing (EAR). Energy efficiency of networking infrastructure (telecom, data center, ISP or enterprise networks) is a growing concern, due to both increasing energy costs and worries about CO2 emissions. Since traffic load has a small influence on power consumption of routers, EAR allows to put unused links into sleep mode to save energy. SDN can collect traffic matrix and then computes routing solutions satisfying QoS while being minimal in energy consumption. However, prior works on EAR have assumed that the forwarding tables can hold an infinite number of rules. In practice, this assumption does not hold since these tables are implemented with Ternary Content Addressable Memory (TCAM) which is expensive and powerhungry. In [1], we have proposed an optimization method to minimize energy consumption for a backbone network while respecting capacity constraints on links and rule space constraints on routers. In details, we have presented an exact formulation using Mixed Integer Linear Program (ILP) and introduced efficient greedy heuristic algorithm that were analyzed through simulations.
The objectives of this internship are to extend this work in order to take into account the possibility to add wildcards within the forwarding rules. As an example, assuming that the rules are the mapping of [(src, dest): porttofoward], we can compact several different rules with same source s, same porttoforward p, but different destinations into: [(s,*): p]. In this case, the forwarding tables will be shrinked, but the general problem becomes more complicated. Therefore, the candidate will study the extension of the linear program to take this into account, study new heuristics, and perform the corresponding simulations on a realistic topology and set of traffic demands.
Prerequisites for candidates:  Knowledge and/or taste for Networking  Knowledge and/or taste for discrete mathematics, graph theory and/or combinatorial optimization
Contact information: For further information, please contact  Frédéric Giroire  frederic.giroire.fr  Joanna Moulierac  joanna.moulierac.fr
Location: COATI =96 INRIA/I3S (CNRSUNS)  2004 route des lucioles, 06906 SOPHIA ANTIPOLIS https://team.inria.fr/coati
References: [1] Optimizing Rule Placement in SoftwareDefined Networks for Energyaware Routing. Frédéric Giroire, Joanna Moulierac and Khoa Truong Phan. In Globecom 2014.
[2] On the effect of forwarding table size on SDN network utilization. R. Cohen, L. LewinEytan, J. S. Naor, D. Raz In INFOCOM, 2014
[3] SoftwareDefined Networking: A Comprehensive Survey. Diego Kreutz, Fernando M. V. Ramos, Paulo Verissimo, Christian Esteve Rothenberg, Siamak Azodolmolky, Steve Uhlig In arXiv:1406.0440
URL sujet détaillé : wwwsop.inria.fr/members/Joanna.Moulierac/SDN_Energy_internship_2014_COATI.pdf
:
Remarques : CoEncadrement : Frédéric Giroire frederic.giroire.fr





SM20731 Petri net unfolding of biological networks


Description


Biological networks, such as gene regulatory networks or signalling networks, can be modeled as automata networks or Petri nets. It results in concurrent models involved hundreds or thousands of parallelinteracting components. Aiming at capturing the dynamical landscape of those models, and ultimately proposing mutations to control the biological cells, we are interested in computing several properties, such as the attractors (live/dead locks, characterizing the longterm behaviour) and set of mutations that would trigger a redifferentiation of the cell, i.e. will change the current attractor.
In order to cope with the huge size of the networks, this internship will address the design, implementation, and evaluation of new algorithms based on Petri net unfoldings for computing dynamical properties on discrete models of biological networks. Petri net unfoldings exploit the concurrency between transitions to obtain socalled complete prefixes, that result in compact representations of all the possible behaviour of the net. Tools such as mole (http://www.lsv.enscachan.fr/~schwoon/tools/mole) or cunf (https://code.google.com/p/cunf) allow to compute efficiently the complete prefix of the unfolding of Petri nets.
This internship will first consist in evaluating and comparing existing algorithms for computing attractors in Petri net models of biological networks. Several variants and extensions of Petri net unfoldings will be explored in order to improve the scalability of the method. This task will involve adapting existing algorithms and implementing them to make them more efficient when applied to large biological networks. Then, new algorithms will be designed for identifying sets of components of the network for which the activation or inhibition would (1) prevent the reachability of a particular attractor from a given initial state while conserving other attractors reachability; (2) force a transition from one attractor to another given one.
This work may be continued with a PhD thesis on developing causality and concurrency theory for the formal analysis of dynamics of discrete biological networks. URL sujet détaillé : http://loicpauleve.name/unfoldingbioM2.pdf
Remarques : Cosupervised by Loïc Paulevé (CNRS / LRI, Université ParisSud)  http://loicpauleve.name





SM20732 Analyse des chemins d'accès aux variables partagées pour l'anticipation en ligne des placements routage


Description


De nos jours, des processeurs à plusieurs centaines de cóurs tels que la puce Xeon Phi d'Intel (64 cóurs) ou la puce MPPA de Kalray (256 cóurs) promettent un gain important de performance ainsi qu'une réduction de la consommation. Ces gains dépendent fortement de la programmabilité des puces mais aussi des stratégies de placement routage lors du déploiement de l'application. L'un des moyens d'obtenir la performance d'exécution consiste à réduire le temps d'accès aux mémoires partagées. Pour cela il suffit de réduire les messages coûteux du protocole d'accès à la mémoire partagée (ex. messages bloquants dans la programmation synchrone ou messages de contrôle, etc.) tout en plaçant les données au plus proche de la tâche. Dans le laboratoire LaSTRE, des travaux antérieurs ont porté sur le développement d'une chaîne de compilation à protocoles multiples permettant d'associer à chaque donnée partagée un protocole donné. L'assignation d'un protocole se fait de manière statique ou dynamique. Dans le contexte de ce stage, nous voudrions doter cette chaîne de compilation d'outils d'analyse permettant de caractériser les chemins d'accès à la mémoire partagée afin d'anticiper les placements routage. Le stagiaire aura à identifier et à formaliser les types d'accès à la mémoire partagée afin d'aider à identifier la meilleur stratégie de placement routage. Le travail se déroulera selon trois phases : 1. Etat de l'art sur les stratégies de placement routage 2. Formalisation des analyses de dépendance qui permettraient de faciliter les placements routage 3. Implémentation d'un prototype d'analyse Le stage se déroulera dans les locaux NanoInnov du CEA Saclay, au sein d'un laboratoire de recherche notamment spécialisé dans l'analyse de code, les chaînes de compilation et les systèmes embarqués pour architectures massivement parallèles.
Ce stage peut se poursuivre en thèse. URL sujet détaillé :
:
Remarques :





SM20733 Analyse de modèles d'accès aux données partagées pour le choix de protocoles de cohérence pour architectures massivement parallèles


Description


Afin de répondre aux demandes de puissance de calcul tout en maîtrisant la consommation d'énergie, des processeurs à plusieurs centaines de cóurs  dits manycoeurs  apparaissent dans les systèmes de calcul haute performance et les systèmes embarqués. Ainsi nous avons vu apparaître sur le marché, la puce Xeon Phi d'Intel qui contient 64 cóurs ou encore la puce MPPA de Kalray est composée de 256 cóurs. Les performances des applications exécutées sur de telles architectures dépendent fortement de la programmabilité de la puce, et notamment des mécanismes et des techniques de gestion de la mémoire sur puce. Le choix du modèle de cohérence des données (les règles qui déterminent la fraîcheur d'une donnée lors d'un accès) et des protocoles de cohérence impacte directement l'efficacité des accès. Ceci peut rapidement constituer un goulot d'étranglement pour l'application si ces choix sont mal adaptés. Nous avons proposé dans nos précédents travaux une chaîne de compilation à protocoles multiples permettant d'associer à chaque donnée partagée un protocole donné. Dans ce contexte, un axe de recherche ouvert consiste à s'appuyer sur des techniques d'analyse statique du code source afin d'identifier et de caractériser les accès aux données partagées (ex: Identification des tâches de l'application et des variables partagées, construction des graphes de dépendance des accès aux données partagées). Le travail proposé dans ce stage consiste à dresser un état de l'art autour de l'analyse des accès aux données partagées dans les codes d'applications parallèles, puis de proposer une solution qui répond aux points suivants: 1. La possibilité d'extraire, à partir du code source, des graphes représentant les dépendances des accès aux données partagées, 2. La caractérisation de motifs et schémas d'accès mémoire qui s'appuie sur la représentation précédente, 3. Une analyse automatisée des graphes de dépendance afin de détecter les schémas d'accès aux données. Enfin, la solution proposée devra faire l'objet d'un prototype pour la validation du stage. Le stage se déroulera dans les locaux NanoInnov du CEA Saclay, au sein d'un laboratoire de recherche notamment spécialisé dans l'analyse de code, les chaînes de compilation et les systèmes embarqués pour architectures massivement parallèles. URL sujet détaillé :
:
Remarques :





SM20734 Lagrangian Duality in Online Algorithms


Description


Online algorithms is a domain studying problems with uncertainty about future. In general, requests arrive overtime and one needs to make irrevocable decisions without any information about the future. The problems appears in many practical settings, for example in decision making, stock investment, network routing, etc. Hence, it is important to design algorithms/strategies with performance guarantee in uncertain environment.
The most successful technique until now in online algorithms is the potential argument. However, the technique has its limit (it does not give much insight about the problems, etc.) Currently, there is a trend looking for a principled approach to design and analyze online algorithms. A direction [1] is to exploit the power of mathematical programming and use the Lagragian duality. The method is powerful and flexible as it is based on natural relaxations (even nonconvex ones) and it successfully gives theoretical evidence for practical algorithms.
Our project is to develop this technique and the internship is a part of the project. The goal is to study concrete online problems (online scheduling/matching/paging etc) using the approach. Ideally, a research publication could be derived at the end of the internship.
Requirement: Motivation and basic knowledge on Algorithms, Discrete Mathematics.
[1] Nguyen Kim Thang. Lagrangian Duality in Online Scheduling with Resource Augmentation and Speed Scaling. European Symposium on Algorithms (ESA), 2013. URL sujet détaillé : https://www.ibisc.univevry.fr/~thang/Stages/lagrange.html
Remarques : Remuneration standard 523,26 euros/mois.





SM20735 Analysis of the Mifare Plus Distance Bounding Protocol


Description


* Keywords: Security, Cryptography, Contactless Smartcards, RFID.
* Objective: Check in practice the efficiency of the distance bounding protocol of the Mifare Plus RFID tag.
* Content: Radio Frequency Identification (RFID) allows to identify objects or subjects without any physical nor optical contact, using transponders  microcircuits with an antenna  queried by readers through a radio frequency channel. This technology is one of the most promising of this decade and is already widely used in applications such as access cards, transportation passes, payment cards, and passports. This success is partly due to the steadily decrease in both size and cost of passive transponders called "tags".
The "relay attack"  sometimes referred to as "Mafia fraud"  exhibited by Desmedt, Goutier, and Bengio recently became a major issue of concern for RFID authentication protocols. The adversary pretends to be the legitimate prover by relaying the messages that are exchanged during the execution of the protocol. This is illustrated through the following example. Consider an RFIDbased ticket selling machine in a theater. To buy a ticket, the customer is not required to show his theater pass, an RFID tag. The customer needs to be close enough to the machine (verifier) so that the pass (prover) can communicate with it. The pass can be kept in the customer's pocket during the transaction. Assume there is a line of customers waiting for a ticket. Bob and Charlie masterminded the attack. Charlie is in front of the machine while Bob is far in the queue, close to Alice, the victim. When the machine initiates the transaction with Charlie's card, Charlie forwards the received signal to Bob who transmits it to Alice. The victim's tag automatically answers since a passive RFID tag  commonly used for such applications  responds without requiring the agreement of its holder. The answer is then transmitted back from Alice to the machine through Bob and Charlie who act as relays. The whole communication is transparently relayed and the attack eventually succeeds: Alice pays Charlie's ticket.
When it was first introduced in the late eighties, the relay attack appeared unrealistic. Nowadays, the relay attack is one of the most effective and feared attacks against RFID systems. It can be easily implemented since the reader and the tag communicate wirelessly, and it is not easily detectable by the victim because queried (passive) tags automatically answer to the requests without agreement of their bearers. Relay attacks are for example used to steal cars in parking lots.
Distance bounding protocols are promising solutions to prevent such relay attacks. These protocols measure the roundtrip time of messages exchanged between the reader (verifier) and the tag (prover): if the round trip time is too long, then the reader considers the tag is outside its neighborhood and an attack running. There exist many distance bounding protocols but most of them are theoretical proposals. The Mifare Plus tag is the only device that takes benefit of a distance bounding protocol. The implemented protocol is however much more simple than theoretical ones. The protocol so mitigates the problem but cannot fully avoid them. Consequently, this project will seek for the limits of the distance bounding protocol implemented in the Mifare Plus. URL sujet détaillé : http://www.avoine.net/irisa_emsec_avoine_1.pdf
Remarques :





SM20736 Practical Cryptanalysis of PseudoRandom Number Generators in RFID


Description


* Keywords: RFID, Contactless Smartcard, Practical Attacks, PRNG
* Objective: Analyze (in practice) PRNGs implemented on tags available in our everyday lives
* Content:Randomness is the keystone of any cryptographic system. Generating randomness is yet far from being an easy task, especially with lowcost devices as RFID passive tags. Indeed, on one hand few logic gates can be "wasted" just for generating randomness; on the other hand passive RFID tags have no battery  hence no internal clock  and must so trust in some way the reader that feeds them. Most current pseudorandom number generators (PRNG) implemented on RFID tags definitely suffer from poor designs. For example, the PRNG implemented on Mifare Classic RFID tags  1 billion of such tags have been sold since 1995  is reinitialized each time the tag enters the field of a reader. Consequently an adversary can easily drive the PRNG of a Mifare Classic tag just by measuring the time between the tag is powered and the PRNG is called.
This internship aims to analyze (in practice) PRNGs implemented on tags available in our everyday lives, e.g., tags for building access control, public transportation, ski lift, etc. The analysis will depends on whether or not the specifications of the PRNG are known. When they are not known, which is the most usual case, the RFID tag is considered as a black box and two approaches are required: (1) Determining whether the PRNG can be influenced by some external events; (2) Determining whether the PRNG output is biased. Guidelines and tools provided by the US National Institute of Standards and Technology might be useful for this work. The expected aim of this project is to eventually break some RFID solutions in practice.
Depending of the profile of the student, this topic can be tackled from different approaches, either practical or theoretical. For example, it is expected to analyze the PRNG as a black box, but students familiar with reverse engineering techniques may wish apply there knowledge against some PRNGs. Whatever the approach, the student will be required to implement tests, and so skills in programming are required. Choice of the language is let to the discretion of the candidate. URL sujet détaillé : http://www.avoine.net/irisa_emsec_avoine_2.pdf
Remarques :





SM20737 Can SSD Memory Improve Cryptanalytic TimeMemory TradeOff?


Description


* Keywords: Cryptography, TimeMemory TradeOff, Password Cracking, Algorithmics, Probability.
* Objective: Check whether SSD memory may improve cryptanalytic timememory tradeoffs.
* Content: Many cryptanalytic problems can be solved in theory using an exhaustive search in the key space, but are still hard to solve in practice because each new instance of the problem requires restarting the process from scratch. Such a problem is for example the cracking of passwords.
The basic idea of a cryptanalytic timememory tradeoff (TMTO) is to carry out an exhaustive search once for all such that following instances of the problem become easier to solve. Thus, if there are N possible solutions to a given problem, a timememory tradeoff can solve it with T units of time and M units of memory. In the methods we are looking at, T is proportional to N^2/M^2 and a typical setting is T=M=N^(2/3).
The cryptanalytic timememory tradeoff has been introduced in 1980 by Hellman and applied to DES. Given a plaintext P and a ciphertext C, the problem consists in recovering the key K such that C=E_K(P), where E is an encryption function assumed to follow the behavior of a random function. Encrypting P under all possible keys and storing each corresponding ciphertext allows for immediate cryptanalysis but needs N elements of memory. The idea of a tradeoff is to use chains of keys, which is done using a reduction function R that generates a key from a ciphertext. Using E and R, chains of alternating ciphertexts and keys can thus be generated. The key point is that only the first and the last element of each chain are stored. In order to retrieve K, a chain is generated from C. If at some point it yields a stored end of chain, then the entire chain is regenerated from its starting point. However, finding a matching end of chain does not necessarily imply that the key will be found in the regenerated chain. There exist situations where the chain that has been generated from C merges with a chain that is stored in the memory that does not contain K. This situation is called a false alarm and is very costly.
A TMTO is efficient as long as the precalculations of the chains can be entirely stored in a fastaccess memory. RAM is currently used but its quite limited size is the bottleneck of TMTO. This internship aims to check whether SSD memory may improve cryptanalytic timememory tradeoffs.
During this internship, the student will:
(1) implement the cryptanalytic timememory tradeoff (TMTO) technique, (2) evaluate the efficiency of a TMTO when a SSD is used, (3) find the best approach to combine SSD and RAM, (4) apply the technique to crack passwords. URL sujet détaillé : http://www.avoine.net/irisa_emsec_avoine_3.pdf
Remarques :





SM20738 Hardwaresoftware codesign of TLS/SSL support for Zynq embedded systems


Description


Zynq platforms embed booth a processor (ARM dual core Cortex A9) and a FPGA (Artix7). We will study and prototype a TLS/SSL support on this platform. At hardware level, an ECC cryptoprocessor will be used to accelerate some cryptographic primitives.
Complete description at http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet2014codesignhwswtlsssl.pdf URL sujet détaillé : http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet2014codesignhwswtlsssl.pdf
Remarques : See complete description





SM20739 Software extensions for an ECC cryptoprocessor


Description


A dedicated processor for elliptic curve cryptography on FPGA circuits is under development at IRISACAIRN in Lannion.
In this work we will study and prototype software extensions for advanced support with more curves, computation algorithms, and algorithmic protections against side channel attacks. The proposed solutions will be implemented and tested on the poweranalysis attack setup.
Complete description at http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet2014extensionsoftcryptoprocecc.pdf URL sujet détaillé : http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet2014extensionsoftcryptoprocecc.pdf
Remarques : See complete description





SM20740 Vehicle Sharing Systems


Description


The number of cities equipped with Vehicle Sharing Systems (VSS, BSS, VLS) is increasing rapidly. VSS are viewed as a promising way to reduce the usage of personal car and improve the quality of life by reducing traffic, noise and air pollution. Despite the growing number of VSS, they are still poorly understood and often suffer from congestion problems (absence of a vehicle or of a parking spot at the destination).
The goal of the internship will be to contribute to the understanding of VSS by developing computer and mathematical methods. For completeness, we describe four research directions bellow. According to his preferences and skill, the student will choose one of the them during the internship. The others can serve as a basis for a subsequent PhD proposal.
Benchmark the historical data of various real VSS, and develop visualization tools, to help humans deal directly with such data. Data mining, in order to estimate the underlying intensity of user flow depending on the stations, time, day, season, weather. Modelling and simulating so as to have a solid basis for investigating the behavior of future systems and optimizing them. Mathematical analysis of models of VSS, so as to have an analytic understanding of how VSS behave and help optimizing them. URL sujet détaillé :
:
Remarques : Le stage sera coencadré par Nicolas GAST (CR INRIA Montbonnot) et Vincent JOST (CR CNRS GSCOP Grenoble)





SM20741 A dynamical system approach to compute the transient behavior of stochastic distributed systems


Description


Let us consider a random walker over a finite grid (or torus) in dimension d. A natural question is how long does it take for the walker to have a chance to visit any state in the grid, or, more or less equivalently, how long does it take for the walker's initial position not to matter anymore?
These questions are all related to the {it mixing time} of the random walk: the time it takes for the probability distribution at time t, to be close (for the total variation norm) to the stationary distribution. Computing the mixing time is an important challenge in modern Markov chain theory that has been approached in many ways (spectral decomposition of the transition matrix, coupling theory, perfect simulations).
In this internship, we propose to investigate a new approach based on asymptotic arguments (by letting the space dimension d, go to infinity). This approach transforms the stochastic random walk into a deterministic dynamical system whose long run behavior can be well understood (sometimes).
This new approach can allow one to compute the mixing time of random walks that cannot to tackled using the classical tools. It is also useful fo compute the complexity of distributed randomized algorithms such as cooperative protocols in games. Finally, it can help to assess the expected time to extension of population processes with intricate dynamics (such as populations of particules under general interaction processes). URL sujet détaillé :
:
Remarques : Will be cosupervised by Bruno Gaujal (DR Inria, Grenoble).





SM20742 Numerical Optimal Transport in 3d


Description


This project aims at developing new efficient numerical methods for optimal transport in 3d. The rich mathematical structure of the problem leads to interesting theoretical developments. It also has the potential of leading towards new efficient numerical methods to solve some nonlinear PDEs of a certain class (MongeAmpere), see Benamou, Carlier, Merigot, Oudet's article see http://arxiv.org/abs/1408.4536 on the JKO (JordanKinderlehrerOtto) scheme. My early experiments in 3d seem promising: the semidiscrete optimal transport between a measure with density and a sum of Dirac massescan be computed for up to 1 million Dirac masses. For solving MongeAmpere PDEs, the JKO scheme requires solving multiple instances of optimal transport problems, the goal is to study the possibility of accelerating the speed of convergence of the algorithm, to make it usable in practice. URL sujet détaillé : http://arxiv.org/abs/1409.1279
Remarques : We have several possibilities of funding for continuing with a Ph.D. thesis.





SM20743 Understanding the pelvic dynamics through a 3D simulation of childbirth


Admin


Encadrant : Fabrice JAILLET 
Labo/Organisme :MaSTR team of the Laboratory of Mechanics in Lille are involved in the development of a biomechanical model for the simulation of interactions between pelvic dynamics of the pregnant woman and the fetus during delivery. 
URL : https://liris.cnrs.fr/saga/wiki/doku.php 
Ville : Villeurbanne Doua 



Description


Context:
The SAARA team of the LIRIS in Lyon and the ER3:MaSTR team of the Laboratory of Mechanics in Lille are involved in the development of a biomechanical model for the simulation of interactions between pelvic dynamics of the pregnant woman and the fetus during delivery.
The purpose of this project is now to change the existing 3D model to allow interaction with the fetus during labor. This implies ultimately to provide a more or less "patientspecific" geometric model, and to set it under different scenarios (orientation and shape of the pelvis, the fetal head, altering the properties of ligaments, etc.).
This 3D model can be used to perform biomechanical simulations, which will aim to:
1) assist in understanding the role of various anatomical structures,
2) observe the physiological damage consecutive to the childbirth, and characterize the possible induced trauma.
Proposed missions:
The physiological model of childbirth will be degraded to move towards more interactive simulations. This is reflected for example by the use of lower resolution meshes, using laws of simpler behavior (linear, nonlinear) ...
This work will go through a phase of confrontation with the full model (taken as "ground truth") and dynamic MRI 2D images. The challenge will be to control the error while ensuring the mechanical validity.
It will be necessary to perform the analysis and the following developments:
 literature review leading to the definition of a relevant error criteria, a posteriori and possibly during the simulation,
 software implementation (C++) of the adopted solution in the CamiTK platform,
 coupling with the simulation software Abaqus=99.
Depending on the progress, it will be useful to adapt the 3D model (Catia=99) according to the different scenarios.
Area of expertise:
The candidate will ideally be skilled at the interface between mechanical and computer graphics applied to the biomedical simulation. Skills in computational mesh processing and numerical analysis would be a plus. URL sujet détaillé : http://liris.cnrs.fr/fabrice.jaillet/data/M2.ErreurAccouchement201415.pdf
Remarques : Contract Type: Master research or engineering school regulated internship
Équipe SAARA,
LIRIS UMR CNRS 5205,
Domaine scientifique de la Doua,
Bâtiment Nautibus,
2325 Av. Pierre de Coubertin,
F69100 Villeurbanne Cedex.
Contact :
LIRIS : Fabrice JAILLET (fabrice.jaillet (at) liris.cnrs.fr)
Florence ZARA (florence.zara (at) liris.cnrs.fr)
LML : Mathias BRIEU (mathias.brieu (at) eclille.fr)
LIRISCHU Lyon : Géry LAMBLIN (gery.lamblin (at) chulyon.fr)





SM20744 Bounded graph coloring


Description


A proper vertexcoloring of a graph G = (V, E) is a mapping from the set of vertices V to a set of colors, usually represented by the set of integers {1, ..., k}, in such a way that every two adjacent vertices are assigned distinct colors.
There exist many variations of graph coloring: coloring edges instead of vertices, coloring with lists instead of integers, adding constraints such as forbidding bicolored cycles...
One (quite natural) constraint on the coloration is to bind the number of elements (vertices or edges) that may receive the same color. In that case, we speak about bbounded coloring, where b denotes the maximum number of elements that may receive the same color. Such colorings have applications in timetable scheduling or job scheduling in multiprocessor systems for instance.
The goal of this internship is:  firstly, to review existing results on kbounded colorings,  secondly, to study the kbounded version of other types of graph colorings (such as, for example, oriented vertexcolorings or weight edgecolorings, two types of colorings wellstudied at LaBRI) both in their structural and algorithmic aspects.
An oriented coloring of an oriented graph is a proper coloring such that if there exists an arc uv with color(u) = a and color(v) = b, then there is no arc wx with color(w) = b and color(x) = a. A weight edgecoloring of an undirected graph is a mapping from its set of edges to a set of weights W = {1, ..., w} such that the mapping that assigns to each vertex the sum of the weights of its incidents edges is a proper vertexcoloring.
References. P. Hansen, A. Hertz and J. Kuplinsky. Bounded vertex colorings of graphs. Discrete Math. 111:305312 (1993). M. Jarvis and B. Zhou. Bounded vertex coloring of trees. Discrete Math. 212:145151 (2001). R. Rizzi and D. Cariolaro. Polynomial time complexity of edge colouring graphs with bounded colour classes. Algorithmica 69:494500 (2014). URL sujet détaillé :
:
Remarques : Coencadrant : Eric Sopena, PR LaBRI





SM20745 Types and interpretations for programs of polynomial time complexity


Description


Can one define simple and modular programming disciplines in a highlevel language ensuring that the programs produced have a bounded complexity, for instance of polynomial time? This question is explored by the field of implicit computational complexity: it aims at defining static complexity criteria, which do not rely on explicit bounds on resources. This internship proposal focuses on implicit complexity criteria for functional programs, which can use higherorder subprograms. It aims at relating two approaches for this issue, based respectively on type systems and on the semantic notion of interpretation. URL sujet détaillé : http://perso.enslyon.fr/patrick.baillot/STAGES/2015/sujet_interpretations2015.pdf=09
Remarques :





SM20746 Evaluation Sécurisée d'une BoiteS


Description


L'exécution d'un algorithme cryptographique (DES, AES, RSA,...) sur un appareil embarqué implique, en l'absence de précautions particulières dans l'implémentation, des variations dans le comportement de l'appareil. Ces variations peuvent être observées (via la consommation de courant, les émissions électromagnétiques,...) et combinées à l'aide de méthodes statistiques afin d'en déduire les secrets manipulés par l'appareil. L'implémentation de fonctions cryptographiques pour des applications embarquées (carte à puce par exemple) nécessite par conséquent des contremesures contre de telles attaques, dites attaques par canaux auxiliaires. Une contremesure classique, appelée masquage, consiste à séparer les entrées des primitives cryptographiques en plusieurs parties générées aléatoirement puis à remplacer les opérations sensibles (i.e. opérant sur des données secrètes) par des séquences de calculs n'opérant que sur un nombre limité de ces parties. Les façons d'appliquer ce type de contremesures diffèrent pour chaque primitive et elles s'appuient sur des techniques de Secure Multiparty Computation, de codage et/ou de calcul efficace sur des corps finis.
La sécurisation de l'évaluation de certaines primitives appelées boitesS à l'aide du masquage est particulièrement difficile et pose des questions qui rejoignent des problématiques bien connues en implantation matérielle efficace de polynômes définis sur des corps finis de caractéristique 2. Des récents travaux ont ainsi montré que la difficulté de protéger l'implantation d'une boiteS dépendait du nombre de multiplications nécessaires à l'évaluation du polynôme correspondant à la boiteS (les multiplications représentant des élévations au carré étant omises dans le calcul de la complexité). Une série d'articles a ainsi conduit à la publication de méthodes d'évaluation essayant de minimiser le nombre de multiplications. La plus efficace d'entre elles, publiée dans les actes de la conférence CHES cette année, peut sur plusieurs points être considérée comme adhoc.
Le but du stage proposé ici est d'investiger des pistes permettant de rendre la méthode [CRR14] plus formelle et, éventuellement, d'en améliorer l'efficacité. Pour se faire, l'étudiant devra dans un premier temps se familiariser avec l'état de l'art. Puis, il devra implanter, dans un système de calcul formel comme Maple ou Magma, la solution proposée dans [CRR14]. La seconde partie du stage consistera à rechercher avec les encadrants des pistes pour améliorer l'état de l'art.
[CRR14] J.S. Coron, A. Roy, and S. Vivek. "Fast Evaluation of Polynomials over Finite Fields and Application to Sidechannel Countermeasures". In CHES, 2014. To appear. URL sujet détaillé : http://wwwsalsa.lip6.fr/~perret/Site/stageEval.pdf
Remarques :





SM20747 FouCo


Description


Isabelle/HOL is a popular proof assistant based on higherorder logic. It has recently been extended with support for codatatypes and corecursion. The goal of this project would be to extend Isabelle so that it can process a larger class of corecursive functions, i.e., functions that produce possibly infinite values. This would enable Isabelle to go beyond the state of the art in its competitors (Agda, Coq, and Matita).
The implementation language is Standard ML (an AngloSaxon dialect of OCaml). The implementation would consist of two modules:
1. A command that would parse the specification of a function f, define f in terms of a socalled corecursor, and derive some theorems.
2. A command that would be used to derive the corecursor needed for point 1.
We explicitly do not expect the internship to result in the completion of both modules. The supervisor and one of the cosupervisors would also participate in the coding effort. URL sujet détaillé : http://www21.in.tum.de/~blanchet/corec.pdf
Remarques : Coencadrants : Andrei POPESCU et Dmitriy TRAYTEL.





SM20748 Relativized Implicit Computational Complexity and Declassification


Description


On one hand, implicit computational complexity is the attempt to characterize complexity classes without having to refer to specific machines. Statically analyzing a program may give bounds on the complexity of this program. On the other hand, confidentiality and integrity properties can be statically analyzed: each variable is given a security level (for the sake of simplicity assume two levels : public and private) and various systems have been proposed to ensure that no information may leak from private to public variables. Both of those problems rely heavily on the noninterference property. Roughly two variables are noninterferent when they are independant, i.e. when changing the value of one variable does not change the value of the other one. Several techniques of noninterference analysis have been successfully adapted to the field of implicit computation complexity : the fact that variables are non interferent allows to hierarchize computations in tiers.
Noninterference is too strict to cope with many every day life scenarios, for instance think at the password checking : a secret value is compared to a public value and the result of the test is public. Thus many declassification frameworks have been proposed.
The aim of the internship is to explore the following idea : in some declassification settings downgrading of information is possible through some fixed functions. Is it possible to reuse those kind of frameworks from an implicit computation point of view with the idea that those downgrading are viewed as oracles ? Is it possible to give the characterization of relativized (to the oracle) complexity classes in this context ?
References :
[1] Frédéric Prost: Enforcing Dynamic Interference Policy. SocialCom/PASSAT 2011.
[2] JeanYves Marion: A Type System for Complexity Flow Analysis. LICS 2011.
[2] Andrei Sabelfeld, David Sands: Declassification: Dimensions and principles. Journal of Computer Security 17(5): 517548 (2009). URL sujet détaillé :
:
Remarques :





SM20749 Preventing SQL Injection Attacks using Information Flow Control


Description


Keywords : Security, SQL Injections, Access/Flow/Dissemination Control, Declassification
Since the last decade, the importance of web security has tremendously increased. In fact, all companies have a web portal and many collaborative services are nowadays provided by web applications. Software vulnerabilities like buffer overflows and format strings are less and less significant compared to higher level attacks on web applications such as SQL injections, XSS, CSRF... In this work we want to focus on SQL injections.
An SQL injection is due to the fact that an application (e.g. in Java), that is responsible for most of the computations and input/output, interact with a database application (e.g. in SQL), responsible for most of the data storage and retrieval. Thus, the Java application transmits SQL queries to the database without knowing what is being requested: an SQL query is just a string without any semantic meaning for the Java application (we voluntarily exclude parameterized queries, which only solve part of the problem). It is this blind interaction that is largely responsible for transmitting malicious SQL queries (i.e. injections) to DBMS that Java applications cannot detect. Moreover, beyond the different languages used to develop these applications, this interaction is also complicated due to the different paradigms used in these two programs: imperative for Java and declarative for SQL. In this work, we aim at preventing SQL injections by tracking endtoend information flow through both the application and the database.
Many contributions have been proposed to deal with information flow control in imperative or declarative programs, but only few of them [1,6] have tracked endtoend information flow through the whole program (both imperative and declarative). The main matter of this work is to study this interaction and to propose an approach to track data from their input in the imperative system to their storage in the declarative system, and conversely to track the data output through the declarative layer and then through the imperative one. For this purpose, the dissemination control model TBAC [2], proposed for declarative systems, will be adapted and used throughout the whole system.
During this internship, the student will have to conduct a literature review on SQL injection countermeasures [7], on decentralized information flow control [3], dissemination control [4] and also on declassification [5] which is a central point in the TBAC approach. Then the student will have to propose a common imperative/declarative formalism tracking information flows through the application and the database and taking declassification into account. Finally, a prototype will be developed to validate the proposed approach and to evaluate the performance overhead. Please note that this internship may lead to a PhD thesis.
References : [1] David Schultz, Barbara Liskov. IFDB: decentralized information flow control for databases. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys), 2013.
[2] François Lesueur, Sabina Surdu, Romuald Thion, Yann Gripay, Meriam Ben GhorbelTalbi. Palpable Privacy through Declarative Information Flows Tracking for Smart Buildings. In Proceedings of the International Conference on Availability, Reliability and Security (ARES), Suisse, 2014.
[3] Andrew C. Myers Barbara Liskov. A Decentralized Model for Information Flow Control. in the Proceedings of the ACM Symposium on Operating Systems Principles, France, 1997.
[4] Roshan K. Thomas, Ravi Sandhu. Towards a Multidimensional Characterization of Dissemination Control. Proceedings of the Fifth IEEE International Workshop on Policies for Distributed Systems and Networks, POLICY 2004.
[5] Andrei Sabelfeld, David Sands. Dimensions and Principles of Declassification. In Proceedings of the IEEE Computer Security Foundations Workshop, France, 2005.
[6] Li, Peng, et S. Zdancewic. Practical information flow control in Webbased information systems. 18th IEEE Workshop In Computer Security Foundations CSFW, 2005.
[7] William G.J. Halfond, Jeremy Viegas, and Alessandro Orso. A Classification of SQL Injection Attacks and Countermeasures. Proceedings of the IEEE International Symposium on Secure Software Engineering, USA, 2006. URL sujet détaillé :
:
Remarques : Ce stage sera rénuméré et coencadré par Meriam Talbi.





SM20750 Optimizing avatars lifecycle in a Cloud infrastructure


Description


Avatars are virtual abstractions to extend physical objects on the Web. They are a central piece of the ASAWoO project (http://liris.cnrs.fr/asawoo/). A cloud infrastructure is an ideal location for avatars, since they can rely on the almost unlimited resources provided by the interconnected cloud infrastructures. Therefore, we need to study several aspects about the lifecycle of those avatars. Those aspects range from creation and destruction to migration of avatars while considering resource usage (energy consumption, storage space, bandwidth usage, processing power,...). Thus, we need to study how to optimize avatar instantiation, deployment, execution and hibernation in a cloud infrastructure and ultimately migrate an avatar between two infrastructures when required.
The cloud software architecture will be built on top of an existing infrastructure currently hosted at the computer science department of Université Claude Bernard Lyon 1 : This infrastructure is currently composed of 10 nodes based on hexaprocessor machines globally providing 640 GB RAM and 52 TB disk space. The current software cloud solution running is the free, open source OpenStack solution.
Requirements : Some knowledge about OSGi and Openstack is preferred
Resources and links :
Orgerie, A.C.; Lefevre, L.; Gelas, J.P., "Demystifying energy consumption in Grids and Clouds", 2010 International Green Computing Conference, pp.335,342, 1518 Aug. 2010. Lefevre, L.; Orgerie, A.C., "Designing and evaluating an energy efficient Cloud", The Journal of Supercomputing, vol.51, n.3, March 2010. Koomey, J. "Growth in data center electricity use 2005 to 2010", Analytics Press, 2011. Orgerie, A.C.; Dias de Assunçio, M.; Lefèvre, L., "A Survey on Techniques for Improving the Energy Efficiency of Large Scale Distributed Systems", ACM Computing Surveys, 2013. Topic: TIWE
URL sujet détaillé :
:
Remarques : Coencadrement : JeanPatrick Gelas





SM20751 Asynchronous convergence of monotone Boolean networks


Description


Boolean networks are finite dynamical systems that consist in a finite number of binary variables evolving, in a discrete time, through mutual interactions and according to a fixed low. They are use in many disciplines for their ability to model nonlinear behaviors. Most famous applications are in biology, where there are classical models for the dynamics of neural and gene networks.
This training course consists in studying the asynchronous dynamics of monotone Boolean networks, which present an interesting intermediate level of complexity, and which has interesting applications in the context of biological networks. It has been proved in [1] that, given a monotone Boolean network and an initial configuration x, the set R of configurations reachable from x by some asynchronous updating has the following properties:
1/ R contains a fixed point reachable from x by updating to most one time each component;
2/ R contains a unique maximal fixed point and a unique minimal fixed point, both reachable from x by updating at most two times each component.
These results suggest the following natural question, which is the core of the subject: Is every fixed point of R reachable from x by updating a polynomial number of times each component ?
The study could start with ANDOR nets, i.e. monotonous Boolean networks in which each local transition function is either a disjonction or a conjonction of positive literals. These networks can thus be represented by a directed graph with two types of vertices (andvertices and orvertices), and the use of concepts from graph theory could allows rapid and informative progress on the main question, which is difficult in the general case.
[1] T. Melliti, D. Regnault, A. Richard and S. Sené. On the convergence of Boolean automata networks without negative cycles. In J. Kari, M. Kutrib, A. Malcher, Cellular Automata and Discrete Complex Systems (Proceeding of AUTOMATA 2013), Springer Berlin Heidelberg, LNCS 8155, 124138. {em PDF version}: {verb+http://www.i3s.unice.fr/~richard/+}
URL sujet détaillé : http://www.i3s.unice.fr/~richard/stageM2/stage_M2_ENS_LYON.pdf
Remarques : Rémunération: Gratification usuelle (436,05=80)





SM20752 Generation of RDF triples from raw sensor data


Description


The objective of this MSc topic is to implement a tool capable of processing massive raw data gathered from primary sources of all kinds, such as sensor devices, surveillance videos, manually entered inputs, etc., cleaning it up into consistently uniform RDF representations. The generated RDF data is to comprise triplestores encompassing data to be analyzed and further processed into intensional knowledge characterizing the sort of specific information this data relies on. URL sujet détaillé : http://cedar.liris.cnrs.fr/papers/mstopic1.pdf
Remarques :





SM20753 Derivation of ontologies from RDF triplestores


Description


This MSc topic concerns the analysis of RDF triplestores generated from raw sensor data in order to derive terminological schemas that the generated RDF triples abide by. The objective is to categorize these triplestore data using classes and attributes fitting their structure using Formal Concept Analysis (FCA). The outcome is to constitute a terminological schema for this data that can be used for reasoning about such data. URL sujet détaillé : http://cedar.liris.cnrs.fr/papers/mstopic2.pdf
Remarques :





SM20754 Use of terminological knowledge for monitoring applications


Description


This topic's objective is to develop a use case exploiting terminological knowledge over monitoring data with reasoning based on OrderSorted Feature constraint logic for smart information retrieval, data analytics, and query processing.4 The object is to demonstrate how to leverage the capabilities brought by knowledgebased processing of monitoring data for intelligent applications with a convincing use case based on actual gathered data. URL sujet détaillé : http://cedar.liris.cnrs.fr/papers/mstopic3.pdf
Remarques :





SM20755 DYNFO and Regular Trees


Description


The seminal result of Courcelle et al. states that every MSOL (optimisation/counting) property can be solved in polynomial time in graph classes of bounded cliquewidth. Even though this is interesting, many objects in real life are dynamic (for instance updates in databases) and when a database is modified recomputing a decomposition is counterproductive. Instead, we would like to maintain the answers as the database changes. This situation is addressed by the introduction of DynFO, the properties that can be maintained and queried in First Order logic. Since its introduction the main interest was in some small set of graph properties and in regular words. The goal is to extend the study to regular trees and then to structures that are images of trees (graphs of bounded treewidth/cliquewidth, matroids/hypergraphs of bounded width,etc.)
For more details see the link below. URL sujet détaillé : http://www.isima.fr/~kante/cours/DynFO.pdf
Remarques : fund is possible (around 437euros)





SM20756 Algorithms for the Traveling Salesman Problem


Description


The wellstudied symmetric Traveling Salesman Problem (STSP) is to find a minimumcost tour in an undirected, weighted complete graph in which the edge weights are nonnegative and obey the triangle inequality. While the problem is NPcomplete, there is a simple algorithm with an approximation guarantee of 2 based on doubling a minimum spanning tree. A slightly more sophisticated algorithm due to Christofides shows that the problem can be approximated to within a factor of 3/2. Improving on this factor is a wellknown open problem.
In the last few years, there have been many exciting results (e.g. MoemkeSvensson 2011, Mucha 2012, SeboVygen 2012) pertaining to a special case known as GraphTSP, where the edge weights correspond to the shortest path distances in a given unweighted graph. The current bestknown approximation guarantee for this problem stands at 1.4. These new approaches have resulted in many interesting related questions in graph theory.
The goal of this project is to study the recent developments and related questions.
Requirement: interest in Algorithms and Graph Theory. URL sujet détaillé :
:
Remarques :





SM20757 Formal proofs for realtime analysis using CoQ


Description


Scientific context: Realtime systems are systems that must satisfy timing requirements such as response time bounds, that is, bounds on the maximum time it may take a software task to complete its execution once it has been activated. Establishing such properties is not a trivial thing because tasks executing on the same processor may have complex timing influence on each other due to scheduling. Busy window analysis [1] is a wellestablished method to compute bounds on worstcase response times in uniprocessor realtime systems that is based on a fixpoint computation. It was originally introduced for the Static Priority Preemptive scheduling policy (SPP), then reused for the nonpreemptive case (SPNP) [2] and has since been an inspiration for the analysis of many new scheduling policies. However, these proofs are too often somewhat informal. For example, the original SPNP analysis had a flaw which was discovered only years after its original publication [3].
Thesis subject: The objective of this thesis is to initiate formal support based on the proof assistant CoQ [4] for these analyses. The student will have to:  get acquainted with the SPP and SPNP scheduling policies, busy window analysis and its extension to arbitrary event models [5];  identify the proper concepts and lemmas of the original proofs;  and formalize them in CoQ. A particularly interesting sideeffect of this work would be to identify the key properties of the concept of busy window so that it can be easily adapted for proofs of other scheduling policies.
Bibliography: [1] Ken Tindell, Alan Burns, Andy J. Wellings: An Extendible Approach for Analyzing Fixed Priority Hard RealTime Tasks. RealTime Systems 6(2): 133151 (1994) [2] Ken Tindell, Alan Burns: Guaranteeing message latencies on Controller Area Network (CAN). Proceedings of 1st international CAN conference: 111 (1994) [3] Robert I. Davis, Alan Burns, Reinder J. Bril, Johan J. Lukkien: Controller Area Network (CAN) schedulability analysis: Refuted, revisited and revised. RealTime Systems 35(3): 239272 (2007) [4] http://coq.inria.fr/ [5] Arne Hamann, Rafik Henia, Razvan Racu, Marek Jersak, Kai Richter und Rolf Ernst: SymTA/S  Symbolic Timing Analysis for Systems. WiP Proceedings of the Euromicro Conference on RealTime Systems: 1720 (2004) URL sujet détaillé :
:
Remarques : This thesis could result in a publication and/or a PhD thesis. There is a strong collaboration with the group of Rolf Ernst at TU Braunschweig (Germany) which could give the student the opportunity to travel there if she/he is interested.
Feel free to contact me if you have questions related to this subject!





SM20758 Incremental proofs for realtime analysis à la SymTA/S


Description


Scientific context: Realtime systems must satisfy timing requirements such as response time bounds, i.e., bounds on the maximum time it may take a software task to complete its execution once it has been activated. Establishing such properties is nontrivial as tasks executing on the same processor may have complex timing influence on each other due to scheduling. Furthermore, such effects propagate through the system so that tasks executing on different processors may also influence each other. Compositional Performance Analysis (CPA), best known through its commercial implementation SymTA/S [1], is a compositional approach for computing such bounds. It is based on fixpoint computation at the system level on top of busy window analyses [2] at the processor level  which means that there are nested fixpoint computations. In the current state of the art, any change to the sytem requires to perform a new analysis from scratch.
Thesis subject: The objective of this master thesis is to establish simple criteria for guaranteeing that it is correct to reuse, for a responsetime analysis, the fixpoint obtained by previous analysis. The student will have to:  understand the principles of CPA and get acquainted with its free Pythonbased implementation pyCPA [3];  establish a list of structural changes (adding/removing a task, replacing/remapping a task etc.) for which previous results may be reused and formally prove the correctness of the approach;  implement the established criteria in pyCPA and quantify the efficiency gain compared to the state of the art.
Bibliography: [1] Arne Hamann, Rafik Henia, Razvan Racu, Marek Jersak, Kai Richter und Rolf Ernst: SymTA/S  Symbolic Timing Analysis for Systems. WiP Proceedings of the Euromicro Conference on RealTime Systems: 1720 (2004) [2] Ken Tindell, Alan Burns, Andy J. Wellings: An Extendible Approach for Analyzing Fixed Priority Hard RealTime Tasks. RealTime Systems 6(2): 133151 (1994) [3] pycpa.readthedocs.org/ URL sujet détaillé :
:
Remarques : This thesis could result in a publication and/or a PhD thesis on a related topic. There is a strong collaboration with the group of Rolf Ernst at TU Braunschweig (Germany) which could give the student the opportunity to travel there if she/he is interested.
Feel free to contact me if you have questions related to this subject!





SM20759 Modèles mécaniques formels pour l'algorithmique distribuée


Description


Distributed computing is one of the domains where informal reasoning is not an option, in particular when Byzantine failures are involved. What characterises also Distributed Computing is its diversity of models subtle modifications of which induce radical change in the system behaviour. We consider Robot Networks, that is swarms of *autonomous* mobile entities that have to accomplish some task in *cooperation*. In this emerging framework, models can be distinguished by the capabilities of robots, the topology of the considered space, the level of synchrony (that is the properties of a demon), the type of the failures likely to occur, etc.
We are interested in obtaining formal and moreover mechanical guarantees of properties for certain protocols, using the Coq proof assistant. A Coq framework for robot networks recently proposed can express quite a few variants of models for such networks, in particular regarding topology or demon properties. This framework is higher order and based on coinductive types. It allows us to prove in Coq positive results (the task will be fulfilled using the algorithm embedded in all robots *for all* initial configuration) as well as negative results (*there cannot be any* embedded algorithm that will work for this task for all initial configuration).
An objective of this proposal is to extend this framework to some particular topologies (higher dimensions) so as to propose a framework in which some new results could be formally established, in particular regarding the election of a leader, or the formation of geometrical patterns. Those two problems have been shown equivalent (meaning that there is an algorithm that solves one of them if one has as an input a solution for the other one) under some particular assumptions. Extending this equivalence to less constrained models is an interesting open problem. URL sujet détaillé : http://pactole.lri.fr/pub/M2_1415.pdf
Remarques : Coencadrement, avec :  Pierre Courtieu (pierre.courtieu.fr)
 Sébastien Tixeuil (sebastien.tixeuil.fr)
Compétence en Coq nécessaire.





SM20760 Strategies to deal with deadline misses in a weaklyhard realtime system


Description


Scientific context: Realtime systems are systems that must satisfy timing requirements such as response time bounds, that is, bounds on the maximum time it may take a software task to complete its execution once it has been activated. For some systems called weaklyhard, the specified timing requirements may be violated once in a while, as long as the number of violations can be bounded. Establishing such properties is not a trivial thing because tasks executing on the same processor may have complex timing influence on each other due to scheduling. Busy window analysis [1] is a wellestablished method to compute bounds on worstcase response times in uniprocessor realtime systems that is based on a fixpoint computation. In the current state of the art, the analyzed scheduler is deadlineagnostic: whenever a timing requirement is violated because the specified bound on response times has been exceeded, the task that is experiencing this violation still runs to completion. An alternative would be to kill the belated execution, which would reduce the delay experienced by subsequent executions.
Thesis subject: The objective of this thesis is to adapt busy window analysis for the Static Priority Preemptive (SPP) scheduling policy assuming that executions are stopped as soon as they miss their deadline. The student will have to:  get acquainted with the SPP scheduling policy, busy window analysis and weaklyhard systems analysis;  adapt the standard response time analysis to support killing of task executions missing their deadline as well as the corresponding weaklyhard analysis;  implement the proposed method within the free Pythonbased response time analysis tool pyCPA [2] and test it on existing test cases. An interesting extension of this work would be to extend the approach to arbitrary event models (as opposed to fully periodic tasks) [3]. The longterm goal is to combine these results with an analysis of the impact of deadline misses on the controller receiving the output generated by the analyzed tasks.
Bibliography: [1] Ken Tindell, Alan Burns, Andy J. Wellings: An Extendible Approach for Analyzing Fixed Priority Hard RealTime Tasks. RealTime Systems 6(2): 133151 (1994) [2] pycpa.readthedocs.org/ [3] Arne Hamann, Rafik Henia, Razvan Racu, Marek Jersak, Kai Richter und Rolf Ernst: SymTA/S  Symbolic Timing Analysis for Systems. WiP Proceedings of the Euromicro Conference on RealTime Systems: 1720 (2004) URL sujet détaillé :
:
Remarques : This thesis could result in a publication and/or a PhD thesis. There is a strong collaboration with the group of Rolf Ernst at TU Braunschweig (Germany) which could give the student the opportunity to travel there if she/he is interested.
Feel free to contact me if you have questions related to this subject!





SM20761 Efficient symbolic resolution of systems of linear differential and difference equations


Description


In the context of problems involving solutions of linear differential or difference equations, computer algebra is often crucially based on the resolution of auxiliary systems in terms of rational functions. The simplest approach to this problem is to uncouple the system, i.e., to obtain a scalar equation of higher order for each component of the vector solutions. Rational solutions are then computed using algorithms due to Abramov for the resolution of scalar equations. The uncoupling procedure by itself is already time consuming, so it seems advantageous to use direct solution algorithms that avoid uncoupling. The basis of such an algorithm exists (Barkatou, 1999), but it remains to adapt it to applications and to precisely quantify the expected gain.
The internship is expected to complete two goals. First, the algorithm described in (Barkatou, 1999) will be validated by an implementation in Maple, first in its simplest version, then in its parameterized version. This parameterized version will then be integrated in the Mgfun package that implements algorithms by (Chyzak, 2000) for the symbolic summation and integration in Maple.
 Moulay A. Barkatou: On rational solutions of systems of linear differential equations. Journal of Symbolic Computation 28 (1999), 547=96567.  Frédéric Chyzak: An extension of Zeilberger's fast algorithm to general holonomic functions. Discrete Mathematics 217 (2000), 115=96134. URL sujet détaillé : http://specfun.inria.fr/internships/
Remarques : Cosupervised by
Alin Bostan (alin.bostan((at))inria.fr) and
Frédéric Chyzak (frederic.chyzak((at))inria.fr)





SM20762 Efficient computation of power series coefficients


Description


The need to compute a single coefficient of a given formal power series is frequent in combinatorics. Indeed, in many case, the enumeration of objects subject to certain structural constraints is reduced to extracting a coefficient of a series. In more elaborate contexts, several parameters are involved, and the corresponding power series have several variables. The question is then to extract the coefficient of a monomial in several variables. For obvious efficiency reasons, it is important to be able to calculate this coefficient without computing all previous coefficients. This internship focuses on two fast algorithms, due to Massazza and Radicioni, for extracting such coefficients for classes of power series whose coefficients are related by a set of linear recurrences. The main objective of the internship is to implement these algorithms and compare them on various applications.
 P. Massazza and R. Radicioni: On computing the coefficients of rational formal series. (2004). In: Formal Power Series and Algebraic Combinatorics, Vancouver, 2004.  P. Massazza and R. Radicioni: On computing the coefficients of bivariate holonomic formal series. Theoretical Computer Science 346 (2005), no. 23, 418=96438. URL sujet détaillé : http://specfun.inria.fr/internships/
Remarques : Cosupervised by
Alin Bostan (alin.bostan((at))inria.fr) and
Frédéric Chyzak (frederic.chyzak((at))inria.fr)





SM20763 Code automodifiant pour la sécurité des systèmes embarqués


Description


The CEA is developing a technology for runtime code generation around a tool called deGoal, which initial motivation is the observation that execution performance may be strongly correlated to the characteristics of the data to process (cf. related publications on this website).
Preliminary developments have been carried out to achieve the objectives of the project: the deGoal framework has been extended so that code polymorphism could be applicable to embedded constrained platforms such as microcontrollers; a preliminary validation of our approach has been performed on AES and experimental measures have been carried out.
The objectives of this internship are to continue the implementation of our software solution for code polymorphism, and to focus in particular on the performance of runtime code generation to reduce the overall overhead of our approach.
The intern will be in charge of:
identifying the main performance issues in the current implementation perform an analysis about the potential degree of polymorphism achievable with our solution design and implement more efficient solutions. The solutions will be taken in the perspective of the amount of polymorphic instances that could be generated with the new solution. validate the approach experimentally URL sujet détaillé : http://www.cogitoanr.fr/posts/internshippositionatceagrenoble.html
Remarques :





SM20764 Algorithmic proofs for the transcendence of power series


Description


A formal power series is called algebraic if it is a root of a polynomial, and transcendental otherwise. Establishing the transcendence of a generating function is a constant need in enumerative combinatorics. The goals of the internship are the study, the comparison and the implementation of algorithmic methods allowing to answer in a uniform manner the following question: Given a sequence of rational numbers satisfying a linear recurrence with polynomial coefficients, decide the transcendence of its generating function. The question is non trivial even for sequences satisfying a recurrence of first order. A classic algorithm is sufficient, in principle, to answer the general case (Singer, 1979). However, this algorithm suffers from too high a complexity to be effective in practice. A more recent method has been proposed, on a nontrivial combinatorial example, in (Bostan et al, 2014). The central objectives of this internship are to understand the new method, to translate it into an explicit algorithm working uniformly on a class of inputs to be defined, to implement this algorithm, test it and measure its theoretical and practical efficiency.
 A. Bostan, M. BousquetMélou, M. Kauers, S. Melczer : On 3dimensional lattice walks confined to the positive octant (2014), http://arxiv.org/abs/1409.3669.  M. F. Singer: Algebraic Solutions of nth Order Linear Differential Equations. Queens Papers in Pure and Applied Mathematics, 54 (1979), 379=96420. URL sujet détaillé : http://specfun.inria.fr/internships/
Remarques : Cosupervised by
Alin Bostan (alin.bostan((at))inria.fr) and
Frédéric Chyzak (frederic.chyzak((at))inria.fr)





SM20765 Verification of Distributed Systems with Parameterized Network Topology


Description


We consider distributed systems that can be run on arbitrary network topologies from a given class of architectures (e.g., on all pipelines, all grids, or all ring topologies). Typically, such a program is given by a single sequential process, and a copy of that process can be run on any node in the given network topology.
Parameterized systems are hard to analyze, and their verification problems quickly turn undecidable. There have been several orthogonal approaches to regain decidability: (i) restricting the amount of information that can be exchanged (e.g., unary tokens instead of binary ones), (ii) relaxing pointtopoint communication towards broadcasting or nondeterministic choice, or (iii) imposing a bound on the number of contexts that a process traverses; a context may, e.g., restrict communication to one fixed neighbor, or to one type of communication such as =93send=94. In the internship, we propose to investigate some extensions of (iii). See also the detailed description of the internship and the references therein. URL sujet détaillé : http://www.lsv.enscachan.fr/~bollig/MPRI/stageM2.pdf
Remarques : Coencadré par Benedikt Bollig: http://www.lsv.enscachan.fr/~bollig/





