SM207-1 Performance Analysis of Congestion in Commodity Networks for Exascale Computing Platforms  
Admin  
Encadrant : Arnaud LEGRAND
Labo/Organisme : LIG/CNRS/INRIA
URL : http://mescal.imag.fr/membres/arnaud.legrand/
Ville : Grenoble

Description  

The Mont-Blanc project aims at building a super-computer based on
commodity low-power hardware such as the ARM and the Ethernet
technology. It will not be able to reach Exascale but is a first step
toward this direction.

Although there are numerous articles explaining how to tune or modify
TCP to optimize end-to-end bandwidth in a HPC context, most of them
completely ignore flow control and congestion management aspects. Our
initial experiments with TCP/Ethernet revealed some non-trivial
network congestion phenomenon in the presence of intensive HPC
workload. More precisely, some MPI collective
operations and the NAS PB conjugate gradient can lead, even at
moderate scale, to critical congestion situations that force the whole
application to freeze until a TCP timeout is triggered. Such blocking
situations only last for 200 milliseconds but such this is
unacceptable on a platform where typical communication times are of
the order of a few microseconds. Indeed, if such timeouts are harmless
in a wide area context over the Internet they can occur systematically
in such HPC workloads and slow down the whole execution by more than
30%!!! Beyond a simple application slowdown, it should be understood
that the whole set of processors, even when forced to idleness, keeps
consuming energy uselessly. Given the scale evolution of HPC
platforms, such issue is likely to quickly become impossible to
overlook.

There are several options to circumvent this issue:
1. Such timeouts seem to be link to the retransmission timeout (TCP
RTO), which could be decreased. The value of this timeout (200ms)
is unfortunately hardcoded in the kernel. Although having kernels
specifically tuned for HPC workloads is perfectly legitimate, it is
quite difficult to know how to set such value. Indeed, aggressively
decreasing this timeout may trigger massive retransmissions that
could make the network collapse even more.
2. Since most TCP variants are likely to run into the same kind of
issues, another option could be to use a completely different
protocol like SCTP that manages congestion in a very different way.
3. Another option could be to have user space deamons that smartly
inject packets in the system to prevent such timeouts to occur.

The goal of this project is to evaluate these different
options. Experiments will be conducted on the Grid5000 platform
(probably on Nancy's Graphene cluster). In a first step, the student
will familiarize with the experimental procedures and on the techniques to control their
experimental environment (e.g., recompiling the kernel and preparing
custom images,...).

If time allows, similar experiments will be conducted on the
Mont-Blanc prototype. Such study should provide invaluable feedback to
Exascale platform architects in particular with respect to the network
and CPU provisioning aspect.

URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_tcp_analysis.pdf

Remarques : Ce stage est évidemment rémunéré...



SM207-2 Modeling and Simulation of Dynamic Applications for Exascale Computing Platforms  
Admin  
Encadrant : Arnaud LEGRAND
Labo/Organisme : LIG/CNRS/INRIA
URL : http://mescal.imag.fr/membres/arnaud.legrand/
Ville : Grenoble

Description  

Multi-core architectures comprising several GPUs have become
mainstream in the field of High-Performance Computing. However,
obtaining the maximum performance of such heterogeneous machines is
challenging as it requires to carefully offload computations and
manage data movements between the different processing units. The most
promising and successful approaches so far rely on task-based runtimes
that abstract the machine and rely on opportunistic scheduling
algorithms. As a
consequence, the problem gets shifted to choosing the task
granularity, task graph structure, and optimizing the scheduling
strategies. Trying different combinations of these different
alternatives is also itself a challenge. Indeed, getting accurate
measurements requires reserving the target system for the whole
duration of experiments. Furthermore, observations are limited to the
few available systems at hand and may be difficult to generalize.
Finally, since execution time on real machine exhibit variability,
dynamic schedulers tend to make varying scheduling decisions, and the
obtained performance is thus far from deterministic. This makes
performance comparisons more questionable and debugging of
non-deterministic deadlocks inside such runtimes even harder.

Simulation is a technique that has proven extremely useful to study
complex systems and which would be a very powerful way to address
these issues. Performance models can be collected for a wide range of
target architectures, and then used for simulating different
executions, running on a single commodity platform. Since the
execution can be made deterministic, experiments become /completely
reproducible/, also making debugging a lot easier. Additionally, it is
possible to try to extrapolate target architectures, for instance by
trying to increase the available PCI bandwidth, the number of GPU
devices, etc. and thus even estimate performance which would be
obtained on hypothetical platforms.

In a recent Europar article, we show how we crafted a coarse-grain hybrid
simulation/emulation of StarPU, a dynamic runtime system for
heterogeneous multi-core architectures, on top of SimGrid, a
simulation toolkit specifically designed for distributed system
simulation. This approach allows to obtain performance predictions
accurate within a few percents on classical dense linear algebra
kernels in a matter of seconds, which allows both runtime and
application designers to quickly decide which optimization to enable
or whether it is worth investing in higher-end GPUs or not.

It is thus currently possible to precisely evaluate the performance of
*dynamic* HPC applications running on a single hybrid node but
evaluating such applications at larger scale remains quite
challenging. StarPU was recently extended to exploit clusters of
hybrid machines by relying on MPIcite{starpu-mpi}. Since SimGrid's
ability to accurately simulate MPI applications has already been
demonstrated (PMBS'13), combining both works should allow to
investigate performance predictions of complex applications on
large-scale high-end HPC infrastructures.

The goal of this internship is thus to experiment with dynamic HPC
applications based on StarPU-MPI and to evaluate their performance in
simulation. In a first step, medium-size clusters of hybrid nodes will
be used. If time allows, such validation could be conducted on the
Mont-Blanc prototypes that rely on ARM processors and GPUs similar to
the ones that may be used in future Exascale platforms. Such
performance study should provide invaluable feedback to Exascale
platform architect in particular with respect to the network and CPU
provisioning aspect.

URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_starpu_mpi.pdf

Remarques : Ce stage est évidemment rémunéré...



SM207-3 Analysis and Modeling of Cache Performance for Exascale Computing Platforms  
Admin  
Encadrant : Arnaud LEGRAND
Labo/Organisme : LIG/CNRS/INRIA
URL : http://mescal.imag.fr/membres/arnaud.legrand/
Ville : Grenoble

Description  

The Mont-Blanc project aims at building a super-computer based on the
ARM technology. It will not be able to reach Exascale but is a first
step toward this direction. ARM processors have almost no memory
hierarchy and may thus be easier to model than more recent multicore
architectures used in PetaScale architectures.

The goal of this internship is thus to experiment with multi-core ARM
processors similar to the ones that may be used in future Exascale
platforms and to derive from such experimentation a performance and
power consumption model. Measuring the parameters of such model will
require to design specific micro-benchmarks that reach the peak
performance of these processors, which in turns requires exploring a
wide parameter space of code compiling options, loop unrolling and
vectorizing options. To ease such code generation, we will rely on the
[[https://forge.imag.fr/projects/boast/][BOAST]] framework that allows to meta-program computing kernels and
their optimizations.

Such model will then be implemented in the [[http://simgrid.gforge.inria.fr/][SimGrid]] toolkit to try to
evaluate the performance of simple benchmarks like LINPACK on such
platforms. Such study should provide invaluable feedback to Exascale
platform architect in particular with respect to the network and CPU
provisioning aspect.
URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_cache_analysis.pdf

Remarques : Ce stage est évidemment rémunéré...



SM207-4 Response Time Optimization of Multi-User BOINC Projects  
Admin  
Encadrant : Arnaud LEGRAND
Labo/Organisme : LIG/CNRS/INRIA
URL : http://mescal.imag.fr/membres/arnaud.legrand/
Ville : Grenoble

Description  

Recent evolutions in inter-networking technology and the decreasing
cost-performance ratio of computing components have enabled Volunteer
Computing (VC) systems. Among these platforms, BOINC ([[http://boinc.berkeley.edu/][Berkley Open
Infrastructure for Network Computing]]) is the most
representative, with over 580,000 hosts that deliver over 2,300
TeraFLOP per day and many projects deployed. In such platforms,
volunteers usually donate their machines' CPU idle time to scientific
projects and indicate some preferences on how to share their
processing power among them. BOINC projects usually have hundreds of
thousands of independent CPU-bound tasks and are interested in overall
throughput, i.e., maximize the total number of tasks completed per
day. Each project has its own server which is responsible for
distributing work units to clients, recovering results and validating
them.

To handle this complexity, BOINC implements a simple and robust
distributed protocol which ensures that local shares and volunteer
priorities are respected. This protocol has been used for many years
now and its efficiency and fairness have been assessed in the context
of throughput oriented
projects.

Lately, the BOINC workload has evolved in the following two noticeable
ways:
- First, to answer the burden of deploying and maintaining BOINC
servers, multi-user projects (also called emph{umbrella} projects)
have appeared. Such BOINC project host several scientific projects
(in the later, we will talk about users rather than scientific
projects to avoid confusion with BOINC projects), each with their
own application, which provides a much larger audience to every user
than if they had deployed their own server. For example, the [[http://www.worldcommunitygrid.org/][World
Community Grid]], funded by IBM, has been one of the first such BOINC
projects, helping research on drug search for Leishmaniasis, clean
water localization, childhood cancer, dots. More recently, [[http://casathome.ihep.ac.cn/][CAS]] a
Chinese project has been set up to help both protein structure,
nanotechnology, cancer genomics and high energy physics
research. Likewise [[http://www.ibercivis.es/][Ibercivis]] a Spanish-Portuguese initiative has
been set up to provide a huge BOINC umbrella Project for scientists
in those countries and [[http://rechenkraft.net/yoyo/][Yoyo]] helps research on harmonious trees,
elliptic curve factorization, and DNA evolution among others. In
such BOINC projects, demand for computing power generally exceeds
supply, hence a need for a mechanism to divide the supply among
users.
- Second, although in the past most users had an unlimited supply of
jobs to process (throughput-oriented users), the popularization of
BOINC has attracted other type of users. In CAS, Many users
have campaigns (batches) made of fewer tasks and are interested in
response time of campaigns. For such users, throughput is not
meaningful and response time should be minimized (response-time
oriented users).
In such multi-user projects, each such user will have a numeric quota
representing the fraction of the computing resource they should get.
What are the precise semantics of quotas? Ideally the mechanism should
accommodate the previous two classes of users. The GridBot
projectcite{gridbot,Silberstein11,Silberstein12} recently made use of
hybrid computing platform composed of grid, clusters and VC to execute
workload resulting from mix of throughput and response time oriented
applications. Such initiatives are very encouraging but somehow
emph{ad hoc} and does not consider the problem of streams of batches
nor the resource sharing issue. We think that regular BOINC servers
can be improved to handle seamlessly such workloads.

In the last BOINC workshops ([[http://boinc.berkeley.edu/trac/wiki/WorkShop11][2011]], [[http://boinc.berkeley.edu/trac/wiki/WorkShop12][2012]], [[http://boinc.berkeley.edu/trac/wiki/WorkShop13][2013]]) members from the Mescal
team, the BOINC team, the World Community Grid project and the OurGrid
project have brainstormed to design a new server architecture that
would enable to fairly and efficiently handle complex workloads such
as the ones we previously described. Yet, many parts of this
architecture (batch prioritizing, batch completion estimation, client
reliability estimation, dots) remain to be specified.

The goal of this internship is thus to first implement a simulation of
such prototype and to evaluate its performance under some realistic
workload from the World Community Grid or from CAS. Such
evaluation will then guide the design of different scheduling and
estimation mechanisms. The project could involve visits to Berkeley,
California to present results, which will later be implemented in the
BOINC server and deployed on the World Community Grid and in CAS.
URL sujet détaillé : http://mescal.imag.fr/membres/arnaud.legrand/research/M2R_boinc.pdf

Remarques : Ce stage est évidemment rémunéré...



SM207-5 Learning algorithms for optimal routing in large scale systems  
Admin  
Encadrant : Bruno GAUJAL
Labo/Organisme : Inria
URL : http://mescal.imag.fr/membres/bruno.gaujal/
Ville : st ismier

Description  


Consider the following multiflow routing problem:
N flows (given by source , destination and volume)
share a communication network (with limited throughput on the links).
We want to design an distributed algorithm that discovers the best routes
for all flows with only partial knowledge on the network topology and the routes chosen by other flows.

Classical algorithms are based on game theory and see the flows as players and all the routes from source to destination as their possible strategies.
The best algorithms are shown to scale well with the number of flows but not so well with the number of possible routes per flow.
The goal here is to explore alternative approaches where the nodes of the communication network become active players and their strategies are limited to output links.
These new approaches have the potential to solve very large
routing problems very fast.
URL sujet détaillé : :

Remarques :



SM207-6 Learning and mean field approximations.  
Admin  
Encadrant : Bruno GAUJAL
Labo/Organisme : Inria
URL : http://mescal.imag.fr/membres/bruno.gaujal/
Ville : st ismier

Description  

Q-learning is an algorithmic technique that learns the transition probabilities in a Markov decision process while maintaining an optimal reward over an infinite horizon.
While this makes this approach particularly attractive on a conceptual point of view, it suffers acutely from the curse of dimensionality that makes it inapplicable in practice.
In this internship we propose to explore the possibility to aggregate the state
space of a large Markov decision process made of several identical objects
and to use Q-learning on the resulting mean field approximation so as to
avoid the state space explosion.

This project involves both mathematical modeling and algorithmic techniques.
No background knowledge is needed for this project but some familiarity with probability theory (or a taste for it) will help.

URL sujet détaillé : :

Remarques :



SM207-7 Dynamic solution of the memory affinity problem.  
Admin  
Encadrant : Bruno GAUJAL
Labo/Organisme : Inria
URL : http://mescal.imag.fr/membres/bruno.gaujal/
Ville : st ismier

Description  

One of the main problem faced by HPC applications that run over modern multicore NUMA
architectures is to solve the memory affinity problem:
place each task on a processor close to the memory where the data used by the task is stored.
If considered in a static way, optimal placement is NP-complete.
One way to approach the problem is to dynamically change the task allocation according to an adaptive algorithm based on the aggregate performance of all the allocations used from the start of the prgram rather than on the current allocation, and learn from past mistakes.
This approach has never been used in this context but should prove very efficient compared with static heuristics based on graph partitioning that are currently used.
URL sujet détaillé : :

Remarques :



SM207-8 Assemblage et estimation de la valeur adaptive de génomes du virus de la Dengue Assembly and fitness estimation of Dengue virus genomes  
Admin  
Encadrant : Bastien BOUSSAU
Labo/Organisme : Laboratoire de Biométrie et Biologie Evolutive (LBBE), UMR 5558, Equipe BGE.
URL : https://lbbe.univ-lyon1.fr/-Boussau-Bastien-.html
Ville : Lyon

Description  

390 million people worldwide are infected yearly by Dengue virus, through a mosquito bite. No therapeutic is currently available. To control the epidemics, the most promising strategies target the mosquito host to stop viral transmission. However, little is known about how the virus interacts with its mosquito host, which makes it very hard to predict whether those strategies could be successful.
To understand how the virus interacts with the mosquito cell, a team of virologists headed by Marlène Dreux runs viral evolution experiments in mosquito cells and sequences the viral population throughout its evolution. The analysis of the resulting sequences will help us identify the sites of the viral genome that interact with the mosquito cell machinery.
The master internship aims at developing bioinformatic and statistical methods to i/ assemble complete viral genomes from the sequences and ii/ predict the fitness of those assembled genomes to identify important sites. These tasks require parallel algorithms to handle massive amounts of sequence data as well as handling of the quasispecies equation to infer genome fitness.
The intern will have to implement the above-described algorithms. Good programming skills are most welcome, as well as a good working knowledge of probability and statistics. Moreover, interest in evolution, virology or public health issues is primordial.
The host laboratory (LBBE) is a stimulating and pleasant place to work, where one can meet biologists, physicians, computer scientists, mathematicians and statisticians working on problems that range from ecology to medicine, through genomics and evolution.

URL sujet détaillé : https://lbbe.univ-lyon1.fr/IMG/pdf/project_m2_student_bastien_boussau.pdf?1229/4b82a603f0cc81c9cba9f30b14690da6e18d2747

Remarques :



SM207-9 Formal Verification of Security Properties of E-voting Systems  
Admin  
Encadrant : Gilles BARTHE
Labo/Organisme : IMDEA Software Institute
URL : https://www.easycrypt.info/Team
Ville : Madrid

Description  

Designing and implementing cryptographic systems that offer effective security guarantees is known to be a complicated task. They are widely-deployed applications involving complex information flows, making code reviews and testing less effective. They use modern cryptographic primitives that must be carefully designed: the history of cryptography (including this last decade) is full of examples of flawed cryptographic systems that were thought secure for a long time (and sometimes even proved secure). Furthermore, their security should hold in hostile environments.

Electronic voting schemes are one example of such systems. They are meant to offer at least the same security guarantees as traditional paper-based systems. They make use of cryptographic primitives and rely on protocols that are notoriously difficult to design and verify.

In this internship, we propose to formally define and verify security properties of Helios, one of the best established academic protocols. The development will be performed in EasyCrypt, a proof-assistant jointly developed by the IMDEA Software Institute and Inria, for verifying the security of cryptographic constructions in the computational model.

We invite applicants to read the following paper for more information:

​https://eprint.iacr.org/2013/235
URL sujet détaillé : https://www.easycrypt.info/Internships

Remarques : Possibilité de rénumération

Co-encadrants: F. Dupressoir, P-Y. Strub




SM207-10 Formal Verification of a Compiler for Masking-Based Countermeasures  
Admin  
Encadrant : Gilles BARTHE
Labo/Organisme : IMDEA Software Institute
URL : https://www.easycrypt.info/Team
Ville : Madrid

Description  

In the real world, attackers have access to side-channels that may reveal information about intermediate variables of a computation, even those that depend on secrets. For instance, recent and very efficient attacks have allowed adversaries to recover the secret key using only AES power traces, and the design and implementation of countermeasures is now of central importance.

Masking is such a countermeasure. It consists in using a secret-sharing scheme to split each sensitive variable and the compuations on them into (t + 1) shares such that observing any t of them does not reveal any information about the secrets. Intuitively, an implementation that is split over (t + 1) shares should be able to resist the leakage of t of its intermediate variables.

In this internship, we propose to formally verify, in the Coq proof-assistant, a masking compiler, i.e. a compiler that transforms an input program into a functionally equivalent program that is resistant to t intermediate observations. Both the implementation, the functional correctness and the security proofs will be performed in Coq.

We invit applicants to read the following paper for more information:

​http://www.cs.berkeley.edu/~daw/papers/privcirc-crypto03.pdf
URL sujet détaillé : https://www.easycrypt.info/Internships

Remarques : Possibilité de rénumération

Co-encadrants: F. Dupressoir, P-Y. Strub




SM207-11 Energy-efficient cloud elasticity for data-driven applications  
Admin  
Encadrant : Anne-Cécile ORGERIE
Labo/Organisme : energy-efficiency, performance and elasticity. This work will be done in collaboration with the Lawrence Berkeley National Laboratory (USA) in the context of the DALHIS associate team.
URL : http://www.irisa.fr/myriads
Ville : Rennes

Description  

Distributed and parallel systems offer to users tremendous computing capacities. They rely on distributed computing resources linked by networks. They require algorithms and protocols to manage these resources in a transparent way for users. Recently, the maturity of virtualization techniques has allowed for the emergence of virtualized infrastructures (Clouds). These infrastructures provide resources to users dynamically, and adapted to their needs. By benefiting from economies of scale, Clouds can efficiently manage and offer virtually unlimited numbers of resources, reducing the costs for users.

However, the rapid growth for Cloud demands leads to a preoccupying and uncontrolled increase of their electric consumption. In this context, we will focus on data driven applications which require to process large amounts of data. These applications have elastic needs in terms of computing resources as their workload varies over time. While reducing energy consumption and improving performance are orthogonal goals, this internship aims at studying possible trade-offs for energy-efficient data processing without performance impact. As elasticity comes at a cost of reconfigurations, these trade-offs will consider the time and energy required by the infrastructure to dynamically adapt the resources to the application needs.

The validations of the proposed algorithms may rely on the French experimental platform named Grid'5000. This platform comprises about 8,000 cores geographically distributed in 10 sites linked with a dedicated gigabit network. Some of these sites have wattmeters which provide the consumption of the computing nodes in real-time. This validation step is essential as it will ensure that the selected criteria are well observed: energy-efficiency, performance and elasticity. This work will be done in collaboration with the Lawrence Berkeley National Laboratory (USA) in the context of the DALHIS associate team.


References:
[RDG11] N. Roy, A. Dubey and A.S. Gokhale. "Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting". In Proceedings of IEEE CLOUD, 2011.
[Koo11] J. Koomey, "Growth in data center electricity use 2005 to 2010", Analytics Press, 2011.
[ADE10] D. Agrawal, S. Das and A. El Abbadi, "Big data and cloud computing: new wine or just new bottles?", Journal of the Very Large Data Base Endowment, vol 3 no. 1-2, pp. 1647-1648, Sept. 2010.
[ODL13] A.-C. Orgerie, M. Dias de Assunç=E3o, L. Lefèvre, "A Survey on Techniques for Improving the Energy Efficiency of Large Scale Distributed Systems", ACM Computing Surveys, 2013.


Advisors:
Anne-Cécile Orgerie anne-cecile.orgerie.fr
Christine Morin christine.morin.fr
URL sujet détaillé : http://www.irisa.fr/myriads/open-positions/internships1314/document.2013-09-25.9133673154

Remarques : Contact: Anne-Cécile Orgerie anne-cecile.orgerie.fr
Christine Morin christine.morin.fr



SM207-12 Large-Scale Graph Processing Platform on Supercomputers  
Admin  
Encadrant : Toyotaro SUZUMURA
Labo/Organisme : UCD School of Computer Science and Informatics
URL : www.scalegraph.org :
Ville : Dublin, Ireland

Description  

A large-scale graph analytics is one of the important approaches for big data processing for a variety of purposes such as web and social network mining, life science, cyber security, and so forth. We have been developing an open source billion-scale graph analytics library called ScaleGraph (URL: http://www.scalegraph.org) on top of a parallel and distributed programming language named X10. For the last 3 years, we have been successful in demonstrating its high performance nature and handling billion-scale whole Twitter follower-follower network on supercomputers with more than 3000 CPU cores.

In this project, research project members will work on various research themes depending on their interest and skills by using a supercomputer named TSUBAME 2 located Tokyo Institute of Technology in Japan. The project can be mainly divided into three teams:
A) Graph Algorithms Team: The team designs and implements a set of parallel and distributed graph algorithms that the ScaleGraph library is currently missing such as community detection, clustering, link prediction, bipartite matching, and their incremental algorithms. After finishing them, they will test the algorithms over our crawled Twitter network and conduct the performance evaluation on TSUBAME 2.
B) System Software Team: The team tackles various problems such as optimizing the underlying graph processing framework based on the Pregel model, designing and implementing asynchronous graph processing models, and implementing out-of-core graph processing with external memory. We could broaden the scope if you have good ideas.
C) Application Team: By working together with other teams and also domain experts, the application team will explore new application areas such as human brain network, protein-protein interaction, linked data such as RDF, cyber security, and so forth with the ScaleGraph library.
Each member is expected to deliver both software assets as well as publication for international conferences. For further information, you can see the ScaleGraph web page (http://www.scalegraph.org)

Recommended Skills:
・ OS, Programming Languages, Tools: Linux, C/C++, Java, MPI, Performance profiler, Git/Subversion, etc ・ Knowledge on graph analytics, High performance computing, parallel and distributed systems
・ A skill on writing and reading publications for international conferences

Duration: 6 months (negotiable)
Salary: =801000-=801300 (depending on the skills)
Work Location: UCD School of Computer Science and Informatics, Dublin, Ireland

Contact:
Please send your CV/resume (the detailed information such as programming experiences would be more preferable) and your preferred research theme to crest.research.com. If selected, you will be contacted and interviewed by Prof. Toyotaro Suzumura (University College Dublin / IBM Research) and the other team members.
URL sujet détaillé : :

Remarques :



SM207-13 Integrated blackbox+whitebox performance modeling of Cloud applications  
Admin  
Encadrant : Guillaume PIERRE
Labo/Organisme : IRISA / Université de Rennes 1
URL : http://www.globule.org/~gpierre/
Ville : Rennes

Description  

Context:

Cloud computing offers unprecedented levels of flexibility to efficiently deploy demanding applications. Cloud users have access to a large variety of computing resources with various combinations of configurations and prices. This allows them in principle to make use of the exact types and numbers of resources the applications need. However, this flexibility also comes as a curse as choosing the best resource configuration for an application becomes extremely difficult: cloud users often find it very hard to accurately estimate the requirements of complex applications. It is therefore useful to model the performance of Cloud applications, such that we can use the model to choose the best resource configuration for an application.

The HARNESS European project (http://www.harness-project.eu) is currently working on two complementary approaches to model the performance of cloud applications: First, a ``blackbox'' method can build a performance model of arbitrary applications. This approach is very general, but on the other hand it may require a significant number of profiling executions before capturing all the relevant specificities of a new application. We therefore also proposed to develop one or more ``whitebox'' performance modeling components. Each such component may be specialized for a particular class of application, for example MapReduce applications. Whitebox performance modeling may be faster than blackbox modeling in identifying key information thanks to its domain-specific knowledge of a specific class of applications. When a whitebox component can be applied to a user's application, it can thus considerably speed up the blackbox performance modeling process.

Research topic:

The objective of this master project is to design and develop a unified performance modeling component which combines the best aspects of both models: a combined blackbox/whitebox model would be able to handle any type of application; but for a number of common application types it would make use of domain-specific information. The idea is to make the blackbox model be in charge of exploring possible resource configurations, measuring the performance and cost of the application using these resources, and deriving a performance model from these experiments. Measured metrics are also given to all the whitebox models. If one of the whitebox models does generate useful information, it can inform the blackbox model about configurations that should be avoided.

For example, we have a useful whitebox model for MapReduce applications which can quickly decide if using GPUs or FPGAs for executing the map or reduce task is likely to provide performance benefits.

The student will be in charge of proposing concrete ways to make two performance models work in conjunction, following these principles. Developments will be based on ConPaaS (www.conpaas.eu) where the blackbox model is already implemented. The validations of the proposed algorithms may rely on the French experimental platform named Grid'5000 (www.grid5000.fr). This platform comprises about 8,000 cores geographically distributed in 10 sites linked with a dedicated gigabit network.

This work will be realized in collaboration with Imperial College London, which is developing whitebox performance models for MapReduce and data-streaming applications.

References:

[1] S. Pllana, I. Brandic, and S. Benkner. A survey of the state of the art in performance modeling and prediction of parallel and distributed computing systems. International Journal of Computational Intelligence Research, 4(1), 2008.

[2] E. Ipek, B. de Supinski, M. Schulz, and S. McKee. An approach to performance prediction for parallel applications. In Proceedings of the Euro-Par Conference, 2005.

[3] ConPaaS: a Platform for Hosting Elastic Cloud
Applications. Guillaume Pierre and Corina tratan. IEEE Internet Computing 16(5), September-October 2012.

[4] Imperial-led project aims to unlock processor diversity in the cloud. InformationAge, April 2013. http://bit.ly/1mu926F

URL sujet détaillé : :

Remarques :



SM207-14 Using constraint programming to solve  
Admin  
Encadrant : Marine MINIER
Labo/Organisme : CITI Laboratory / INSA de Lyon
URL : http://perso.citi.insa-lyon.fr/mminier/
Ville : Villeurbanne

Description  

Constraint programming is a powerful tool to solve problems if they can be modeled with decision variables and constraints, where a constraint is a relationship between one or more variables that limit the values simultaneously taken by each variables involved in the constraint. Algorithms of solution search inconstraint programming, typically rely on constraints propagation to reduce the number of candidate solutions to explore, as well as a systematic search among the different possible assignments of each variable. Such algorithms guarantee to find a solution, when it exists, and allow to prove that there is no solution to a problem if it has not been resolved at the end of the exhaustive search.

The AES is the new block cipher standard since 2001. There is currently no attacks on the entire encryption process although recent results have led to progress in the understanding of its security. The purpose of this trainee is to use constraint programming to find the results of attacks against the AES obtained by A. Biryukov et al. in [1] and P. Fouque, A. et al. in [2] and attempt to extend the method to the case of the block cipher Rijndael.

The attack by Byriukov et al. proposed in 2009 is a so-called chosen key attack against the AES. The main idea is to look for truncated differentials that form valid paths through both the encryption process and the key schedule itself. The authors of this paper proposed a branch and bound algorithm to solve this problem.

In this trainee, we propose to try to solve this problem using constraint programming and the results we have already obtained in [3]. It will therefore be initially to validate the results of Biryukov et al. and of Fouque et al. (Crypto 2013). Then, in a second time, it will be to extend to the Rijndael cipher.


references:
[1] Alex Biryukov, Ivica Nikolic: Automatic Search for Related-Key Differential Characteristics in Byte-Oriented Block Ciphers: Application to AES, Camellia, Khazad and Others. EUROCRYPT 2010: 322-344
[2] Pierre-Alain Fouque, Jérémy Jean, Thomas Peyrin: Structural Evaluation of AES and Chosen-Key Distinguisher of 9-Round AES-128. CRYPTO (1) 2013: 183-203
[3] Marine Minier, Christine Solnon and Julia Reboul: Solving a Symmetric Key Cryptographic Problem with Constraint Programming. MODREF 2014, workshop of the CP 2014 conference


URL sujet détaillé : :

Remarques : this trainee will be done with Christine Solnon from the LIRIS laboratory.



SM207-15 An overlay network for a distributed cloud within the backbone  
Admin  
Encadrant : Cédric TEDESCHI
Labo/Organisme : INRIA Rennes Bretagne Atlantique
URL : http://beyondtheclouds.github.io
Ville : Rennes

Description  

While the Internet was built in a decentralised fashion, we are currently witnessing a strong re-centralisation of computing facilities such as those run by Amazon, Google or Facebook. The intrinsic limitations to this model is increasingly visible, chiefly due to reliability, privacy, and energy issues, as is experienced recently by users [1].

The Discovery project [2] is driven by the objective of proposing a
decentralised cloud architecture grounding in local resources located within the core the network. Indeed, the infrastructures set up by Internet providers are most of the time overprovisioned given the actual needs. It consequently appears that installing computing resources along the path of the backbone could reduce the costs related to the construction, maintenance, and energy supply of a computing centre by capitalising on the existing infrastructure. Also, it will lead to a natural distribution of resources while favouring the use of resources that are geographically close to the user.

Operating such an infrastructures calls for means to abstract out the details of the complex, heterogeneous and unreliable underlying network, so as to offer a a simple API to be levered by users to deploy their applications easily. This requires to design this overlay network to be deployed between the infrastructure and the applications on which will rely the API. An identified promising approach consists in the usage of the Vivaldi protocol [3], which places nodes in a Cartesian space according to their physical distance.

The work will consist in the design of such an overlay, and on the definition of routing strategy to be adopted within. In particular, one needs to be able to efficiently locate close nodes, retrieve objects stored in the network, while minimising the maintenance cost of the overlay. The work will be an algorithmic work firstly, and secondly experimental, through the validation of the proposed overlay network, its routing and maintenance algorithms in a software prototype to be deployed on the Grid'5000 national platform [4].
URL sujet détaillé : :

Remarques : - Advisor 1: Cédric Tedeschi, INRIA / Univ. Rennes 1 (http://people.irisa.fr/Cedric.Tedeschi)

- Advisor 2: Marin Bertier, INRIA / INSA Rennes (http://www.irisa.fr/asap/?page_id=195)

- Pay: ~500 =80




SM207-16 Dynamic distributed adaptation of data stream processing platforms  
Admin  
Encadrant : Cédric TEDESCHI
Labo/Organisme : INRIA Rennes Bretagne Atlantique
URL : http://www.irisa.fr/myriads
Ville : Rennes

Description  

The amount of data generated by social media such as Twitter makes it appealing being able to extract relevant information from these data in real time. Storm [1,2], initially developed at Twitter, has been designed to propose a simple API to deal with this problem. Developing a Storm application chiefly consists in defining the different steps a data stream will go through so as to extract information from it. Storm provides a way to ease the deployment of this topology over a computing platform.

In other terms, Storm automates the deployment of the set of processes composing the extraction application on the physical computation nodes available for instance in a cluster. To achieve and maintain such a deployment, Storm relies on a hierarchical structure monitoring worker nodes so as to replace them in case of a failure. One strong limitation of this approach is its hierarchical shape and its inability to cope with the dynamic nature of computing platforms. Indeed, the top of the hierarchy is occupied by an agent responsible for the application of fault-tolerance [3] and load balancing policies. Also, the problem of dynamically remapping the processes on the computing nodes as the velocity of the stream can quickly change is still an open problem in Storm, and more generally in stream processing platforms [4].

The work will consists in modelling a Storm topology, and to propose distributed algorithms (without the need for a monitoring hierarchy) to adapt to dynamic changes in the stream and platform's conditions. More precisely, the computing power dedicated to the support of each process needs to be dynamically adjusted, and this needs to be done in a decentralised fashion, each node deciding, based on local information about its workload , to duplicate or to terminate.

Validating this work will rely on the development of a prototype including these algorithms and its deployment over the Grid'5000 platform [5].
URL sujet détaillé : :

Remarques : - The team:
- Marin Bertier (Associate prof., INRIA / INSA Rennes)
- Matthieu Simonin (Engineer, INRIA)
- Cédric Tedeschi (Associate prof., INRIA / U. Rennes 1)
- Christine Morin (Senior Researcher, INRIA)

- Pay: ~500 =80




SM207-17 Secure multi-party computation using the bitcoin protocol  
Admin  
Encadrant : Omar HASAN
Labo/Organisme : LIRIS / INSA Lyon
URL : http://liris.cnrs.fr/omar.hasan/publications.html
Ville : Villeurbanne / Lyon

Description  

General context:

The problem studied in the field of Secure Multi-Party Computation (SMPC) is as follows: There are n parties or nodes in a network. Each node has an input which is private. The node does not wish to reveal the input to any other node in the network. A secure multi-party computation protocol enables the nodes to collaboratively compute a function over their inputs such that all inputs remain private, i.e., known only to their owners. An example of such a protocol is private maximum, i.e. given n nodes each with a private input, how to determine the maximum without revealing who has the maximum or any of the other private inputs. Other examples of possible private computations may include geometric functions, set operations, reputation [1], etc.

Objectives of the internship project:

The objective of this project would be to study secure multi-party computation protocols. Such protocols are often resource-heavy, which limits their practical application. The specific objective of the project would be to analyze and propose methods for rendering these protocols resource-efficient and thus practical. A study of cryptographic techniques and privacy enhancing technologies such as zero-knowledge proofs, homomorphic encryption, electronic cash, etc. will be part of the project.

Recent advancements in cryptographic electronic currencies (e.g. bitcoin) are also impacting how secure multi-party computation protocols are constructed. In particular, the bitcoin protocol may be used for building fully decentralized secure multi-party protocols that do not need to rely on trusted third parties. The project will thus also include a study of cryptographic electronic currency protocols.

References:

[1] A Decentralized Privacy Preserving Reputation Protocol for the Malicious Adversarial Model. Omar Hasan, Lionel Brunie, Elisa Bertino, and Ning Shang. IEEE Transactions on Information Forensics and Security (TIFS). April 2013.

URL sujet détaillé : :

Remarques :



SM207-18 Approche spectrale appliquée aux symétries engendrées par transformations d'auto ressemblance  
Admin  
Encadrant : Victor OSTROMOUKHOV
Labo/Organisme : LIRIS UMR 5205
URL : http://liris.cnrs.fr/victor.ostromoukhov
Ville : Lyon

Description  

La symétrie joue un rôle important dans notre vie, en fournissant à notre cerveau une représentation compacte des réalités complexes du monde extérieur. C'est pour cette raison que le traitement des symétries est une branche très active dans la vision par ordinateurs, dans le traitement et la synthèse d'images. Les travaux très récents ont permis de caractériser les symétries le plus répandues (symétrie bilatérale, symétrie rotationnelle ou miroir) grâce à l'étude spectrale des matrices dites matrices de correspondance symétrique. Dans ce travail de stage, nous proposons d'étudier la possibilité d'appliquer l'étude spectrale aux symétries d'un type plus évolué : celles engendrées par une transformation d'auto ressemblance. On trouve ce genre de symétrie partout dans le règnes végétal et animal : dans l'arrangement des branches et des feuilles des plantes, dans les arrangements des feuilles, fleurs et des graines dans les phyl lotaxies, dans la répartition des chambres des mollusques, etc. Ce travail de stage demandera à un étudiant de faire preuve d'imagination et de connaissances mathématiques considérables ainsi que de maîtriser des outils de l'analyse spectrale : calcul matriciel évolué, étude des valeurs propres pour les matrices éparses de grande dimension, etc.

URL sujet détaillé : :

Remarques :



SM207-19 Etude des propriétés spectrales des empilements compacts à plusieurs sphères  
Admin  
Encadrant : Victor OSTROMOUKHOV
Labo/Organisme : LIRIS UMR 5205
URL : http://liris.cnrs.fr/victor.ostromoukhov
Ville : Lyon

Description  

La théorie d'échantillonnage joue un rôle prépondérant dans la synthèse d'image. Les systèmes d'échantillonnage multidimensionnels évolués avec des qualités spectrales contrôlables peuvent également trouver leur application dans d'autres domaines qui font appel aux méthodes de Monte-Carlo et de Quasi Monte-Carlo multidimensionnelles, par exemple, dans des systèmes de simulations. On propose à l'étudiant d'examiner les propriétés spectrales des empilements compacts à plusieurs sphères, ainsi que les méthodes rapides pour générer de tels arrangements en 2D, en 3D, et en dimensions plus élevées. Ce projet est passionnant et très ambitieux. Pour comprendre à quel point le problème des empilements compacts est non-trivial, lisez l'article de Pfender et Ziegler : vous y verrez que certaines réponses aux problèmes posés depuis des siècles ont été apportées en 2003 par un mathématicien peu connu. Ou lisez les livres de Thompson ou de Conway & Sloane, pour voir à quel point le problème des empilements compacts est lié à d'autres branches de la science, par exemple à la théorie des groupes ou bien à la théorie des codes correcteurs. Références : Pfender, Florian; Ziegler, G=FCnter M. (September 2004), "Kissing numbers, sphere packings, and some unexpected proofs", Notices of the American Mathematical Society: 873=96883, http://www.ams.org/notices/200408/fea-pfender.pdf. Thomas M. Thompson (1983) "From error-correcting codes through sphere packings to simple groups". Conway, J.H. & Sloane, N.J.H. (1998) "Sphere Packings, Lattices and Groups".

URL sujet détaillé : :

Remarques :



SM207-20 EVisual cues play essential role in the human visual system. They ease the process of reconstruction of 3D scenes from 2D retinal images, the only visual information that the human being possesses about the world. Consequently, understanding the underlying mechanisms is essential for better understanding the process of vision. It is also important for computer graphics  
Admin  
Encadrant : Victor OSTROMOUKHOV
Labo/Organisme : LIRIS UMR 5205
URL : http://liris.cnrs.fr/victor.ostromoukhov
Ville : Lyon

Description  

Visual cues play essential role in the human visual system.
They ease the process of reconstruction of 3D scenes from 2D retinal images,
the only visual information that the human being possesses about the world.
Consequently, understanding the underlying mechanisms is essential for better
understanding the process of vision. It is also important for computer graphics: in fact,
taking into account visual cues may dramatically reduce harmful rendering artifacts
and also save computational resources.
In this exploratory project, the student will first systematically study perceptual
properties of the human visual system and identify the most important visual cues to model.
Then, based on both bibliographical search and his own exploration, he will develop
computational models for the most tangible visual cues.
He will implement them in the rendering framework, test them and evaluate the results.

URL sujet détaillé : :

Remarques :



SM207-21 Analyse statique de performance pour le langage audio Faust  
Admin  
Encadrant : Pierre JOUVELOT
Labo/Organisme : MINES ParisTech
URL : http://www.cri.mines-paristech.fr/~pj
Ville : Fontainebleau

Description  

Sujet :
Développement d'un analyseur automatique automatique WCET (Worst-Case Execution Time) pour le langage Faust de traitement du signal audio et musical.

Projet :
Le langage Faust, développé initialement au Grame (Lyon), permet de spécifier de manière fonctionnelle des opérations de traitement du signal ; il est particulièrement adapté aux applications audio et musicales (http://faust.grame.fr). Le traitement du son se fait, naturellement, de manière synchrone au sein des programmes Faust, et la question de la performance de ce langage se pose. En effet, un langage synchrone se doit d'être conforme à ce qu'on appelle "Hypothèse synchrone", qui stipule que tous les traitements peuvent être exécutés entre les arrivées de deux échantillons sonores successifs. Pour cela, une analyse statique de cout, du genre WCET (Worst-Case Execution Time), de ces traitements est nécessaire au moment de la compilation pour vérifier que cette hypothèse est valide, assurant ainsi la correction du code Faust à exécuter.

Ce stage s'insère dans le cadre du projet ANR FEEVER (voir http://feever.fr) qui a pour ambition d'offrir, via Faust, une solution ubiquitaire innovante pour la manipulation multimodale du son.

Descriptif :
L'objet du stage de niveau Master Recherche, de quelques mois, avec prolongation en thèse possible, est, en partant par exemple de l'évaluateur Faustine développé au CRI (voir http://feever.fr/Papers/propFaustine2013.pdf) :

- d'étudier les techniques actuelles d'analyse WCET;
- de concevoir et d'implanter via Faustine un algorithme d'analyse WCET pour le langage Faust;
- si le temps le permet, d'en faire une version C++ à intégrer dans le compilateur Faust.

Egalement en fonction du temps disponible, une preuve informatique, en Coq, d'une partie de ce processus pourra être abordée.
URL sujet détaillé : :

Remarques : Exigences :
Une connaissance de la programmation fonctionnelle et des techniques de compilation est nécessaire. Un intérêt pour la musique ou les traitements audio est un plus.



SM207-22 exploration of computational models for visual cues  
Admin  
Encadrant : Victor OSTROMOUKHOV
Labo/Organisme : LIRIS UMR 5205
URL : http://liris.cnrs.fr/victor.ostromoukhov
Ville : Lyon

Description  

Visual cues play essential role in the human visual system.
They ease the process of reconstruction of 3D scenes from 2D retinal images,
the only visual information that the human being possesses about the world.
Consequently, understanding the underlying mechanisms is essential for better
understanding the process of vision. It is also important for computer graphics: in fact,
taking into account visual cues may dramatically reduce harmful rendering artifacts
and also save computational resources.
In this exploratory project, the student will first systematically study perceptual
properties of the human visual system and identify the most important visual cues to model.
Then, based on both bibliographical search and his own exploration, he will develop
computational models for the most tangible visual cues.
He will implement them in the rendering framework, test them and evaluate the results.
URL sujet détaillé : :

Remarques :



SM207-23 Behaviours as solutions for higher-order programs and processes  
Admin  
Encadrant : Daniel HIRSCHKOFF
Labo/Organisme : LIP - ENS Lyon
URL : http://perso.ens-lyon.fr/daniel.hirschkoff
Ville : Lyon

Description  

This internship proposal focuses on operational methods to study the operational semantics of programs and processes (process calculi, lambda-calculus). A description can be found at
http://perso.ens-lyon.fr/daniel.hirschkoff/Stages/dh-contr.pdf

URL sujet détaillé : http://perso.ens-lyon.fr/daniel.hirschkoff/Stages/dh-contr.pdf

Remarques : Ce travail s'inscrit dans le cadre d'une collaboration avec D. Sangiorgi (Université de Bologne, Italie).



SM207-24 Dynamic muscle model for virtual human animation  
Admin  
Encadrant : Nicolas PRONOST
Labo/Organisme : LIRIS - SAARA
URL : https://liris.cnrs.fr/nicolas.pronost/
Ville : Lyon

Description  

This project concerns the accurate and real-time simulation of real human movement. To perform such simulation, we need a model of the real human in the virtual world of the simulation. Different models can be used depending on the intended application. In this project, we focus on models involving the anatomy and inner working of the human body.

The purpose of this project is to design a new model of muscle action line for the real-time simulation of virtual characters. The currently used muscle model uses line segments (called action lines) which represent only within a certain limit the action that a muscle can produce. These limits are reached when muscle attachments are large, or when the muscle fibers have a complex arrangement or when they follow a complex routing (e.g. wrapped around bones or other muscles). Some muscle-based simulation platforms use predefined primitive surfaces around which these action lines can wrap, but these are still rough approximations of the deformations that are involved in the human body.

Nevertheless muscle-based simulations are very sensitive to the directions of the muscle forces, and the routings of the action lines define the direction of the applied forces. Different action lines routings have a significant effect on the simulations. This project will investigate the benefits of using dynamic action lines i.e. that adapt to the posture. These adjustments will be estimated from the geometric deformations of the muscle. As these deformations are expensive to compute, they will be part of an off-line preprocess, and only a pre-calculated table of each muscle moment arms will be used according to the current posture of the virtual character.

Expectations for this project are threefold. (1) The student will use a geometric method to calculate the deformations of the action lines. (2) The student will then assess the impact of this new model on muscle function (i.e. are the differences in forces significant?). (3) Finally, demonstration scenarios will be developed to validate the model and support the results.
URL sujet détaillé : https://liris.cnrs.fr/nicolas.pronost/Papers/StudentProject/DynamicActionLine.pdf

Remarques : Secondary advisor: Saida Bouakaz (LIRIS-SAARA)
Ce stage est rémunéré et pourra être poursuivit en thèse.



SM207-25 Machine learning techniques for classification of viral genomic sequences  
Admin  
Encadrant : Macha NIKOLSKI
Labo/Organisme : Bordeaux Bioinformatics Center (CBiB) and LaBRI (Laboratoire Bordelais de Recherche en Informatique)
URL : http://www.labri.fr/perso/macha/
Ville : Bordeaux

Description  

Since the sequencing of the first bacterial genome in 1995, genomic sequences of thousands of microbial species have been sequenced, which covers only a tiny portion of known microbes. Genomes can be viewed as linear strings of four bases, adenine (A), guanine (G), cytosine (C), and thymine (T). This enables the treatment of genome sequences as symbolic sequences and the characterization of these sequences using mathematical models. The mathematical models of genomic sequences of particular interest to us in this project are a class of models called genome signatures. Genome signatures are compact mathematical representations of DNA sequences (see for a detailed presentation e.g. [5]) and in the context of this project we will consider the decomposition of genomic sequences in k-mers, that is words of size k.

Analysis of microbial communities has been until recently a complicated task due to their high diversity and the fact that many of these organisms can not be cultured. Metagenomics has emerged as essentially the only way to characterize these unculturable microbes. Metagenomic studies generate sequence data at an even higher rate than classical genomics. Indeed, the number and diversity of microbial communities are not known, but current estimations place the number of species for certain environments in the range of millions [1]. The recent announcement from the Earth Microbiome Project is to generate petabases 10^15 or more of environmental sequences [10]. Unfortunately, the general question of classifying complex sequence data coming from a non a-priori sequencing of the whole metagenomic sample is a particularly challenging computational problem. Moreover, the most blatant difficulty is to recognize novel viral sequences within a set of sequences.

Nonetheless, a number of efficient methods exist to classify the composition of a sample that mainly contains known species. Computational solutions for taxonomic assignment can be broadly organized in two main categories: (1) sequence similarity methods and (2) sequence composition methods.

Methods that rely on sequence similarity can be themselves subdivided in classical alignment-based techniques (mostly attempting to improve BLAST accuracy) and index-based. Alignment-based methods suffer from two limitations: speed and lack of sensitivity, see e.g. [2] and [3]. Alternative solutions have been suggested to overcome these limitations. These methods are based on long k-mers (subsequences of size k) and conceptually rely on the fact that when k is sufficiently large, sequences become species-specific. Consequently, the idea is to index the sequences by long k-mers. This is indeed the foundation of MegaBlast (a general-purpose sequence aligner using long seeds), but also of a number of methods specific for taxonomic assignment such as e.g. LMAT [4] and Kraken [3]. The downside of these sequence similarity approaches is over-specificity, which makes classification of unknown sequences problematic.

A complementary approach is based on sequence composition analysis. Such methods rely on the decomposition of sequences into frequencies of k-mers (with k that is not too large), and make use of machine learning techniques (e.g. SVM, kNN, Naive Bayes, etc) to train a classifier on the reference database. The taxonomic assignment of novel sequences is then predicted by applying the pre-trained model. These methods theoretically are better suited to the task of novel species classification as short k-mers distributions are less prone to over-fitting. However, even these techniques fail to classify about 50% of species absent from the training set [6]. This is especially salient for viral sequences, as the vast majority of them fails to be uniquely assigned to any domain [7].

In this project we propose to explore the use of spaced k-mers [8] (that is k-mers where certain letter positions are considered to be wildcards) for classification of novel viral sequences. Indeed, it has been recently shown that such signatures can be better suited to certain genomic classification tasks such as construction of phylogenetic trees [9].

The intern will develop an algorithmic method for viral sequence classification based on spaced k-mers. For this purpose he/she will have to acquire and apply machine learning techniques as well as "Big Data" approaches for dealing with highly voluminous data. The proposed method(s) will have to be implemented and will be tested both on public data as well as on novel data acquired by our collaborators at INRA. Good programming skills are most welcome, as well as a good working knowledge of algorithmics and statistics.

Advisors: Macha Nikolski and Hayssam Soueidan
Laboratory : Bordeaux Bioinformatics Center (CBiB) and LaBRI (Laboratoire Bordelais de Recherche en Informatique)

Contact: Please send your CV/resume to and

References:

[1] J. Gans, M. Wolinsky, and J. Dunbar. Computational improvements reveal great bacterial diversity and high metal toxicity in soil. Science, 309:1387=9690, 2005.
[2] A. Bazinet and M. Cummings. A comparative evaluation of sequence classification programs. BMC Bioinformatics, 13(1):92, 2012.
[3] Wood, D. E. and Salzberg, S. L. (2014). Kraken: Ultrafast metagenomic sequence classification using exact alignments. Genome Biology, 15(3), R46.
[4] Ames S.K., Hysom D.A., Gardner S.N., Lloyd G.S., Gokhale M.B. and Allen J.E. (2013), Scalable metagenomic taxonomy classification using a reference genome database. Bioinformatics, vol. 29, p. 2253=962260
[5] Nalbantoglu O., Computational Genomic Signatures and Metagenomics, Phd dissertation University of Nebraska-Lincoln USA, 2011
[6] Nalbantoglu O., Way SF, Hinrichs SH, Sayood K: RAIphy: Phylogenetic classification of metagenomics samples using iterative refinement of relative abundance index profiles. BMC Bioinformatics 2011, 12:41-41.
[7] Rosen G.L., Reichenberger E.R., Rosenfeld A.M. (2010), NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads. Bioinformatics 27, p. 127=96129
[8] Keich, U., Li, M., Ma, B. and Tromp, J. 2004, On spaced seeds for similarity search, Discrete Appl. Math. vol. 138, p. 253=96263
[9] Leimeister, C.-A., Boden, M., Horwege, S., Lindner, S., and Morgenstern, B. (2014). Fast alignment-free sequence comparison using spaced-word frequencies.
Bioinformatics, DOI: 10.1093/bioinformatics/btu177.
[10] J. Gilbert, J. Meyer, D. Antonopoulos, P. Balaji, C. Brown, N. Desai, J. Eisen, D. Evers, D. Field, W. Feng, D. Huson, J. Jansson, R. Knight, J. Knight, E. Kolker, K. Konstantindis, J. Kostka, N. Kyrpides, R. Mackelprang, A. McHardy, C. Quince, J. Raes, A. Sczyrba, A. Shade, and R. Stevens. Meeting report: the terabase metagenomics workshop and the vision of an earth microbiome project. Stand Genomic Science, 3:243=968, 2010.
URL sujet détaillé : :

Remarques : Ce stage est, bien sur, rémunéré




SM207-26 Motion control on soft or liquid support  
Admin  
Encadrant : Nicolas PRONOST
Labo/Organisme : LIRIS - SAARA
URL : https://liris.cnrs.fr/nicolas.pronost/
Ville : Lyon

Description  

This project concerns the accurate and real-time simulation of real human movement. To perform such simulation, we need a model of the real human in the virtual world of the simulation. Different models can be used depending on the intended application. In this project, we focus on models capable of dealing with complex interactions with the environment. These interactions affect the simulation and we are seeking to control the motion of the virtual human regarding these interactions.

The purpose of this project is to create a new physics based motion controller for virtual characters. The current controllers are not designed to deal with complex interactions with soft bodies or liquids that can have an influence on the movement. Yet these interactions happen in many real life situations. For example, we can imagine a virtual human easily moving through 5 cm of water on a beach. This exact same human will not walk around with the same gait pattern at all with water up to the knee. It means that the controller must know how to adapt to such different situations. To do so, we need to define a mathematical formulation to describe how we expect to see humans move in these situations (e.g. by optimizing a cost function that describes the effectiveness of the gait).

Expectations for this project are twofold. (1) The student will implement a computational method for interaction between rigid bodies(representing the virtual character) and soft bodies /liquids(water, mud, snow, sand etc.) that is efficient enough to be applied to real-time animation. (2) The student will use this method (i.e. mainly collision handling) and a physics based controller to produce plausible locomotion (walking and running).
URL sujet détaillé : https://liris.cnrs.fr/nicolas.pronost/Papers/StudentProject/SoftMotionControl.pdf

Remarques : Secondary advisor: Saida Bouakaz (LIRIS-SAARA)
Ce stage est rémunéré et pourra être poursuivit en thèse.



SM207-27 Vérification paramétrée de systèmes temps-réel  
Admin  
Encadrant : Étienne ANDRÉ
Labo/Organisme : LIPN, Université Paris 13, Sorbonne Paris Cité
URL : http://lipn.univ-paris13.fr/~andre/
Ville : Villetaneuse, France

Description  

Les systèmes temps réel multi-processeurs sont devenus omniprésents ces dernières années.
Certains d'entre eux (pilotage automatique des avions et drones, voitures sans chauffeur) sont critiques en ce sens qu'aucune erreur ne doit survenir.
Tester de tels systèmes peut éventuellement détecter la présence de bugs, mais pas en garantir l'absence.
Il est alors nécessaire d'utiliser des méthodes telles que le model checking afin de prouver formellement la correction d'un système.

Déterminer un ordonnancement (scheduling) correct des tâches effectuées par les processeurs est crucial pour la sûreté du système.
Ces systèmes sont caractérisés par un ensemble de constantes temporelles,
telles que la période de lecture d'un capteur d'altitude sur un drone,
ou la durée minimale ou maximale d'un calcul sur un satellite.
Bien que des techniques permettant de vérifier le système pour un ensemble de constantes existent, vérifier formellement le système pour de nombreuses valeurs de ces constantes peut demander un temps extrêmement long, voire infini si l'on cherche à vérifier des ensembles denses de valeurs.

Il est alors intéressant de raisonner paramétriquement, en considérant que ces constantes temporelles sont inconnues, c'est-à-dire des paramètres, et synthétiser une contrainte sur ces paramètres garantissant qu'il existe un ordonnancement des tâches à effectuer par les processeurs tel que le système est correct.
Plusieurs techniques ont été proposées afin de résoudre ce problème, notamment dans le cadre de l'ordonnancement.


L'objectif du stage est de mettre au point des techniques de vérification paramétrée de systèmes temps-réel dédiées au cas de l'ordonnancement multi-processeurs.

Une implémentation sera également effectuée par l'étudiant afin de valider l'approche proposée.
Une option possible est de réutiliser l'outil IMITATOR.
URL sujet détaillé : http://lipn.univ-paris13.fr/~andre/verification-parametree-de-systemes-temps-reel.pdf

Remarques : Co-encadrement avec Giuseppe Lipari (LIFL, Lille).
Le stage peut être effectué au LIPN (Paris 13) ou au LIFL (Lille), en fonction du souhait du candidat.

Rémunération standard.



SM207-28 BigData in NeuroScience  
Admin  
Encadrant : Thierry VIÉVILLE
Labo/Organisme : mnemosyne, Inria
URL : https://team.inria.fr/mnemosyne/
Ville : Sophia-Antipolis

Description  

Models studied in computational neuroscience are fed up with facts obtained from the large literature reporting knowledge obtained from biological experimentation. Such knowledge is put in words, leading to phenomenological descriptions of the underlying behaviors. Then, a part of this information is put in equations and in distributed algorithms to check some quantitative or qualitative aspects of the presumed global understanding. Surprisingly enough, this work is mainly realized like by craftsmen, and more or less by intuition.

Other domains of knowledge have taken into account recent and powerful developments regarding semantic representations and introduce an intermediate framework between knowledge-in-words and knowledge-in-equations: knowledge-in-facts. From very large and heterogeneous corpus like Wikipedia turned in DBpedia semantic data base, to very specialized fields of knowledge formalized as ontology, it is clear that powerful methodological tools are now available to manage the data at a higher scale.

The goal of the proposed work is to address this issue in three steps:

1/ Considering a well-defined topic (either early-vision processing from retina to the thalamus, or functional models of basal-ganglia and related structures of action selection, the choice being made with the student) build an ontology that describe our current knowledge of these brain sub-systems.

2/ From this real example, propose an intermediate language (say, ``turtle=B4=B4) that allows to computer scientists to easily annotates publications building the related ontology. This steps is realized with colleagues specialists of semantic web formalism.

3/ Considering existing tools and existing initiative, propose the design of a collective platform allowing a scientific community in the field to construct such bigdata, beyond this seminal work.

URL sujet détaillé : http://www-sop.inria.fr/members/Thierry.Vieville/projects/neurobigdata/

Remarques : Rénumération du stage. Sujet de thèse à la clé. Le stage se déroule à l'Inria Sophia-Antipolis en lien avec l'équipe du NeuroCampus de Bordeaux.



SM207-29 Study of P2P Systems for Video Live Streaming  
Admin  
Encadrant : Frederic GIROIRE
Labo/Organisme : INRIA Sophia Antipolis
URL : http://www-sop.inria.fr/members/Frederic.Giroire/
Ville : Sophia Antipolis

Description  

The goal of this internship is to study and propose models to describe P2P systems. These systems are cheap to operate and scale well, but their highly distributed nature raises questions about reliability, durability, availability, confidentiality, and routing of the data. In particular, the focus of this internship will be on streaming in peer to peer networks. In such system peers are relaying packets or video frames, and the goal is to play the video the nodes ask for. This is possible only if some delay in the playback is allowed, and there exist trade-offs be- tween the number of customers nodes and the delay that the systems can support. Moreover some scheduling protocols must decide which packets are exchanged and who send them in order to achieve the best possible behavior. The analysis of P2P systems involves a wide range of techniques between practice and theory, from mathematical analysis to simulations and deployment of experiments. This will allow us to adapt the orientation of the project to the taste of the interested student.

Objective: During the internship, we will study and simulate a live streaming system for video. We will study its efficiency when there are frequent arrivals and departures of peers. The metric considered will be the delay and bandwidth usage.

The internship can be followed by a Ph.D.


URL sujet détaillé : http://www-sop.inria.fr/members/Frederic.Giroire/stage-streaming.pdf

Remarques : Possibilité de rémunération.



SM207-30 Energy, Software-Defined Networks and Scalability  
Admin  
Encadrant : Joanna MOULIERAC
Labo/Organisme : COATI - INRIA/I3S (CNRS-UNS)
URL : https://team.inria.fr/coati
Ville : SOPHIA ANTIPOLIS

Description  

Software-defined Networks (SDN), in particular OpenFlow, is a new networking paradigm enabling innovation through network programmability. Over past few years, many applications have been built using SDN such as server load balancing, virtual-machine migration, traffic engineering and access control. For example, recently, Google has deployed during three years a network connecting their data centers across the planet using SDN.

In this internship, we will focus on using SDN for energy-aware routing (EAR). Energy efficiency of networking infrastructure (telecom, data center, ISP or enterprise networks) is a growing concern, due to both increasing energy costs and worries about CO2 emissions. Since traffic load has a small influence on power consumption of routers, EAR allows to put unused links into sleep mode to save energy. SDN can collect traffic matrix and then computes routing solutions satisfying QoS while being minimal in energy consumption. However, prior works on EAR have assumed that the forwarding tables can hold an infinite number of rules. In practice, this assumption does not hold since these tables are implemented with Ternary Content Addressable Memory (TCAM) which is expensive and power-hungry. In [1], we have proposed an optimization method to minimize energy consumption for a backbone network while respecting capacity constraints on links and rule space constraints on routers. In details, we have presented an exact formulation using Mixed Integer Linear Program (ILP) and introduced efficient greedy heuristic algorithm that were analyzed through simulations.

The objectives of this internship are to extend this work in order to take into account the possibility to add wildcards within the forwarding rules. As an example, assuming that the rules are the mapping of [(src, dest): port-to-foward], we can compact several different rules with same source s, same port-to-forward p, but different destinations into: [(s,*): p]. In this case, the forwarding tables will be shrinked, but the general problem becomes more complicated. Therefore, the candidate will study the extension of the linear program to take this into account, study new heuristics, and perform the corresponding simulations on a realistic topology and set of traffic demands.

Prerequisites for candidates:
- Knowledge and/or taste for Networking
- Knowledge and/or taste for discrete mathematics, graph theory and/or combinatorial optimization

Contact information:
For further information, please contact
- Frédéric Giroire -
frederic.giroire.fr
- Joanna Moulierac - joanna.moulierac.fr

Location:
COATI =96 INRIA/I3S (CNRS-UNS) - 2004 route des lucioles, 06906 SOPHIA ANTIPOLIS https://team.inria.fr/coati

References:
[1] Optimizing Rule Placement in Software-Defined Networks for Energy-aware Routing. Frédéric Giroire, Joanna Moulierac and Khoa Truong Phan. In Globecom 2014.

[2] On the effect of forwarding table size on SDN network utilization. R. Cohen, L. Lewin-Eytan, J. S. Naor, D. Raz In INFOCOM, 2014

[3] Software-Defined Networking: A Comprehensive Survey. Diego Kreutz, Fernando M. V. Ramos, Paulo Verissimo, Christian Esteve Rothenberg, Siamak Azodolmolky, Steve Uhlig In arXiv:1406.0440


URL sujet détaillé : www-sop.inria.fr/members/Joanna.Moulierac/SDN_Energy_internship_2014_COATI.pdf :

Remarques : Co-Encadrement : Frédéric Giroire frederic.giroire.fr



SM207-31 Petri net unfolding of biological networks  
Admin  
Encadrant : Stefan HAAR
Labo/Organisme : LSV, École Normale Supérieure de Cachan
URL : http://lsv.ens-cachan.fr/ ~haar
Ville : Cachan

Description  

Biological networks, such as gene regulatory networks or signalling networks, can be modeled as automata networks or Petri nets. It results in concurrent models involved hundreds or thousands of parallel-interacting components. Aiming at capturing the dynamical landscape of those models, and ultimately proposing mutations to control the biological cells, we are interested in computing several properties, such as the attractors (live/dead locks, characterizing the long-term behaviour) and set of mutations that would trigger a re-differentiation of the cell, i.e. will change the current attractor.

In order to cope with the huge size of the networks, this internship will address the design, implementation, and evaluation of new algorithms based on Petri net unfoldings for computing dynamical properties on discrete models of biological networks. Petri net unfoldings exploit the concurrency between transitions to obtain so-called complete prefixes, that result in compact representations of all the possible behaviour of the net. Tools such as mole (http://www.lsv.ens-cachan.fr/~schwoon/tools/mole) or cunf (https://code.google.com/p/cunf) allow to compute efficiently the complete prefix of the unfolding of Petri nets.

This internship will first consist in evaluating and comparing existing algorithms for computing attractors in Petri net models of biological networks. Several variants and extensions of Petri net unfoldings will be explored in order to improve the scalability of the method. This task will involve adapting existing algorithms and implementing them to make them more efficient when applied to large biological networks. Then, new algorithms will be designed for identifying sets of components of the network for which the activation or inhibition would (1) prevent the reachability of a particular attractor from a given initial state while conserving other attractors reachability; (2) force a transition from one attractor to another given one.

This work may be continued with a PhD thesis on developing causality and concurrency theory for the formal analysis of dynamics of discrete biological networks.
URL sujet détaillé : http://loicpauleve.name/unfolding-bio-M2.pdf

Remarques : Co-supervised by Loïc Paulevé (CNRS / LRI, Université Paris-Sud) - http://loicpauleve.name



SM207-32 Analyse des chemins d'accès aux variables partagées pour l'anticipation en ligne des placements routage  
Admin  
Encadrant : Selma AZAIEZ
Labo/Organisme : CEA, LIST, Laboratoire des Fondements des Systèmes Temps-Réels Embarqués
URL : http://www-list.cea.fr/
Ville : Saclay

Description  

De nos jours, des processeurs à plusieurs centaines de cóurs tels que la puce Xeon Phi d'Intel (64 cóurs) ou la puce MPPA de Kalray (256 cóurs) promettent un gain important de performance ainsi qu'une réduction de la consommation. Ces gains dépendent fortement de la programmabilité des puces mais aussi des stratégies de placement routage lors du déploiement de l'application.

L'un des moyens d'obtenir la performance d'exécution consiste à réduire le temps d'accès aux mémoires partagées. Pour cela il suffit de réduire les messages coûteux du protocole d'accès à la mémoire partagée (ex. messages bloquants dans la programmation synchrone ou messages de contrôle, etc.) tout en plaçant les données au plus proche de la tâche.

Dans le laboratoire LaSTRE, des travaux antérieurs ont porté sur le développement d'une chaîne de compilation à protocoles multiples permettant d'associer à chaque donnée partagée un protocole donné. L'assignation d'un protocole se fait de manière statique ou dynamique.

Dans le contexte de ce stage, nous voudrions doter cette chaîne de compilation d'outils d'analyse permettant de caractériser les chemins d'accès à la mémoire partagée afin d'anticiper les placements routage. Le stagiaire aura à identifier et à formaliser les types d'accès à la mémoire partagée afin d'aider à identifier la meilleur stratégie de placement routage.

Le travail se déroulera selon trois phases :
1. Etat de l'art sur les stratégies de placement routage
2. Formalisation des analyses de dépendance qui permettraient de faciliter les placements routage
3. Implémentation d'un prototype d'analyse

Le stage se déroulera dans les locaux Nano-Innov du CEA Saclay, au sein d'un laboratoire de recherche notamment spécialisé dans l'analyse de code, les chaînes de compilation et les systèmes embarqués pour architectures massivement parallèles.

Ce stage peut se poursuivre en thèse.
URL sujet détaillé : :

Remarques :



SM207-33 Analyse de modèles d'accès aux données partagées pour le choix de protocoles de cohérence pour architectures massivement parallèles  
Admin  
Encadrant : Loïc CUDENNEC
Labo/Organisme : CEA, LIST, Laboratoire des Fondements des Systèmes Temps-Réels Embarqués
URL : http://www-list.cea.fr/
Ville : Saclay

Description  

Afin de répondre aux demandes de puissance de calcul tout en maîtrisant la consommation d'énergie, des processeurs à plusieurs centaines de cóurs - dits many-coeurs - apparaissent dans les systèmes de calcul haute performance et les systèmes embarqués. Ainsi nous avons vu apparaître sur le marché, la puce Xeon Phi d'Intel qui contient 64 cóurs ou encore la puce MPPA de Kalray est composée de 256 cóurs.

Les performances des applications exécutées sur de telles architectures dépendent fortement de la programmabilité de la puce, et notamment des mécanismes et des techniques de gestion de la mémoire sur puce. Le choix du modèle de cohérence des données (les règles qui déterminent la fraîcheur d'une donnée lors d'un accès) et des protocoles de cohérence impacte directement l'efficacité des accès. Ceci peut rapidement constituer un goulot d'étranglement pour l'application si ces choix sont mal adaptés.
Nous avons proposé dans nos précédents travaux une chaîne de compilation à protocoles multiples permettant d'associer à chaque donnée partagée un protocole donné.

Dans ce contexte, un axe de recherche ouvert consiste à s'appuyer sur des techniques d'analyse statique du code source afin d'identifier et de caractériser les accès aux données partagées (ex: Identification des tâches de l'application et des variables partagées, construction des graphes de dépendance des accès aux données partagées).

Le travail proposé dans ce stage consiste à dresser un état de l'art autour de l'analyse des accès aux données partagées dans les codes d'applications parallèles, puis de proposer une solution qui répond aux points suivants:
1. La possibilité d'extraire, à partir du code source, des graphes représentant les dépendances des accès aux données partagées,
2. La caractérisation de motifs et schémas d'accès mémoire qui s'appuie sur la représentation précédente,
3. Une analyse automatisée des graphes de dépendance afin de détecter les schémas d'accès aux données.
Enfin, la solution proposée devra faire l'objet d'un prototype pour la validation du stage.

Le stage se déroulera dans les locaux Nano-Innov du CEA Saclay, au sein d'un laboratoire de recherche notamment spécialisé dans l'analyse de code, les chaînes de compilation et les systèmes embarqués pour architectures massivement parallèles.
URL sujet détaillé : :

Remarques :



SM207-34 Lagrangian Duality in Online Algorithms  
Admin  
Encadrant : Kim Thang NGUYEN
Labo/Organisme : IBISC, Universite Evry et LIP6, Universite Paris 6
URL : https://www.ibisc.univ-evry.fr/~thang
Ville : Evry et Paris

Description  

Online algorithms is a domain studying problems with uncertainty about future. In general, requests arrive overtime and one needs to make irrevocable decisions without any information about the future. The problems appears in many practical settings, for example in decision making, stock investment, network routing, etc. Hence, it is important to design algorithms/strategies with performance guarantee in uncertain environment.

The most successful technique until now in online algorithms is the potential argument. However, the technique has its limit (it does not give much insight about the problems, etc.) Currently, there is a trend looking for a principled approach to design and analyze online algorithms. A direction [1] is to exploit the power of mathematical programming and use the Lagragian duality. The method is powerful and flexible as it is based on natural relaxations (even non-convex ones) and it successfully gives theoretical evidence for practical algorithms.

Our project is to develop this technique and the internship is a part of the project. The goal is to study concrete online problems (online scheduling/matching/paging etc) using the approach. Ideally, a research publication could be derived at the end of the internship.

Requirement: Motivation and basic knowledge on Algorithms, Discrete Mathematics.

[1] Nguyen Kim Thang. Lagrangian Duality in Online Scheduling with Resource Augmentation and Speed Scaling. European Symposium on Algorithms (ESA), 2013.
URL sujet détaillé : https://www.ibisc.univ-evry.fr/~thang/Stages/lagrange.html

Remarques : Remuneration standard 523,26 euros/mois.



SM207-35 Analysis of the Mifare Plus Distance Bounding Protocol  
Admin  
Encadrant : Gildas AVOINE
Labo/Organisme : IRISA
URL : www.avoine.net :
Ville : Rennes

Description  

* Keywords: Security, Cryptography, Contactless Smartcards, RFID.

* Objective: Check in practice the efficiency of the distance bounding protocol of the Mifare Plus RFID tag.

* Content: Radio Frequency Identification (RFID) allows to identify objects or subjects without any physical nor optical contact, using transponders - micro-circuits with an antenna - queried by readers through a radio frequency channel. This technology is one of the most promising of this decade and is already widely used in applications such as access cards, transportation passes, payment cards, and passports. This success is partly due to the steadily decrease in both size and cost of passive transponders called "tags".

The "relay attack" - sometimes referred to as "Mafia fraud" - exhibited by Desmedt, Goutier, and Bengio recently became a major issue of concern for RFID authentication protocols. The adversary pretends to be the legitimate prover by relaying the messages that are exchanged during the execution of the protocol. This is illustrated through the following example. Consider an RFID-based ticket selling machine in a theater. To buy a ticket, the customer is not required to show his theater pass, an RFID tag. The customer needs to be close enough to the machine (verifier) so that the pass (prover) can communicate with it. The pass can be kept in the customer's pocket during the transaction. Assume there is a line of customers waiting for a ticket. Bob and Charlie masterminded the attack. Charlie is in front of the machine while Bob is far in the queue, close to Alice, the victim. When the machine initiates the transaction with Charlie's card, Charlie forwards the received signal to Bob who transmits it to Alice. The victim's tag automatically answers since a passive RFID tag - commonly used for such applications - responds without requiring the agreement of its holder. The answer is then transmitted back from Alice to the machine through Bob and Charlie who act as relays. The whole communication is transparently relayed and the attack eventually succeeds: Alice pays Charlie's ticket.

When it was first introduced in the late eighties, the relay attack appeared unrealistic. Nowadays, the relay attack is one of the most effective and feared attacks against RFID systems. It can be easily implemented since the reader and the tag communicate wirelessly, and it is not easily detectable by the victim because queried (passive) tags automatically answer to the requests without agreement of their bearers. Relay attacks are for example used to steal cars in parking lots.

Distance bounding protocols are promising solutions to prevent such relay attacks. These protocols measure the round-trip time of messages exchanged between the reader (verifier) and the tag (prover): if the round trip time is too long, then the reader considers the tag is outside its neighborhood and an attack running. There exist many distance bounding protocols but most of them are theoretical proposals. The Mifare Plus tag is the only device that takes benefit of a distance bounding protocol. The implemented protocol is however much more simple than theoretical ones. The protocol so mitigates the problem but cannot fully avoid them. Consequently, this project will seek for the limits of the distance bounding protocol implemented in the Mifare Plus.

URL sujet détaillé : http://www.avoine.net/irisa_emsec_avoine_1.pdf

Remarques :



SM207-36 Practical Cryptanalysis of Pseudo-Random Number Generators in RFID  
Admin  
Encadrant : Gildas AVOINE
Labo/Organisme : IRISA
URL : www.avoine.net :
Ville : Rennes

Description  

* Keywords: RFID, Contactless Smartcard, Practical Attacks, PRNG

* Objective: Analyze (in practice) PRNGs implemented on tags available in our everyday lives

* Content:Randomness is the keystone of any cryptographic system. Generating randomness is yet far from being an easy task, especially with low-cost devices as RFID passive tags. Indeed, on one hand few logic gates can be "wasted" just for generating randomness; on the other hand passive RFID tags have no battery - hence no internal clock - and must so trust in some way the reader that feeds them. Most current pseudo-random number generators (PRNG) implemented on RFID tags definitely suffer from poor designs. For example, the PRNG implemented on Mifare Classic RFID tags - 1 billion of such tags have been sold since 1995 - is re-initialized each time the tag enters the field of a reader. Consequently an adversary can easily drive the PRNG of a Mifare Classic tag just by measuring the time between the tag is powered and the PRNG is called.

This internship aims to analyze (in practice) PRNGs implemented on tags available in our everyday lives, e.g., tags for building access control, public transportation, ski lift, etc. The analysis will depends on whether or not the specifications of the PRNG are known. When they are not known, which is the most usual case, the RFID tag is considered as a black box and two approaches are required: (1) Determining whether the PRNG can be influenced by some external events; (2) Determining whether the PRNG output is biased. Guidelines and tools provided by the US National Institute of Standards and Technology might be useful for this work. The expected aim of this project is to eventually break some RFID solutions in practice.

Depending of the profile of the student, this topic can be tackled from different approaches, either practical or theoretical. For example, it is expected to analyze the PRNG as a black box, but students familiar with reverse engineering techniques may wish apply there knowledge against some PRNGs. Whatever the approach, the student will be required to implement tests, and so skills in programming are required. Choice of the language is let to the discretion of the candidate.

URL sujet détaillé : http://www.avoine.net/irisa_emsec_avoine_2.pdf

Remarques :



SM207-37 Can SSD Memory Improve Cryptanalytic Time-Memory Trade-Off?  
Admin  
Encadrant : Gildas AVOINE
Labo/Organisme : IRISA
URL : www.avoine.net :
Ville : Rennes

Description  

* Keywords: Cryptography, Time-Memory Trade-Off, Password Cracking, Algorithmics, Probability.

* Objective: Check whether SSD memory may improve cryptanalytic time-memory trade-offs.

* Content: Many cryptanalytic problems can be solved in theory using an exhaustive search in the key space, but are still hard to solve in practice because each new instance of the problem requires restarting the process from scratch. Such a problem is for example the cracking of passwords.

The basic idea of a cryptanalytic time-memory trade-off (TMTO) is to carry out an exhaustive search once for all such that following instances of the problem become easier to solve. Thus, if there are N possible solutions to a given problem, a time-memory trade-off can solve it with T units of time and M units of memory. In the methods we are looking at, T is proportional to N^2/M^2 and a typical setting is T=M=N^(2/3).

The cryptanalytic time-memory trade-off has been introduced in 1980 by Hellman and applied to DES. Given a plaintext P and a ciphertext C, the problem consists in recovering the key K such that C=E_K(P), where E is an encryption function assumed to follow the behavior of a random function. Encrypting P under all possible keys and storing each corresponding ciphertext allows for immediate cryptanalysis but needs N elements of memory. The idea of a trade-off is to use chains of keys, which is done using a reduction function R that generates a key from a ciphertext. Using E and R, chains of alternating ciphertexts and keys can thus be generated. The key point is that only the first and the last element of each chain are stored. In order to retrieve K, a chain is generated from C. If at some point it yields a stored end of chain, then the entire chain is regenerated from its starting point. However, finding a matching end of chain does not necessarily imply that the key will be found in the regenerated chain. There exist situations where the chain that has been generated from C merges with a chain that is stored in the memory that does not contain K. This situation is called a false alarm and is very costly.

A TMTO is efficient as long as the precalculations of the chains can be entirely stored in a fast-access memory. RAM is currently used but its quite limited size is the bottleneck of TMTO. This internship aims to check whether SSD memory may improve cryptanalytic time-memory trade-offs.

During this internship, the student will:

(1) implement the cryptanalytic time-memory trade-off (TMTO) technique,
(2) evaluate the efficiency of a TMTO when a SSD is used,
(3) find the best approach to combine SSD and RAM,
(4) apply the technique to crack passwords.

URL sujet détaillé : http://www.avoine.net/irisa_emsec_avoine_3.pdf

Remarques :



SM207-38 Hardware-software codesign of TLS/SSL support for Zynq embedded systems  
Admin  
Encadrant : Arnaud TISSERAND
Labo/Organisme : IRISA-CAIRN
URL : http://people.irisa.fr/Arnaud.Tisserand/
Ville : LANNION

Description  

Zynq platforms embed booth a processor (ARM dual core Cortex A9) and a FPGA (Artix7). We will study and prototype a TLS/SSL support on this platform. At hardware level, an ECC cryptoprocessor will be used to accelerate some cryptographic primitives.

Complete description at
http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet-2014-codesign-hw-sw-tls-ssl.pdf
URL sujet détaillé : http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet-2014-codesign-hw-sw-tls-ssl.pdf

Remarques : See complete description



SM207-39 Software extensions for an ECC cryptoprocessor  
Admin  
Encadrant : Arnaud TISSERAND
Labo/Organisme : IRISA-CAIRN
URL : http://people.irisa.fr/Arnaud.Tisserand/
Ville : LANNION

Description  

A dedicated processor for elliptic curve cryptography on FPGA circuits is under development at IRISA-CAIRN in Lannion.

In this work we will study and prototype software extensions for advanced support with more curves, computation algorithms, and algorithmic protections against side channel attacks. The proposed solutions will be implemented and tested on the power-analysis attack setup.

Complete description at
http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet-2014-extension-soft-crypto-proc-ecc.pdf
URL sujet détaillé : http://people.irisa.fr/Arnaud.Tisserand/jobs/sujet-2014-extension-soft-crypto-proc-ecc.pdf

Remarques : See complete description



SM207-40 Vehicle Sharing Systems  
Admin  
Encadrant : Nicolas GAST
Labo/Organisme : INRIA et G-SCOP
URL : http://mescal.imag.fr/membres/nicolas.gast/news/2014/10/10/intership-BSS/index.html
Ville : Grenoble

Description  

The number of cities equipped with Vehicle Sharing Systems (VSS, BSS, VLS) is increasing rapidly. VSS are viewed as a promising way to reduce the usage of personal car and improve the quality of life by reducing traffic, noise and air pollution. Despite the growing number of VSS, they are still poorly understood and often suffer from congestion problems (absence of a vehicle or of a parking spot at the destination).

The goal of the internship will be to contribute to the understanding of VSS by developing computer and mathematical methods. For completeness, we describe four research directions bellow. According to his preferences and skill, the student will choose one of the them during the internship. The others can serve as a basis for a subsequent PhD proposal.

Benchmark the historical data of various real VSS, and develop visualization tools, to help humans deal directly with such data.
Data mining, in order to estimate the underlying intensity of user flow depending on the stations, time, day, season, weather.
Modelling and simulating so as to have a solid basis for investigating the behavior of future systems and optimizing them.
Mathematical analysis of models of VSS, so as to have an analytic understanding of how VSS behave and help optimizing them.
URL sujet détaillé : :

Remarques : Le stage sera co-encadré par Nicolas GAST (CR INRIA Montbonnot) et Vincent JOST (CR CNRS G-SCOP Grenoble)



SM207-41 A dynamical system approach to compute the transient behavior of stochastic distributed systems  
Admin  
Encadrant : Nicolas GAST
Labo/Organisme : Inria
URL : http://mescal.imag.fr/membres/nicolas.gast/news/2014/10/10/intership-hittingTime/index.html
Ville : Grenoble

Description  

Let us consider a random walker over a finite grid (or torus) in dimension d. A natural question is how long does it take for the walker to have a chance to visit any state in the grid, or, more or less equivalently, how long does it take for the walker's initial position not to matter anymore?

These questions are all related to the {it mixing time} of the random walk: the time it takes for the probability distribution at time t, to be close (for the total variation norm) to the stationary distribution. Computing the mixing time is an important challenge in modern Markov chain theory that has been approached in many ways (spectral decomposition of the transition matrix, coupling theory, perfect simulations).

In this internship, we propose to investigate a new approach based on asymptotic arguments (by letting the space dimension d, go to infinity). This approach transforms the stochastic random walk into a deterministic dynamical system whose long run behavior can be well understood (sometimes).

This new approach can allow one to compute the mixing time of random walks that cannot to tackled using the classical tools. It is also useful fo compute the complexity of distributed randomized algorithms such as cooperative protocols in games. Finally, it can help to assess the expected time to extension of population processes with intricate dynamics (such as populations of particules under general interaction processes).
URL sujet détaillé : :

Remarques : Will be co-supervised by Bruno Gaujal (DR Inria, Grenoble).




SM207-42 Numerical Optimal Transport in 3d  
Admin  
Encadrant : Bruno LEVY
Labo/Organisme : Inria Nancy Grand-Est / Loria
URL : http://www.loria.fr/~levy
Ville : Nancy

Description  

This project aims at developing new efficient numerical methods for optimal transport in 3d.
The rich mathematical structure of the problem leads to interesting theoretical developments. It also has the potential of leading towards new efficient numerical methods to solve some non-linear PDEs of a certain class (Monge-Ampere),
see Benamou, Carlier, Merigot, Oudet's article
see http://arxiv.org/abs/1408.4536 on the JKO (Jordan-Kinderlehrer-Otto) scheme.
My early experiments in 3d seem promising: the semi-discrete optimal transport between a measure with density and a sum of Dirac massescan be computed for up to 1 million Dirac masses. For solving Monge-Ampere PDEs, the JKO scheme requires solving multiple instances of optimal transport problems, the goal is to study the possibility of accelerating the speed of convergence of the algorithm, to make it usable in practice.
URL sujet détaillé : http://arxiv.org/abs/1409.1279

Remarques : We have several possibilities of funding for continuing with a Ph.D. thesis.



SM207-43 Understanding the pelvic dynamics through a 3D simulation of childbirth  
Admin  
Encadrant : Fabrice JAILLET
Labo/Organisme :MaSTR team of the Laboratory of Mechanics in Lille are involved in the development of a bio-mechanical model for the simulation of interactions between pelvic dynamics of the pregnant woman and the fetus during delivery.
URL : https://liris.cnrs.fr/saga/wiki/doku.php
Ville : Villeurbanne Doua

Description  

Context:

The SAARA team of the LIRIS in Lyon and the ER3:MaSTR team of the Laboratory of Mechanics in Lille are involved in the development of a bio-mechanical model for the simulation of interactions between pelvic dynamics of the pregnant woman and the fetus during delivery.

The purpose of this project is now to change the existing 3D model to allow interaction with the fetus during labor. This implies ultimately to provide a more or less "patient-specific" geometric model, and to set it under different scenarios (orientation and shape of the pelvis, the fetal head, altering the properties of ligaments, etc.).

This 3D model can be used to perform bio-mechanical simulations, which will aim to:

1) assist in understanding the role of various anatomical structures,

2) observe the physiological damage consecutive to the childbirth, and characterize the possible induced trauma.

Proposed missions:

The physiological model of childbirth will be degraded to move towards more interactive simulations. This is reflected for example by the use of lower resolution meshes, using laws of simpler behavior (linear, nonlinear) ...

This work will go through a phase of confrontation with the full model (taken as "ground truth") and dynamic MRI 2D images. The challenge will be to control the error while ensuring the mechanical validity.

It will be necessary to perform the analysis and the following developments:

- literature review leading to the definition of a relevant error criteria, a posteriori and possibly during the simulation,

- software implementation (C++) of the adopted solution in the CamiTK platform,

- coupling with the simulation software Abaqus=99.

Depending on the progress, it will be useful to adapt the 3D model (Catia=99) according to the different scenarios.

Area of expertise:

The candidate will ideally be skilled at the interface between mechanical and computer graphics applied to the bio-medical simulation. Skills in computational mesh processing and numerical analysis would be a plus.

URL sujet détaillé : http://liris.cnrs.fr/fabrice.jaillet/data/M2.ErreurAccouchement-2014-15.pdf

Remarques : Contract Type: Master research or engineering school regulated internship

Équipe SAARA,
LIRIS UMR CNRS 5205,
Domaine scientifique de la Doua,
Bâtiment Nautibus,
23-25 Av. Pierre de Coubertin,
F-69100 Villeurbanne Cedex.
Contact :

LIRIS : Fabrice JAILLET (fabrice.jaillet (at) liris.cnrs.fr)
Florence ZARA (florence.zara (at) liris.cnrs.fr)

LML : Mathias BRIEU (mathias.brieu (at) ec-lille.fr)

LIRIS-CHU Lyon : Géry LAMBLIN (gery.lamblin (at) chu-lyon.fr)




SM207-44 Bounded graph coloring  
Admin  
Encadrant : Olivier BAUDON
Labo/Organisme : Laboratoire Bordelais de Recherche en Informatique
URL : www.labri.fr :
Ville : Talence

Description  

A proper vertex-coloring of a graph G = (V, E) is a mapping from the set of vertices V to a set of colors, usually represented by the set of integers {1, ..., k}, in such a way that every two adjacent vertices are assigned distinct colors.

There exist many variations of graph coloring: coloring edges instead of vertices, coloring with lists instead of integers, adding constraints such as forbidding bi-colored cycles...

One (quite natural) constraint on the coloration is to bind the number of elements (vertices or edges) that may receive the same color. In that case, we speak about b-bounded coloring, where b denotes the maximum number of elements that may receive the same color. Such colorings have applications in timetable scheduling or job scheduling in multiprocessor systems for instance.

The goal of this internship is:
- firstly, to review existing results on k-bounded colorings,
- secondly, to study the k-bounded version of other types of graph colorings (such as, for example, oriented vertex-colorings or weight edge-colorings, two types of colorings well-studied at LaBRI) both in their structural and algorithmic aspects.

An oriented coloring of an oriented graph is a proper coloring such that if there exists an arc uv with color(u) = a and color(v) = b, then there is no arc wx with color(w) = b and color(x) = a.
A weight edge-coloring of an undirected graph is a mapping from its set of edges to a set of weights W = {1, ..., w} such that the mapping that assigns to each vertex the sum of the weights of its incidents edges is a proper vertex-coloring.

References.
P. Hansen, A. Hertz and J. Kuplinsky. Bounded vertex colorings of graphs. Discrete Math. 111:305-312 (1993).
M. Jarvis and B. Zhou. Bounded vertex coloring of trees. Discrete Math. 212:145-151 (2001).
R. Rizzi and D. Cariolaro. Polynomial time complexity of edge colouring graphs with bounded colour classes. Algorithmica 69:494-500 (2014).

URL sujet détaillé : :

Remarques : Co-encadrant : Eric Sopena, PR LaBRI



SM207-45 Types and interpretations for programs of polynomial time complexity  
Admin  
Encadrant : Patrick BAILLOT
Labo/Organisme : LIP, ENS Lyon
URL : http://perso.ens-lyon.fr/patrick.baillot/
Ville : Lyon

Description  

Can one define simple and modular programming disciplines in a high-level language ensuring that the programs produced have a bounded complexity, for instance of polynomial time? This question is explored by the field of implicit computational complexity: it aims at defining static complexity criteria, which do not rely on explicit bounds on resources.

This internship proposal focuses on implicit complexity criteria for functional programs, which can use higher-order subprograms. It aims at relating two approaches for this issue, based respectively on type systems and on the semantic notion of interpretation.
URL sujet détaillé : http://perso.ens-lyon.fr/patrick.baillot/STAGES/2015/sujet_interpretations2015.pdf=09

Remarques :



SM207-46 Evaluation Sécurisée d'une Boite-S  
Admin  
Encadrant : Jean-charles FAUGÈRE
Labo/Organisme : INRIA/UPMC/LIP6/PolSys et ANSSI/LSC
URL : http://www-polsys.lip6.fr/~jcf/
Ville : Paris

Description  

L'exécution d'un algorithme cryptographique (DES, AES, RSA,...) sur un appareil embarqué implique, en l'absence de précautions particulières dans l'implémentation, des variations dans le comportement de l'appareil. Ces variations peuvent être observées (via la consommation de courant, les émissions électromagnétiques,...) et combinées à l'aide de méthodes statistiques afin d'en déduire les secrets manipulés par l'appareil. L'implémentation de fonctions cryptographiques pour des applications embarquées (carte à puce par exemple) nécessite par conséquent des contremesures contre de telles attaques, dites attaques par canaux auxiliaires. Une contre-mesure classique, appelée masquage, consiste à séparer les entrées des primitives cryptographiques en plusieurs parties générées aléatoirement puis à remplacer les opérations sensibles (i.e. opérant sur des données secrètes) par des séquences de calculs n'opérant que sur un nombre limité de ces parties. Les façons d'appliquer ce type de contre-mesures diffèrent pour chaque primitive et elles s'appuient sur des techniques de Secure Multi-party Computation, de codage et/ou de calcul efficace sur des corps finis.

La sécurisation de l'évaluation de certaines primitives appelées boites-S à l'aide du masquage est particulièrement difficile et pose des questions qui rejoignent des problématiques bien connues en implantation matérielle efficace de polynômes définis sur des corps finis de caractéristique 2. Des récents travaux ont ainsi montré que la difficulté de protéger l'implantation d'une boite-S dépendait du nombre de multiplications nécessaires à l'évaluation du polynôme correspondant à la boite-S (les multiplications représentant des élévations au carré étant omises dans le calcul de la complexité). Une série d'articles a ainsi conduit à la publication de méthodes d'évaluation essayant de minimiser le nombre de multiplications. La plus efficace d'entre elles, publiée dans les actes
de la conférence CHES cette année, peut sur plusieurs points être considérée comme ad-hoc.

Le but du stage proposé ici est d'investiger des pistes permettant de rendre la méthode [CRR14] plus formelle et, éventuellement, d'en améliorer l'efficacité. Pour se faire, l'étudiant devra dans un premier temps se familiariser avec l'état de l'art. Puis, il devra implanter, dans un système de calcul formel comme Maple ou
Magma, la solution proposée dans [CRR14]. La seconde partie du stage consistera à rechercher avec les encadrants des pistes pour améliorer l'état de l'art.


[CRR14] J.-S. Coron, A. Roy, and S. Vivek. "Fast Evaluation of Polynomials over Finite Fields and Application to Side-channel Countermeasures".
In CHES, 2014. To appear.
URL sujet détaillé : http://www-salsa.lip6.fr/~perret/Site/stageEval.pdf

Remarques :



SM207-47 FouCo  
Admin  
Encadrant : Jasmin BLANCHETTE
Labo/Organisme : Inria ou MPI Informatik
URL : http://www21.in.tum.de/~blanchet/
Ville : Nancy ou Sarrebruck

Description  

Isabelle/HOL is a popular proof assistant based on higher-order logic. It has recently been extended with support for codatatypes and corecursion. The goal of this project would be to extend Isabelle so that it can process a larger class of corecursive functions, i.e., functions that produce possibly infinite values. This would enable Isabelle to go beyond the state of the art in its competitors (Agda, Coq, and Matita).

The implementation language is Standard ML (an Anglo-Saxon dialect of OCaml). The implementation would consist of two modules:

1. A command that would parse the specification of a function f, define f in terms of a so-called corecursor, and derive some theorems.

2. A command that would be used to derive the corecursor needed for point 1.

We explicitly do not expect the internship to result in the completion of both modules. The supervisor and one of the co-supervisors would also participate in the coding effort.
URL sujet détaillé : http://www21.in.tum.de/~blanchet/corec.pdf

Remarques : Co-encadrants : Andrei POPESCU et Dmitriy TRAYTEL.



SM207-48 Relativized Implicit Computational Complexity and Declassification  
Admin  
Encadrant : Frédéric PROST
Labo/Organisme : LIP - ENS Lyon
URL : http://perso.ens-lyon.fr/frederic.prost/
Ville : Lyon

Description  

On one hand, implicit computational complexity is the attempt to
characterize complexity classes without having to refer to specific
machines. Statically analyzing a program may give bounds on the
complexity of this program. On the other hand, confidentiality and
integrity properties can be statically analyzed: each variable is given a
security level (for the sake of simplicity assume two levels : public
and private) and various systems have been proposed to ensure that
no information may leak from private to public variables. Both of
those problems rely heavily on the non-interference
property. Roughly two variables are non-interferent when they are
independant, i.e. when changing the value of one variable does not
change the value of the other one. Several techniques of
non-interference analysis have been successfully adapted to the field
of implicit computation complexity : the fact that variables are non
interferent allows to hierarchize computations in tiers.

Non-interference is too strict to cope with many every day life
scenarios, for instance think at the password checking : a secret
value is compared to a public value and the result of the test is
public. Thus many declassification frameworks have been proposed.

The aim of the internship is to explore the following idea : in some
declassification settings downgrading of information is possible
through some fixed functions. Is it possible to reuse those kind of
frameworks from an implicit computation point of view with the idea
that those downgrading are viewed as oracles ? Is it possible to give
the characterization of relativized (to the oracle) complexity classes
in this context ?

References :

[1] Frédéric Prost: Enforcing Dynamic Interference Policy. SocialCom/PASSAT 2011.

[2] Jean-Yves Marion: A Type System for Complexity Flow Analysis. LICS
2011.

[2] Andrei Sabelfeld, David Sands: Declassification: Dimensions and
principles. Journal of Computer Security 17(5): 517-548 (2009).

URL sujet détaillé : :

Remarques :



SM207-49 Preventing SQL Injection Attacks using Information Flow Control  
Admin  
Encadrant : Francois LESUEUR
Labo/Organisme : LIRIS/INSA Lyon
URL : http://liris.cnrs.fr/~flesueur/
Ville : Lyon

Description  

Keywords : Security, SQL Injections, Access/Flow/Dissemination Control, Declassification


Since the last decade, the importance of web security has tremendously increased. In fact, all companies have a web portal and many collaborative services are nowadays provided by web applications. Software vulnerabilities like buffer overflows and format strings are less and less significant compared to higher level attacks on web applications such as SQL injections, XSS, CSRF... In this work we want to focus on SQL injections.

An SQL injection is due to the fact that an application (e.g. in Java), that is responsible for most of the computations and input/output, interact with a database application (e.g. in SQL), responsible for most of the data storage and retrieval. Thus, the Java application transmits SQL queries to the database without knowing what is being requested: an SQL query is just a string without any semantic meaning for the Java application (we voluntarily exclude parameterized queries, which only solve part of the problem). It is this blind interaction that is largely responsible for transmitting malicious SQL queries (i.e. injections) to DBMS that Java applications cannot detect. Moreover, beyond the different languages used to develop these applications, this interaction is also complicated due to the different paradigms used in these two programs: imperative for Java and declarative for SQL. In this work, we aim at preventing SQL injections by tracking end-to-end information flow through both the application and the database.

Many contributions have been proposed to deal with information flow control in imperative or declarative programs, but only few of them [1,6] have tracked end-to-end information flow through the whole program (both imperative and declarative). The main matter of this work is to study this interaction and to propose an approach to track data from their input in the imperative system to their storage in the declarative system, and conversely to track the data output through the declarative layer and then through the imperative one. For this purpose, the dissemination control model TBAC [2], proposed for declarative systems, will be adapted and used throughout the whole system.

During this internship, the student will have to conduct a literature review on SQL injection countermeasures [7], on decentralized information flow control [3], dissemination control [4] and also on declassification [5] which is a central point in the TBAC approach. Then the student will have to propose a common imperative/declarative formalism tracking information flows through the application and the database and taking declassification into account. Finally, a prototype will be developed to validate the proposed approach and to evaluate the performance overhead. Please note that this internship may lead to a PhD thesis.

References :
[1] David Schultz, Barbara Liskov. IFDB: decentralized information flow control for databases. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys), 2013.

[2] François Lesueur, Sabina Surdu, Romuald Thion, Yann Gripay, Meriam Ben Ghorbel-Talbi. Palpable Privacy through Declarative Information Flows Tracking for Smart Buildings. In Proceedings of the International Conference on Availability, Reliability and Security (ARES), Suisse, 2014.

[3] Andrew C. Myers Barbara Liskov. A Decentralized Model for Information Flow Control. in the Proceedings of the ACM Symposium on Operating Systems Principles, France, 1997.

[4] Roshan K. Thomas, Ravi Sandhu. Towards a Multi-dimensional Characterization of Dissemination Control. Proceedings of the Fifth IEEE International Workshop on Policies for Distributed Systems and Networks, POLICY 2004.

[5] Andrei Sabelfeld, David Sands. Dimensions and Principles of Declassification. In Proceedings of the IEEE Computer Security Foundations Workshop, France, 2005.

[6] Li, Peng, et S. Zdancewic. Practical information flow control in Web-based information systems. 18th IEEE Workshop In Computer Security Foundations CSFW, 2005.

[7] William G.J. Halfond, Jeremy Viegas, and Alessandro Orso. A Classification of SQL Injection Attacks and Countermeasures. Proceedings of the IEEE International Symposium on Secure Software Engineering, USA, 2006.

URL sujet détaillé : :

Remarques : Ce stage sera rénuméré et co-encadré par Meriam Talbi.



SM207-50 Optimizing avatars life-cycle in a Cloud infrastructure  
Admin  
Encadrant : Michael MRISSA
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/asawoo/
Ville : Lyon

Description  



Avatars are virtual abstractions to extend physical objects on the Web. They are a central piece of the ASAWoO project (http://liris.cnrs.fr/asawoo/). A cloud infrastructure is an ideal location for avatars, since they can rely on the almost unlimited resources provided by the interconnected cloud infrastructures. Therefore, we need to study several aspects about the lifecycle of those avatars. Those aspects range from creation and destruction to migration of avatars while considering resource usage (energy consumption, storage space, bandwidth usage, processing power,...). Thus, we need to study how to optimize avatar instantiation, deployment, execution and hibernation in a cloud infrastructure and ultimately migrate an avatar between two infrastructures when required.

The cloud software architecture will be built on top of an existing infrastructure currently hosted at the computer science department of Université Claude Bernard Lyon 1 : This infrastructure is currently composed of 10 nodes based on hexa-processor machines globally providing 640 GB RAM and 52 TB disk space. The current software cloud solution running is the free, open source OpenStack solution.

Requirements : Some knowledge about OSGi and Openstack is preferred

Resources and links :

Orgerie, A.-C.; Lefevre, L.; Gelas, J.-P., "Demystifying energy consumption in Grids and Clouds", 2010 International Green Computing Conference, pp.335,342, 15-18 Aug. 2010.
Lefevre, L.; Orgerie, A.-C., "Designing and evaluating an energy efficient Cloud", The Journal of Supercomputing, vol.51, n.3, March 2010.
Koomey, J. "Growth in data center electricity use 2005 to 2010", Analytics Press, 2011.
Orgerie, A.-C.; Dias de Assunçio, M.; Lefèvre, L., "A Survey on Techniques for Improving the Energy Efficiency of Large Scale Distributed Systems", ACM Computing Surveys, 2013.
Topic: TIWE


URL sujet détaillé : :

Remarques : Co-encadrement : Jean-Patrick Gelas






SM207-51 Asynchronous convergence of monotone Boolean networks  
Admin  
Encadrant : Adrien RICHARD
Labo/Organisme : Laboratoire I3S, CNRS & Université de Nice-Sophia Antipolis
URL : http://www.i3s.unice.fr/~richard/
Ville : Nice / Sophia Antipolis

Description  

Boolean networks are finite dynamical systems that consist in a finite number of binary variables evolving, in a discrete time, through mutual interactions and according to a fixed low. They are use in many disciplines for their ability to model non-linear behaviors. Most famous applications are in biology, where there are classical models for the dynamics of neural and gene networks.

This training course consists in studying the asynchronous dynamics of monotone Boolean networks, which present an interesting intermediate level of complexity, and which has interesting applications in the context of biological networks. It has been proved in [1] that, given a monotone Boolean network and an initial configuration x, the set R of configurations reachable from x by some asynchronous updating has the following properties:

1/ R contains a fixed point reachable from x by updating to most one time each component;

2/ R contains a unique maximal fixed point and a unique minimal fixed point, both reachable from x by updating at most two times each component.

These results suggest the following natural question, which is the core of the subject: Is every fixed point of R reachable from x by updating a polynomial number of times each component ?

The study could start with AND-OR nets, i.e. monotonous Boolean networks in which each local transition function is either a disjonction or a conjonction of positive literals. These networks can thus be represented by a directed graph with two types of vertices (and-vertices and or-vertices), and the use of concepts from graph theory could allows rapid and informative progress on the main question, which is difficult in the general case.

[1] T. Melliti, D. Regnault, A. Richard and S. Sené. On the convergence of Boolean automata networks without negative cycles. In J. Kari, M. Kutrib, A. Malcher, Cellular Automata and Discrete Complex Systems (Proceeding of AUTOMATA 2013), Springer Berlin Heidelberg, LNCS 8155, 124-138. {em PDF version}: {verb+http://www.i3s.unice.fr/~richard/+}


URL sujet détaillé : http://www.i3s.unice.fr/~richard/stageM2/stage_M2_ENS_LYON.pdf

Remarques : Rémunération: Gratification usuelle (436,05=80)



SM207-52 Generation of RDF triples from raw sensor data  
Admin  
Encadrant : Hassan AÏT-KACI
Labo/Organisme : PALSE Project
URL : http://cedar.liris.cnrs.fr/
Ville : Villeurbanne

Description  

The objective of this MSc topic is to implement a tool capable of processing massive raw data gathered from primary sources of all kinds, such as sensor devices, surveillance videos, manually entered inputs, etc., cleaning it up into consistently uniform RDF representations. The generated RDF data is to comprise triplestores encompassing data to be analyzed and further processed into intensional knowledge characterizing the sort of specific information this data relies on.
URL sujet détaillé : http://cedar.liris.cnrs.fr/papers/mstopic-1.pdf

Remarques :



SM207-53 Derivation of ontologies from RDF triplestores  
Admin  
Encadrant : Hassan AÏT-KACI
Labo/Organisme : PALSE Project LivEMUSIC
URL : http://cedar.liris.cnrs.fr/
Ville : Villeurbanne

Description  

This MSc topic concerns the analysis of RDF triplestores generated from raw sensor data in order to derive terminological schemas that the generated RDF triples abide by. The objective is to categorize these triplestore data using classes and attributes fitting their structure using Formal Concept Analysis (FCA). The outcome is to constitute a terminological schema for this data that can be used for reasoning about such data.
URL sujet détaillé : http://cedar.liris.cnrs.fr/papers/mstopic-2.pdf

Remarques :



SM207-54 Use of terminological knowledge for monitoring applications  
Admin  
Encadrant : Hassan AÏT-KACI
Labo/Organisme : PALSE Project LivEMUSIC
URL : http://cedar.liris.cnrs.fr/
Ville : Villeurbanne

Description  

This topic's objective is to develop a use case exploiting terminological knowledge over monitoring data with reasoning based on Order-Sorted Feature constraint logic for smart information retrieval, data analytics, and query processing.4 The object is to demonstrate how to leverage the capabilities brought by knowledge-based processing of monitoring data for intelligent applications with a convincing use case based on actual gathered data.
URL sujet détaillé : http://cedar.liris.cnrs.fr/papers/mstopic-3.pdf

Remarques :



SM207-55 DYN-FO and Regular Trees  
Admin  
Encadrant : Mamadou KANTÉ
Labo/Organisme : LIMOS
URL : http//www.isima.fr/~kante :
Ville : Clermont-Ferrand

Description  

The seminal result of Courcelle et al. states that every MSOL (optimisation/counting) property can be solved in polynomial time in graph classes of bounded clique-width. Even though this is interesting, many objects in real life are dynamic (for instance updates in databases) and when a database is modified recomputing a decomposition is counter-productive. Instead, we would like to maintain the answers as the database changes. This situation is addressed by the introduction of Dyn-FO, the properties that can be maintained and queried in First Order logic. Since its introduction the main interest was in some small set of graph properties and in regular words. The goal is to extend the study to regular trees and then to structures that are images of trees (graphs of bounded tree-width/clique-width, matroids/hypergraphs of bounded width,etc.)

For more details see the link below.
URL sujet détaillé : http://www.isima.fr/~kante/cours/DynFO.pdf

Remarques : fund is possible (around 437euros)



SM207-56 Algorithms for the Traveling Salesman Problem  
Admin  
Encadrant : Alantha NEWMAN
Labo/Organisme : G-SCOP
URL : http://www.g-scop.grenoble-inp.fr/
Ville : Grenoble

Description  


The well-studied symmetric Traveling Salesman Problem (STSP) is to find a minimum-cost tour in an undirected, weighted complete graph in which
the edge weights are non-negative and obey the triangle inequality. While the problem is NP-complete, there is a simple algorithm with an approximation guarantee of 2 based on doubling a minimum spanning tree. A slightly more sophisticated algorithm due to Christofides shows that the problem can be approximated to within a factor of 3/2. Improving on this factor is a well-known open problem.

In the last few years, there have been many exciting results (e.g. Moemke-Svensson 2011, Mucha 2012, Sebo-Vygen 2012) pertaining to a special case known as Graph-TSP, where the edge weights correspond to the shortest path distances in a given unweighted graph. The current best-known approximation guarantee for this problem stands at 1.4. These new approaches have resulted in many interesting related questions in graph theory.

The goal of this project is to study the recent developments and related questions.

Requirement: interest in Algorithms and Graph Theory.

URL sujet détaillé : :

Remarques :



SM207-57 Formal proofs for real-time analysis using CoQ  
Admin  
Encadrant : Sophie QUINTON
Labo/Organisme : Inria Grenoble - Rhône-Alpes
URL : https://team.inria.fr/spades/quinton/
Ville : Grenoble

Description  

Scientific context: Real-time systems are systems that must satisfy timing requirements such as response time bounds, that is, bounds on the maximum time it may take a software task to complete its execution once it has been activated. Establishing such properties is not a trivial thing because tasks executing on the same processor may have complex timing influence on each other due to scheduling.

Busy window analysis [1] is a well-established method to compute bounds on worst-case response times in uniprocessor real-time systems that is based on a fixpoint computation. It was originally introduced for the Static Priority Preemptive scheduling policy (SPP), then reused for the non-preemptive case (SPNP) [2] and has since been an inspiration for the analysis of many new scheduling policies. However, these proofs are too often somewhat informal. For example, the original SPNP analysis had a flaw which was discovered only years after its original publication [3].

Thesis subject: The objective of this thesis is to initiate formal support based on the proof assistant CoQ [4] for these analyses. The student will have to:
- get acquainted with the SPP and SPNP scheduling policies, busy window analysis and its extension to arbitrary event models [5];
- identify the proper concepts and lemmas of the original proofs;
- and formalize them in CoQ.
A particularly interesting side-effect of this work would be to identify the key properties of the concept of busy window so that it can be easily adapted for proofs of other scheduling policies.

Bibliography:
[1] Ken Tindell, Alan Burns, Andy J. Wellings: An Extendible Approach for Analyzing Fixed Priority Hard Real-Time Tasks. Real-Time Systems 6(2): 133-151 (1994)
[2] Ken Tindell, Alan Burns: Guaranteeing message latencies on Controller Area Network (CAN). Proceedings of 1st international CAN conference: 1-11 (1994)
[3] Robert I. Davis, Alan Burns, Reinder J. Bril, Johan J. Lukkien: Controller Area Network (CAN) schedulability analysis: Refuted, revisited and revised. Real-Time Systems 35(3): 239-272 (2007)
[4] http://coq.inria.fr/
[5] Arne Hamann, Rafik Henia, Razvan Racu, Marek Jersak, Kai Richter und Rolf Ernst: SymTA/S - Symbolic Timing Analysis for Systems. WiP Proceedings of the Euromicro Conference on Real-Time Systems: 17-20 (2004)
URL sujet détaillé : :

Remarques : This thesis could result in a publication and/or a PhD thesis. There is a strong collaboration with the group of Rolf Ernst at TU Braunschweig (Germany) which could give the student the opportunity to travel there if she/he is interested.

Feel free to contact me if you have questions related to this subject!



SM207-58 Incremental proofs for real-time analysis à la SymTA/S  
Admin  
Encadrant : Sophie QUINTON
Labo/Organisme : Inria Grenoble - Rhône-Alpes
URL : https://team.inria.fr/spades/quinton/
Ville : Grenoble

Description  

Scientific context: Real-time systems must satisfy timing requirements such as response time bounds, i.e., bounds on the maximum time it may take a software task to complete its execution once it has been activated. Establishing such properties is nontrivial as tasks executing on the same processor may have complex timing influence on each other due to scheduling. Furthermore, such effects propagate through the system so that tasks executing on different processors may also influence each other.

Compositional Performance Analysis (CPA), best known through its commercial implementation SymTA/S [1], is a compositional approach for computing such bounds. It is based on fixpoint computation at the system level on top of busy window analyses [2] at the processor level --- which means that there are nested fixpoint computations. In the current state of the art, any change to the sytem requires to perform a new analysis from scratch.

Thesis subject: The objective of this master thesis is to establish simple criteria for guaranteeing that it is correct to reuse, for a response-time analysis, the fixpoint obtained by previous analysis. The student will have to:
- understand the principles of CPA and get acquainted with its free Python-based implementation pyCPA [3];
- establish a list of structural changes (adding/removing a task, replacing/remapping a task etc.) for which previous results may be reused and formally prove the correctness of the approach;
- implement the established criteria in pyCPA and quantify the efficiency gain compared to the state of the art.

Bibliography:
[1] Arne Hamann, Rafik Henia, Razvan Racu, Marek Jersak, Kai Richter und Rolf Ernst: SymTA/S - Symbolic Timing Analysis for Systems. WiP Proceedings of the Euromicro Conference on Real-Time Systems: 17-20 (2004)
[2] Ken Tindell, Alan Burns, Andy J. Wellings: An Extendible Approach for Analyzing Fixed Priority Hard Real-Time Tasks. Real-Time Systems 6(2): 133-151 (1994)
[3] pycpa.readthedocs.org/
URL sujet détaillé : :

Remarques : This thesis could result in a publication and/or a PhD thesis on a related topic. There is a strong collaboration with the group of Rolf Ernst at TU Braunschweig (Germany) which could give the student the opportunity to travel there if she/he is interested.

Feel free to contact me if you have questions related to this subject!



SM207-59 Modèles mécaniques formels pour l'algorithmique distribuée  
Admin  
Encadrant : Xavier URBAIN
Labo/Organisme : LRI, Université Paris-Sud
URL : http://pactole.lri.fr
Ville : ORSAY

Description  

Distributed computing is one of the domains where informal reasoning is not an option, in particular when Byzantine failures are involved. What characterises also Distributed Computing is its diversity of models subtle modifications of which induce radical change in the system behaviour. We consider Robot Networks, that is swarms of *autonomous* mobile entities that have to accomplish some task in *cooperation*. In this emerging framework, models can be distinguished by the capabilities of robots, the topology of the considered space, the level of synchrony (that is the properties of a demon), the type of the failures likely to occur, etc.

We are interested in obtaining formal and moreover mechanical guarantees of properties for certain protocols, using the Coq proof assistant. A Coq framework for robot networks recently proposed can express quite a few variants of models for such networks, in particular regarding topology or demon properties. This framework is higher order and based on coinductive types. It allows us to prove in Coq positive results (the task will be fulfilled using the algorithm embedded in all robots *for all* initial configuration) as well as negative results (*there cannot be any* embedded algorithm that will work for this task for all initial configuration).

An objective of this proposal is to extend this framework to some particular topologies (higher dimensions) so as to propose a framework in which some new results could be formally established, in particular regarding the election of a leader, or the formation of geometrical patterns. Those two problems have been shown equivalent (meaning that there is an algorithm that solves one of them if one has as an input a solution for the other one) under some particular assumptions. Extending this equivalence to less constrained models is an interesting open problem.

URL sujet détaillé : http://pactole.lri.fr/pub/M2_1415.pdf

Remarques : Co-encadrement, avec :
- Pierre Courtieu (pierre.courtieu.fr)
- Sébastien Tixeuil (sebastien.tixeuil.fr)


Compétence en Coq nécessaire.



SM207-60 Strategies to deal with deadline misses in a weakly-hard real-time system  
Admin  
Encadrant : Sophie QUINTON
Labo/Organisme : Inria Grenoble - Rhône-Alpes
URL : https://team.inria.fr/spades/quinton/
Ville : Grenoble

Description  

Scientific context: Real-time systems are systems that must satisfy timing requirements such as response time bounds, that is, bounds on the maximum time it may take a software task to complete its execution once it has been activated. For some systems called weakly-hard, the specified timing requirements may be violated once in a while, as long as the number of violations can be bounded. Establishing such properties is not a trivial thing because tasks executing on the same processor may have complex timing influence on each other due to scheduling.

Busy window analysis [1] is a well-established method to compute bounds on worst-case response times in uniprocessor real-time systems that is based on a fixpoint computation. In the current state of the art, the analyzed scheduler is deadline-agnostic: whenever a timing requirement is violated because the specified bound on response times has been exceeded, the task that is experiencing this violation still runs to completion. An alternative would be to kill the belated execution, which would reduce the delay experienced by subsequent executions.

Thesis subject: The objective of this thesis is to adapt busy window analysis for the Static Priority Preemptive (SPP) scheduling policy assuming that executions are stopped as soon as they miss their deadline. The student will have to:
- get acquainted with the SPP scheduling policy, busy window analysis and weakly-hard systems analysis;
- adapt the standard response time analysis to support killing of task executions missing their deadline as well as the corresponding weakly-hard analysis;
- implement the proposed method within the free Python-based response time analysis tool pyCPA [2] and test it on existing test cases.
An interesting extension of this work would be to extend the approach to arbitrary event models (as opposed to fully periodic tasks) [3]. The long-term goal is to combine these results with an analysis of the impact of deadline misses on the controller receiving the output generated by the analyzed tasks.

Bibliography:
[1] Ken Tindell, Alan Burns, Andy J. Wellings: An Extendible Approach for Analyzing Fixed Priority Hard Real-Time Tasks. Real-Time Systems 6(2): 133-151 (1994)
[2] pycpa.readthedocs.org/
[3] Arne Hamann, Rafik Henia, Razvan Racu, Marek Jersak, Kai Richter und Rolf Ernst: SymTA/S - Symbolic Timing Analysis for Systems. WiP Proceedings of the Euromicro Conference on Real-Time Systems: 17-20 (2004)
URL sujet détaillé : :

Remarques : This thesis could result in a publication and/or a PhD thesis. There is a strong collaboration with the group of Rolf Ernst at TU Braunschweig (Germany) which could give the student the opportunity to travel there if she/he is interested.

Feel free to contact me if you have questions related to this subject!



SM207-61 Efficient symbolic resolution of systems of linear differential and difference equations  
Admin  
Encadrant : Frédéric CHYZAK
Labo/Organisme : SpecFun, INRIA
URL : http://specfun.inria.fr
Ville : Palaiseau

Description  

In the context of problems involving solutions of linear differential or difference equations, computer algebra is often crucially based on the resolution of auxiliary systems in terms of rational functions. The simplest approach to this problem is to uncouple the system, i.e., to obtain a scalar equation of higher order for each component of the vector solutions. Rational solutions are then computed using algorithms due to Abramov for the resolution of scalar equations. The uncoupling procedure by itself is already time consuming, so it seems advantageous to use direct solution algorithms that avoid uncoupling. The basis of such an algorithm exists (Barkatou, 1999), but it remains to adapt it to applications and to precisely quantify the expected gain.

The internship is expected to complete two goals. First, the algorithm described in (Barkatou, 1999) will be validated by an implementation in Maple, first in its simplest version, then in its parameterized version. This parameterized version will then be integrated in the Mgfun package that implements algorithms by (Chyzak, 2000) for the symbolic summation and integration in Maple.

- Moulay A. Barkatou: On rational solutions of systems of linear differential equations. Journal of Symbolic Computation 28 (1999), 547=96567.
- Frédéric Chyzak: An extension of Zeilberger's fast algorithm to general holonomic functions. Discrete Mathematics 217 (2000), 115=96134.
URL sujet détaillé : http://specfun.inria.fr/internships/

Remarques : Co-supervised by

Alin Bostan (alin.bostan((at))inria.fr) and
Frédéric Chyzak (frederic.chyzak((at))inria.fr)



SM207-62 Efficient computation of power series coefficients  
Admin  
Encadrant : Frédéric CHYZAK
Labo/Organisme : SpecFun, INRIA
URL : http://specfun.inria.fr
Ville : Palaiseau

Description  

The need to compute a single coefficient of a given formal power series is frequent in combinatorics. Indeed, in many case, the enumeration of objects subject to certain structural constraints is reduced to extracting a coefficient of a series. In more elaborate contexts, several parameters are involved, and the corresponding power series have several variables. The question is then to extract the coefficient of a monomial in several variables. For obvious efficiency reasons, it is important to be able to calculate this coefficient without computing all previous coefficients. This internship focuses on two fast algorithms, due to Massazza and Radicioni, for extracting such coefficients for classes of power series whose coefficients are related by a set of linear recurrences. The main objective of the internship is to implement these algorithms and compare them on various applications.

- P. Massazza and R. Radicioni: On computing the coefficients of rational formal series. (2004). In: Formal Power Series and Algebraic Combinatorics, Vancouver, 2004.
- P. Massazza and R. Radicioni: On computing the coefficients of bivariate holonomic formal series. Theoretical Computer Science 346 (2005), no. 2-3, 418=96438.
URL sujet détaillé : http://specfun.inria.fr/internships/

Remarques : Co-supervised by

Alin Bostan (alin.bostan((at))inria.fr) and
Frédéric Chyzak (frederic.chyzak((at))inria.fr)



SM207-63 Code automodifiant pour la sécurité des systèmes embarqués  
Admin  
Encadrant : Damien COUROUSSE
Labo/Organisme : CEA/DACLE
URL : www.cea.fr :
Ville : Grenoble

Description  

The CEA is developing a technology for runtime code generation around a tool called deGoal, which initial motivation is the observation that execution performance may be strongly correlated to the characteristics of the data to process (cf. related publications on this website).

Preliminary developments have been carried out to achieve the objectives of the project: the deGoal framework has been extended so that code polymorphism could be applicable to embedded constrained platforms such as microcontrollers; a preliminary validation of our approach has been performed on AES and experimental measures have been carried out.

The objectives of this internship are to continue the implementation of our software solution for code polymorphism, and to focus in particular on the performance of runtime code generation to reduce the overall overhead of our approach.

The intern will be in charge of:

identifying the main performance issues in the current implementation
perform an analysis about the potential degree of polymorphism achievable with our solution
design and implement more efficient solutions. The solutions will be taken in the perspective of the amount of polymorphic instances that could be generated with the new solution.
validate the approach experimentally

URL sujet détaillé : http://www.cogito-anr.fr/posts/internship-position-at-cea-grenoble.html

Remarques :



SM207-64 Algorithmic proofs for the transcendence of power series  
Admin  
Encadrant : Frédéric CHYZAK
Labo/Organisme : SpecFun, INRIA
URL : http://specfun.inria.fr
Ville : Palaiseau

Description  

A formal power series is called algebraic if it is a root of a polynomial, and transcendental otherwise. Establishing the transcendence of a generating function is a constant need in enumerative combinatorics. The goals of the internship are the study, the comparison and the implementation of algorithmic methods allowing to answer in a uniform manner the following question: Given a sequence of rational numbers satisfying a linear recurrence with polynomial coefficients, decide the transcendence of its generating function. The question is non trivial even for sequences satisfying a recurrence of first order. A classic algorithm is sufficient, in principle, to answer the general case (Singer, 1979). However, this algorithm suffers from too high a complexity to be effective in practice. A more recent method has been proposed, on a non-trivial combinatorial example, in (Bostan et al, 2014). The central objectives of this internship are to understand the new method, to translate it into an explicit algorithm working uniformly on a class of inputs to be defined, to implement this algorithm, test it and measure its theoretical and practical efficiency.

- A. Bostan, M. Bousquet-Mélou, M. Kauers, S. Melczer : On 3-dimensional lattice walks confined to the positive octant (2014), http://arxiv.org/abs/1409.3669.
- M. F. Singer: Algebraic Solutions of nth Order Linear Differential Equations. Queens Papers in Pure and Applied Mathematics, 54 (1979), 379=96420.
URL sujet détaillé : http://specfun.inria.fr/internships/

Remarques : Co-supervised by

Alin Bostan (alin.bostan((at))inria.fr) and
Frédéric Chyzak (frederic.chyzak((at))inria.fr)



SM207-65 Verification of Distributed Systems with Parameterized Network Topology  
Admin  
Encadrant : Paul GASTIN
Labo/Organisme : LSV, ENS Cachan
URL : http://www.lsv.ens-cachan.fr/~gastin/
Ville : Cachan

Description  

We consider distributed systems that can be run on arbitrary network topologies from a given class of architectures (e.g., on all pipelines, all grids, or all ring topologies). Typically, such a program is given by a single sequential process, and a copy of that process can be run on any node in the given network topology.

Parameterized systems are hard to analyze, and their verification problems quickly turn undecidable. There have been several orthogonal approaches to regain decidability: (i) restricting the amount of information that can be exchanged (e.g., unary tokens instead of binary ones), (ii) relaxing point-to-point communication towards broadcasting or nondeterministic choice, or (iii) imposing a bound on the number of contexts that a process traverses; a context may, e.g., restrict communication to one fixed neighbor, or to one type of communication such as =93send=94. In the internship, we propose to investigate some extensions of (iii). See also the detailed description of the internship and the references therein.
URL sujet détaillé : http://www.lsv.ens-cachan.fr/~bollig/MPRI/stage-M2.pdf

Remarques : Co-encadré par Benedikt Bollig:
http://www.lsv.ens-cachan.fr/~bollig/



SM207-66 Cordes pour l'analyse dynamique de formes 2D et 3D  
Admin  
Encadrant : David COEURJOLLY
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/david.coeurjolly
Ville : Villeurbanne

Description  

Que ce soit pour l'analyse de formes 2D/3D ou la modélisation interactive, il est important de pouvoir caractériser localement l'épaisseur d'un objet géométrique. Une approche classique consiste, en un point de la surface, à tracer une polyligne à l'intérieur de la forme de manière à atteindre un point situé de l'autre côté. De cette polyligne appelée corde, plusieurs informations géométriques peuvent être extraites : la longueur comme estimation de l'épaisseur locale, le centre comme appartenant à une approximation de l'axe médian de la forme, etc.

Si nous considérons un ensemble de cordes issues d'un ensemble de points sur la surface, nous pouvons également extraire des descripteurs globaux sur la forme 2D ou 3D (distribution d'épaisseur). Par ailleurs, dans un contexte de modélisation interactive où l'utilisateur fait évoluer la surface, ces cordes sont cruciales pour détecter des changements topologiques (auto-intersections...).

Dans la littérature, différents travaux se sont intéressés à ces structures. Les objectifs du stage seraient les suivants:

D'un point de vue algorithmique, plusieurs approches existent pour la construction de ces cordes (dans la direction de la normale au point, premier point de contact par croissance de boule, par un filtrage des boules de l'axe médian...). Il s'agira dans un premier temps d'implémenter des approches afin d'évaluer leurs coûts.
Dans un contexte dynamique, comment va évoluer l'ensemble des cordes et quels sont les évènements auquel cet ensemble va être soumis. On se servira en particulier de cette analyse pour maintenir une structure de cordes lorsque l'on déforme la surface de l'objet.
Pour de l'analyse de formes par la fonction d'épaisseur, quelles sont les robustesses/stablités de ces différentes approches en présence de bruit ?
Est-ce que ce système de cordes peut être à la base d'un processus d'indexation et de reconnaissance de formes 3D ?
Le stage se déroulera au sein du laboratoire LIRIS-Lyon1 (bâtiment Nautibus)

Références:
Fast and Accurate Approximation of Digital Shape Thickness Distribution in Arbitrary Dimension D. Coeurjolly, Computer Vision and Image Understanding (CVIU), Volume 116, Issue 12, December, pp 1159=961167, 2012
Shapira, L., Shamir, A., & Cohen-Or, D. (2008). Consistent mesh partitioning and skeletonisation using the shape diameter function. The Visual Computer, 24(4), 249-259.
Rolland-Neviere, X., Doërr, G., & Alliez, P. (2013). Robust diameter-based thickness estimation of 3D objects. Graphical Models, 75(6), 279-296.
A Dynamic Surface Reconstruction Framework for Large Unstructured Point Sets Rémi Allègre, Raphaëlle Chaine, and Samir Akkouche. Computers & Graphics, vol. 31, issue 2
URL sujet détaillé : :

Remarques : co-encadrement Raphaelle Chaine
Indémnisation de stage prévue.



SM207-67 Semi-computable geometry  
Admin  
Encadrant : Mathieu HOYRUP
Labo/Organisme : LORIA/Inria Nancy Grand-Est
URL : http://www.loria.fr/~hoyrup/
Ville : Nancy

Description  

The internship is about computability theory over the real numbers (and not algorithmic geometry). More specifically, it is about computability of simple geometrical figures of the plane. A subset of the plane is computable if there is a program that draws a pixelized version of the set at any scale on a screen (it colors pixels that intersect the set in white, the others in black). A subset of the plane is semi-computable if there is a program that at each scale progressively fills the set on the screen (but we never know when the coloration is finished). For instance some sets appearing in complex analysis (Julia sets) are semi-computable but not computable.

The goal is to understand what it means for a simple geometrical figure, such as a filled triangle, to be semi-computable. A filled polygon is computable iff its coordinates are computable real numbers. However such a characterization is lacking in the case of semi-computable geometrical figures. Many questions arise and will be studied during the internship:

- How is the semi-computability of a triangle reflected in its coordinates?

- Is it possible to have a parametrization of triangles in such a way that the semi-computable triangles are exactly the triangles with semi-computable parameters (a real number x is semi-computable from below if there is a program semi-deciding the set of rational numbers below x)?

- If it is possible, what is the minimal number of parameters?

- What about other geometrical figures, such as disks?

Concepts from advanced computability theory will naturally come in when studying this problem.

URL sujet détaillé : http://www.loria.fr/~hoyrup/geometry.pdf

Remarques :



SM207-68 Géodésiques sur des surfaces digitales  
Admin  
Encadrant : David COEURJOLLY
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/david.coeurjolly/
Ville : Villeurbanne

Description  

Dans de nombreuses applications d'analyse et de traitement de formes 3D, le calcul de géodésiques sur la surface d'objets est un problème préliminaire très important. Ainsi, être capable de construire, sur la surface, le plus court chemin entre deux points est un problème variationnel complexe d'un point de vue théorique et algorithmique.

Récemment, des approches par optimisation de champs de vecteurs ont été proposées avec un objectif encore plus général : il ne s'agit plus construire le plus court chemin entre deux points mais le transport optimal entre deux mesures sur la surface.

Le contexte applicatif de ce stage est l'analyse de structure fibreuses (cf image) pour laquelle la construction d'une métrique intrinsèque (géodesique) à la structure est curciale pour la reconstruction en graphe de ces objets ou encore la simulation de diffusion thermo-electique dans le matériaux.

Dans le cas des surfaces digitales (bords d'objets de Z^3), des premières expérimentations de ces outils ont été menées (au sein de la bibliothèque open-source DGtal). Il s'agira donc de :

Poursuivre l'expérimentation et notamment optimiser les calculs (en temps et en mémoire).
Dans le cas de surfaces digitales, est-ce que des propriétés de convergence peuvent être obtenues sur ces structures ?
Utiliser ces outils pour la caractérisation de matériaux fibreux.

Références:
Solomon, J., Rustamov, R., Guibas, L., & Butscher, A. (2014). Earth mover's distances on discrete surfaces. ACM Transactions on Graphics (TOG), 33(4), 67.
Crane, K., Weischedel, C., & Wardetzky, M. (2013). Geodesics in heat: a new approach to computing distance based on heat flow. ACM Transactions on Graphics (TOG), 32(5), 152. ISO 690
DGtal: http://dgtal.org
URL sujet détaillé : http://liris.cnrs.fr/david.coeurjolly/sujets/

Remarques : Indémnisation de stage prévue.



SM207-69 Études des algorithmes pour différentes classes de jeux stochastiques, jeux représentés implicitement  
Admin  
Encadrant : Yann STOZECKI
Labo/Organisme : PRiSM
URL : http://www.prism.uvsq.fr/
Ville : Versailles

Description  


We have a team at PRiSM working on algorithmic game theory.
We are interested in game with two players and randomness.
The first task will be to compare the different methods to compute optimal strategies
in all possible variations of theses games (the most classical examples being parity games and simple stochastic games).
The aim is to understand what really makes hard a stochastic game (there are not NP complete but no polynomial algorithm are known to find optimal strategies for all but the most contrieved class of games).

Then we will try to give a relevant notion of game represented implicitely (for instance a graph whose states are given in some simple implicit manner).
We will try to adapt the methods useful for classical stochastic games to these implicit games.
We plan to use this kind of modelization for an application to traffic management algorithm.

We also plan to devise new methods to compute optimal strategies in classical games. We will try to adapt
learning algorithms such as regret minimization to compute a sequence of strategies converging to the optimal straegy.



See the more detailed french subject at http://www.prism.uvsq.fr/~ystr/stages.html
URL sujet détaillé : http://www.prism.uvsq.fr/~ystr/stages.html

Remarques : Co-encadrement par David Auger et Pierre Coucheney.



SM207-70 Complexity of enumeration  
Admin  
Encadrant : Yann STROZECKI
Labo/Organisme : PRiSM
URL : http://www.prism.uvsq.fr/
Ville : Versailles

Description  

This internship is part of the ANR project AGGREG, to which several teams in Lille, Marseille, Caen and Paris 7 participate.
In complexity, we are usually interested in the time used to decide wether a problem has a solution.
In enumeration complexity, we are interested in the time used to produce all solutions, which can be in very large number.
In this context, one of the relevant complexity measure is the time between two produced solutions.

Quite a lot of complexity classes for enumeration have been designed, their major weakness being that they have no complete problem.
However, it is possible to separate some of these classes modulo complexity hypotheses such as P != NP.
The aim of this internship will be to separate some of these classes, in particular the incremental delay with the polynomial delay.
To do this we will try to classify the problems which can be generated by simple rules of saturation of solutions.
We hope to be able to prove, like it is done for CSP, that some rules yield a polynomial delay algorithm while the rest are
hard in some way (to determine) for the incremental delay.


URL sujet détaillé : http://www.prism.uvsq.fr/~ystr/stages.html

Remarques : Possibilité de rémunération et de financement de thèse.



SM207-71 Connectivity inference in structural proteomics  
Admin  
Encadrant : David COUDERT
Labo/Organisme : Inria Sophia Antipolis - Méditerranée
URL : https://team.inria.fr/coati/
Ville : Sophia Antipolis

Description  

Context and multi-disciplinary nature:
Molecular assemblies involving from tens to hundreds of macro-molecules (proteins or nucleic acids) are commonplace, yet, very little is known on their structure. To fill this gap, a number of biophysical experiments (mass spectrometry, tandem affinity purification, etc.) providing information on the composition of sub-systems called oligomers are being developed [SR07]. More precisely, given an assembly, one such experiment gives access to the composition of one sub-system, called oligomer, so that experiments for multiple overlapping oligomers convey information on the overall connectivity. For example, if one oligomer reduces to two proteins, these proteins touch in the assembly. Connectivity inference is the problem concerned with the elucidation of this connectivity. Instances for the connectivity inference problem can be modeled as a graph in which each vertex corresponds to a protein, and an assembly (oligomer or set of proteins) is a connected subset of vertices.

The problem is of multi-disciplinary nature, since developing solutions requires a virtuous circle between bio-physics on the one hand, and algorithmic - optimization - graph theory on the other hand.


Goals:
The connectivity inference problem in the particular case where each protein species is represented by a single copy has recently been studied using tools from graph theory and integer linear programming [AAC+13, ACCC]. Roughly, the minimum connectivity inference problem is to find the smallest subset of edges of a complete graph such that each subset of vertices associated to an assembly gets connected. This optimization problem has been proved NP-hard and hard to approximate in general. Nonetheless, when the number of proteins is moderate (less than 20), the problem can be solved efficiently using mixed integer linear programming (MILP) and a greedy approximation algorithm.
These two types of algorithms generate ensembles of (optimal) solutions, from which consensus solutions were defined. Tests conducted on systems with up to 20 proteins show an almost perfect agreement between the predicted contacts and the experimentally determined ones.

The goal of this internship is to go beyond this state of the art, in the following directions:
* Understand the relationship between the structure of the oligomers and the reported solutions.
* Exploit hypothesis on the geometry of the proteins/oligomers, to speed up the optimization process.
* Generalize the modeling in order to handle the cases where multiples copies of a protein are present.


References:
[AAC+13] D. Agarwal, J. Araujo, C. Caillouet, F. Cazals, D. Coudert, and S. Perennes. Connectivity inference in mass spectrometry based structure determination. In H.L. Bodlaender and G.F. Italiano, editors, European Symposium on Algorithms (LNCS 8125), pages 289=96300, Sophia-Antipolis, France, 2013. Springer.

[ACCC] D. Agarwal, C. Caillouet, F. Cazals, and D. Coudert. Unveiling contacts within macro-molecular assemblies by solving minimum weight connectivity inference problems. submitted.

[DDC13] T. Dreyfus, V. Doye, and F. Cazals. Probing a continuum of macro-molecular assembly models with graph templates of sub-complexes. Proteins: structure, function, and bioinformatics, 81(11):2034=962044, 2013.

[SR07] M. Sharon and C.V. Robinson. The role of mass spectrometry in structure elucidation of dynamic protein complexes. Annu. Rev. Biochem., 76:167=96193, 2007.

URL sujet détaillé : ftp://ftp-sop.inria.fr/coati/Stages/coati-abs-1415.pdf

Remarques : Co-supervisors of the internship:
* Christelle Caillouet (COATI)
* Frédéric Cazals (ABS, https://team.inria.fr/abs/)

Remuneration according to the Inria grid.

Possibility to follow-up with a PhD thesis.



SM207-72 Modèles mécaniques formels pour l'algorithmique distribuée (comportements probabilistes)  
Admin  
Encadrant : Xavier URBAIN
Labo/Organisme : LRI, Université Paris-Sud
URL : http://pactole.lri.fr
Ville : ORSAY

Description  

Distributed computing is one of the domains where informal reasoning
is not an option, in particular when Byzantine failures are
involved. What characterises also Distributed Computing is its
diversity of models subtle modifications of which induce radical
change in the system behaviour. We consider Robot Networks, that is
swarms of *autonomous* mobile entities that have to accomplish
some task in *cooperation*. In this emerging framework, models
can be distinguished by the capabilities of robots, the topology of
the considered space, the level of synchrony (that is the properties
of a demon), the type of the failures likely to occur, etc.

We are interested in obtaining formal and moreover mechanical
guarantees of properties for certain protocols, using the Coq proof
assistant. A Coq framework (http://pactole.lri.fr) for robot networks recently
proposed can express quite a few variants of models for such networks,
in particular regarding topology or demon properties. This framework
is higher order and based on coinductive types. It allows to prove in
Coq positive results (the task will be fulfilled using the algorithm
embedded in all robots *for all* initial configuration) as well
as negative results (there cannot be *any* embedded algorithm
that will work for this task for all initial
configuration).

An objective of this proposal is to extend the formal framework so as
to express and allow one to work on robot behaviours that are *probabilistic*.
As a matter of fact,
randomised algorithms are most convenient to handle situations where
symmetricity makes any deterministic protocol useless. See for
instance a recent article by Yamauchi &
Yamashita (DISC 14). The Coq library
ALEA will be an interesting starting
point to this goal.
URL sujet détaillé : http://pactole.lri.fr/pub/M2_1415_prob.pdf

Remarques : Co-encadrement, avec :
- Pierre Courtieu (pierre.courtieu.fr)
- Sébastien Tixeuil (sebastien.tixeuil.fr)


Compétence en Coq nécessaire.



SM207-73 Information flow type preserving compilation in the CompCert verified compiler  
Admin  
Encadrant : David PICHARDIE
Labo/Organisme : IRISA
URL : http://www.irisa.fr/celtique/pichardie/
Ville : Rennes

Description  

Cache-based attacks are a class of side-channel attacks that are particularly effective in virtualized or cloud-based environments, where they have been used to recover secret keys from cryptographic implementations. One common approach to thwart cache-based attacks is to use constant-time implementations,i.e. which do not branch on secrets and do not perform memory accesses that depend on secrets.
In [1], Barthe et al propose a new information-flow analysis [2] that checks if an x86 application executes in constant-time. The analysis if formally verified in the Coq proof assistant and integrated in the verified Compcert compiler [3]. The effectiveness of the tool has been demonstrated on several medium-size cryptographic implementations like AES, DES and SHA256.
The current tool rejects program which may branch on secrets. The goal of this internship is to leverage this limitation and allows such a branch if we can guaranty both branches will have the same executation time. We would like the approach to be developed at the RTL representation of the compcert compiler, and compile along Compcert chain trough assembly code.
This internship requires a strong interest and background in formal semantics, static analysis and proof assistant.

References :
[1] G. Barthe, G. Betarte, J. D. Campo, C. Luna and D. Pichardie. System-level non-interference for constant-time cryptography. 21st Conference on Computer and Communications Security (CCS 2014).
[2] D. M. Volpano, C. E. Irvine, G. Smith: A Sound Type System for Secure Flow Analysis. Journal of Computer Security 1996
[3] X. Leroy. A Formally Verified Compiler Back-end. Journal of Automated Reasoning 2009.
URL sujet détaillé : :

Remarques : The internship will take place at IRISA Rennes or IMDEA Software in Madrid. David Pichardie (ENS Rennes) and Gilles Barthe (IMDEA Software Madrid) will supervise the internship.




SM207-74 Internet of Things  
Admin  
Encadrant : Bernard TOURANCHEAU
Labo/Organisme :
URL : http://drakkar.imag.fr
Ville : Grenoble

Description  

The Internet of Things (IoT) is the latest phase in the expansion of the Internet where billions of smart sensors and actuators will be reachable by their IPv6 address. A consequence of the large number of these devices is their low-cost requirement. This results in hardware that is typically very constrained in terms of processing power, memory and particularly energy. Nevertheless, 15 years of research have resulted in optimal distributed algorithms and protocols that allow =93smart objects=94 to form multi-hop wireless sensor networks (WSN) in order to facilitate data exchange.

One of the most widely used mechanisms for timing and suppression of messages in WSN is the Trickle algorithm [1]. For example, Trickle regulates the emission of control messages of the principal routing protocol for the Internet of Things, called RPL (RFC 6550), and therefore directly affects the lifetime of the network and of the individual nodes. Whatsoever, due to its wide spread use, Trickle has been separately standardized in RFC 6206, and many other popular protocols build upon it, making the understanding of its behavior crucial for performance optimization of control overhead.

Latest research, however, suggests that Trickle may suffer from unfairness - it distributes unequally the transmission load among nodes in a network. As a consequence of the limited energy supply of IoT devices, this leads to premature battery depletion and shortened lifetime of individual nodes, but also the whole network. The root cause for this unfairness is the probability of transmission with Trickle for individual nodes, dependent on the network topology. Due to the distributed nature of networks, each node only has a local view of its surroundings and therefore a different probability of transmission with Trickle.

The goal of the internship is to study this transmission probability and to derive a distributed algorithm that will equalize the possibility of transmission among nodes in the network, based on existing, analytical models of Trickle [2] [3] [4]. The algorithm may be coupled with a simple communication protocol, allowing nodes to communicate among themselves their current transmission probabilities and converge to the optimal one, which will render the equalized transmission load in a Trickle network. The derived algorithm will be implemented and validated by experiments for state-of-the-art IoT devices and the Contiki operating system.

Requirements:
- Solid math background and affinity for analytical models (see references), Basic Probability, Distributed Algorithmics
- C Programming language
- Networking (UDP, IPv6)

Open to learn or/and exploit knowledge in:
- IoT systems
- Wireless Communications
- Approximation Algorithms
- Wireless Sensor Networks
- Python Programming Language
- Embedded Systems

References:
[1] P. A. Levis, N. Patel, D. Culler, and S. Shenker, =93Trickle: A self regulating algorithm for code propagation and maintenance in wireless sensor networks,=94 in Proceedings of the First USENIX/ACM Symposium on Networked Systems Design and Implementation,, 2004.

[2] H. Kermajani, C. Gomez, and M. Arshad, =93Modeling the message count of the trickle algorithm in a steady-state, static wireless sensor network,=94 Communications Letters, IEEE, vol. 16, no. 12, pp. 1960=96 1963, December 2012.

[3] T. M. Meyfroyt, S. C. Borst, O. J. Boxma, and D. Denteneer, =93Data Dissemination Performance in Large-scale Sensor Networks,=94 in The 2014 ACM International Conference on Measurement and Modeling of Computer Systems,ser.SIGMETRICS'14. NewYork,NY,USA:ACM, 2014, pp. 395=96406. 


[4] T. Coladon, M. Vucinic, and B. Tourancheau, =93Modeling the Trickle Algorithm and Increasing its Fairness with Variable Redundancy Constants,=94 Technical Report, Grenoble Informatics Laboratory, July 2014, pp. 1-9.

URL sujet détaillé : :

Remarques : Co-encadrant: Malisa Vucinic
Renumeration: ~400 eur/month



SM207-75 Optimal complexity bounds for real root refinement for polynomial systems  
Admin  
Encadrant : Elias TSIGARIDAS
Labo/Organisme : Inria & UPMC (LIP6 - Paris 6)
URL : http://www-polsys.lip6.fr/~elias
Ville : Paris

Description  

Given isolating intervals for all the real roots of a univariate polynomial having degree d and coefficients of bitsize at most t, we can refine all the intervals by a factor of k = 2^L in almost optimal Boolean complexity, O~(d^t + d L). The algorithm uses a combination of bisections and Newton iterations.
We interested for an analogous optimal bound for the multivariate case.


The problem is as follows:
We are given a 0-dimensional polynomial system,
f_1(x) = = f_n(x) = 0,
where x = x_1, . . . , x_n.
The polynomials f_i have degree at most d and integer coefficients of maximum bitsize t. For each isolated real root of the system, z,
we are given an (isolating) hyperbox
B_0 = [a_1, b_1] x x [a_n, b_n],
where a_k, b_k are rationals, that contains z and no other root of the system.
Let w_0=width(B)=max{|b_1 − a_1|,...,|b_n − a_n|}.
We are targeting for an algorithm, with optimal Boolean complexity, for obtaining a refined hyperbox, B_t, such that width(B_t) = w_0/k, where k = 2^L, based on Newton operator.
URL sujet détaillé : :

Remarques :



SM207-76 Automatic speech recognition  
Admin  
Encadrant : Irina ILLINA
Labo/Organisme : LORIA/INRIA
URL : www.loria.fr :
Ville : Nancy

Description  

Framework of ANR project ContNomina

The technologies involved in information retrieval in large audio/video databases are often based on the analysis of large, but closed, corpora, and on machine learning techniques and statistical modeling of the written and spoken language. The effectiveness of these approaches is now widely acknowledged, but they nevertheless have major flaws, particularly for what concern proper names, that are crucial for the interpretation of the content.
In the context of diachronic data (data which change over time) new proper names appear constantly requiring dynamic updates of the lexicons and language models used by the speech recognition system.
As a result, the ANR project ContNomina (2013-2017) focuses on the problem of proper names in automatic audio processing systems by exploiting in the most efficient way the context of the processed documents. To do this, the student will address the contextualization of the recognition module through the dynamic adjustment of the language model in order to make it more accurate.

Subject

Current systems for automatic speech recognition are based on statistical approaches. They require three components: an acoustic model, a lexicon and a language model. This stage will focus on the language model. The language model of our recognition system is based on a neural network learned from a large corpus of text. The problem is to re-estimate the language model parameters for a new proper name depending on its context and a small number of adaptaion data. Several tracks can be explored: adapting the language model, using a class model or studying the notion of analogy.
Our team has developed a fully automatic system for speech recognition to transcribe a radio broadcast from the corresponding audio file. The post-doc will develop a new module whose function is to integrate new proper names in the language model.
URL sujet détaillé : :

Remarques : To be familiar with the tools for automatic speech recognition, background in statistics and computer program skills (​​C, object-oriented programming and Perl). Candidates should email a detailed CV and diploma.



SM207-77 Formal proof certificates using a zero knowledge protocol  
Admin  
Encadrant : Thomas GENET
Labo/Organisme : IRISA
URL : http://www.irisa.fr/celtique/genet
Ville : Rennes

Description  



Formal verification permits to prove properties on programs.
When a property on a software is true, it is still necessary to
convince the users that it has been proven. If users generally trust
big software companies, it may not be the case for isolated
developpers or for free software. In case, the developper will have to
prove to users that the proof has been done. It is the role of a proof
certificate. This kind of certificates has been studied in the "Proof
Carrying Code" (PCC) framework defined in [1]. In PCC, the developper
gives to the user sufficient informations to rebuild and replay the
proof on its software. The proof is replayed thanks to a checker that
the user has to blindly trust.

In this internship, we want to go further than what proposes PCC.
We want the developper to give to the user a certificate only showing
that the proof exists. In particular, the user will not have to replay
the proof and to trust a checker. This is possible by combining to
areas of computer science that are, apparently, disconnected:
mechanisms used in formal proofs on one side and cryptography on the
other side. The idea is to see a logical property to prove as a
ciphering key and the proof of this property as the secret key to
decipher. Then, the user can send a random value (a nonce) ciphered by the
property. If the developper has proven the property then he has the
secret key. Thus, he will be abble to get the nonce in clear. To prove to the
user that he has the proof, he will only have to send the nonce back:
this is the certificate.

This is a form of zero knowledge protocol [2] commonly used in
cryptography. This has, also, to do with the Curry-Howard isomorphism
used in the proof assistant Coq. This tool can be used to produce the proof
of a property on a program. In Coq, a property is represented by a type and a proof by any
lambda-term having this type. Thus, if we consider a type (a property)
has a ciphering key then a lambda-term (a proof of this property) will
be the deciphering key. This new case of the isomorphism has already
been studied in [3]. The aim of this internship is to start back from
this work and to produce a complete certification solution for any
property of the propositionnal logic. In particular, it will be
necessary to investigate what cryptographic functions are necessary
and to prove that the solution is robust using results of [4]. On
the cryptographic side, it will be also interesting to compare this
approach to [5] and [6].

[1] George Necula. Proof carrying code. POPL'97.

[2] http://en.wikipedia.org/wiki/Zero-knowledge_proof

[3] Amrit Kumar, Pierre-Alain Fouque, Thomas Genet, Mehdi Tibouchi. Proofs
as Cryptography: a new interpretation of the Curry-Howard isomorphism for software certificates.
http://hal.inria.fr/hal-00715726

[4] Martin Abadi, Véronique Cortier. Deciding knowledge in security
protocols under equational theories. TCS 367(1-2):2-32 (2006).
http://users.soe.ucsc.edu/~abadi/Papers/AbadiCortier.pdf


[5] Yael Tauman Kalai, Ran Raz, and Ron D. Rothblum
Delegation for bounded space
In Proceedings of the 45th annual ACM symposium on Theory of computing (STOC), pp. 565-574, 2013.
Available at: http://research.microsoft.com/en-us/um/people/yael/publications/2013-delegation-bounded-space.pdf

[6] Eli Ben-Sasson, Alessandro Chiesa, Daniel Genkin, Eran Tromer, Madars Virza, SNARKs for C: verifying program executions succinctly and in zero knowledge , proc. CRYPTO 2013, Part II, LNCS 8043, 90-108, Springer, 2013.
Available at http://www.tau.ac.il/~tromer/papers/csnark-20131007.pdf
URL sujet détaillé : :

Remarques : Pierre-Alain Fouque (http://www.di.ens.fr/~fouque/)



SM207-78 Complexity of homology computation methods on cellular structures  
Admin  
Encadrant : Pascal LIENHARDT
Labo/Organisme : XLIM-SIC
URL : http://xlim-sic.labo.univ-poitiers.fr/membres/lienhard/
Ville : Poitiers

Description  

Many combinatorial structures exist to encode subdivisions of an n-dimensional space. They are for instance used in topology-based geometrical modeling. For such applications, it is particularly important to be able to compute topological characteristics of the constructed object, such as connected components, holes in any dimension... These particular properties can be obtained through the computation of the so-called homology groups of the object.

We are particularly interested in the computation of such groups on structures derived from combinatorial maps. Several approaches have been theoretically explored, mainly incremental computation during the construction of an object, and direct computation on the whole object.

We are now studying the complexity of these approaches, and partial results have been obtained. The goal of the work is to explore the open questions related to complexity, from both theoretical and practical points of view.

The student will have to understand at least one of the methods to provide an efficient implementation of it, to check experimentally its complexity for different classes of objects, and also to study open problems related to the evaluation of its theoretical complexity.

Do not hesitate to contact us for further information.

References:
[1] N-dimensional generalized combinatorial maps and cellular quasi-manifolds, Lienhardt P., International Journal on Computational Geometry and Applications, Vol. 4, n=B0 3, pp. 275-324 - 1994
[2] A Boundary Operator for Computing the Homology of Cellular Structures, Alayrangues S., Damiand G., Lienhardt P., Peltier S.
[hal-00683031 − version 2]

URL sujet détaillé : :

Remarques : co-advisors:
_ Sylvie Alayrangues : sylvie.alayrangues-poitiers.fr
_ Samuel Peltier : samuel.peltier-poitiers.fr



SM207-79 Logique temporelle multi-agents pour la modélisation formelle des exigences  
Admin  
Encadrant : Julien BRUNEL
Labo/Organisme : ONERA - DTIM
URL : http://www.onera.fr/staff/julien-brunel/
Ville : Toulouse

Description  

Requirement engineering consists in methods and tools which allows to provide a clear, non ambiguous specification of a system to design. I can be seen as the step from the informal to the formal. Agent- and Goal-oriented methods, such as KAOS and i*/Tropos, are particularly convincing academic propositions. They emphasize different concepts, in particular (1) agents, which represent the different parts/components of the system that play a role in the evolution of the system and collaborate in order to fulfill goals; (2) goals, which represent high-level needs and are refined in more and more precise goals until we get the requirements of the system. These methods have already been related to formal methods. In particular, many goals can be seen as temporal logic formulas.

Onera has proposed a language (Khi) allowing to analyze the relations between goals an agents. To handle this, the semantics of Khi relies on a multi-agents temporal logic, called Updatable Strategy Logic (USL). USL extends temporal logic with the ability to reason about the strategies that agents follow in order to fulfill their goals.

In this internship, one of the following lines will be treated, depending on the applicant:
- a verification line: the objective is to get a reasonable way to check that a formula is satisfied by a model (model checking problem). Indeed, the theoretical complexity of USL model checking is non elementary. One goal is to define a fragment of (or a variation on) USL which enjoys a better complexity, and/or to propose an efficient implementation of the USL model checking algorithm.

- a requirement engineering line: the objective is to apply Khi to a concrete case study, and to assess limits and advantages of Khi. In particular, a focus can be put on the way the data structures can be formalized (with first-order logic for instance).
URL sujet détaillé : http://sites.onera.fr/stages/sites/sites.onera.fr.stages/files/DTIM-2015-22_0.pdf

Remarques : rémunération : 500 - 600 euros/mois



SM207-80 Lightweight formal verification for OCaml  
Admin  
Encadrant : Thomas GENET
Labo/Organisme : IRISA
URL : http://www.irisa.fr/celtique/genet
Ville : Rennes

Description  


Keywords: Semantics, OCaml, Static analysis, Rewriting, Regular tree
languages, Tree automata, Abstract interpretation.





Some strongly typed programming languages (like Haskell,
OCaml, Scala et F#) have a type inference mechanism.
This mechanism permits to automatically detect some kind of errors in
a program. To prove richer properties on programs (stronger than being
correctly typed), people usually use proof assistants that require a
high level of expertise both to define the property and to achieve the proof.



We propose to define a lightweight formal verification tool for
functional programming languages relying on a representation finer
than what offer simple types. Several propositions in this direction have been
made using type systems extended by arithmetic constraints [1,2] or
intersection types [3]. Here, we propose to use
regular languages to represent the (infinite) set of possible inputs
of a function. We already know that it is possible to compute an
over-approximation of the image of a regular language by a term
rewriting system using a form of abstract interpretation [4]. Thus,
using this technique, if we transform a functional program into a term
rewriting system, it is possible to get an over-approximation of the
(infinite) set of results of the function for a given (infinite) set
of input values [5].



The objective of this internship is to define a transformation of an
interesting subset of OCaml to term rewriting systems. This
transformation should produce systems that can be approximated by
techniques defined in [4,5]. One of the challenges will be to deal
with higher order functions. Transforming a higher order functional
program into a first order term rewriting system is possible [6] and
gives encouraging results [7]. The results obtained during this
internship will have to be compared with the state-of-the-art analysis
techniques for higher-order programs [8,9,10].




[1] P. Rondon, M. Kawaguci, R. Jhala: Liquid Types. PLDI'08.



[2] D. Xu, S. P. Jones, K. Claessen: Static contract checking for
Haskell. POPL'09.



[3] G. Castagna, K. Nguyen, Z. Xu, H. Im, S. Lenglet, L. Padovani:
Polymorphic functions with set-theoretic types: part 1: syntax,
semantics, and evaluation. POPL'14.



[4] T. Genet, V. Rusu: Equational approximations for tree automata
completion. J. Symb. Comput. 2010.



[5] T. Genet: Towards Static Analysis of Functional Programs using
Tree Automata Completion. WRLA'14.



[6] N.~D. Jones, N. Andersen: Flow analysis of lazy higher-order
functional programs. TCS 2007.



[7] T. Genet, Y. Salmon: Tree Automata Completion for Static Analysis
of Functional Programs. Technical Report.
http://hal.inria.fr/hal-00780124



[8] C. H. Broadbent, A. Carayol, M. Hague, O. Serre: C-SHORe: a
collapsible approach to higher-order verification. ICFP'13.



[9] N. Kobayashi: Model Checking Higher-Order Programs. J. ACM 60. 2013.



[10] C.-H. L. Ong, S. J. Ramsay: Verifying higher-order functional
programs with pattern-matching algebraic data types. POPL'11.


URL sujet détaillé : :

Remarques : Barbara Kordy (http://satoss.uni.lu/members/barbara/)



SM207-81 Formal proof certificates using a zero knowledge protocol.  
Admin  
Encadrant : Thomas GENET
Labo/Organisme : IRISA
URL : http://www.irisa.fr/celtique/genet
Ville : Rennes

Description  

Mots clés: Lambda-calculus, Typing, Curry-Howard Isomorphism,
Zero knowledge proofs, cryptography.



Formal verification permits to prove properties on programs.
When a property on a software is true, it is still necessary to
convince the users that it has been proven. If users generally trust
big software companies, it may not be the case for isolated
developpers or for free software. In case, the developper will have to
prove to users that the proof has been done. It is the role of a proof
certificate. This kind of certificates has been studied in the "Proof
Carrying Code" (PCC) framework defined in [1]. In PCC, the developper
gives to the user sufficient informations to rebuild and replay the
proof on its software. The proof is replayed thanks to a checker that
the user has to blindly trust.



In this internship, we want to go further than what proposes PCC.
We want the developper to give to the user a certificate only showing
that the proof exists. In particular, the user will not have to replay
the proof and to trust a checker. This is possible by combining to
areas of computer science that are, apparently, disconnected:
mechanisms used in formal proofs on one side and cryptography on the
other side. The idea is to see a logical property to prove as a
ciphering key and the proof of this property as the secret key to
decipher. Then, the user can send a random value (a nonce) ciphered by the
property. If the developper has proven the property then he has the
secret key. Thus, he will be abble to get the nonce in clear. To prove to the
user that he has the proof, he will only have to send the nonce back:
this is the certificate.



This is a form of zero knowledge protocol [2] commonly used in
cryptography. This has, also, to do with the Curry-Howard isomorphism
used in the proof assistant Coq. This tool can be used to produce the proof
of a property on a program. In Coq, a property is represented by a type and a proof by any
lambda-term having this type. Thus, if we consider a type (a property)
has a ciphering key then a lambda-term (a proof of this property) will
be the deciphering key. This new case of the isomorphism has already
been studied in [3]. The aim of this internship is to start back from
this work and to produce a complete certification solution for any
property of the propositionnal logic. In particular, it will be
necessary to investigate what cryptographic functions are necessary
and to prove that the solution is robust using results of [4]. On
the cryptographic side, it will be also interesting to compare this
approach to [5] and [6].



[1] George Necula. Proof carrying code. POPL'97.



[2] http://en.wikipedia.org/wiki/Zero-knowledge_proof



[3] Amrit Kumar, Pierre-Alain Fouque, Thomas Genet, Mehdi Tibouchi. Proofs
as Cryptography: a new interpretation of the Curry-Howard isomorphism for software certificates.
http://hal.inria.fr/hal-00715726



[4] Martin Abadi, Véronique Cortier. Deciding knowledge in security
protocols under equational theories. TCS 367(1-2):2-32 (2006).
http://users.soe.ucsc.edu/~abadi/Papers/AbadiCortier.pdf




[5] Yael Tauman Kalai, Ran Raz, and Ron D. Rothblum
Delegation for bounded space
In Proceedings of the 45th annual ACM symposium on Theory of computing (STOC), pp. 565-574, 2013.
Available at: http://research.microsoft.com/en-us/um/people/yael/publications/2013-delegation-bounded-space.pdf



[6] Eli Ben-Sasson, Alessandro Chiesa, Daniel Genkin, Eran Tromer, Madars Virza, SNARKs for C: verifying program executions succinctly and in zero knowledge , proc. CRYPTO 2013, Part II, LNCS 8043, 90-108, Springer, 2013.
Available at http://www.tau.ac.il/~tromer/papers/csnark-20131007.pdf


URL sujet détaillé : :

Remarques : Pierre-Alain Fouque (http://www.di.ens.fr/~fouque/)



SM207-82 Towards an optimal data locality in MapReduce  
Admin  
Encadrant : Shadi IBRAHIM
Labo/Organisme : Inria, Rennes Bretagne Atlantique Research Center
URL : https://www.irisa.fr/kerdata/doku.php?id=start
Ville : Rennes

Description  

MapReduce [1] is emerging as a prominent tool for large-scale data analysis: It is often advocated as an easier-to-use, efficient and reliable replacement for the traditional programming model of moving the data to the computation. The popular open source implementation of MapReduce, Hadoop [2], was developed primarily by Yahoo!, where it processes hundreds of tera-bytes of data on at least 10,000 cores, and is now used by other companies, including Facebook, Amazon, Last.fm, and the New York Times.

The Hadoop MapReduce system runs on top of the Hadoop Distributed File System (HDFS), where files are partitioned into chunks, and each chunk is replicated to tolerate failures. Data processing is co-located with data storage; when a node is available to process a map task, the job scheduler consults the storage metadata service to get the hosted chunks as close as possible, in this order: on the same node, on another node within the same rack and on another node outside the rack. The goal is to leverage data locality. Although, many studies have been dedicated to improve the locality of task executions in Hadoop through static data placement [3] or dynamic task scheduling [4]. Comparatively, little attention has been devoted on investigating the impacts of the several interleaved factors which contribute to data locality executions in Hadoop (i.e., data distribution, job size, node configuration, environment dynamicity) or on investigating the impact of data locality on the performance of MapReduce applications.

The proposed project aims at investigating the correlation between the aforementioned factors and to provide a step towards an optimal data locality execution in Hadoop and therefore improve the performance of MapReduce applications. This work will provide an important opportunity to learn how to make experiments with the Hadoop system for rigorous evaluations of Hadoop large-scale environment using scientific testbed (i.e., Grid'5000). Depending on the completion of the work, this work could lead to the publication of a research article.

References

[1] J. Dean and S. Ghemawat, =93MapReduce: simplified data processing on large clusters,=94 Communications of the ACM, vol. 51, no. 1, pp.107=96113, 2008.
[2] Hadoop Apache homepage: http://www.hadoop.org
[3] J. Xie, S. Yin; X.J. Ruan; Z.Y. Ding, Y. Tian, M. J. Manzanares and X. Qin, =93Improving MapReduce performance through data placement in heterogeneous Hadoop clusters,=94 Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on , vol., no., pp.1,9, 19-23 April 2010
[4] S. Ibrahim, H. Jin, L. Lu, B. He, G. Antoniu, S. Wu, =93Maestro: Replica-aware map scheduling for mapreduce=94, in CCGrid 2012.
URL sujet détaillé : https://www.irisa.fr/kerdata/doku.php?id=master

Remarques : Co-supervisors:
Gabriel Antoniu
Luc Bougé



SM207-83 Exploring the Limits of Distributed Computation  
Admin  
Encadrant : Adrian KOSOWSKI
Labo/Organisme : Laboratoire d'Informatique Algorithmique
URL : http://www.liafa.fr/web9/equiprech/listequipe_en.php?ideq=2
Ville : Paris

Description  

In the setting of distributed computing, the graph coloring problem is of fundamental practical importance, and also serves as a sort of "benchmark problem" for testing our understanding and limitations of local network-based computational models. Despite more than 20 years of intensive study, many fascinating research questions remain open.

This internship is intended to focus on a theoretical study of the interplay between time and quality of distributed graph coloring algorithms, taking into account aspects of determinism/randomization and properties of special graph classes.

For more details, please consult the PDF file of the proposal.

URL sujet détaillé : http://www.liafa.fr/~kosowski/stage-coloring.pdf

Remarques : Co-encadrement : Pierre FRAIGNIAUD, Adrian KOSOWSKI.



SM207-84 Real-time online emulation of real applications on SimGrid with Simterpose  
Admin  
Encadrant : Martin QUINSON
Labo/Organisme : LORIA
URL : http://www.loria.fr/~quinson
Ville : Nancy

Description  

The goal of this project is to design an evaluation environment for distributed applications (e.g.; P2P applications) where the instances of the real application (a standard & unmodified BitTorrent client) are executed in a
virtual environment simulated by the SimGrid simulator.
URL sujet détaillé : http://www.loria.fr/~quinson/Research/Students/2015-simterpose.pdf

Remarques : Rémunération prévue.



SM207-85 State Space Reduction for the Dynamic Formal Verification of Legacy Distributed Applications  
Admin  
Encadrant : Martin QUINSON
Labo/Organisme : LORIA
URL : http://www.loria.fr/~quinson
Ville : Nancy

Description  

The state space explosion problem greatly hinders the applicability of
exhaustive verification techniques. The goal of this work is to explore new
reduction techniques specifically tailored to the dynamic verification of large
legacy MPI applications.
URL sujet détaillé : http://www.loria.fr/~quinson/Research/Students/2015-reduction.pdf

Remarques : Rémunération prévue.



SM207-86 Optimizing Incremental MapReduce Computations for Live Data Streams  
Admin  
Encadrant : Alexandru COSTAN
Labo/Organisme : Inria / IRISA
URL : http://www.irisa.fr/kerdata/
Ville : Rennes

Description  

Big Data analytics on cloud infrastructures saw a rapid rise in recent years thanks to the promise of delivering valuable insight in an affordable fashion for the masses. However, so far most attention has been focused on how to optimize the performance and cost-effectiveness of the computations, while largely neglecting an important aspect: users need to upload massive datasets on the cloud for their computations.

This goal of this internship is to study the problem of how to run Big Data analytics applications based on MapReduce [1] when considering the simultaneous optimization of the performance and cost of both the data upload and its corresponding computation taken together. To this end, one approach would be to analyze the feasibility of incremental MapReduce frameworks (e.g. HourGlass [2] from LinkedIn) to advance the computation as much as possible during the data upload by using already transferred data to calculate intermediate results.

An interesting research direction is to investigate the possibility of dynamically adapting the computation to account for fluctuations and variability of the cloud environment. Also, we plan to tackle the cost optimization aspect by making use of a complex model that takes both computation and storage into consideration, following the new Resource-as-a-Service paradigm [3] introduced for the next generation clouds. In this context, we plan to study the viability of several storage elasticity features introduced by our previous work [4].

Experiments to validate this approach will be carried out on the Grid'5000 platform [5], in the context of the H2020 BigStorage project.


URL sujet détaillé : http://www.irisa.fr/kerdata/doku.php?id=master#optimizing_incremental_mapreduce_computations_for_live_data_streams

Remarques : This internship will be co-advised by Bogdan Nicolae (IBM Research) and will be remunerated.



SM207-87 A Theory of Complexity, Condition and Round-off  
Admin  
Encadrant : Felipe CUCKER
Labo/Organisme : Dept. of Mathematics, City University of Hong Kong
URL : http://www6.cityu.edu.hk/ma/
Ville : Hong Kong

Description  

In 1989 Blum, Shub and Smale presented a theory of complexity over the real numbers that was meant to model numerical computations as usually performed in numerical analysis. At the end of their paper they referred to the fact that their theory did not accounted for iterative algorithms or for the effects of finite-precision and explicitly stated as an open problem to incorporate these features into the theory.

In the last year I developed a theory that does precisely this. The paper can be downloaded at

http://arxiv.org/abs/1403.6241v5

It presents versions of the classes P and NP in terms of notions of "input size" and "cost of a computation" that capture the usual features of finite-precision computations.

The subject of the stage is to explore the extension of other common complexity classes within this framework.
URL sujet détaillé : :

Remarques : Joint supervision is possible (Pascal Koiran would be an appropriate supervisor at the LIP). Financial support is also possible.



SM207-88 Study and implementation of an intent recognition model in the context of the Web of Things  
Admin  
Encadrant : Michael MRISSA
Labo/Organisme : LIRIS
URL : http://liris.cnrs.fr/mmrissa/
Ville : Villeurbanne

Description  


This work is part of the ASAWoO ANR project (http://liris.cnrs.fr/asawoo/) that proposes a software infrastructure based on avatars [1,2] to build applications for the Web of Things. An avatar is a logical abstraction for a physical object. The avatar allows to extend the behavior and knowledge of physical objects, exploiting semantic Web and Web service technologies and reasoning about their contextual environment. In addition, an avatar allows inter-avatars cooperation mechanisms.

An avatar can participate to the realization of several objectives related to different applications driven by users and/or organizations.
To do so, it must cooperate with other avatars to fill its lack of knowledge or functionality. In order to collaborate with entities of its environment that have compatible objectives, an avatar must be able to identify the intents of other avatars based on its observation of their activities. We use a plan and intent recognition model [3] based on an artificial corpus to realize this observation.

It is necessary to demonstrate the applicability of this model in the context of the Web. In particular, it is required to design and develop a resource-oriented solution that allows to realize the different interactions necessary in the context of our intent recognition model.

References:

[1] J.-P. Jamont, L. Médini, M. Mrissa: A Web-Based Agent-Oriented Approach to Address Heterogeneity in Cooperative Embedded Systems. PAAMS (Special Sessions) 2014: 45-52

[2] M. Mrissa, L Médini, J.P. Jamont , Semantic Discovery and Invocation of Functionalities for the Web of Things, IEEE International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises, Parma. pp. 281-286. 2014.

[3] G. Sukthankar, R. P. Goldman, C. Geib, D. V. Pynadath , H. H. Bui (editors), Plan, Activity and Intent Recognition, Morgan Kaufmann, 2013.

[4] Ferber. J. Les systèmes multi-agents. Vers une intelligence collective, InterEditions, Paris, 1995.


URL sujet détaillé : :

Remarques :
Encadrement :
Jean-Paul Jamont (LCIS, Université de Grenoble Alpes),
Michael Mrissa (LIRIS CNRS, Université de Lyon)





SM207-89 A Security Property and Constraint Description Language  
Admin  
Encadrant : Brahim HAMID
Labo/Organisme : Institut de Recherche en Informatique de Toulouse (IRIT)
URL : www.semcomdt.org :
Ville : Toulouse

Description  

The intership is part of the SEMCO project (www.semcomdt.org) which is a framwork to assist system and software developers in the domain of security and safety critical systems to capture, implement and document distributed system applications. SEMCO aims at supporting model- and pattern-based development approaches by defining and providing several artifacts types representing the different related engineering concerns (Security, Dependability, Safety and Ressources) and architectural information. These artifacts are stored in a model-based repository and provided in the form of modeling languages (SEMCO-ML), tools (SEMCO-MDT) and methodologies (SEMCO-PM).

Goal:
The objective of this work is to design a modeling language for the specification of security properties and constraints in the context of model and pattern-based security engineering. This language is closely related to a modeling, formal specification and validation framework of security pattern studied in the context of the SEMCO project.

Tasks:
1) Identify a set of roles that a security pattern description language should perform to support a security engineering process, reuse process and the representations
commonly used in industry and academy.
2) Define a set of design requirements that should be considered when building a security property and constraint description language.
3) Design a security property and constraint description language to satisfy these requirements providing a language
for the specification of security pattern properties and constraints.
4) Develop an xtext-based editor and suggest appropriate transformations in order to integrate this tool into the semcomdt tool-suite.
URL sujet détaillé : :

Remarques : Le stage est rémunéré



SM207-90 Faster, Faster, Faster  
Admin  
Encadrant : Akim DEMAILLE
Labo/Organisme : Laboratoire de Recherche et Développement de l'EPITA (LRDE)
URL : https://www.lrde.epita.fr
Ville : Le Kremlin-Bicêtre

Description  

The classical theory of automata, of transducers and of rational expressions, admits a very elegant and extremely useful extension (eg, in natural language processing) taking into account the concept of weighting. The weights are then taken in a semi-ring, which can be classical (⟨𝔹, ∨, ∧⟩, ⟨ℤ, +, =D7⟩, ⟨ℚ, +, =D7⟩, etc..), tropical (⟨ℤ, min, +⟩, etc..), or yet of another type (e.g. rational expressions).
Vcsn is a project led by Alexandre Duret-Lutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical object-oriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI).
Vcsn aims at speed and user friendliness. To achieve this goal, it is built in three layers:
- static: this is a templated C++ library with efficient data structures and algorithms
- dyn: this C++ API support type-erasure and shields the user from the complex API of static while preserving performances. It also support code-generation and linking on the fly
- Python: on top of dyn, a Python binding allows a pleasant use of Vcsn, without any C++ knowledge required.

The objective of this internship is to improve the performances of Vcsn in every possible ways. This includes:
- design of a binary format to save and restore automata (possibly with mmap)
- improvements to the core data structures
design of efficient alternatives to classical data structures such as found in the C++ library
- improvement of the existing algorithms taken from Automata Theory, such as epsilon-transition removal
- optimization of the dyn type-erased dynamic dispatch
development of benches and comparison with other platforms
etc.
URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_AD_Faster,_Faster,_Faster

Remarques :



SM207-91 Symmetric Difference Automata  
Admin  
Encadrant : Akim DEMAILLE
Labo/Organisme : Laboratoire de Recherche et Développement de l'EPITA (LRDE)
URL : https://www.lrde.epita.fr
Ville : Le Kremlin-Bicêtre

Description  

The classical theory of automata, of transducers and of rational expressions, admits a very elegant and extremely useful extension (eg, in natural language processing) taking into account the concept of weighting. The weights are then taken in a semi-ring, which can be classical (⟨𝔹, ∨, ∧⟩, ⟨ℤ, +, =D7⟩, ⟨ℚ, +, =D7⟩, etc..), tropical (⟨ℤ, min, +⟩, etc..), or yet of another type (e.g. rational expressions).
The field 𝔽₂ is a particular interest, and shows quite a few unique properties, as demonstrated by the current active research under various names (e.g., "Symmetric Difference Automata").
Vcsn is a project led by Alexandre Duret-Lutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical object-oriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI).

The objective of this internship is to exploit the existing features of Vcsn to apply them to 𝔽₂ in a first step, and then, in a second step, to use Vcsn as a tool to explore novel results.
URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_AD_2015_Symmetric_Difference_Automata

Remarques :



SM207-92 Syntactic Monoids  
Admin  
Encadrant : Akim DEMAILLE
Labo/Organisme : Laboratoire de Recherche et Développement de l'EPITA (LRDE)
URL : http://www.lrde.epita.fr
Ville : Le Kremlin-Bicêtre

Description  

Many properties and characteristics of an automaton can be easily computed from its syntactic monoids. Such properties are of particular importance to theoreticians.
Vcsn is a project led by Alexandre Duret-Lutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical object-oriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI).
Vcsn has a sound base of data structure and algorithms for automata and rational expressions. However, it offers no support for syntactic monoids at all.

The objective of this internship is to develop support for syntactic monoids in Vcsn, and to implement recent research results in Automata Theory that use the syntactic monoid.
URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_AD_2015_Syntactic_Monoids

Remarques :



SM207-93 Vcsn for Linguists  
Admin  
Encadrant : Akim DEMAILLE
Labo/Organisme : Laboratoire de Recherche et Développement de l'EPITA (LRDE)
URL : http://www.lrde.epita.fr
Ville : Le Kremlin-Bicêtre

Description  

The classical theory of automata, of transducers and of rational expressions, admits a very elegant and extremely useful extension (eg, in natural language processing) taking into account the concept of weighting. The weights are then taken in a semiring, which can be classical (⟨𝔹, ∨, ∧⟩, ⟨ℤ, +, =D7⟩, ⟨ℚ, +, =D7⟩, etc..), tropical (⟨ℤ, min, +⟩, etc..), or yet of another type (e.g. rational expressions).
Automata are heavily used in computational linguistics, and conversely, automata used in computational linguistics are "heavy".
Vcsn is a project led by Alexandre Duret-Lutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical object-oriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI).
Vcsn has a sound base of data structure and algorithms for automata and rational expressions. It is already able to deal with many of the typical needs of linguists. However some specific semirings have not been implemented, and some well-known algorithms are needed.

The objective of this internship is to develop a complete Computational Linguistics toolchain on top of Vcsn, for instance the "SMS to French" project from François Yvon (sms.limsi.fr). To this end, Vcsn will have to be completed with the needed data structures and algorithms, and possibility the existing implementation will have to be revised to cope with the extremely demanding size of these real-life automata.
URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_AD_2015_Vcsn_for_Linguists

Remarques :



SM207-94 Development of an IPython interface, and a couple of web interfaces for the Spot library  
Admin  
Encadrant : Alexandre DURET-LUTZ
Labo/Organisme : Laboratoire de Recherche et Développement de l'EPITA (LRDE)
URL : http://www.lrde.epita.fr
Ville : Le Kremlin-Bicêtre

Description  

The Spot library (http://spot.lip6.fr/), written in C++, offers several algorithms and data structures to build model checkers. It contains many algorithms for handling linear-time temporal logic (LTL) formulas, or for different variants of B=FCchi automata. Spot also distributes several command-line tools, built on top of the library. Finally, Spot includes some Python bindings, as well as a web interface (http://spot.lip6.fr/ltl2tgba.html) for easy access to one of the features of the library.

The goal of the internship is to improve the last two points (Python bindings and web interface): we would like to access all features of Spot from Python (and in particular from IPython), and on top of this Python interface, we would like more on-line services that people can use without having to install Spot.

URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_ADL_Python_Interfaces_for_Spot

Remarques :



SM207-95 Minimization of B=FCchi automata using SAT-solving  
Admin  
Encadrant : Alexandre DURET-LUTZ
Labo/Organisme : Laboratoire de Recherche et Développement de l'EPITA (LRDE)
URL : http://www.lrde.epita.fr
Ville : Le Kremlin-Bicêtre

Description  

The Spot library (http://spot.lip6.fr/) contains many algorithms for translating LTL formulas into B=FCchi automata, and to simplify these formulas and automata. A recently added technique (in Spot version 1.2) allows us to minimize deterministic B=FCchi automata using a SAT-solver.
To minimize a n-state deterministic B=FCchi automaton (an NP-complete problem) we encode its equivalence with a (n-1)-state deterministic B=FCchi automaton as a SAT problem, and let a SAT solver do the work of finding a solution. If such a solution is found, we try again, looking for a (n-2) state automaton, etc.

Presently, our first implementation saves the SAT problem as a huge file before calling the SAT-solver, and it does this for each iteration. The goal of this internship is to improve this situation in multiple ways:
- implement purely technical optimizations (like direct communication with the SAT solver, bypassing any temporary files)
- take advantage of feature offered by modern SAT solvers (for instance an "incremental SAT-solver" can be given new constraints after it has found a solution, without needing to restart from scratch)
- implement optimization to the encoding of the SAT-problem, based on structural properties of the automaton (we have some ideas).

In a second, step, we would like to generalize the existing technique to non-deterministic automata, or to different types of acceptance conditions (i.e., not B=FCchi).
URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_ADL_SAT-based_Minimization

Remarques :



SM207-96 Algorithms for automata over infinite words using a general acceptance condition  
Admin  
Encadrant : Alexandre DURET-LUTZ
Labo/Organisme : Laboratoire de Recherche et Développement de l'EPITA (LRDE)
URL : http://www.lrde.epita.fr
Ville : Le Kremlin-Bicêtre

Description  

The formal method community uses several types of automata over infinite words (called "omega-automata"): B=FCchi automata, Streett automata, Rabin automata... These automata differ by their acceptance condition, i.e., the condition under which an infinite run of the automaton is accepted. For each of these different acceptance conditions there exist specialized algorithms, for instance to decide if an automaton is accepting, or to change an acceptance condition into another, or to determinize an automaton...
We have recently defined file format for exchanging omega-automata between tools. Ce format can represent all the above types of automata thanks to a very general way to to represent the acceptance condition. In fact the acceptance condition specification is so general that it allows us to represent combination of existing conditions.

The goal of this internship is to survey existing algorithms that differ from one acceptance condition to the other, and develop algorithms that will work with the generalized acceptance condition. The first algorithms we would like to have are:
- emptiness check (is the language of the automaton empty?)
- conversion to B=FCchi acceptance
URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_ADL_General_Acceptance_Condition

Remarques :



SM207-97 Lisp for the design and the development of DSLs  
Admin  
Encadrant : Didier VERNA
Labo/Organisme : Laboratoire de Recherche et Développement de l'EPITA (LRDE)
URL : http://www.lrde.epita.fr
Ville : Le Kremlin-Bicêtre

Description  

DSL design and implementation is booming. There are many approaches to this problem, one is about using the extensibility capabilities of a general-purpose programming language in order to meet specific requirements. Many multi-paradigm approaches (eg functional languages / compile-time meta-programming) also offer benefits. Lisp is a language that is particularly well suited for this type of problem, but it is almost totally absent from the literature on DSLs. Indeed, most researchers are not (or not anymore) familiar with this language. They prefer a language based on a static approach, or have simply forgotten everything Lisp is capable of.

The purpose of this internship is to (begin to) fill the gap by adding Lisp to the comparative literature. The work will consist in the study of literature demonstrating some useful approaches for the design of DSLs, further it should propose an alternative approach with Lisp and then compare the results. The steps of this training will include:
- Familiarization with Common Lisp and in particular its extensibility capabilities.
- Getting acquainted with the literature on the design and implementation of DSLs in other languages.
- Choice of an application-centric approach or a comparative approach based on this literature and implementation of a Lispish equivalent.
- Finally, analysis of the results following the aspects such as: ease of implementation, expressiveness, performance etc. .
URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_DV_Conception_DSL

Remarques :



SM207-98 Lisp for statistics  
Admin  
Encadrant : Didier VERNA
Labo/Organisme : Laboratoire de Recherche et Développement de l'EPITA (LRDE)
URL : http://www.lrde.epita.fr
Ville : Le Kremlin-Bicêtre

Description  

R is a programming language for statistics. Because of its historical development, it has many similarities with Scheme (a language from the Lisp family), but also many problems: in particular, it is a language which is not compiled. The biggest complaint of its users is therefore its lack of performance. For this (bad) reason a number of totally esoteric features have been added to it.

The goal of this internship is to study how to port a particular application of R to an equivalent one in Common Lisp. The first objective (which is certain) is to achieve a much higher level of performance than that of the original application. The long term goal is to consider a new implementation of R (or of a subset of R) over Common Lisp, with the intuition that it will result in a much simpler and better system, which is in addition much more efficient.
URL sujet détaillé : https://www.lrde.epita.fr/wiki/Jobs/M2_2015_DV_Lisp_For_Statistics

Remarques :



SM207-99 Verified graph transformations with attributes  
Admin  
Encadrant : Martin STRECKER
Labo/Organisme : LIG ou IRIT
URL : http://www.irit.fr/~Martin.Strecker/
Ville : Grenoble ou Toulouse

Description  

Within the ANR Climt project, we are working on graph transformations, and in particular the verification of graph transformations.

We are currently developing an imperative language for graph transformations, and a family of Hoare-style logics for reasoning about programs of this language. The logics are variants of Description Logics. A distinctive feature of our work is that both the programming language and the logic have been formalized in a proof assistant.

The graph transformations we have considered so far only take into account structural aspects of graphs, but not attributes, i.e. numeric values, strings and the like. Our graph transformation language and its associated program logic have been designed to yield decidable proof problems (with an effective proof procedure). The internship aims at exploring what happens when adding attributes: conceptually, how to preserve decidability; practically, how to implement corresponding proof procedures.
URL sujet détaillé : http://www.irit.fr/~Martin.Strecker/Teaching/Master/

Remarques : Co-encadrement avec Rachid Echahed (LIG, Grenoble)



SM207-100 Parsing Algorithms for Abstract Categorial Grammars  
Admin  
Encadrant : Philippe DE GROOTE
Labo/Organisme : LORIA/INRIA Nancy-Grand Est
URL : http://www.loria.fr/~degroote/
Ville : Nancy

Description  

Abstract Categorial Grammar (ACG) is a framework dedicated to the description of the syntax and the semantics of natural languages. It is based on the implicative fragment of linear logic and uses linear λ-terms. The latter allows for the modeling of both strings and trees, two data-structures that are central in computational linguistics. From a formal language-theoretic point of view, ACGs can encode several classes of languages. In particular, second-order ACGs encode mildly context-sensitive languages (mCSL) such as Tree-Adjoining Grammars (TAG) or multiple context-free grammars (MCFG). M. Kanazawa showed how to reduce the parsing problem for this class of grammars to Datalog querying.

In order to model natural language phenomena, using higher-order grammars also proved to be useful. The goal of the research topic we propose is to adapt the results that are available for second-order grammars to higher-order grammars using the reduction of parsing to a proof search problem in multiplicative exponential linear logic (MELL). The decidability of the latter fragment being an open problem, we don't know whether the parsing problem is decidable for any higher-order ACG. For lexicalized ACG, we know the problem to be decidable but NP-complete. For these reasons, we aim at identifying suitable fragments (particular lexicalization) or principles (bounded typing contexts) that would provide decidable parsing algorithms with a controlled complexity. The ACG toolkit, developed in the Polymnie ANR project will provide a development and experimental environment.
URL sujet détaillé : http://semagramme.loria.fr/lib/exe/fetch.php?media=projects

Remarques : The internship will be co-supervised with Sylvain Pogodalla (http://www.loria.fr/~pogodall).

The internship position is supported (rémunération) by the ANR project Polymnie (http://semagramme.loria.fr/doku.php?id=projects:polymnie).



SM207-101 Version management for big-data storage  
Admin  
Encadrant : Marc SHAPIRO
Labo/Organisme :
URL : http://lip6.fr/Marc.Shapiro/
Ville : Paris

Description  

Many modern applications access very large objects, such as the
friendship graph of a social network, or the directory used for storing
a large email collection. The cost of copying such a large object,
either to or from disk or over the network, may overwhelm the
computational cost of using the object. However, copying is a frequent
operation, for instance when in-place update is undesirable, e.g., to
accomodate concurrent updates, to ensure atomicity, or for fault
tolerance.

Techniques for addressing this problem include copy-on-write (CoW) page
management (as used in operating system kernels), and so-called
persistent objects (as used by the implementation of functional
languages). Unfortunately, existing CoW techniques are low-level and
hard to use, and furthermore are not applicable in modern managed
language environments such as Java or Erlang. Conversely, persistent
object techniques are general but not have not been applied to a
distributed big-data setting. One recent result is the LSM-Tree, used
in the LevelDB database, for the special case of a large index tree.

The above techniques all follow a common pattern. A frozen snapshot of
the main (large) data structure is created. A new version consists of
both a pointer to the previous version and some kind of buffer. An
update is stored in the buffer, and does not modify the snapshot.
Reading the object state first checks the buffer, then (possibly) the
snapshot. Versions can be stacked or chained, several possibly sharing
the same base snapshot. When the chain of recent-update buffers becomes
too large or too complex, they are merged (according to the object's
semantics) into a new frozen snapshot.

This pattern has been used so far mostly in an ad-hoc way and has not
been studied scientifically. We wish to study its theoretical and
practical foundations and limitations, and to leverage the mergeability,
commutativity, associativity and idempotence properties of Conflict-Free
Replicated Data Types (CRDTs) to optimise and generalise it in a
principled way. The specific objectives of the internship are the
following:
- Develop a library of big-data versionable data types based on this
technique. Data types of interest include maps (directories), trees
and directed graphs.
- Implement them efficiently in the context of large multicore
computers and/or clusters.
- Deploy them in some representative applications.
- Perform experiments and measure performance, compared to alternative
implementation.
- Publish the findings in the scientific literature.

URL sujet détaillé : :

Remarques : Requirements. The intern must have:
- An excellent academic record.
- A strong interest and good knowledge of big-data applications,
distributed systems, distributed algorithms, or distributed
databases.
Knowledge of node.js is a plus.

The internship is fully funded, and will take place in the Regal group,
at Laboratoire d'Informatique de Paris-6 (LIP6), in Paris. A successful
intern will be invited to apply for a PhD.

To apply, contact Marc Shapiro , with the
following information:
- A resume or Curriculum Vit=E6
- A transcript of studies and grades of the last two years of study
(an informal transcript is OK).
- Names and contact details of two references (people who can recommend
you. We will contact these people directly.




SM207-102 A distributed log for scalable big-data storage  
Admin  
Encadrant : Marc SHAPIRO
Labo/Organisme :
URL : http://lip6.fr/Marc.Shapiro/
Ville : Paris

Description  

A well-known technique from databases and file systems to support fast
writes and to be able to recover from failures, is to track all update
activity into a log. A log is an append-only, write-mostly file,
organised as an array of update records; this ensures that writes are
fast. If all activity is logged to disk, this supports recovery and
rebuilding state.

Another well-known database technique is to partition the storage, i.e.,
to logically divide it into disjoint parts (also known as shards). This
improves scalability, manageability, performance, availability and
distribution. Shards are kept independent. For fault tolerance, a shard
is replicated over multiple nodes.

The subject of this internship is to explore the design of a
distributed, partitioned and replicated log. To support both performance
and fault-tolerance, log records are replicated among different servers
(instead of writing to disk). Each shard's replica must remain totally
ordered, in order to provide consistent reads.

Big-data applications manipulate data structures such as maps, trees or
general graphs. The aim of the log is to combine fast and persistent
writes to such structures, delaying the actual application of the
operation to later asynchronous processing. This enables maintaining
different views and versions of the database efficiently, which is
important for transaction processing. Upper layers play the log and
build their state on demand, off of the write path.

The work of the intern will be to design and implement this log, studying:
- A replication scheme supporting high performance.
- Mapping read and write operations to replicas.
- Comparing to existing solutions, such as disk- or SSD-based logs.

The intern's findings shall be written up for publication in the
scientific literature.

URL sujet détaillé : :

Remarques : Requirements. The intern must have:
- An excellent academic record.
- A strong interest and good knowledge of big-data applications,
distributed systems, distributed algorithms, or distributed
databases. Knowledge of the Erlang programming language is a plus.

The internship is fully funded and will take place in the Regal group, at Laboratoire d'Informatique de Paris-6 (LIP6), in Paris. A successful intern will be invited to apply for a PhD.

To apply, contact Alejandro Tomsic , with the
following information:
- A resume or Curriculum Vit=E6
- A transcript of studies and grades of the last two years of study (an
informal transcript is OK).
- Names and contact details of two references, people who can recommend
you. We will contact these people directly.




SM207-103 Types and interpretations for programs of polynomial time complexity  
Admin  
Encadrant : Patrick BAILLOT
Labo/Organisme : LIP ENS Lyon
URL : http://perso.ens-lyon.fr/patrick.baillot/
Ville : Lyon

Description  


Can one define simple and modular programming disciplines in a high-level language ensuring that the programs produced have a bounded complexity, for instance of polynomial time?
This question is explored by the field of implicit computational complexity: it aims at defining static complexity criteria, which do not rely on explicit bounds on resources.

This internship proposal focuses on implicit complexity criteria for functional programs, which can use higher-order subprograms. It aims at relating two approaches for this issue, based respectively on type systems and on the semantic notion of interpretation.
URL sujet détaillé : http://perso.ens-lyon.fr/patrick.baillot/STAGES/2015/sujet_interpretations2015.pdf

Remarques :



SM207-104 Modelisation and structural analisys of preferences relations in the qualitative approach to multicriteria decision making  
Admin  
Encadrant : Miguel COUCEIRO
Labo/Organisme : LORIA (CNRS - Inria Nancy Grand Est - Universite de Lorraine)
URL : http://www.loria.fr/~mcouceir/
Ville : Nancy

Description  

*Basic setting*

In decision theory there are two main problems: modelling of preferences and the axiomatization of such models which allow, in particular, a justification for the chosen model. In addition to these two, a third is added and that deals with the elicitation, mining and learning of preferences. For references see, e.g., [BDPP,DeBaets,PRSR09].

Aggregation functions [GMMP] constitute an important tool when dealing with preference modelling. Preference relations can be represented by global utility functions that score different alternatives, thus providing a simple decision procedure: alternatives with higer scores are preferred to those with lower scores. In this procedure, aggregation functions play a relevant role as they can be used to aggregate (merge) the different criteria (describing each alternative) into a single scale. This refined model for representating preference relations R over a set of alternatives
X=X_1 x ... x X_n (e.g., appartements to buy) that are described by different attributes X_i (such as price , surface, localisation, etc.) can be formalised by the rule: a R a' iff U(a)=< U(a'), where U:X -> Y is a global utility fonction defined by a composition

U(a) =A(g_1(a_1),...,g_n(a_n) right) (*)

of an aggregation function A:Y^n -> Y (e.g., a weighted sum or a majority function) local utility functions g_i:X_i -> Y
that express our (local) preferences on each attribute X_i. For instance, concerning the price it is reasonable to assume that cheaper appartements are preferred to more expensive ones, whereas for surface, bigger appartements are usually preferred to smaller ones.

*Qualitative setting and goals*

This proposal explores a qualitative approach to multicriteria decision making where the aggregation procedure A of choice is the Sugeno integral [DubPraSab01,DubMarPraRouSab01]. In [CW1,CW2] proposed axiomatizations of model (*). As it turned out, different aggregation procedures may culminate in the same global preference relation for suitable rankings over attributes. This asked for algorithmic procedures to obtain all possible factorisations of form (*), that is, all Sugeno integrals A and all local utility functions g_i whose composition results in the same global utility function U. This was provided in [CW2] and lead to logical characterizations of preference relations in this qualitative setting [CW3] and gave tools for analysing the effect of local preferences on global preferences modelled by (*). The first objective is to familiarise the student the emerging domain of aggregation theory with applications in decision making and preference learning. Then the student will be asked to implement the algorithmic procedures given in [CW1,CW2] with the following objectives:

1. Study the behaviour of global preferences with respect to local changes (variations of local preferences),

2. Analyse the computational complexity and the effect of considering simplified models (e.g., by limiting the interactions between criteria),

3. Explore methods of elicitation and learning of preferences in this qualitative setting.


*Desired skills*

The candidate should be motivated and have an initiative spirit. Basic knowledge in discrete mathematics and good programing skills (for algorithmic implementations over ordered structures) is desirable.


*Bibliography*


[BDPP] D. Bouyssou, D. Dubois, H. Prade, M. Pirlot (eds). Decision-Making Process -- Concepts and Methods, ISTE/John Wiley, 2009.

[DeBaets] K. Cao-Van, B. De Baets, S. Lievens. A probabilistic framework for the design of instance-based supervised ranking algorithms in an ordinal setting, Annals Operations Research 163 (2008) 115--142.

[CW1] M. Couceiro, T. Waldhauser, Axiomatizations and factorizations of Sugeno utility functions, Internat. J. Uncertain. Fuzziness Knowledge-Based Systems, 19(4) (2011) 635--658.

[CW2] M. Couceiro, T. Waldhauser. Pseudo-polynomial functions over finite distributive lattices, Fuzzy Sets and Systems 239 (2014) 21--34

[CW3] M. Couceiro, D. Dubois, H. Prade, T.~Waldhauser. Decision making with Sugeno integrals: DMU vs. MCDM. 20th European Conference on Artificial Intelligence (ECAI2012), 288--293, 2012.

[DubMarPraRouSab01] D. Dubois, J.-L. Marichal, H. Prade, M. Roubens, R. Sabbadin. The use of the discrete Sugeno integral in decision making: a survey. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 9:5 (2001) 539--561.

[DubPraSab01] D. Dubois, H. Prade, R. Sabbadin. Decision-theoretic foundations of qualitative possibility theory. European Journal of Operational Research 128 (2001) 459--478.

[GMMP] M.~Grabisch, J.-L. Marichal, R. Mesiar, and E. Pap. Aggregation Functions. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2009.

[FH] J. F=FCrnkranz, E. H=FCllermeier (eds). Preference Learning. Springer, 2010.

[PRSR09] H. Prade, A. Rico, M. Serrurier, E. Raufaste. Eliciting Sugeno integrals: Methodology and a case study, in Proceedings of European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty 2009, pp. 712--723.

URL sujet détaillé : :

Remarques :



SM207-105 XPath-like query logics  
Admin  
Encadrant : Sylvain SCHMITZ
Labo/Organisme : U. of Warwick / LSV
URL : http://www.lsv.ens-cachan.fr/~schmitz/
Ville : Coventry / Cachan

Description  

XPath est un langage permettant de décrire des arbres XML. Il peut être vu comme une logique comportant des modalités de navigation dans les arbres, et éventuellement des primitives permettant de comparer les données attachées aux noeuds des arbres. Le problème de la satisfaction de requêtes XPath est très complexe dès lors que l'on se permet de manipuler les données, puisqu'elles sont prises dans un ensemble infini.

Des travaux récents laissent penser qu'un point de vue unificateur peut être développé pour mieux comprendre les résultats de décidabilité et complexité pour XPath avec données. Celui-ci s'appuie sur la théorie de la preuve, et plus précisemment sur l'utilisation de logiques sous-structurelles.

Le sujet du stage s'inscrit dans ce programme. On considèrera une logique modale correspondant à un fragment simplifié de XPath avec données, et on s'attachera à caractériser la complexité de son problème de satisfaction, en prenant cette question comme un problème de recherche de preuve dans un calcul des séquents bien structuré qu'il faudra préalablement développer.

Voir PDF ci-dessous pour plus de détails.

URL sujet détaillé : http://www.lsv.ens-cachan.fr/Stages/Fichier/m2-15-ssdb.pdf

Remarques : Le stage sera co-encadré par Sylvain Schmitz et David Baelde. Il peut avoir lieu soit intégralement à l'université de Warwick (où Sylvain Schmitz sera prof. invité durant le second semestre 2014/1025) soit à moitié au LSV. Le stage peut mener à une thèse au LSV.



SM207-106 Coinductive and symbolic algorithms for non standard automata  
Admin  
Encadrant : Damien POUS
Labo/Organisme : LIP, ENS Lyon
URL : http://perso.ens-lyon.fr/damien.pous/stages.html
Ville : Lyon

Description  

We recently proposed algorithms for checking language equivalence of non-deterministic finite automata (NFA), by relying: 1/ on techniques from concurrency and process algebra theory: ``bisimulations up to'' [1], and 2/ on a symbolic representation of automata, using BDD [2] .
Thanks to these techniques, our algorithms seem to be more efficient than all algorithms known up to now (Hopcroft&Karp, partition refinment, antichains). We propose to study the possibility of extending these algorithm to other kinds of automata, that are widely used in practice: B=FCchi automata, alternating automata, tree automata, etc...

[1] F. Bonchi and D. Pous, Checking NFA equivalence with bisimulations up to congruence, in Proc. POPL 2013.
[2] D. Pous, Symbolic Algorithms for Language Equivalence and Kleene Algebra with Tests. To appear in Proc. POPL 2015.
URL sujet détaillé : :

Remarques :



SM207-107 Formalisation and certification of ownership in a specialised Java virtual machine  
Admin  
Encadrant : Damien POUS
Labo/Organisme : LIP, ENS Lyon
URL : http://perso.ens-lyon.fr/damien.pous/stages.html
Ville : Lyon

Description  

(Joint work with Fabienne Boyer and Olivier Gruber, Grenoble)
One major challenge of reconfigurable component-based systems is to maintain the consistency of the system despite failures occurring at reconfiguration time. Fabienne Boyer and Olivier Gruber recently proposed a low-level model allowing one to guarantee that a given system is always consistent. This model was implemented as a Java virtual machine and we proved in Coq that the reconfiguration algorithms are correct [1]. The model includes many other features for component-based programming, including a well-defined notion of ownership [2].
We propose to formalise this aspect of the model in Coq: the corresponding algorithms deal with potentially cyclic graphs of objectc, making it a really challenging task to certify them.

[1] F. Boyer, O. Gruber, and D. Pous, Robust Reconfigurations of Component Assemblies. In Proc. ICSE'13, IEEE / ACM, 2013.
Associated Coq development:
http://perso.ens-lyon.fr/damien.pous/rrca/

[2] F. Boyer, O. Gruber, Ownership-Based Isolation for Concurrent Actors on Multi-core Machines. In Proc. ECOOP'13, 2013.

URL sujet détaillé : :

Remarques : Co-encadré avec Fabienne Boyer et Olivier Gruber (LIG, Grenoble)




SM207-108 PiCoq  
Admin  
Encadrant : Damien POUS
Labo/Organisme : LIP, ENS Lyon
URL : http://perso.ens-lyon.fr/damien.pous/stages.html
Ville : Lyon

Description  

We would like to apply the recently introduced proof techniques to help the formalisation in Coq of the operational theory of pi-calculus like programming languages.
The pi-calculus is a formalism introduced by R. Milner [1,2] for the specification and the analysis of concurrent systems or processes. Reasoning about these processes usually involves behavioural equivalence relations: two processes are considered as equal when they exhibit the same behaviour. While the theory of these equivalences is well established, proving formally that two concrete processes are equal can be hard, and tedious. This is where a proof assistant would really by useful, both to make sure that all cases were covered, but also to solve automatically numerous sub-proofs obligations.
The internship would consist both in applying known techniques within the Coq proof assistant, to prove various results about the pi-calculus, and in developing new techniques or new abstractions allowing one to overcome the difficulties that are inherent to the mechanised formalisation of such proofs.

[1] R. Milner, Communicating and Mobile Systems: the Pi-Calculus, Cambridge University Press.
[2] M. Hennessy, A Distributed pi-calculus, Cambridge University Press.

URL sujet détaillé : :

Remarques :



SM207-109 Approximations finies de chaînes de Markov modélisées par des graphes réguliers probabilistes  
Admin  
Encadrant : Christophe MORVAN
Labo/Organisme : INRIA
URL : http://www-igm.univ-mlv.fr/~cmorvan/index.php
Ville : Rennes

Description  

Markov chains are a simple and powerful tool to model uncertainty in systems. It enable to define a probability measure on the set of executions. It then enables to quantify several data on the system itself.

Probabilistic pushdown automata are certainly the most abundantly stutied infinite state Markov chain [1]. Such an automaton is defined similarly as a classical pushdown automaton but transitions are also labelled by probabilities. More recently similar systems, defined by deterministic graph grammars (with probabilistic labelling functions) have been considered [2]. These grammars define structures where finitely many patterns are repeated according to a finite set of rules.

In this work we plan to study finite abstractions of such infinite systems. In particular we expect to characterize systems where finite restrictions contain a subset of executions having a strictly positive measure. In such cases we also want to produce algorithms ensuring construction of fintite restrictions containing a subset of executions of measure p for an arbitrary p.
We also want to compare this approach with the "probability of termination" of Etessami [3].
We would also like to establish the precise complexity of each algorithm. Finally it would be interesting to connect this work with those of Bradzil and Kucera which define approximations for more general chains [4].

[1] J. Esparza, A. Kucera et R. Mayr, Model Checking Probabilistic Pushdown Automata, LMCS 2006
[2] N. Bertrand et C. Morvan, Probabilistic regular graphs, Infinity 2010
[3] K. Etessami, The complexity of analyzing infinite-state Markov chains, Markov decision processes, and stochastic games; Stacs 2013
[4] T. Bradzil et A. Kucera, Computing the Expected Accumulated Reward and Gain for a Subclass of Infinite Markov Chains

URL sujet détaillé : http://master.irisa.fr/index.php/fr/370?id_stage=39

Remarques :



SM207-110 Mitigation of harmful Network Interactions for Extreme Scale Computing Platform  
Admin  
Encadrant : Denis TRYSTRAM
Labo/Organisme : LIG/CNRS/INRIA
URL : http://moais.imag.fr/membres/denis.trystram/
Ville : Grenoble

Description  

Context
-------

Most of the current supercomputers have a unique and multi-purpose
interconnection network. Such a network carries both I/O traffic and
internal communications of jobs. Blue Waters, with its 3D torus network,
is concerned with such issues. The network is shared by all the jobs
running on the machine. This leads to harmful effects such as congestion
and hinders application's performances.

Two orthogonal lines of research have already been identified:

1. Enhanced allocation strategies:

- Constructing allocation using specific shapes for jobs (e.g.,
convex or contiguous jobs) would allow to contain most of the
network traffic generated by a job within its physical geometric
boundaries. With such shapes we are able to optimize the network
load induced by an application. Solving this problem is however
not enough as the I/O nodes are spread in the machine. It may also
be specific to the 3D torus topology.
- Another strategy would be to over-allocate and then wisely use the
spare nodes allocated to a job. Still, there is a clear trade-off
to determine.

2. I/O pacing:

The network use tends to be bursty, hence leading to congestion.
By pacing or delaying the I/O of some nodes, we may avoid
congestion and thus gain in performance. Instead of detecting such
pattern with packets' drop, we could be proactive by using the
gathered metrics.

Tough points
------------

As of today, we know with a time granularity of about a minute the
global behaviour of the Blue Waters machine. However, such granularities
(in space and time) seem too coarse to deal efficiently with network
congestion and other related issues.

Program
-------

1. Improve data collection to get per job information: which time/space
granularity trade-off is the right one?
2. Identify and characterize applications that could benefit from one
or both approaches. This step entails the identification of the
relevant metrics and the corresponding temporal scales to
characterize such applications.
3. Design algorithms with time complexity compatible with the
identified scales. These algorithms aim at mitigating jobs' harmful
network interactions.
URL sujet détaillé : :

Remarques : Duration: 4 to 6 months



SM207-111 Résolution de jeux combinatoires à l'aide des graphes  
Admin  
Encadrant : Eric DUCHENE
Labo/Organisme : LIRIS
URL : liris.cnrs.fr/~educhene/ :
Ville : Lyon

Description  

Les jeux combinatoires forment un domaine de recherche relativement récent. Ce sont des jeux à deux joueurs, à information parfaite, et où le hasard n'intervient pas. Parmi les jeux les plus connus on citera les échecs, le Go, Hex... L'objectif de la recherche dans ce domaine est l'obtention d'une stratégie gagnante pour l'un des deux joueurs. Celle-ci reste toutefois difficile à obtenir dans de nombreux cas, vu la complexité de la combinatoire qui entre en jeu. Par ailleurs, la théorie mathématique de résolution "exacte" de tels jeux est aujourd'hui en cours de construction, et fait notamment intervenir des concepts de théorie des graphes ou de théorie des nombres. Dans ce stage, l'étudiant commencera par s'approprier les grands concepts de la théorie des jeux combinatoires, en résolvant par lui-même des premiers jeux dont les stratégies gagnantes sont connues. Les notions de somme de jeux, de fonction de Grundy seront abordées. Par la suite, on s'intéressera aux jeux dits "miserables" (introduits par Gurvich en 2012), càd ceux pour lesquels les versions dites "normale" (celui qui joue le dernier coup gagne) et "misere" (celui qui joue en dernier perd) ont des résolutions quasi-identiques. Une telle étude est un premier pas vers la résolution des jeux misères, qui constituent le sujet le plus ambitieux du domaine.
URL sujet détaillé : :

Remarques : Co-encadrement avec Aline Parreau.



SM207-112 Étude logique de l'ordonnancement dans les algèbres de processus  
Admin  
Encadrant : Emmanuel BEFFARA
Labo/Organisme : LIP - ENS Lyon
URL : http://iml.univ-mrs.fr/~beffara/
Ville : Lyon

Description  

The Curry-Howard correspondence establishes an isomorphism between intuitionistic proofs and functional programs and provides fruitful methods for studying both proofs and programs. In its extension to classical logic by means of sequential control operators, the well-known continuation-passing-style translations appeared as the operational counterpart of translations from classical to intuitionistic logic; this extensions has been very fruitful both in logic and in programming theory. The transposition of these methods to concurrent models of computation is still problematic because crucial features of concurrent systems, like non-determinism and explicit temporal constraints between events, have not yet found their logical counterpart.

The point of this internship is to explore a new approach to this question in which proofs are put in correspondence not with concurrent programs but with schedulers. The technical objective is to reinterpret known transformations on processes from synchronous formalisms to asynchronous ones as translations between logical systems, as a concurrent analogue of CPS translations.
URL sujet détaillé : http://iml.univ-mrs.fr/~beffara/stage-m2if.pdf

Remarques :



SM207-113 Terminaison de programmes à structures de données complexes  
Admin  
Encadrant : Laure GONNORD
Labo/Organisme : University of Lyon / LIP- ENS Lyon
URL : laure.gonnord.org/pro :
Ville : Lyon

Description  

Proving the termination of a flowchart program can be done by
exhibiting a ranking function, i.e., a function from the program
states to a well-founded set, which strictly decreases at each
program step. In [compsysTermination:sas2010] we proposed a
general algorithm for the computation of linear ranking functions,
which is an adaptation of the Linear-Programming based scheduling method
of [DarteVivien97]

However, the limits of our approach are raised as soon as we are faced
with recursive programs, or irregular data. In fact, the underlying
model of program, namely, the emph{polyhedral model}, has intrinsic
restrictions that have only been partially solved.

In this internship, we will focus on the relationship between
termination techniques and sequential scheduling/termination
techniques.

In this internship, we aim at finding a way to prove termination
of sequential programs with data structures (without translating into
TRSs).

The candidate is expected to~:
* Read the state of the art research articles : sequential
termination, rewriting termination.
* Propose a framework that is able to deal with sequential
programs with inductively defined data-structures and recursive
programs.
* Implement a prototype.
URL sujet détaillé : http://laure.gonnord.org/pro/research/termdata_2015.pdf

Remarques : co encadrement possible avec D. Monniaux (vérimag), et Carsten Fuhs (UCLondon)



SM207-114 Affine Transformations of Digital Images using Arbitrary Precision Computations  
Admin  
Encadrant : Nicolas MAGAUD
Labo/Organisme : Icube - UMR 7357 CNRS Université de Strasbourg
URL : http://dpt-info.u-strasbg.fr/~magaud
Ville : STRASBOURG

Description  

Using affine or linear transformations such as rotations, symmetries, homotheties and shear mappings is commonplace in digital image processing. These transformations are usually modelled as matrices of real numbers. Because floating-point numbers can not represent all real numbers, but only a finite set of rational numbers, the implementations of affine transformations often have properties which differ from those of the real transformations they are supposed to represent.

Quasi-Affine Applications (a.k.a. by QAA) are obtained by the discretization of affine applications. They can be represented as matrices of rational numbers and allow to describe linear or affine transformations such as the discrete rotations. An Ω-QAA can be viewed as a sequence of these Quasi-Affine Applications whose successive values (rational numbers) are closer to the actual real application. Using this representation allows both capturing the discrete features of the plane (or space) as well as computing exactly with a discrete model of the continuum based on integers only : the Harthong-Reeb line.

In this internship, we shall state and prove the mathematical properties of the Ω-QAA. Proofs shall be carried out using the Coq proof assisatant. We shall also study some algorithms based on the Ω-QAA and formally prove their properties. A possible extension consists in adapting these algorithms and their proofs of correctness to a three-dimensional setting.


URL sujet détaillé : http://dpt-info.u-strasbg.fr/~magaud/TrADiCont/SujetStageM2English.pdf

Remarques : Co-encadrement avec Marie-Andrée Da Col-Jacob et Loïc Mazo (équipe MIV, ICube). Stage financé dans le cadre du projet interne "TrADiCont" de ICube.



SM207-115 Analyse in situ de données de simulations moléculaires sur très grand calculateur  
Admin  
Encadrant : Bruno RAFFIN
Labo/Organisme : CEA
URL : http://moais.imag.fr/membres/bruno.raffin
Ville : Bruyères-le-Chatel

Description  

Les capacités de calcul des prochaines générations de machines permettront des simulations numériques de très grandes tailles produisant des quantités de données massives. La gestion de ces données du moment où elles sont produites jusqu'à leur analyse et leur visualisation est un enjeu majeur. Ce stage s'inscrit dans ce contexte en se focalisant sur les simulations de dynamique moléculaire. Lorsque les traitements (compression, indexation, analyse, visualisation) des données sont réalisés alors qu'elles sont encore stockées dans la mémoire vive de la machine exécutant la simulation, ils sont dit " in-situ ". Si ces traitements ont lieu une fois les résultats stockés sur disques une fois la simulation terminée, on parle de " post-traitement ". Ce domaine de recherche, à la frontière entre les problématiques du Big Data et de l'Exascale, est identifié comme critique pour la simulation numérique à très grande échelle sur le prochaines générations de machines [5]. Il intéresse une large communauté qui va bien au-delà de la dynamique moléculaire.
Notre équipe au CEA développe le code de dynamique moléculaire classique ExaStamp. Exastamp est un code destiné à des simulations à très grandes échelles sur les futures machines Exaflopiques. Dès à présent Exastamp est capable de s'exécuter sur plusieurs milliers de machines hybrides associant CPU multi-cóurs et accélérateurs de type GPU ou Xeon Phi [1]. ExaStamp permet déjà des calculs à plusieurs milliards de particules (atomes et molécules). Sans un gestion adaptée des données produites, il sera très difficile voir impossible d'exploiter pleinement les résultats potentiellement très volumique de ces simulations. A l'INRIA, nous avons développé un environnement logiciel, pour le traitement parallèle in-situ et post-mortem parallèle de données [2,3]. Cet environnement très flexible permet d'évaluer une grande variété de scénarios, mais ne constitue qu'une première étape vers le support efficace des machines Exascale.

L'objectif de ce stage est d'étudier la parallélisation d'un algorithme de traitement in-situ des données à partir des positions des atomes fournies par la simulation. Cet algorithme devra calculer à partir de ces données une grille découpant l'espace de simulation en voxels portant chacun des informations de pression, température, champ électromagnétique. Cet algorithme engendrera des calculs et des communications dont il faudra minimiser l'impact sur les performances de la simulation s'exécutant en même temps. Pour cela le candidat étudiera plusieurs stratégies d'entrelacement de la simulation et de l'algorithme développé (traitements asynchrones sur une coeur dédié par exemple). Cette grille de calcul sera ensuite utilisée pour effectuer des traitement complémentaires comme par exemple l'extraction d'une isosurface de pression que l'utilisateur pourra visualiser en cours de simulation.

Le candidat aura accès pour ses expérimentations à des très grands calculateurs. Il intégrera l'équipe développant le code Exastamp, localisée sur le site du CEA de Bruyères-le-Châtel, tout en étant co-encadré par Bruno Raffin, chercheur à l'INRIA Grenoble et spécialiste de l'analyse et de la visualisation in situ. Le stagiaire aura la possibilité de poursuivre en thèse, le financement étant déjà assuré.

[1] A. Heinecke, M. Klemm and H.-J. Bungart, "From GPGPUs to Many-Core: NVIDIA Fermi and Intel Many Integrated Core Architecture," Computing in Science and Engineering, vol. 14, no. 2, p. 78=9683, 2012.
[2] Matthieu Dreher, Bruno Raffin; =93A Flexible Framework for Asynchronous In Situ and In Transit Analytics for Scientific Simulations=94, 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2014). (http://hal.inria.fr/hal-00941413/en)
[3] Matthieu Dreher, Jessica Prevoteau-Jonquet, Mikael Trellet, Marc Piuzzi, Marc Baaden, Bruno Raffin, Nicolas Férey, Sophie Robert, Sébastien Limet. " ExaViz: a Flexible Framework to Analyse, Steer and Interact with Molecular Dynamics Simulations=94, Faraday Discussions of the Chemical Society (2014) (http://hal.inria.fr/hal-00942627/en)
[4] C. Docan, M. Parashar, and S. Klasky, =93DataSpaces: an Interaction and Coordination Framework for Coupled Simulation Workflows,=94 Cluster Computing, vol. 15, no. 2, pp. 163=96181, 2012.
[5] Synergistic Challenges in Data-Intensive Science and Exascale Computing. US Department of Energy, March 2013 (http://science.energy.gov/~/media/40749FD92B58438594256267425C4AD1.ashx).
[6] F. Zheng, H. Zou, G. Eisnhauer, K. Schwan, M. Wolf, J. Dayal, T. A. Nguyen, J. Cao, H. Abbasi, S. Klasky, N. Podhorszki, and H. Yu, =93FlexIO: I/O middleware for Location-Flexible Scientific Data Analytics,=94 in IPDPS'13, 2013.
[7] M. Dorier, G. Antoniu, F. Cappello, M. Snir, and L. Orf. =93Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O,=94 in CLUSTER =96 IEEE. International Conference on Cluster Computing. IEEE, Sep. 2012.
[8] M. Dorier, S. Ibrahim, G. Antoniu, R. Ross. Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction. Accepted, to appear in Proc. of ACM/IEEE SC 2014 (New Orleans, November 2014.
[9] Fang Zheng, Hongfeng Yu, Can Hantas, Matthew Wolf, Greg Eisenhauer, Karsten Schwan, Hasan Abbasi, Scott Klasky. GoldRush: Resource Efficient In Situ Scientific Data Analytics Using Fine-Grained Interference Aware Execution. Proceedings of ACM/IEEE Supercomputing Conference (SC'13), November, 2013.

URL sujet détaillé : :

Remarques : Lieu: CEA, Bruyères le chatel, France
Rémunération: environ 1000 euros/mois.
Ce stage pourra être suivi d'une thèse (financement acquis)



SM207-116 Analye in-situ et simulations parallèles à grande échelle en mécanique de fluides  
Admin  
Encadrant : Bruno RAFFIN
Labo/Organisme : INRIA Grenoble
URL : http://moais.imag.fr/membres/bruno.raffin
Ville : Montbonnot

Description  

Les capacités de calcul des prochaines générations de machines permettront des simulations numériques de très grandes tailles produisant des quantités de données massives. La gestion de ces données du moment où elles sont produites jusqu'à leur analyse est un enjeu majeur. Ce stage s'inscrit dans ce contexte en se focalisant sur les simulations de mécanique des fluides. Lorsque les traitements (compression, indexation, analyse, visualisation) des données sont réalisés alors qu'elles sont encore stockées dans la mémoire vive de la machine exécutant la simulation, ils sont dit " in-situ ". Si ces traitements ont lieu une fois les résultats stockés sur disques une fois la simulation terminée, on parle de " post-traitement ". Ce domaine de recherche, à la frontière entre les problématiques du Big Data et de l'Exascale, est identifié comme critique pour la simulation numérique à très grande échelle sur le prochaines générations de machines [5].

Le département de R&D d'EDF développe des codes de simulation parallèle tel que le code SATURNE de mécanique des fluides qui s'exécute sur plusieurs milliers de coeurs de calcul. Les études réalisées à l'aide de tels codes produisent de grandes masses de données essentiellement décrites sous forme de champs scalaires (par exemple la température dans chaque position de l'espace dans un problème de thermodynamique) et champs vectoriels (par exemple la vitesse des particules d'eau). La Figure 1 montre un cas de visualisation de flux dans un environnement industriel. La quantité de données produite par simulation est très importante, mais très souvent chaque simulation est reproduite plusieures fois avec des jeux de paramètres différents.

Un premier prototype d'analyse de données in-situ a été développé avec SATURNE et la librairie Catalyst/Paraview [1]. L'objectif de ce stage est d'étudier des stratégies de partage efficace des resources entre l'analyse et la simulation. Certaines approches dédient un coeur de calcul, helper-core, pour les analyses par nóud de simulation, on parle de calculs in-situ. D'autres essayent d'entrelacer sur les mêmes coeurs analyse et simulation en activant les analyses sur les phases d'inactivité de la simulation. D'autres nóuds peuvent être dédiés à 100% aux analyses. On parle alors de nóuds in-transit. Après une étude de l'existant le stagiaire sera amené à développer sa propre stratégie dans le contexte des analyses des simulations du code SATURNE. Le travail consistera en une partie de développement et d'expérimentation sur des supercalculateurs de grande taille. Le stage se déroulera à l'INRIA Grenoble (possibilité de faire le stage à Clamart en acccord avec les encadrants) dans une équipe spécialisée en calcul parallèle et analyse in-situ, en étroite collaboration avec les équipes EDF R&D en charge du développement de SATURNE et
de l'analyse de données.

URL sujet détaillé : http://moais.imag.fr/membres/bruno.raffin/MISC/stageinsituINRIA+EDF.pdf

Remarques : Stage en collaboration avec EDF R&D (quelques mission à prévoir à Clamart)
Rémunération: suivant situation (minimum de 430 euros/mois)




SM207-117 Social network and language change  
Admin  
Encadrant : Eric FLEURY
Labo/Organisme :
URL : https://team.inria.fr/dante/
Ville : lyon

Description  

Context
Research on modeling and analyzing human language with complex network is on the rise in the recent years [1]. An important subfield focusses on multi-agent modeling of the influence of social network on language use and language change [2-10]. Indeed, understanding the links between society and language benefits from the insights of the dynamic modeling of populations of agents interacting with one another and sharing simulated linguistic variants under social and cognitive constraints [11]. Many attempts have been made in this direction, each of them achieving local goals and accounting for certain linguistic, social, and cognitive parameters: structure and size of the social network, type of learning (statistical vs. categorical), functional value and frequency of the alternating linguistic variants [3], social status of the agents and social distance between them [5], spatial distribution of the population [10], birth and death of the agents as well as preference for interacting with other agents using the same dialect [7], trend toward social distinction [4], delay of decay of the memorized instances of the variants [6]. However, considering the field, one feels disappointed because of the huge diversity of models implementing overlapping sets of parameters. We lack a general view on what is to be modeled and how to model it. Answering these questions is important because modeling is the only way for testing hypotheses in a field where experimentation in the real environment is not possible. Answering these questions implies an interdisciplinary work bringing together Network Science and Linguistics.
In this context, the work proposed for the internship will be carried out in an interdisciplinary environment. It will follow two stages corresponding to two objectives.



Objectives
1/ Taking an overview of the issues at stake
- What parameters are included in the available models? Into what categories could they be classified? (Cognitive parameters, social parameters, linguistic parameters, etc.)
- In the light of the work on language change in the field of linguistics, what parameter should be prioritized?
- What implementation tools have been used in the different modeling attempts?
2/ Taking into account this overview, the internship should then aim at:
- Formulating a general framework for this type of modeling in comparison with similar models (e.g. opinion models)
- Implementing a model inspired by opinion models and variationist linguistics with reals between 0 and 1 accounting for the weight of the linguistic variants ([12] for a first attempt)
- Observing how the spread of the linguistic variants throughout the social network depends on the structure of the communities within the network and the overlapping between them.
- Laying the foundation for implementing the reverse influence, that is the influence of the linguistic states of the agents on the structure and the frequency of their interactions.

URL sujet détaillé : :

Remarques : co-encadrement avec

********************************************
Jean-Pierre Chevrot
Institut Universitaire de France
http://iuf.amue.fr/
Université Grenoble 3
Laboratoire Lidilem - BP 25, 38040, Grenoble cedex
France
http://lidilem.u-grenoble3.fr/spip.php?article30&var_mode=calcul&mem_id=4



SM207-118 Machine learning and data mining for vehicle simulation results  
Admin  
Encadrant : Ayman MOAWAD
Labo/Organisme : Argonne National Laboratory
URL : www.anl.gov :
Ville : Chicago, USA

Description  

ANALYSIS OF THE IMPACT OF ADVANCED VEHICLE TECHNOLOGIES USING LARGE-SCALE SIMULATIONS

Context
Vehicle energy consumption and powertrain operations for future vehicle powertrain technologies (from current vehicles to year 2045) is predicted through vehicle simulations using Autonomie, a vehicle simulation tool developed at Argonne. Over 4000 vehicles, with different powertrain types (conventional and numerous electrified powertrains) and component technologies (advanced engines, batteries, electrical machines, light weighting) are simulated for vehicle fuel consumption and cost. The output of this study is used by the US government to evaluate the impact of its R&D funding on energy security and CO2 emission.

Objectives and Tasks
The project will involve running large scale simulations of advanced technology vehicles using the large scale simulation process internally developed. The student will also help in automating the large scale simulation process.
The project will involve improving/developing analysis methods for the simulation results. In particular:
1. Developing algorithms to extract important parameters/values from large sets of simulation results and results database:
a. For quality check and analysis using mathematical techniques (statistical methods, machine learning, fuzzy clustering and data mining for data validation).
b. For data-analysis.
2. Developing standardized plots to explain component operation and power/energy requirements (e.g. engine operation, battery power and energy requirements) and trends with powertrain technology and time.
3. Time permitting, continuation of development of a graphical interface allowing to access the results and assumptions. This will include making the tool more user-friendly, more customizable and with more plotting possibilities (histograms, scatter plots, pies, charts...)

Skills/pre-requisites (desired and/or required): Strong mathematical background: statistical analysis, machine learning, data mining, good knowledge of Matlab, graphical user interface programing (C# preferred).
URL sujet détaillé : :

Remarques : Salary: approx. /month
- Benefits: round-trip ticket to Chicago, visa expenses
- Duration: 5.5 months max, 4 months minimum

Please send your resume in English to Ayman Moawad: amoawad.gov



SM207-119 Strucutre of classes of graphs defined by constraints on chords  
Admin  
Encadrant : Nicolas TROTIGNON
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/nicolas.trotignon/index.html
Ville : Lyon

Description  

The goal is to study structural graph theory through a particular class. A detailed statement is available on the links below.
URL sujet détaillé : http://perso.ens-lyon.fr/nicolas.trotignon/sujet.pdf

Remarques :



SM207-120 Evaluating and Exploiting Dynamic Pricing Strategies  
Admin  
Encadrant : Jannik DREIER
Labo/Organisme : ETH Zurich - Institute of Information Security
URL : http://people.inf.ethz.ch/jdreier/
Ville : Z=FCrich / Suisse

Description  

Some online shops and other e-commerce providers, including plane or train ticketing, car rentals, or hotel booking websites, use dynamic pricing. When using dynamic pricing, the offered price is not identical for all customers, but varies as a function of many parameters, for example time of the day, day of the week, geographic location, IP address, cookies, browsing history, device or browser used, etc. For example, flights can be sold at different prices in different places, or may become more expensive if you already looked for them in the past. Some web shops offer cheaper prices on price comparison websites, which means that customers arriving via these websites pay lower prices than customers that directly consulted the shop's website.

The goal of these strategies is to maximize profits by always charging the highest price the customer is willing or able to pay.
Since different customers might not all be willing to pay the same
price, the website providers aim to determine the optimal price for each customer, based on the available information.

However, using privacy-enhancing techniques such as proxys or VPN connections, many of these parameters can be influenced by the user.
For example, HTTP headers and cookies can be modified, IP addresses changed, etc. Using such techniques, users could potentially exploit the dynamic pricing strategies to their advantage, and obtain lower prices than they would normally have been offered.

The goals of this project are twofold. In the first part, the goal is to experimentally evaluate dynamic pricing strategies used by different websites. The goal of the second part is to identify effective price-reducing strategies for customers, and implement them in a tool that automatically adapts the users access to a website to reduce prices.

For the first part, a system to experimentally measure prices as a function of the various parameters has to be developed. For example, header rewriting tools or proxys can be used to prevent browser tracking, and simulate various possible clients - different browsers, operating systems, devices etc. VPN servers or systems like TOR can be used to change IP addresses, and in particular to obtain IP addresses from various geographical locations all over the world. Finally, a website crawler can connect to the targeted websites via these tools to automatically collect the relevant prices for various configurations. To obtain sufficient data and meaningful statistics, this data acquisition processes, i.e. configuration of the tools and price crawling, should be automated.

In the second part, once the data is collected, it should be analyzed to identify opportunities for customers to exploit these techniques to their advantage. These strategies then have to be implemented in a tool helping users to automatically obtain the lower prices. The tool will automatically configure the header rewriting tools, proxys and VPN servers (already used in the data collection process) so that a customer can simply connect via the tool to a website to get lower offers.

If precise models of the pricing strategy can be inferred from the collected data, these models could be validated and used to predict e.g. further price evolution. This would be done in collaboration with a behavioural economist from ETH Zurich. Depending on how far this modelling and prediction is taken, the project could also be suitable for two students working in collaboration.
URL sujet détaillé : http://people.inf.ethz.ch/jdreier/projects/dynamic-pricing.pdf

Remarques : Co-encadrement avec Prof. David Basin.
Eventuellement possibilité de renumération.



SM207-121 Analyzing Accountability in (E-)Exams  
Admin  
Encadrant : Jannik DREIER
Labo/Organisme : ETH Zurich - Institute of Information Security
URL : http://people.inf.ethz.ch/jdreier/
Ville : Z=FCrich / Suisse

Description  

Many companies and universities increasingly use electronic exam (e-exam for short) systems: for example Cisco's CCNA certifications or language tests such as IELTS or TOEFL are organized using computer systems, and also at ETH Zurich electronic exams are held regularly. This naturally raises the question of the security of such systems. From the point of view of the organizing institution, cheating is a major concern, but candidates (i.e., students) are typically more concerned with fairness and transparency of the process. In particular the latter concerns are often ignored by current systems. Typically the system is assumed to be trusted, and the candidates can only partly verify whether everything went well. Yet, as recent scandals have shown, (e-)exam systems cannot always be trusted entirely.

In security protocol design the notions of Verifiability, Accountability and Auditability have been known for some time. A verifiable protocol protocol allows the participants to verify, after the protocol's execution, whether the outcome of the protocol is correct. When verifiability fails, accountability additionally allows to identify which party is responsible for this failure. Finally a protocol is auditable if it logs sufficient evidence to convince an outside judge of the absence of errors.

Some preliminary work on verifiability in e-exams exist, however accountability and auditability have not yet been studied in this context.

The goal of this project is study accountability and auditability of (e-)exams. In the first part, based on previous work, formal definitions of accountability and auditability have to be developed. This requires identification of the stakeholders, and the properties requiring verification. Moreover, a suitable model for the definitions has to be chosen.

In the second part, these definitions will be applied on a case study. The goal is to validate the model and the definitions, but also to analyze to which extent current systems ensure the desired properties. Possible case studies include existing real e-exam systems (e.g., the system used at ETH), or cryptographic protocols from the literature. A formal analysis using a protocol verification tool such as Tamarin, ProVerif, CryptoVerif or similar is expected.

Optionally also a traditional pen-and-paper exam system can be studied, or new and improved e-exam protocols can be proposed. Moreover, if shortcomings in the case studies are identified, fixes can be developed.

URL sujet détaillé : http://people.inf.ethz.ch/jdreier/projects/e-exam.pdf

Remarques : Co-encadrement avec Prof. David Basin.
Eventuellement possibilité de renumération.



SM207-122 Cloud resource management  
Admin  
Encadrant : Jean-marc PIERSON
Labo/Organisme : IRIT
URL : www.irit.fr/~Jean-Marc.Pierson/ :
Ville : Toulouse

Description  

Clouds are changing the way computing resources are used. From Google to Amazon, most major IT companies are separating their computing resources from their usage using virtualization technique. Assuming the content of virtual machines are black box, it becomes possible to save energy at unprecedented scale using migration techniques and DVFS. SEPIA team at irit is investigating two complementary directions to improve sustainability of large scale datacenters:

1 - Distributed work stealing applied to virtual machines: In this work the focus will be put on decentralized decision center able to scale easily where some physical machines may steal work from others, if given threshold budget (financial, service level violations, power, thermal) is/are not exceeded. Obviously we will study the tradeoff between optimal solutions and decentralized possible inaccurate ones.
2 - Thermal and energy aware scheduling: In this work the focus will be on the thermal and energy models of clouds, from the architecture level to the application level. These models will serve decision mechanisms to schedule (and reschedule) the virtual machines among the set of physical host in order to achieve good performance while saving globally energy.

The goal of this research project is to merge those two approaches and to investigate how thermal and energy characteristics of nodes and applications can lead to reduce energy consumption of heterogeneous data centers. This project will use in-house cloud system CloudMip (32nodes, up to 256 VM) and Grid5000 (5000 cores, 11 sites in France).
URL sujet détaillé : :

Remarques :



SM207-123 Energy-efficient Kernel level DVFS for High Performance Computing  
Admin  
Encadrant : Georges DA COSTA
Labo/Organisme : IRIT
URL : www.irit.fr/~Georges.Da-Costa/ :
Ville : Toulouse

Description  

Classically, High Performance Computing systems are following the principle: The faster, the better, whatever the cost. Next generation HPC hardware has to be able to reduce power and energy consumption. It is necessary to reduce not only their cost, but also to make their existence possible: a HPC system needing a dedicated power plan would be impossible to build. One key element is to manage at fine grain the processor and cores frequencies. In HPC systems, the tradeoff between energy and performance lean toward the latest. The goal of this project is to optimize frequency to reduce power consumption without impacting performance.
To reach this objective, we will first detect the type of bottleneck a particular core is facing: computing, memory or IO-intensive application. This detection will be done using only black box models and thus will use only system information, but also performance counters. Depending on the particular bottleneck, the kernel DVFS system (governor) will adapt the processor frequency. A second step will be to design a processor model in order to generalise the approach. Finally a holistic model will be proposed in order not only to apply DVFS at the processor level, but also to manage memory banks and bandwidth and network card Adaptive Link Rate.
URL sujet détaillé : :

Remarques :



SM207-124 Smart-Campus Geolocation  
Admin  
Encadrant : Georges DA COSTA
Labo/Organisme : IRIT
URL : www.irit.fr/~Georges.Da-Costa/ :
Ville : Toulouse

Description  

With thousand of inhabitants, campus are now small towns. For all users (students, teachers, support staff), in-campus movements are part of the everyday job. Often being able to find a particular service or a particular lecture hall can be difficult. It is even harder as last-minute changes often occur (schedule change, logistic problems in lecture or meeting rooms). But Geolocation on a campus is not only a problem, it can be a precious resource. University can use anonymous geolocation data to improve road infrastructures, but also schedule. Using these data at a personal level also allows to automate some tasks (switching off lights and computers when everyone leaves a room, finding an available room for small meetings, tutoring or students willing to work on particular projects).
The goal of this project is to propose a global geolocation system based on a new language using geolocation condition to perform tasks in accordance of a global behavioral model based on differentiated profiles (students, teachers,...). There will be a cooperative learning component of the system able to detect global changes in paths followed by people to help detect automatically moving problems. This part of the system will be of great use to help motor disabled people to reach easily their destination.

URL sujet détaillé : :

Remarques : Part of neOCampus operation (www.irit.fr/neocampus)



SM207-125 Green smart campus  
Admin  
Encadrant : Patricia STOLF
Labo/Organisme : IRIT
URL : www.irit.fr/~Patricia.Stolf/ :
Ville : Toulouse

Description  

In the ambient context of the Campus the objective is to improve the life of the users of a university campus while answering ecological, financial objectives ... The campus is then a place of training but also an ecological place.
We suggest considering the set of machines of an university as being a cloud of machines and to virtualize the applications of the users. The project will study various scientific aspects to propose a dynamic center of decision to execute the applications in a global objective of reduction of the energy consumption.
From the model of the set of physical machines and Virtual Machines characteristics, this project will propose placement algorithms to choose the place of execution of the VM (consolidation, reduction of the energy consumption, reduction of the h spots). This work will be complemented by the study of the adaptation / elasticity so that the decisions also adapt themselves to the peaks of loads (for example in case of arrival / departure of students at the same time).

URL sujet détaillé : :

Remarques : Part of neOCampus operation (www.irit.fr/neocampus)



SM207-126 Energy-aware task management in hybrid environment  
Admin  
Encadrant : Patricia STOLF
Labo/Organisme : IRIT
URL : www.irit.fr/~Patricia.Stolf/ :
Ville : Toulouse

Description  

In this project, we will work on the dynamic management of services (may be composed of several interdependent individual tasks) proposed to the users in an hybrid environment (multi-cloud). These services can be executed either on dedicated cloud or on the users resources.
We aim at providing services to users optimizing several objectives related to the experience of the user (performance, latency, cost, ...) while reducing the overall energy consumption at its maximum, leading to necessary trade-off studies.

The objectives will be to propose distributed algorithm for placement, re-placement and scheduling of services in heterogeneous systems. Genetic algorithm approach will be compared to vector-packing.

URL sujet détaillé : :

Remarques :



SM207-127 Dynamique des Machines de Turing abstraites / Dynamics of Abstract Turing Machines  
Admin  
Encadrant : Emmanuel JEANDEL
Labo/Organisme : LORIA
URL : www.loria.fr/~ejeandel/ :
Ville : Nancy

Description  

The purpose of the internship is to study dynamical properties of abstract Turing machines, which correspond intuitively to Turing machines where the set of states may be arbitrary (but compact).

More information may be found on the link.

A familiarity with the course CR05 (Tilings: between Dynamical Systems and Computability) is appreciated but not mandatory.


URL sujet détaillé : http://www.loria.fr/~ejeandel/files/stage-M2.2.eng.pdf

Remarques :



SM207-128 Type refinement in λΠ-calculus modulo  
Admin  
Encadrant : Arnaud SPIWACK
Labo/Organisme : Deducteam, Inria Paris-Rocquencourt & CRI, MINES ParisTech
URL : http://assert-false.net/arnaud/
Ville : Paris & Fontainebleau

Description  

Dedukti is an experimental language designed to write proof checkers for logics. It is used to
implement independent proof checkers for proof assistants such as Coq and Isabelle and trace
checkers for automated deduction tools such as Zenon and iprover modulo. The logic underlying
Dedukti is called λΠ-modulo: it is an extension of λΠ, the simplest form of dependently typed
λ-calculus, with user-defined rewriting rules for type equality.

The goal of this internship is to develop a type-inference engine for Dedukti. In dependently
typed languages, type inference, unification and proof-search merge. Therefore, we propose to
develop, in this internship, a common abstract interface for these, improving on the work which
has been done for the proof-search engine of Coq.

This line of research contains both theoretical and practical programming aspects. The internship
can be geared towards on or the other depending on the student's preferences.
URL sujet détaillé : http://assert-false.net/arnaud/internships/deduktife.pdf

Remarques : Co-supervised by Olivier Hermant



SM207-129 Proof of convergence for a spacecraft guidance rendezvous algorithm  
Admin  
Encadrant : Mioara JOLDES
Labo/Organisme : LAAS-CNRS Toulouse
URL : http://homepages.laas.fr/mmjoldes/
Ville : Toulouse

Description  

The rendezvous process for spacecraft is an orbital operation involving at least two vehicles (satellite or International Space Station servicing, debris removal, formation flight) and consists in bringing the actuated vehicle (chaser) into the proximity of the passive vehicle (target) via a series of elementary phases (phasing, far range rendezvous, close range rendezvous). Each phase is composed of orbital maneuvers that must be carefully planned to reach their goal.

We focus on impulsive rendezvous problems for technological reasons since most satellites are equipped with ergol thrusters. The impulsive approximation for the thrust means that an instantaneous velocity increment is applied to the chaser for each impulse. We intend to obtain validated solutions to these problems under control and continuous trajectory constraints.

Since rendezvous problems are naturally formulated as optimal control problems we distinguish both direct approaches - which transform the dynamic optimal control problem into a static optimization problem [Hull97] - and indirect approaches - which are based on solving optimality conditions obtained from the Pontryagin Maximum Principle [Pontryagin62]-.

Concerning indirect methods, the derivation of the optimality conditions from the Pontryagin principle is known as the primer vector theory [Lawden63]. In [Arzelier13] the framework of minimum-fuel fixed-time rendezvous problem is presented and necessary and sufficient conditions of optimality are recalled. These conditions are non-convex and transcendental, so a mixed iterative algorithm using dynamic griding and polynomial optimization techniques is developed.
This algorithm is based on a heuristic approach which consists mainly of two stages: (1) an initialization step solving a 2-impulse rendezvous problem; (2) an iterative procedure which builds up final k-impulse solution by adding/merging impulses in an attempt to minimize the maximum of the norm of the primer vector.

While the convergence of this algorithm is observed in all practical cases tested, no analytical proof exists so far.

The objective of this internship is to assess the systematic convergence of this algorithm. One intuitive starting point would be to compare and contrast the idea of this iterative approach to classical minimax, "points exchange technique" approximation algorithms, like the one given by Remez [Remez34], [Chap. 3, Cheney98].

Hence, the candidate should analyse the primer vector theory in order to understand the construction and functionning of the algorithm developed at LAAS-CNRS. Thereafter, he (she) should highlight links between this and the Remez algorithm that would eventually lead to the construction of a proof of convergence.

Keywords: space rendezvous, primer vector theory, minimax approximation

Supervisors: Mioara Joldes, Christophe Louembet, MAC, LAAS-CNRS.

References

[Arzelier13] Arzelier, D. and Louembet, C. and Rondepierre, A. and Kara-Zaitri, M., A New Mixed Iterative Algorithm to Solve the Fuel-Optimal Linear Impulsive Rendezvous Problem, Journal of Optimization Theory and Applications, 2013, vol. 159, no. 1, pages 210-230.

[Cheney98] Cheney, E.W.,Introduction to approximation theory, Amer. Mathematical Society, 1998.

[Hull97] Hull, D. G., Conversion of optimal control problems into parameter optimization problems, J. Guidance, Control and Dynamics, AIAA, 1997, vol. 20, no. 1.

[Lawden63] Lawden, D.F., Optimal trajectories for space navigation, Butterworth, London, England, 1963.

[Pontryagin62] Pontryagin, L.S. and Boltyanskii, V.G. and Gamkrelidze, R.V. and Mishchenko, E.F., The Mathematical Theory of Optimal Processes, Interscience, 1962.

[Remez34] E. Remes, Sur le calcul effectif des polynomes d'approximation de Tchebichef, Comptes rendus hebdomadaires des séances de l'Académie des Sciences, no. 199, pages 337=96340, 1934.
URL sujet détaillé : :

Remarques :



SM207-130 Compiling process networks on GPU  
Admin  
Encadrant : Christophe ALIAS
Labo/Organisme : LIP/INRIA
URL : http://perso.ens-lyon.fr/christophe.alias
Ville : LYON

Description  

A process network is a model of parallel computation, which serves as an intermediate representation for parallelizing compilers.

GPU accelerating cards, initially designed to fit the performance requirements of video game, serves now as basic blocs for High-performance super-computers.

The goal of this internship is to find a way to map our process networks on GPU. That is, to find out how to allocate the process network elements (process, buffers) on the GPU components (memory hierachy, SP, warps, core).

The trainee will:
- study the state of the art on process networks and automatic parallelization, notably on GPU ;
- translate ``by hand'' several simple process networks (vector sum, matrix multiply, 1D Jacobi, etc) ;
- propose a translation algorithm, and possible improvements to the process model ;
- implement (in C++), test and validate the approach experimentally.
URL sujet détaillé : http://perso.ens-lyon.fr/christophe.alias/stages/stage_gpu_en.pdf

Remarques : This internship can lead to a PhD thesis.



SM207-131>Liens entre propriétés dynamiques et algorithmique  
Admin  
Encadrant : Mathieu SABLIK
Labo/Organisme : LATP
URL : http://www.latp.univ-mrs.fr/~sablik/
Ville : MARSEILLE

Description  

Etant donné une classe de systèmes dynamiques effectifs, il apparait naturellement des conditions nécessaires liées à la calculabilité pour décrire des propriétés dynamiques. On cherche alors à savoir si ce sont les seules. Cette approche a beaucoup été étudiée pour les automates cellulaires et les pavages définis par règles locales. Pour ces derniers, les réels réalisant l'entropie sont exactement les réels positifs récursivement approchables par le haut. Ce type de résultats repose sur des constructions techniques qui permettent d'encoder une machine de Turing, mais on souhaiterait dégager les éléments de notre classe de systèmes dynamiques qui permettent cet encodage.

Une approche pour aller dans cette direction, accessible pour un stage de M2 et qui ouvre certainement de nombreuses pistes de recherches, est de regarder les transformations de l'intervalle (i.e. étant donné une fonction 1166BR>f:[a,b]^d->[a,b]^d, on considère le système dynamique défini par u_{n+1}=f(u_n) où u_0in[a,b]^d). En effet, il est facile de faire varier la classe de systèmes dynamiques suivant les propriétés des fonctions considérées (continue, affine par morceaux, linéaire par morceaux...). Par exemple, il apparaît des limitations liées à la calculabilité pour l'entropie, mais on souhaiterait savoir à partir de quelles restrictions sur la classe des systèmes on obtient toutes les entropies possibles. Cette problématique a pour but d'établir un dictionnaire entre propriétés dynamiques et propriétés liées à la calculabilité voir à l'algorithmique dans le cadre des systèmes dynamiques effectifs.

URL sujet détaillé : http://www.latp.univ-mrs.fr/~sablik/Stage.html#SystemeDynamiqueCalculable

Remarques :



SM207-132 Information content of programs  
Admin  
Encadrant : Mathieu HOYRUP
Labo/Organisme : LORIA/Inria Nancy Grand-Est
URL : http://www.loria.fr/~hoyrup/
Ville : Nancy

Description  

Algorithms usually process finite inputs such as natural numbers, finite strings, finite graphs,
etc. In computability theory one is often interested in providing algorithms with infinite
inputs such as functions on natural numbers, infinite graphs, real numbers, etc. There are
mainly two ways of making it possible. In the Type-1 model, one only allows computable infinite
inputs, which can be represented by finite programs, which are the actual inputs to the
algorithm. In the Type-2 model, the infinite input is given by an oracle, or is entirely
written on an infinite input tape. The first model is in a sense more powerful: from a program
computing an input one can recover the input, so every Type-2 algorithm can be converted into a
Type-1 algorithm. Comparing these two models has been a central topic for a long time, but the
Type-1 model is not well understood. Celebrated results such as the undecidability of the halting
problem proved by Turing or Rice theorem essentially say that having a program does not
help in understanding its behavior, so they provide cases when the Type-1 model is not more
powerful than the Type-2 model. However there exist situations where the two models differ.
Our main question is: what makes the Type-1 model strictly more powerful than the Type-2 model?
What additional information does a program computing an input provides about the input?


We have recently shown that to some extent, giving a program computing an input is the same
as giving the input itself together with an upper bound on the size of a program computing it (its
Kolmogorov complexity). Hence each Type-1 algorithm can be converted to a Type-2 algorithm
that is additionally given an upper bound on the Kolmogorov complexity of its input. This
equivalence holds to some extent, depending on the class of inputs and the task performed by the
algorithm.


The objective of the internship is to continue this work. Several problems are raised about this
equivalence: understanding its consequences, its limits, understanding what is the ``information
content'' of a program.
URL sujet détaillé : http://www.loria.fr/~hoyrup/programs.pdf

Remarques :



SM207-133 Proof of convergence for a spacecraft guidance rendezvous algorithm  
Admin  
Encadrant : Mioara JOLDES
Labo/Organisme : LAAS-CNRS Toulouse
URL : http://homepages.laas.fr/mmjoldes/ http
Ville : Toulouse

Description  

The rendezvous process for spacecraft is an orbital operation involving at least two vehicles (satellite or International Space Station
servicing, debris removal, formation flight) and consists in bringing the actuated vehicle (chaser) into the proximity of the passive vehicle
(target) via a series of elementary phases (phasing, far range rendezvous, close range rendezvous). Each phase is composed of orbital
maneuvers that must be carefully planned to reach their goal. We focus on impulsive rendezvous problems for technological reasons since most
satellites are equipped with ergol thrusters. The impulsive approximation for the thrust means that an instantaneous velocity increment is applied to the chaser for each impulse. We intend to obtain
validated solutions to these problems under control and continuous trajectory constraints. Since rendezvous problems are naturally
formulated as optimal control problems we distinguish both direct approaches - which transform the dynamic optimal control problem into a static optimization problem [Hull97] - and indirect approaches - which are based on solving optimality conditions obtained from the Pontryagin
Maximum Principle [Pontryagin62]-. Concerning indirect methods, the derivation of the optimality conditions from the Pontryagin principle is known as the primer vector theory [Lawden63]. In [Arzelier13] the framework of minimum-fuel fixed-time rendezvous problem is presented and necessary and sufficient conditions of optimality are recalled. These conditions are non-convex and transcendental, so a mixed iterative algorithm using dynamic griding and polynomial optimization techniques is developed. This algorithm is based on a heuristic approach which consists mainly of two stages: (1) an initialization step solving a 2-impulse rendezvous problem; (2) an iterative procedure which builds up final k-impulse solution by adding/merging impulses in an attempt to
minimize the maximum of the norm of the primer vector. While the convergence of this algorithm is observed in all practical cases tested,
no analytical proof exists so far. The objective of this internship is to assess the systematic convergence of this algorithm. One intuitive
starting point would be to compare and contrast the idea of this iterative approach to classical minimax, "points exchange technique" approximation algorithms, like the one given by Remez [Remez34], [Chap. 3, Cheney98]. Hence, the candidate should analyse the primer vector theory in order to understand the construction and functionning of the algorithm developed at LAAS-CNRS. Thereafter, he (she) should highlight
links between this and the Remez algorithm that would eventually lead to the construction of a proof of convergence.
Keywords: space rendezvous, primer vector theory, minimax approximation, Remez algorithm

Supervisors: Mioara Joldes, Christophe Louembet, MAC, LAAS-CNRS.

References [Arzelier13] Arzelier, D. and Louembet, C. and Rondepierre, A. and Kara-Zaitri, M., A New Mixed Iterative Algorithm to Solve the Fuel-Optimal Linear Impulsive Rendezvous Problem, Journal of Optimization Theory and Applications,
2013, vol. 159, no. 1, pages 210-230.

[Cheney98] Cheney, E.W.,Introduction to approximation theory, Amer. Mathematical Society,
1998.

[Hull97] Hull, D. G., Conversion of optimal control problems into parameter optimization problems, J. Guidance, Control and Dynamics,
AIAA, 1997, vol. 20, no. 1.

[Lawden63] Lawden, D.F., Optimal trajectories for space navigation, Butterworth, London, England, 1963.

[Pontryagin62] Pontryagin, L.S. and Boltyanskii, V.G. and Gamkrelidze, R.V. and Mishchenko, E.F., The Mathematical Theory of Optimal Processes,
Interscience, 1962.

[Remez34] E. Remes, Sur le calcul effectif des
polynomes d'approximation de Tchebichef, Comptes rendus hebdomadaires des séances de l'Académie des Sciences, no. 199, pages 337=96340, 1934.
URL sujet détaillé : https://www.laas.fr/boreal/web/fr/stage/voirStage/224

Remarques : Possibilité d'indemnisation



SM207-134 Lambda Calculus on the reals compared to other real models of computation  
Admin  
Encadrant : Emmanuel HAINRY
Labo/Organisme : Loria
URL : carte.loria.fr :
Ville : Nancy

Description  

Contrary to discrete computation, there is no Church-Turing thesis for models of computation on the reals. Various models exist: Recursive Analysis[1] that uses Turing Machines for computing reals at any precision; the General Purpose Analog Computer[2] which combines basic continuous bricks in a fashion similar to circuits; the Blum Shub and Smale model[3] which transforms the RAM-model so that it can read and store real numbers in registers; Abstract Geometrical machines[4] inspired by cellular automata... Some comparability and incomparability results between them, but none of those models is clearly the good one.

Recently, an extension of Lambda calculus was introduced by Di Ginanatonio and Edalat[5] for capturing continuous functions with the help of a Differential operator. This model presents similarity with the GPAC as differentiation is a core operator, however it also has the ability to produce non-continuous functions with a test which can only be found in the BSS model. Where this model can be placed on the map of models of real computation is hence an open question.

The goal of this internship will be to compare this Lambda Calculus with Differential (LCD) with the GPAC. Are all functions generated by GPAC definable in LCD, and conversely. For possible simulations do they preserve computing time?
References

1. Weihrauch, K.: Computable Analysis: an Introduction. Springer (2000).

2. Shannon, C.E.: Mathematical theory of the differential analyzer. J. Math. Phys. MIT. 20, 337=96354 (1941).

3. Blum, L., Cucker, F., Shub, M., Smale, S.: Complexity and Real Computation. Springer (1998).

4. Durand-Lose, J.: Abstract geometrical computation 5: embedding computable analysis. Natural Computing. 10, 1261=961273 (2011).

5. Di Gianantonio, P., Edalat, A.: A language for differentiable functions. Foundations of Software Science and Computation Structures. pp. 337=96352 (2013).

URL sujet détaillé : http://www.loria.fr/~hainry/sujet2_en.html

Remarques : Co-advisor: Romain Péchoux



SM207-135 Complexity Analysis in Java with Tier based Type System  
Admin  
Encadrant : Romain PECHOUX
Labo/Organisme : Loria
URL : carte.loria.fr :
Ville : Nancy

Description  

Domain

The aim of Implicit Complexity is to design criteria (type systems, semantic interpretations) to prove that programs belong to given classes of complexity. A new implicit complexity analysis based on a type system for imperative and object oriented languages was proposed in articles [1], [2] and [7]. This analysis is inspired by Data Ramification techniques [3, 4] and by non-interference [5]. It ensures that if a program can be typed and terminates, it will run in polynomial time (or in a different context, polynomial space).
Goal

The goal will be to extend the existing type system to capture more features of Object Oriented Programs, while still ensuring polynomial time or polynomial space execution.

This work should focus on the following ideas:

---increase the number of programs that can be analysed through the use of program transformation techniques;
---improve the expressivity of the considered language by considering forks, threads, functions, exceptions for example.
---explore the common cases in real world programs for which this analysis fails and correct the type system to capture them.

Prerequisite

Knowledge of Java (or similar Object Oriented Language), good understanding of the theory of complexity and of type theory would be welcome.
References

(1) J.-Y. Marion, A Type System for Complexity Flow Analysis. LICS 2011, pp. 123-132 (2011)
(2) J.-Y. Marion, R. Péchoux, Complexity Information Flow in a Multi-threaded Imperative Language, Tamc 2014, lncs.
(3) S. Bellantoni and S. Cook, A new recursion-theoretic characterization of the poly-time functions, Computational Complexity 2 (1992), p. 97=96110.
(4) D. Leivant and J.-Y. Marion, Lambda calculus characterizations of poly-time, Fundamenta Informaticae 19 (1993), no. 1,2, p. 167,184.
(5) Dennis M. Volpano, Cynthia E. Irvine, Geoffrey Smith, A Sound Type System for Secure Flow Analysis. Journal of Computer Security 4(2/3): 167-188 (1996)
(6) Byron Cook, Andreas Podelski, Andrey Rybalchenko, Terminator: Beyond Safety. CAV 2006, 415-418
(7) E. Hainry, R. Péchoux, Type-based heap and stack space analysis in Java, http://hal.inria.fr/hal-00773141, 2013


URL sujet détaillé : http://www.loria.fr/~hainry/sujet1.html

Remarques : Co-advisor: Emmanuel Hainry



SM207-136 Etude des Schnyder woods dans les graphes sur les surfaces  
Admin  
Encadrant : Benjamin LEVEQUE
Labo/Organisme : LIRMM
URL : http://www.lirmm.fr/~leveque/
Ville : MONTPELLIER

Description  

Une "Schnyder wood" est une orientation et coloration des arêtes internes d'un graphe triangulé planaire vérifiant une jolie propriété
locale. Chaque sommet doit avoir exactement trois arêtes sortantes de différentes couleurs : sortante rouge, verte, bleue apparaissant dans
le sens horaire autour d'un sommet (voir Figure 1). Toutes les autres arêtes incidentes à ce sommet doivent être entrantes. Les entrantes rouges doivent entrer dans le secteur entre la sortante bleu et la sortante vert, et symétriquement pour les autres couleurs. Cette simple propriété locale implique une propriété globale où les arêtes internes de la triangulation se retrouvent partionnées en 3 arbres selon la couleur des arêtes, d'où le nom de Schnyder wood (voir Figure 2). Ces Schnyder woods ont de nombreuses applications en informatique et
mathématique comme l'encodage compact, le dessin de graphes, les représentations par des objets géométriques, etc.

Nous proposons une généralisation de cette structure aux graphes triangulés sur les surfaces orientés de genre quelconque : sphère, tore, double tore, bretzel, etc. Dans cette généralisation un sommet à un dégré sortant égal à (0 modulo 3) et la propriété locale avec les
couleurs est alors répétée plusieurs fois autour d'un sommet (voir Figure 3). Nous savons qu'il existe toujours une orientation des arêtes telle que chaque sommet a un dégré égal à (0 modulo 3) d'après un résultat de Barat et Thomassen. Mais pour le moment, la propriété sur les couleurs n'a pu etre démontrée que dans le cas de la sphère et du tore.

Le but de ce stage est d'étudier de manière appronfondie le cas du double tore. Il est facile de voir qu'étant donné une orientation où
tous les sommets ont un degré sortant égal à (0 modulo 3), il suffirait de retourner des cycles orientés pour obtenir aussi la propriété sur les couleurs. L'étudiant pourra dans un premier temps
écrire un programme permettant de comprendre comment se comporte ces retournements de cycles sur des exemples de triangulations du double
tore pour ensuite essayer de prouver l'existence de l'objet étudié en toute généralité.

La proposition de stage est disponible ici avec les Figures : http://www.lirmm.fr/~leveque/Stage.pdf
URL sujet détaillé : http://www.lirmm.fr/~leveque/Stage.pdf

Remarques : Gratification possible pour les non normaliens



SM207-137 Erreurs d'arrondi en analyse numérique  
Admin  
Encadrant : Sylvie BOLDO
Labo/Organisme : Inria Saclay
URL : https://www.lri.fr/~sboldo/
Ville : Orsay

Description  

Ce stage concerne la vérification de programmes utilisant l'arithmétique à virgule flottante.Cette arithmétique est une approximation de l'arithmétique usuelle sur les nombres réels. Les calculs y sont effectués en précision finie ; les résultats intermédiaires ne sont pas représentés exactement. Cela cause des erreurs dites d'arrondi qui peuvent se propager, s'amplifier et potentiellement faire diverger les résultats.

L'une des approches pour certifier un programme est la vérification déductive, c'est-à-dire la preuve de propriétés logiques extraites du programme. Cela s'appuie sur une base de théorèmes formellement prouvés avec le système Coq.

Nous nous intéresserons ici à des schémas numériques stables adaptés à des équations aux dérivées ordinaires et à leur erreur d'arrondi. Nous avons démontré un cas particulier de compensation très importante dans un cas simple pour une équation aux dérivées partielles. Il serait donc intéressant de savoir si, de façon générale, les schémas numériques se comportent bien ou peuvent être des cas pathologiques vis-à-vis des calculs flottants.

Ce stage a pour but d'éclaircir, pour les équations aux dérivées ordinaires seulement, le lien entre deux notions : la stabilité au sens de l'analyse numérique et la stabilité au sens de l'arithmétique flottante : le fait que le calcul se passe "bien", i.e. que les erreurs se compensent à cause de la forme particulière qu'a un schéma numérique stable et convergent.
URL sujet détaillé : https://www.lri.fr/~sboldo/files/stageM2_an.pdf

Remarques :



SM207-138 Bisimulations for the rho-calculus  
Admin  
Encadrant : Serguei LENGLET
Labo/Organisme : Loria
URL : http://www.loria.fr/~slenglet/
Ville : Nancy

Description  

The rho-calculus [1] extends lambda-calculus with rewriting notions, such as first-order patterns, which can be passed to functions as arguments and
manipulated by terms. The calculus is useful for studying the theory of rule-based programming languages such as Tom[2]. The goal of this internship is to study the behavioral theory of the rho-calculus, i.e., to find how to prove formally that two rho-calculus terms behave in the same way. To do so, one has to define relations called bisimulations and prove their soundness. More
precisely, we want to extend the already existing bisimulations for the lambda-calculus to the rho-calculus. We aim at proving equivalences between
patterns or startegies in the rho-calculus, or maybe between programs in Tom.

[1] rho.loria.fr
[2] tom.loria.fr
URL sujet détaillé : :

Remarques : Stage co-encadré avec Horatiu Cirstea



SM207-139 Multi-scale classification and analysis of data on networks  
Admin  
Encadrant : Paulo GONÇALVES
Labo/Organisme : The internship will be co-supervised by Paulo Gonçalves (Inria DR, LIP ENS Lyon) and Pierre Borgnat (CNRS CR, Physics Lab. ENS Lyon).
URL : https://team.inria.fr/dante/fr/
Ville : Lyon

Description  

Keywords: signal processing, networks, graph wavelets, semi-supervised learning, clustering

In a growing number of scientific studies, it is necessary to add to the data a description layer in terms of networks: it is true when studying transportation networks, biological networks, social networks, sensor networks, Internet networks [New10, BBV08]. or even images possibly with irregular samples, or only point samples [LG12], For processing these data having a structure of relational properties encoded by networks, the field of data processing on networks is emerging [SNF+13] and needs to rely jointly on approaches from computer science, physics and signal processing.

Elaborating on our previous works conducted about multiscale community detection in networks [TB14] and semi-supervised learning on graphs [GGFSM14], the general idea proposed for this internship is to study graph-based multiscale processing, transformations and filtering of data on networks. For that, one promising approach is to consider that many detection, segmentation or estimation problems for data on graphs can be represented as a problem of learning, or of classification, with a varying degree of supervision. An objective of this project will be to probe how semi-supervised learning (SSL) [AGMS12] can be designed for data in graphs in a way to fall between supervised and unsupervised cases.

This internship is clearly envisioned as the preparatory stage of a possible PhD work within the framework of the ANR grant "Graph Data and Signal Processing" in collaboration with the Physics Lab of ENS de Lyon, GREYC (Caen), GIPSA-lab (Grenoble) and LIGM (Marne-la-vallée).


[AGMS12] K. Avrachenkov, P. Gonçalves, A. Mishenin, and M. Sokol. Generalized optimization framework for graph- based semi-supervised learning. In SIAM Data Mining, 2012.
[BBV08] A. Barrat, M. Barthlemy, and A. Vespignani. Dynamical processes on complex networks. Cambridge University Press, 2008.
[GGFSM14] B. Girault, P. Gonçalves, E. Fleury, and A. Singh Mor. Semi-supervised learning for graph to signal mapping: a graph signal wiener filter interpretation. In IEEE Int. Conf. on Acoust. Speech and Sig. Proc., Firenze (Italy), May 2014.
[LG12] O. Lezoray and L. Grady, editors. Image Processing and Analysis with Graphs. Theory and Practice. CR Press, 2012.
[New10] M. E. J. Newman. Networks: An Introduction. Oxford University Press, 2010.
[SNF+ 13] D. Shuman, S.K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine, 30(3):83=9698, 2013.
[TB14] N. Tremblay and P. Borgnat. Graph wavelets for multiscale community mining. Signal Processing, IEEE Transactions on, 62(20), October 2014.



URL sujet détaillé : :

Remarques : The internship will be co-supervised by Paulo Gonçalves (Inria DR, LIP ENS Lyon) and Pierre Borgnat (CNRS CR, Physics Lab. ENS Lyon).

Indemnisation de stage au tarif en vigeur



SM207-140 The total s-energy of a multi-agent system  
Admin  
Encadrant : Bernadette CHARRON-BOST
Labo/Organisme : LIX, Ecole polytechnique
URL : http://www.lix.polytechnique.fr/
Ville : Palaiseau

Description  

The objective of the internship is to study the algorithmic proof of the general upper bound on the s-energy of a multi-agent system with bidirectional communication graphs and to extend it to other models for which convergence is also guaranteed (eg., the decentralized model with semi-simple communication graphs).
One application of particular interest is the variant of the Hegselmann-Krause model in social epistemology which assumes a cognitive division of labor [6]: Truth seeking systems consist of one fixed agent (the truth) and mobile agents (truth seekers and ignorants). Agents thus play non-symmetric role and communication graphs are non-bidirectional. However each truth seeking system corresponds to the decentralized model, and the convergence result for this model implies that truth seeking systems converge. An upper bound on the s-energy in the semi-decentralized model would provide informations on the convergence time of truth seeking systems.
URL sujet détaillé : http://www.lix.polytechnique.fr/~charron

Remarques : Possibilité de rémunération dans le cadre d'un contrat DGA-LIX



SM207-141 Vers une encyclopédie des centres de triangles certifiée  
Admin  
Encadrant : Julien NARBOUX
Labo/Organisme : ICube/Université de Strasbourg
URL : http://dpt-info.u-strasbg.fr/~narboux/index.html
Ville : Strasbourg

Description  

Chacun connaît quelques points caractéristiques des triangles: le centre de gravité, le centre du cercle circonscrit, l'orthocentre, etc...
Ces points ont de nombreuses propriétés qui ont été étudiées par les géomètres pendant des siècles: par exemple le centre de gravité, le centre du cercle circonscrit et l'orthocentre sont alignés sur une droite appelée droite d'Euler[1].
Plus récemment, sous l'impulsion de Clark Kimberling une encyclopédie des centres de triangles a été développée: http://faculty.evansville.edu/ck6/encyclopedia/ETC.html
Cette encyclopédie comporte à l'heure actuelle la définition de plus de 6000 centres de triangles et la liste de nombreuses propriétés.
Il est possible à l'aide d'un logiciel de calcul formel de vérifier automatiquement de nombreuses propriétés de ces centres de triangles. Il est aussi possible de vérifier certaines de ces propriétés à l'aide de l'assistant de preuve Coq. Par exemple, pour tester si trois points sont alignés cela revient à vérifier si un déterminant correspond au polynôme nul.
Notre objectif est de créer une version certifiée de l'encyclopédie ETC.
Un travail préliminaire a déjà été réalisé pour étudier la faisabilité du projet et générer automatiquement la preuve de nombreuses propriétés (par David Braun). Mais ce travail admet certains faits comme par exemple:
le fait que le déterminant de la matrice 3*3 formé par les coordonnées trilineaires des trois points est nul ssi les points sont collinéaires,
la caractérisation sous forme d'une equation en coordonnées trilineaires du cercle circonscrit à un triangle, etc.
L'objectif de ce stage est de combler ces lacunes en prouvant dans le contexte de la géométrie de Tarski (http://dpt-info.u-strasbg.fr/~narboux/tarski.html) les différents lemmes nécessaires pour réaliser des calculs en utilisant les coordonnées trilineaires ou barycentriques.
URL sujet détaillé : https://docs.google.com/document/d/1R39fSKGhBU3PPLyjemCs6x8yVGUh1kyDsYljTiZ_yXU/pub

Remarques :



SM207-142 Sommeil/Mémoire et Analyse du signal avancée en Neurosciences  
Admin  
Encadrant : Gaël MALLERET
Labo/Organisme : Centre de Recherche en Neurosciences de Lyon (CRNL)
URL : http://crnl.univ-lyon1.fr/index.php/fr/Recherche/Equipes/7
Ville : Lyon

Description  

Despite of the huge progress made recently in understanding the memory formation processes, the function of sleep, which is a critical step for this mechanisms, remains largely unknown.

We are currently studying the mechanisms of plasticity related to the different stages of sleep after learning in rats, which is a consistent model to understand this function in humans. For this purpose, we have made very long term recordings of freely behaving rats, sleeping and learning. These recordings includes local field potentials, which are spontaneous electrical oscillations varying according to vigilance states and slightly differents between distant brain regions. For the first time, we are coupling this measures to synaptic evoked potentials reflecting the strength of the connections between these areas.

We have developped a Matlab-Based graphic software for the visualization and analysis of such signals according to vigilance state. This software allow the quantification of various parameters of advanced signals analysis like coherence between oscillations, phase differences, cross-correlation in amplitude at specific frequencies, detection of phasic burst of oscillations and other spectral parameters informing us about the evolution of functional connectivity between the brain regions of interest.

We are looking for a student motivated by a trans-disciplary and highly fundamental approach to continue the development of this software, improving its functionnality, and help us in the treatment and management of the big data we acquired to uncover the roles of sleep in memory.
URL sujet détaillé : :

Remarques :



SM207-143 Absorber en couleurs  
Admin  
Encadrant : Laurent BEAUDOU
Labo/Organisme : LIMOS / Univ. Blaise Pascal/ CNRS
URL : http://www.isima.fr/~beaudou/
Ville : Clermont-Ferrand

Description  

This internship will deal with the following conjecture attributed to Erd=F6s:

For every integer k there exists a constant f(k)
such for any k-arc-coloured tournament on n vertices there exists an absorbing subset of vertices of size less than f(k).





URL sujet détaillé : http://www.isima.fr/~limouzy/absorption.pdf

Remarques : Stage co-encadré par L. Beaudou et V. Limouzy



SM207-144 Formal Semantics for JavaScript Proxies  
Admin  
Encadrant : Alan SCHMITT
Labo/Organisme : Projet Celtique, INRIA Rennes
URL : http://www.irisa.fr/celtique/aschmitt/
Ville : RENNES

Description  

JavaScript has become the de facto assembly language of the web. Not
only it is used in the setting of web applications, but it is now the
compilation target of several languages, enabling many programs to run
directly in a browser. Unfortunately, JavaScript is a complex and very
dynamic language, thus JavaScript programs are difficult to analyze
and even harder to prove correct. The combination of widespread use
and complexity motivates the setting of this internship: research
formal techniques for safer JavaScript development.

To control the access to resources, new programming languages
constructs have been proposed, as illustrated by Secure EcmaScript [1]
and the Caja language [2]. These constructs rely on "object
capabilities" and "proxies" [3], the latter currently being
standardized in EcmaScript 6 for the next version of JavaScript.

The topic of this internship is to formally define these constructs in
the setting of the JSCert formalization of JavaScript in Coq [4] and
to show that resource isolation may be achieved using proxies.

The internship will be supervised by [[http://www.irisa.fr/celtique/aschmitt/][Alan Schmitt]], researcher in the
[[http://www.irisa.fr/celtique/][Celtique team]] at Inria Rennes. The student will have the opportunity
to interact with the other members of the team, but also with
researchers of the JSCert project from Inria Saclay and Imperial
College. This internship opens up the opportunity of a PhD thesis in
the setting of the AJACS ANR project.

[1] https://code.google.com/p/es-lab/

[2] https://developers.google.com/caja/

[3] Trustworthy Proxies: Virtualizing Objects with
Invariants. T. V. Cutsem, M. S. Miller. Proceedings of ECOOP '13.
http://research.google.com/pubs/pub40736.html

[4] A Trusted Mechanised JavaScript Specification. M. Bodin,
A. Charguéraud, D. Filaretti, P. Gardner, S. Maffeis,
D. Naudziuniene, A. Schmitt, G. Smith. Proceedings of POPL '14.
http://jscert.org/

URL sujet détaillé : http://ajacs.inria.fr/internships/proxies.html

Remarques : Ce stage pourra être rémunéré.



SM207-145 Construction d'un atlas statistique pour la modélisation et le recalage de maillages dynamiques  
Admin  
Encadrant : Hyewon SEO
Labo/Organisme : ICube
URL : http://icube-igg.unistra.fr/index.php/Hyewon_Seo
Ville : Strasbourg

Description  

We will construct a statistical atlas of a polulation of time-varying (dynamic) 3D objects, in particular human meshes. Linear and nonlinear models will be adopted in order to reduce the dimension of the original data. Several types of distributions will also be studied, such as gaussian and non-gaussian. Finally, we will propose an improved registration algorithm by using the constructed atlas.

URL sujet détaillé : https://drive.google.com/open?id=0Bxt4xLT1VDZiRDJsdlMtSXBjaGM&authuser=0

Remarques : * Co-encadrement avec
- Christian Heinrich (ICube, Strasbourg, christian.heinrich.fr)
- Frederic Cordier (ICube, Strasbourg et UHA, Mulhouse, fcordier.fr)



SM207-146 Eye Tracking et modèle de saillance visuelle pour les objets tridimensionnels  
Admin  
Encadrant : Hyewon SEO
Labo/Organisme : ICube
URL : http://icube-igg.unistra.fr/index.php/Hyewon_Seo
Ville : Strasbourg

Description  

In this project, we will use eye-tracker devices to study visual saliency of 3D objects, with an aim of building a ground-truth for automatic saliency extraction methods. Once the constructed ground-truth, we will propose an error metric or evaluation method for existing saliency extraction methods. Finally, we will propose a new saliency extraction method based on the constructed ground-truth, by adopting a learning method, for instance.
URL sujet détaillé : https://drive.google.com/open?id=0Bxt4xLT1VDZic2poMzMtNWtWU3c&authuser=0

Remarques : * Co-encadrement avec
- Guillaume Lavoué (LIRIS, INSA)
- Chaker Larabi (XLIM, Poitiers)




SM207-147 Outils multidimensionnels de déformation  
Admin  
Encadrant : Dominique BECHMANN
Labo/Organisme : ICube
URL : http://icube-igg.unistra.fr/fr/index.php/Dominique_Bechmann
Ville : Strasbourg

Description  


This work is about geometric modeling. Interactive editing of 3D objects deformation is a tool for creating and modifying objects.

The objective of this work would be to apply deformations based on tools of different sizes simultaneously (points, curves, cages, etc.)

The difficulty of the work comes from different mathematical formulation for each tool different dimension.

A method for mixing the different tools will have to be proposed. It will allow to ensure the continuity of deformation at the interfaces between these tools.
URL sujet détaillé : https://dpt-info.u-strasbg.fr/~bechmann/M2ISI-Stage2015-Deformation3D.pdf

Remarques :



SM207-148 Reconstruction des vaisseaux à partir de leur structure topologique  
Admin  
Encadrant : Dominique BECHMANN
Labo/Organisme : ICube
URL : http://icube-igg.unistra.fr/fr/index.php/Dominique_Bechmann
Ville : Strasbourg

Description  


Reconstruct the surface of organs from a medical image is a difficult problem, especially for structures such as blood vessels or vascular networks.

3D medical images are composed of voxels from which traditional algorithms reconstruct surface meshes composed of triangles, for example. However, in the reconstruction algorithms, defects are observed at the ends of the vessels. We therefore wish to design and implement a reconstruction adapted to these structures . This is the subject of this work.
URL sujet détaillé : https://dpt-info.u-strasbg.fr/~bechmann/M2ISI-Stage2015-ReconstructionVaisseaux.pdf

Remarques :



SM207-149 Reconstruction multi-organes d'images médicales  
Admin  
Encadrant : Dominique BECHMANN
Labo/Organisme : ICube
URL : http://icube-igg.unistra.fr/fr/index.php/Dominique_Bechmann
Ville : Strasbourg

Description  



Reconstruct the surface of organs from a medical image is a difficult problem, especially if the algorithm must ensure topological consistency between the organs.

3D medical images are composed of voxels with different gray levels. These images are difficult to interpret. In addition, they are not suitable for more advanced treatment. Thus, it became of interest to analyze these images and model the structures they represent.

A well-known algorithm to reconstruct these meshes for all configurations of voxels working on the edge of each organ is Marching Cubes.

His results create inconsistencies between the meshes of two adjacent anatomical structures. These topological problems are visually annoying but mostly they do not respect the anatomical reality.

The subject of this work is to develop an algorithm for Marching Cubes Multi-Organ for simultaneous reconstruction of all the anatomical structures present in the image.
URL sujet détaillé : https://dpt-info.u-strasbg.fr/~bechmann/M2ISI-Stage2015-MarchingCubeMultiOrganes.pdf

Remarques :



SM207-150 Modélisation dynamique d'une langue 3D en utilisant un articulographe  
Admin  
Encadrant : Slim OUNI
Labo/Organisme : LORIA
URL : http://www.loria.fr/~slim
Ville : Nancy

Description  

** Context
The human tongue plays a significant role in speech production. It participates in several kinds of sounds, and its position is critical for some phonemes. Even though it is only partially visible from outside the mouth, the tongue helps to improve the intelligibility of audiovisual speech. The tongue is a complex organ that is highly flexible, expandable, compressible, can be curved and display a fine degree of articulation. The dynamic aspects of the tongue articulation (mainly coarticulation) are important as this has an influence on the intelligibility of the overall gestures. It is possible to observe the tongue behavior and measure its dynamics using electromagnetic articulography (EMA). In particular, the articulograph AG501 allows recording 3D positions and 2 rotation angles at 250 Hz. By gluing sensors on the tongue, it is possible to track its movement spatially and temporally.

**Missions
In our work, we mainly focus on visual intelligibility. In this context, the tongue is partially visible and thus the dynamics are more important than finely modeling the shape of the tongue. For these reasons, the objective of this work is to build a simple dynamic 3D tongue model. The challenge is to find a compromise between a highly flexible 3D structure with very complex movements and a simple representation controlled by few parameters. The initial model will be composed of few geometric primitives. Then, articulatory data will be used to refine the tongue shape. This data will be recorded using an articulograph that we have in our lab. The regions of the tongue where the sensors are glued will be carefully chosen for modeling the shape. In addition, the dynamic behavior of the tongue model will be trained on this collected data. The final result of this work would be a 3D deformable tongue, that can be controlled by few parameters and that will be integrated in an existing talking head.

Strong background in optimization and 3D geometry is recommended.
URL sujet détaillé : http://www.loria.fr/~slim/prop/propositionTongue-Sujet-M2-2015.pdf

Remarques : Gratification de stage
Possibilité de poursuite en thèse (doctorat).



SM207-151 Modélisation des expressions faciales d'une tête parlante 3D  
Admin  
Encadrant : Slim OUNI
Labo/Organisme : LORIA
URL : http://www.loria.fr/~slim
Ville : Nancy

Description  

** Context
Speech communication is essentially bimodal, consisting of the acoustic signal that carries the auditory modality and the facial deformation represents the visual modality. This audiovisual communication is not just expressing the linguistic content of spoken text but it allows conveying an emotion (joy, sadness, disgust, etc) or an expressivity.

In the field of audiovisual speech, we develop talking heads to study how to make audiovisual speech communication expressive. We recall that a talking head is an animated and virtual 3D face synchronously with audio. In this context, we focus on the implementation of facial expressions of the major emotions (joy, fear, sadness, disgust, surprise) to build an expressive talking head.

** Mission
The goal of this work is to model the facial expression of a talking head. The control parameters of the facial expressions have to be determined to be able to synthesize the predefined emotions.
In the context of this project, we first acquire 3D facial data of several emotions using two motion capture techniques: (1) Motion capture software based on a Kinect ; (2) Vicon motion capture system that allows tracking markers glued on the face of the human speaker. Next, we will study the facial gestures allowing producing each emotion. The facial features of each emotion will be determined. Finally, we will propose synthesis techniques of the emotions allowing the modification of the talking head (either using machine learning techniques or using control parameters).

Good knowledge in algorithmic geometry and data modeling are appreciated.

URL sujet détaillé : http://www.loria.fr/~slim/prop/proposition_ExpressionsFaciales-M2-2015.pdf

Remarques : Gratification de stage
Possibilité de poursuite en thèse (doctorat).



SM207-152 Sémantique opérationnelle catégorique  
Admin  
Encadrant : Tom HIRSCHOWITZ
Labo/Organisme : LAMA/Université Savoie Mont Blanc
URL : http://lama.univ-savoie.fr/~hirschowitz
Ville : Le Bourget-du-Lac

Description  


La sémantique opérationnelle est une méthode de description des
langages de programmation, ou au moins de leurs modèles mathématiques,
proposée par Plotkin vers 1980. Le point de départ de ce stage est la
distinction entre d'une part les données rassemblées dans une
sémantique opérationnelle (termes, éventuellement avec variables
liées, étiquettes et règles de transition) et d'autre part le système
de transitions étiquetées engendré par de telles données.

Le sujet du stage est l'organisation du passage de l'un à l'autre à
l'aide de la théorie des catégories. Le but est de mathématiser le
processus de description des langages de programmation, l'avantage de
l'approche catégorique étant que chaque langage ainsi décrit vient
avec une notion de modèle dénotationnel.

Plus précisément, on définira une catégorie, dont les objets seront
les sémantiques opérationnelles, et une catégorie dont les objets
comprendront les systèmes de transitions étiquetées engendrés; on
construira ensuite ce qu'on appelle une " adjonction " entre ces deux
catégories. On s'intéressera ensuite à certains langages de
programmation déjà équipés d'une sémantique catégorique, pour comparer
la notion standard de modèle avec celle obtenue dans le nouveau cadre.

Le travail sera guidé par un article antérieur [1] qui fait en gros la
même chose dans le cadre plus limité de la récriture d'ordre
supérieur.

[1] T. Hirschowitz. Cartesian closed 2-categories and permutation
equivalence in higher-order rewriting. Logical Methods in Computer
Science 9(3:10). 2013.

http://www.lmcs-online.org/ojs/viewarticle.php?id=757

URL sujet détaillé : :

Remarques :



SM207-153 Étude combinatoire d'arbres et d'arrangements d'hyperplans  
Admin  
Encadrant : Sylvie CORTEEL
Labo/Organisme : LIAFA, CNRS et Universite Paris 7
URL : http://www.ima.umn.edu/2014-2015/W11.10-14.14/activities/Corteel-Sylvie/corteelima9nov.pdf
Ville : PARIS

Description  

The set of rooted labelled trees is well studied and appears in many
problems.
We define a generalization, for reasons that will be explained later, by
coloring the edges. The number of rooted trees is known to be ^{n-1}1166BR>and so the number of colored trees is ^{n-1} where the
number of vertices and the number of colors.

Among these colored trees we consider some special conditions such as :
for a given color the children are smaller than the parent
(decreasingness) or at least one of the children is smaller than the
parent (non increasingness). These definitions being new many things are
to be found.

We defined these conditions on colored trees in relation to the study of
some hyperplane arrangements. In dimension an hyperplane
arrangement is a set of affine hyperplanes ={H_1,ldots,H_n} Each
hyperplane given by its equation ^{(ell)}
x_1+a_2^{(ell)} x_2+cdots+a_d^{(ell)}x_d=b^{(ell)} People are
looking for combinatorial properties of the arrangement like describing
or counting the regions of the arrangement. One combinatorial
information which is more general is the characteritic polynomial. Some
special hyperplane arrangements will be of interest for us because they
have simple defining equations :equations of the form -x_j=b
These arrangements are called sometime deformations of the braid
arrangement or affinographic arrangement. The first of these families is
the braid arrangement dimension =0 The case
=1 called the Linial arrangement and the case =0
the Shi arrangement. These three special together with many other
obtained by changing the values of the object of many questions
and results in the twenty last years.

The first question when looking at such arrangements is to determine the
number of regions or preferably the characteristic polynomial. A good
way to determine the number of regions is to give a bijective proof
which means giving a bijection between the family we want to count and a
family already known. For this kind of proofs we need to find the
(known) target family.
So the colored trees with the special properties have been defined
because we proved them to be in bijection with the regions of these kind
of arrangements.
The main reference for the study of this kind of arrangements is :
Postnikov and Stanley, Deformations of Coxeter hyperplane arrangements.
Journal of Combinatorial Theory Serie A (2000).

The questions that should be studied in this project are:

- the combinatorial study of the colored trees with decresing or non
increasing conditions. Some results have just been found but there are
still many things to look at.

- understanding the relation with the hyperplane arrangements. Finding
bijective proofs were our starting point. It needs still some more work.

- there are some obvious direct generalizations to look at. The
deformation of the braid arrangement are very specific. Since it is now
working not too bad we can try to apply our tools to other similar
families.

URL sujet détaillé : :

Remarques : Co-encadrement avec David Forge (LRI, Universite Paris-Sud, forge.fr)



SM207-154 Reoptimization of the Shortest Superstring Problem  
Admin  
Encadrant : Eric RIVALS
Labo/Organisme : LIRMM
URL : http://www2.lirmm.fr/~rivals/
Ville : Montpellier

Description  

The Shortest Superstring Problem (SSP) is a well known problem. For a set P of finite strings, it seeks the smallest superstring (a string w where each string of P is a substring of w). This problem is NP-hard (Gallant et al. 1980) and there exist many algorithms of approximation (Blum et al. 1994, Paluch 2014). The reoptimization of the Shortest Superstring Problem describes the following scenario: given a set of strings together with an optimal solution for it, we want to find a good solution for a locally modified instance (addition, deletion or modification of a string or a set of strings).


This research topics of this internship aims at survey the literature and propose ideas for one of the following goals:

1-Find an algorithm for reoptimization of the Shortest Superstring Problem for different locally modified instances, for example :
- addition of a string (Bilo et al. 2011)
- modification of a string (insertion, deletion or mutation)
2-Given a set P of strings, an algorithm A with a ratio of approximation r and a solution w of A for P, we want to find the ratio between w and the solution of A for different locally modified instances of P.


This topic is well suited to a candidate with a good background in algorithmics and complexity.
URL sujet détaillé : http://www2.lirmm.fr/~rivals/INTERN/

Remarques : The compensation is that of the CNRS for internships .



SM207-155 Algorithmique des structures d'ARN localement optimales  
Admin  
Encadrant : Hélène TOUZET
Labo/Organisme : =13Equipe Bonsai =96 Inria Lille Nord Europe et LIFL
URL : http://www.lifl.fr/~touzet/
Ville : Lille

Description  

RiboNucleic Acids (RNAs) are single-stranded macromolecules which can be crudely described as sequences of length ranging from 20 to 3 000 letters, over the four-letters alphabet {A,C,G,U}.
Due to its single-stranded nature, any synthesized RNA molecule undergoes a folding process, in which connections, also known as base-pairs mediated by hydrogen bonds, are established between its nucleotides. The outcome of RNA folding is a large variety of structures, or spatial conformation, which strongly determines the function of a given RNA within cellular mechanisms. From a computer science perspective, the structure(s) of an RNA can be acceptably abstracted as a graph, or a tree.

The main goal of this internship is the design and implementation of one (or several) algorithm(s) for the efficient enumeration and random generation of RNA locally optimal structures. Such structures act as kinetic traps in the energy landscape, aka local minima of the energy function, and are generally believed to significantly slow-down, or even disrupt, the folding of structured RNAs. Their enumeration represents a mandatory first step towards an efficient in silico analysis of the kinetic behavior of RNA from its sequence. Previous works [A] have resulted in a polynomial-time algorithm for the enumeration of locally optimal structure in a combinatorial setting based on base-pair maximization. This algorithm will be the starting point of the internship, and it will be completed/extended to capture the complete set of features supported by the - more accurate and realistic - Turner model [B].

[A] Azadeh Saffarian, Mathieu Giraud, Antoine de Monte, and Hélène Touzet.
RNA locally optimal secondary structures.
J Comput Biol, 19(10) :1120=961133, Oct 2012.
[B] D. H. Mathews, J. Sabina, M. Zuker, and D. H. Turner.
Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure.
J Mol Biol, 288(5) :911=96940, May 1999.
URL sujet détaillé : http://www.lix.polytechnique.fr/~ponty/stages/LocOptStage-BONSAI-RNALands-2014-EN.pdf

Remarques : Co-encadrement avec Yann Ponty (LIX-Ecole Polytechnique)
Poursuite en thèse envisageable dans le cadre du projet RNALands



SM207-156 Estimation of the photometric properties for digitized 3D objects acquired in uncontrolled lighting conditions  
Admin  
Encadrant : Rémi ALLEGRE
Labo/Organisme : ICube UMR 7357 CNRS / Université de Strasbourg
URL : http://icube-igg.unistra.fr/fr/index.php/Accueil
Ville : Strasbourg

Description  

Current 3d digitization technologies for real objects considerably simplify the process of creating 3d models, especially for the digital content production industry (video games, virtual museography, e-business, etc.). In order to perform realistic rendering with variable lighting conditions, the common approach consists in estimating the parameters of a Bidirectional Reflectance Distribution Function (BRDF) model that characterizes the optical properties
of a material. Most existing methods for the estimation of the parameters rely on photometric acquisitions performed in perfectly controlled lighting environments, which imposes too many constraints in an industrial context. Currently there is no method to estimate photometric properties in uncontrolled lighting conditions that is sufficiently reliable and robust. The goal of this internship is twofold. The first objective is to study and evaluate an existing method based on a Phong model. The second objective is to propose an improvement, e.g. based on a more complex BRDF model like the Lafortune model. The implementation will be achieved in C++ and it will be incorporated into the digitization software of the IGG group.
URL sujet détaillé : http://dpt-info.u-strasbg.fr/~allegre/SUJETS/MASTER/Stage_M2_Numerisation_2015_EN.pdf

Remarques : - Advisor: Jean-Michel Dischler, Professeur d'Informatique
- Co-advisor: Rémi Allègre, Maître de Conférences en informatique
- Duration: 6 months
- Funding: About 520 euros per month, net salary



SM207-157 Sloppy Equations  
Admin  
Encadrant : Tomofumi YUKI
Labo/Organisme : LIP
URL : http://www.ens-lyon.fr/LIP/COMPSYS
Ville : Lyon

Description  

Context: Computational scientists spend a lot of time and effort
implementing their "equations" as programs to run their simulations. Why can't the equations themselves be executed if they define the computation? This is because the equations are usually too "sloppy" to serve as formal executable specifications.

The devils are usually in the details, boundaries, and it is tedious and error prone to complete them. The main topic of this internship is to explore the limits of "sloppiness" one could enjoy while writing programs/equations.

Subject: The main goal of this internship is to develop methods to derive precise (non-sloppy) definition of the computation starting from a loosely defined (sloppy) input. The "sloppiness" may come in as various forms, such as unspecified domain, unspecified boundary cases, and so on.

The main focus is on Systems of Affine Recurrence Equations, which can be extracted from regular loop programs. SAREs can be analyzed and transformed using linear algebraic techniques, used in automatic parallelizers today. The result of this work is expected to be used for verification of programs.
URL sujet détaillé : http://perso.ens-lyon.fr/tomofumi.yuki/M2-SloppyEquations.pdf

Remarques : Possible interactions with Dr. Sanjay Rajopadhye at Colorado State University.

Stipends are possible.



SM207-158 Development of Miniapps for Polyhedral Compilation  
Admin  
Encadrant : Tomofumi YUKI
Labo/Organisme : LIP
URL : http://www.ens-lyon.fr/LIP/COMPSYS
Ville : Lyon

Description  

Context: Experimental validation is an essential part of research in many domains, including optimizing compiler design. New ideas for program transformations are usually implemented as "proof of concept" tools and tested against various programs prior to publication. However, compiler writers are often criticized by application writers to be focused too much on benchmark kernels that can be far from what can be found in real-world applications.

Miniapps attempt to bridge this gap by developing smaller and simpler versions of the full application that can still serve as a performance proxy to the full version. The main topic of this internship is to develop such miniapps for polyhedral compilers, and then to tests the limits of current tools.

Subject: The polyhedral model is one of the few successful techniques for automatic parallelization, and is being actively developed today. However, prototype research tools have difficulty handling real-world applications, as it is not as robust as production compilers.

We seek to develop miniapps based on applications from bio-informatics (sequence alignment), numerical simulations (shallow water modeling), and more. Once the miniapps are ready, we will evaluate state-of-the-art tools from polyhedral compilation, and explore opportunities for improvement.


URL sujet détaillé : http://perso.ens-lyon.fr/tomofumi.yuki/M2-PolyApps.pdf

Remarques : Stipends are possible.




SM207-159 Assemblage de tuiles et erreurs  
Admin  
Encadrant : Mathieu SABLIK
Labo/Organisme : I2M
URL : http://www.latp.univ-mrs.fr/~sablik/
Ville : Marseille

Description  

e premier résultat important dans l'étude des pavages définis par interactions locales a été de trouver des règles locales qui forment uniquement des pavages quasi-périodiques (i.e. pour tout motif qui apparaît dans ce pavage, il existe un rayon r tel que ce motif apparaît dans toute boule de rayon r). Tout naturellement, ces objets ont servi à modéliser les quasi-cristaux découverts en 1982 par D. Shechtman, qui lui valut le prix Nobel de chimie en 2011.

On s'intéresse généralement à la structure des pavages quasi-périodiques définis par règles locales. On peut aussi s'intéresser à leur formation, c'est à dire comment former des motifs arbitrairement grands en respectant les règles locales et en s'autorisant uniquement des transformations locales (i.e. la transformation n'a pas accès à la totalité du motif). Différents modèles de croissance ont été établis: auto-assemblage (on part d'un motif correct et on rajoute des tuiles en respectant les règles locales) ou flip-stochastiques (on part d'un motif quelconque et on effectue des perturbations locales pour rétablir les règles locales). Mais les différentes études établissent peu de liens entre les propriétés structurelles et les modèles de croissance.

Le but de ce stage est d'étudier la vitesses de formation d'un pavage en fonction des propriétés algorithmiques issues des règles locales qui le forment.
URL sujet détaillé : http://www.latp.univ-mrs.fr/~sablik/Stage.html#FormationCristal

Remarques :



SM207-160 Stretch factor of the 3D Delaunay triangulation  
Admin  
Encadrant : Olivier DEVILLERS
Labo/Organisme : INRIA
URL : http://www.loria.fr/~odevil
Ville : Nancy

Description  

The Delaunay triangulation is a popular structure in computational geometry (a particular triangulation, easy to compute and with numerous applications).
The stretch factor of a triangulation is the largest ratio between the length of a path in the triangulation and the euclidean length between two vertices. In 2D, finding a tight upperbound on the is still open even if recent progress have been made recently.

The aim of the internship os to obtain result in 3D where almost nothing is known.
URL sujet détaillé : http://www.loria.fr/~odevil/stretch3d.pdf

Remarques : a priori indemnité légale 436 euros/mois



SM207-161 Analysis of real-world large-scale mobile traffic datasets  
Admin  
Encadrant : Razvan STANICA
Labo/Organisme : CITI - INRIA UrbaNet
URL : http://perso.citi.insa-lyon.fr/rstanica/
Ville : Lyon

Description  

The usage of the Internet is becoming increasingly mobile and nomadic, fostered by the popularity of smartphones and mobile apps. This has led to an explosion of mobile traffic, whose compound annual growth rate (CAGR) has been of 146% between 2006 and 2013: not even fixed traffic on the Internet grew that fast during the golden age of the World Wide Web at the turn of the millennium. In addition, forecasts by prominent stakeholders, such as Cisco, tell us that the mobile communication hype is far from having reached its peak. Global mobile data traffic is still expected to grow at a compound annual rate of 61%, which means that the volume of mobile traffic in 2017 will be 13 times that recorded in 2012 [1].

As a result, current and upcoming radio access technologies will struggle to keep the pace with the demand generated by bandwidth-hungry high-end smartphone applications. This manifests through poor service availability, and represents a major impediment for advanced applications. Mobile telecommunication operators are thus studying new solutions to accommodate the capacity demand in densely populated metropolitan areas. A number of approaches aiming at enhancing the access network infrastructure are under study. Yet, many of these solutions need knowledge of the spatio-temporal dynamics of mobile traffic demand, which, in other words, means answering to the question =93when, where and how mobile users consume cellular network resources?=94

Today, we have do not have a precise response to this question. This project falls within a larger initiative aiming at providing a definitive answer to the question above. To that end, we collected, in cooperation with cellular network operators, multiple datasets of real-world mobile traffic in large European and African cities, and we are developing a framework to automatically analyze the datasets and characterize the dynamics of the mobile traffic demand [2]. The successful candidate will work on developing new analytic tools, inspired from fields such as data mining, machine learning, statistical analysis or complex systems, and integrate these tools in the framework. The work will be carried out in tight cooperation with PhD students, postdoctoral fellows, and researchers based at Inria, France, and CNR-IEIIT, Italy.

References
[1] Cisco, =93Global Mobile Data Traffic Forecast Update=94, 2013
[2] D. Naboulsi, R. Stanica, M. Fiore, =93Classifying Call Profiles in Large-scale Mobile Traffic Datasets=94, IEEE Infocom, 2014
URL sujet détaillé : :

Remarques :



SM207-162 Authenticating logs in collaborative editing systems  
Admin  
Encadrant : Claudia IGNAT
Labo/Organisme : Inria Nancy-Grand Est
URL : http://www.loria.fr/~ignatcla/
Ville : Nancy

Description  

In recent years collaborative editing systems such as wikis, GoogleDrive and version control systems became very popular.In order to ensure a high availability these systems replicate data. Users work simultaneously on replicas of shared documents following some cycles of divergence and convergence. Document replicas diverge when users work in isolation and converge later when users synchronize their changes. Most of these systems keep user modifications in a log of operations that are then exchanged among replicas during synchronisation.

However, users can misbehave by tampering logs for their convenience. This might be critical for some collaborative systems such as version control systems. If logs can be modified, revisions do not correspond to the expected behavior of the software. Moreover, developers cannot be made responsible for the revisions for which they contributed. Furthermore, by modifying the log, a contributor may introduce security holes in the system under the name of another contributor. Therefore, there is a need to ensure integrity of the log, and in case the log was tampered, the misbehaving user should be detected.

Coast team proposed a solution relying on hash-chain based authenticators [1] for authenticating logs that ensure the authenticity, the integrity of logs, and the user accountability [2]. Algorithms for authenticators construction and verification of logs were proposed. Our solution is based on a hash chain that is aggregated over time after new interactions among users are performed corresponding to the addition of new entries to the log. Each time a hash value is computed all data used for hashing is used and computation starts from the scratch.

The aim of this project is to investigate solutions for a better performance in constructing authenticators by taking advantage of commutativity among certain elements involved in hashing. Application of one-way accumulators with quasi-commutative property will be studied [3]. Theoretical complexity as well as experimental analyses will be performed.

A successful master student can continue with a PhD thesis on related topics in collaborative editing.

References:
[1] Di Ma and Gene Tsudik. 2009. A new approach to secure logging. ACM Transactions on Storage, 5(1), Article 2, March 2009
[2] Authenticating Operation-based History in Collaborative Systems. Hien Thi Thu Truong, Claudia-Lavinia Ignat and Pascal Molli. In Proceedings of the ACM International Conference on Supporting Group Work, Group 2012, pages 131-140, Sanibel Island, Florida, USA, October 2012
[3] Josh Benaloh and Michael de Mare. One-way accumulators: a decentralized alternative to digital signatures. In
Workshop on the theory and application of cryptographic techniques on Advances in cryptology, EUROCRYPT '93, pages 274-285, Lofthus, Norway, May 23=9627, 1993
URL sujet détaillé : http://www.loria.fr/~ignatcla/pmwiki/pmwiki.php/Main/Thesis14

Remarques :



SM207-163 Datamining for Parallel Corpus for Quality Estimation Applied to Machine Translation  
Admin  
Encadrant : David LANGLOIS
Labo/Organisme : LORIA - SMarT Group
URL : http://smart.loria.fr
Ville : Vandoeuvre-lès-Nancy

Description  

Introduction
------------
In the scope of Machine Translation, Quality Estimation [5] consists in automatically evaluating the quality (correctness, fluency) of a translated sentence or the correctness of each word in this sentence. This automatic evaluation can be useful to help an expert to post-edit and correct the sentence (or the document) [7], or to decide if the translated document can be disseminated at it is [6].

To put it more directly: "Should this sentence t be considered as a correct translation of this one s?"

To make possible this decision by a computer, one uses a machine learning approach. This approach is stemmed on the use of a training corpus made up of a set of triplets (s,t,score_st) where s is a source sentence, t is the automatic translation of s by a machine translation system, and score_st is an evaluation score given to the translation t by a human expert. Each couple (s,t) is automatically described by a vector v_st. v_st is made up of numerical values (see below). Then, a machine learning approach (SVM, bayesian classifier, neural networks...) is applied on the couples (v_st,score_st) in order to estimate a function that predicts the score for any vector. It is then possible to use this function to give a score to another couple (s',t') even if it does not occur in the training corpus.

A vector v_st describes the source s and target t sentences. v_st made up of numerical values/features extracted from s and t. There are two kinds of values: glassbox and blackbox features. Glassbox values are given by the machine translation system (confidence measures from translation tables, from the language model). Blackbox values are independent of the system (for example lengths of s and t). Glassbox and blackbox features have drawbacks. Glassbox values are not available if the translation system is online. Blackbox values, as they are independent of the translation system, are not necessarily correlated with the translation quality.

Subject
-------
In this work, our objective is to add to the vector v_st a new information independent of the translation system, but this information will remain correlated to the quality of translation because it is extracted from a parallel corpus representative of the language pair.

We propose to use a parallel corpus C. In C, each source sentence C_si is given a translation C_ti made by a human expert. For each input sentence s and its automatic translation t, we propose to mine the parallel corpus in order to retrieve the C_si sentences that have common parts with s. We argue that if C_sk is close to s, then C_tk is also close to t. We want first to check this hypothesis in this research work. Second, we want to propose a measure stemmed on this idea, and to add this measure to v_st.

Attempted work
--------------
The student will have to become familiar with the approaches in quality estimation for machine translation. Then, he will propose a measure of the intersection degree between a source sentence and the corpus. For that, a line of inquiry could be to use the state-of-the-art automatic measures of machine translation quality [4,3].

Work environment
----------------
The student will be integrated in the SMarT (http://smart.loria.fr) group. This research group has strong skills in machine translation and quality estimation. We have participated to two evaluation campaigns: Quality Estimation (QE) 2012 (http://www.statmt.org/wmt12/quality-estimation-task.html) and 2013 (http://www.statmt.org/wmt13/quality-estimation-task.html) [1,2]. The results of the student will be incorporated into our research platform. This work will result in the participation to the evaluation campaign QE 2015.

References
----------
[1] D. Langlois, S. Raybaud, and K. Smaïli. Loria system for the WMT12 quality estimation shared task. In Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012.
[2] D. Langlois, S. Raybaud, and K. Smaïli. Loria system for the WMT13 quality estimation shared task. In Proceedings of the Eigth Workshop on Statistical Machine Translation, 2013.
[3] G. Doddington. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the second international conference on Human Language Technology Research, pages 138=96145. Morgan Kaufmann Publishers Inc., 2002.
[4] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 311=96318. Association for Computational Linguistics, 2002.
[5] S. Raybaud, D. Langlois, and K. Smaïli. "This sentence is wrong." Detecting errors in machine-translated sentences. Machine Translation, 25(1):1=9634, 2011.
[6] R. Soricut and A. Echihabi. Trustrank: Inducing trust in automatic translations via ranking. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 612=96621, 2010.
[7] L. Specia. Exploiting objective annotations for measuring translation post-editing effort. In Proceedings of the 15th Conference of the European Association for Machine Translation, pages 73=9680, 2011.

URL sujet détaillé : http://smart.loria.fr/sujet_datamining_qe.pdf

Remarques : The student will be supervised by David Langlois (http://www.loria.fr/~langlois) and Kamel Smaïli (http://www.loria.fr/~smaili).

The remuneration will be standard.





SM207-164 Alliances in graphs  
Admin  
Encadrant : Cristina BAZGAN
Labo/Organisme : LAMSADE, Université Paris-Dauphine
URL : http://www.lamsade.dauphine.fr/~bazgan/
Ville : Paris

Description  

We are interested in the study of defensive alliances and secure sets of a graph. Alliances occur in many contexts: members of different political parties, people who unite by friendship, companies with common economic interest, nations wishing to form defensive coalitions. A defensive alliance in a graph, introduced in [KHH04], is a set of vertices S such that each vertex of S has at least as many neighbors in S as outside S, that means that each vertex of S with its neighbors from S can defend against every attack coming from its neighbors outside S. Finding a defensive alliance of minimum size in a graph is NP-hard. Nothing is known about the approximation of this problem. A defensive alliance involves the defense of a single vertex. In many contexts, alliances must be formed so that any attack on any subset X of S can be defended by X and their neighbors in S. In this case, the set S is called secure. The security number of a graph, introduced in [BDT07], is the smallest size of a secure set. Nothing is known about the complexity and the approximation of this number.
Another interesting question is, given an insecure set N, how to add a minimal number of vertices in order to transform N into a secure set.
Some related problems, including complexity and approximation results, were studied in [BTV10]. A generalization of defensive alliances to defensive 0alliances was defined in [RYS09]. A set of vertices S is a defensive 0alliance if any vertex of S has at least neighbors in S as outside S. A similar generalization for secure sets can be considered.


The goal of the internship is to study the complexity and the approximation of these problems. In order to deal with these problems we count on our experience on problems we studied asking for deciding if a graph has a partition of its vertex set into two nonempty parts satisfying some constraints of degree. We wrote a survey on the complexity and the approximation of these problems in [BTV10].
URL sujet détaillé : http://www.lamsade.dauphine.fr/~bazgan/Stages/stagesecure.pdf

Remarques :



SM207-165 Non-monotony in interaction networks  
Admin  
Encadrant : Sylvain SENÉ
Labo/Organisme : Laboratoire d'Informatique Fondamentale
URL : http://pageperso.lif.univ-mrs.fr/~sylvain.sene
Ville : Marseille

Description  

The major part of studies and known results on the dynamics of Boolean automata networks assumes that the set of local transition functions that defines a network contains only locally monotone functions. Intuitively, that means that if an automaton i influences and automaton j in the network, then there is no ambiguity at all about the nature of this influence, that must be either positive (i tends to activate j) or negative (i tends to inhibit j). This local monotony is a strong property that induces de facto a strong restriction about the networks studied. However, no biological results allow a priori to go in the direction of such a restriction. So, now that a lot of interesting properties have been proven on locally monotone networks, it is important and relevant to study networks that do not admit this feature anymore. Moreover, the specific interest we have in this class of Boolean automata networks comes from previous works on the differences of their behaviours depending on their synchronous or asynchronous nature. In particular, we have shown that the smallest Boolean automata networks that are not robust to changes of synchonicity are non-monotone ones. Moreover, we also have shown that monotone networks can owe this property of non-robustness. One of the remarkable point here is that it seems that non-robust monotone networks can always be coded by a smaller non-monotone network. At present, no general results that explain this phenomenon have been found, although certain conditions for a monotone network to be non-robust to synchronicity perturbations have shown the need of very specific cycles in its definition. According to us, this emphasises the essential role of non-monotone architectural patterns on the complex dynamical behaviour of interaction networks. Thus, for biological arguments related to the possible intrinsic non-monotone nature of regulations as well as for computer science arguments also related to the influence of synchronicity, studying the impact of non-monotony is very pertinent and promising, and should lead to open other interesting questions.
URL sujet détaillé : http://pageperso.lif.univ-mrs.fr/~sylvain.sene/files/stage2015_nonmon_en.pdf

Remarques : - Co-encadrement : Kévin Perrot (kevin.perrot.univ-mrs.fr)
- Durée : 6 mois
- Rémunération : 523,26=80/mois
- Poursuite en thèse envisagée et souhaitée



SM207-166 Alternating-time Temporal Logic with Imperfect Information  
Admin  
Encadrant : Francesco BELARDINELLI
Labo/Organisme : Laboratoire IBISC
URL : https://www.ibisc.univ-evry.fr/~belardinelli/
Ville : Evry

Description  

Alternating-time Temporal Logic (ATL) has been recently put forward as a logical language to specify strategic properties of autonomous agents in distributed systems [1]. ATL extends the linear time temporal logic LTL by introducing modal formulas << A>> f, for a group A of agents, whose intended meaning is that the agents in A have a strategy whereby they are guaranteed to achieve f, irrespectively of the moves of A's opponents.
ATL has been successfully applied to the specification and verification of game structures. Indeed, the ATL language is rich enough to express key game-theoretic concepts such as (strong) Nash equilibria and Pareto optimality, as well as typical safety properties (e.g., =93the agents in group A have a strategy to perpetually avoid a deadlock=94: << A>> G ! deadlock) and liveness properties (e.g., =93once a message is sent, the agents in A cannot prevent that it will eventually be received=94: sent -> ! << A>> ! F received).

Alternating-time Temporal Logic has been thoroughly investigated as regards its theoretical features (decidability and complexity of the satisfiability and model checking problems [2,3]), and it has been observed that these depend on various strategic capabilities of agents. In one dimension the strategies played by agents might depend on the whole sequence of states visited during the system's execution. If that is the case, we say that agents have perfect recall of visited states, and they use this information to decide which action to perform next. On the other hand, if agents only use information available in the current state to decide which action to take next, then they are said to deploy memoryless (or positional) strategies. In another dimension, agents might have perfect knowledge of the game structure (local states of other agents, available actions, etc -- consider chess or go for instance), or only a partial view of the game might be accessible to them, i.e., agents might have imperfect knowledge of the game structure (e.g., poker or bridge).
It is well known that perfect/imperfect recall and perfect/imperfect knowledge affect the decision problems for ATL. For instance, deciding ATL with perfect recall is typically computationally harder than the corresponding problems for its imperfect recall counterpart (even though there are some notable exceptions).

This project aims at contributing to the subject of Alternating-time Temporal Logic under the assumption of imperfect information. Specifically, the state-of-the-art for ATL is much more clear-cut for game structures with perfect knowledge. For such setting we have decision procedures for the model checking and satisfiability problems [2,3], and extensions are available [4].
However, in contexts of imperfect information the techniques to decide satisfiability for ATL fail. As a result, satisfiability of ATL with imperfect information is still an open problem.
The project is intended to map the boundary between perfect and imperfect information in ATL, by building on recent contributions on the subject [5]. Specifically, the envisaged workpackages are as follows:
1) analysis of state-of-the-art on tableau- and automata-based decision procedures for ATL;
2) diagnosis of failure of aforementioned procedures in contexts of imperfect knowledge;
3) development of a method to decide ATL satisfiability with imperfect knowledge;
4) possible implementation of the results in (3) in a tool.

The project programme is ambitious and challenging. In case that the logical language turns out to be undecidable, it will be of interest to explore syntactic and semantic restrictions that still admit a decidable satisfaction problem.
The candidate for the project needs to have a strong background in formal methods for computer science and logic, as well as good skills in working autonomously and be proactive. Further details on the project proposal are available upon request.


Selected Reference

1) Rajeev Alur, Thomas A. Henzinger, and Orna Kupferman. 2002. Alternating-time temporal logic. J. ACM 49, 5 (September 2002), 672-713. DOI=10.1145/585265.585270 http://doi.acm.org/10.1145/585265.585270

2) Bulling, N.; Dix, J. & Jamroga, W. Dastani, M.; Hindriks, K. V. & Meyer, J.-J. C. (Eds.) Model Checking Logics of Strategic Ability: Complexity* Specification and Verification of Multi-agent Systems, Springer US, 2010, 125-159

3) Goranko, V., van Drimmelen, G.: Complete axiomatization and decidability of the Alternating-time Temporal Logic (2003)

4) Serenella CERRITO, Amélie DAVID, Valentin GORANKO. Optimal Tableaux-based Decision Procedure for Testing Satisfiability in the Alternating-time Temporal Logic ATL+. Automated Reasoning (IJCAR), LNAI 8562, 2014.

5) Nils Bulling Wojciech Jamroga, Comparing variants of strategic ability: how uncertainty and memory influence general properties of games, Auton Agent Multi-Agent Syst (2014) 28:474=96518, DOI 10.1007/s10458-013-9231-3

URL sujet détaillé : :

Remarques : Le stage est co-encadré avec Catalin Dima, professeur à l'Université Paris - Créteil.





SM207-167 Formalisation effective des réels algébriques  
Admin  
Encadrant : Yves BERTOT
Labo/Organisme : Inria
URL : http://www-sop.inria.fr/members/Yves.Bertot
Ville : Sophia Antipolis Cedex

Description  

Décrire des techniques pour représenter et manipuler les nombres algébriques, et faire les preuves formelles dans Coq (en se reposant sur la bibliothèque ssreflect)
URL sujet détaillé : http://www-sop.inria.fr/members/Yves.Bertot/internships/algebraic_reals.pdf

Remarques : Stage rémunéré
Co-encadré par Cyril Cohen



SM207-168 Exact QR decomposition over the rationals  
Admin  
Encadrant : Clément PERNET
Labo/Organisme : LIP
URL : http://lig-membres.imag.fr/pernet/
Ville : Lyon

Description  

Exact linear algebra, computing exactly over arbitrary precision integers, rationals and finite fields, has developed over the last 15 years, motivated by applications in cryptology, computational mathematics and high precision
numerical computations.
State of the art implementations, proposed in libraries such as LinBox (http://linalg.org) and
fflas-ffpack (http://linalg.org/projects/fflas-ffpack) rely on efficient dense linear algebra kernels over word size finite fields, and lifting to the integers and rationals by Chinese remainder algorithm or p-adic representations.

Most computations based on elimination in exact linear algebra rely on efficient Gaussian elimination and the related PLUQ decomposition, that has been already well studied for both sequential and parallel implementations.
The QR decomposition, transforming a matrix to a triangular form by unitary transformations, although widely used in numerical linear algebra, has been given less attention in computer algebra until recently. However, applications such as lattice reduction, raised a need for high
performance and high precision QR decomposition which motivates this research project.

After reviewing the state of the art numerical algorithms for this computation, the student will
design several variants of an exact QR decomposition algorithm adapting the specificities of exact linear algebra used for Gaussian elimination: block tiled and slab recursive structure, use of sub-cubic matrix arithmetic, Chinese Remainder algorithm and rational reconstruction. Communication efficient variants and their analysis will be investigated.

This exact QR decomposition will then be adapted in a symbolic-numeric framework to offer an arbitrary precision numerical QR algorithm. It will be benchmarked against alternative approaches, such as a direct the use of the multi-precision floating point numbers of the library mpfr (http://www.mpfr.org). Finally, applications to lattice basis reduction could be considered, time permitting.

URL sujet détaillé : http://lig-membres.imag.fr/pernet/Publications/exactQR.pdf

Remarques :



SM207-169 Ultime periodicity in chip-firing games  
Admin  
Encadrant : Kévin PERROT
Labo/Organisme : Laboratoire d'Informatique Fondamentale
URL : http://pageperso.lif.univ-mrs.fr/~kevin.perrot/
Ville : Marseille

Description  

Chip-firing games are played on directed graphs. Each vertex is given a number of chips, and a single rule is applied in parallel on every vertex at each time step: if a vertex contains enough chips then it gives one chip to each of its out-neighbors, otherwise it does not give any chip.

Chip-firing fames are very expressive (the model has been shown to be Turing-powerful). Nonetheless, on a finite graph and with a finite number of chips, the trajectory of every game is ultimately periodic.

According to the interests of the candidate, the internship may consist in:
- producing a digest inventory of known results;
- continue the classification of some families of graphs as:
1. the period of every trajectory is smaller or equal to the number of vertices;
2. the period of every trajectory is polynomial;
3. the period of some trajectories may be exponential;
- establishing and exploiting links with other models, for example with threshold automata.
- studying the impact of the update scheme on the limit cycles.
URL sujet détaillé : http://pageperso.lif.univ-mrs.fr/~kevin.perrot/documents/sujetM2-CFGPeriods_en.pdf

Remarques : - Co-encadrement : Sylvain Sené (sylvain.sene.univ-mrs.fr)
- Durée : 6 mois
- Rémunération : 523,26 euros/mois
- Poursuite en thèse envisagée et souhaitée



SM207-170 Independent sets in hypergraphs  
Admin  
Encadrant : Yoshiharu KOHAYAKAWA
Labo/Organisme : Institute of Mathematics and Statistics, U. of S=E3o Paulo
URL : https://www.ime.usp.br/~yoshi/
Ville : S=E3o Paulo

Description  

Balogh, Morris and Samotij and Saxton and Thomason have recently developed a novel method for approaching quite diverse problems in asymptotic graph and hypergraph theory. They have shown that certain rather general results on independent sets in hypergraphs can be proved from simple hypotheses and can be applied very effectively in quite different areas, from colouring problems to asymptotic enumeration and probabilistic combinatorics.

Students involved in this project will be expected study the original papers by the authors above, and will be given related literature for the applications. The main aim of this project is to reach a deep understanding of this method and to seek new applications in extremal and probabilistic combinatorics.

References

J. Balogh, R. Morris and W. Samotij, Independent sets in hypergraphs, J. Amer. Math. Soc., to appear.

D. Saxton and A. Thomason, Hypergraph containers, arXiv:1204.6595 [math.CO].

URL sujet détaillé : :

Remarques :



SM207-171 Architecture synthesis for linear time-invariant filters  
Admin  
Encadrant : Florent DE DINECHIN
Labo/Organisme : Inria Socrate team, CITI / INSA
URL : http://flopoco.gforge.inria.fr/
Ville : Lyon

Description  

Linear time-invariant filters are a widely used class of digital signal processing blocks. They can be found in most signal processing applications (from sound processing to software-defined radio) but also in control applications. These mathematical objects can be implemented in a variety of ways. Such realizations always consist of a combination of registers and of a number of scalar products by constant vectors. However, the choice of a realization impacts performance, cost, and accuracy.

The literature offers papers comparing different realizations of the same filter, and observing that more accuracy can be obtained at the expense of more multipliers. Such studies are relevant in software implementations, where the multiplier size is a given. In the context of hardware or FPGA implementations, more accuracy can also be bought by larger multipliers. This internship will study this trade-off and design tools that explore it. The ultimate objective is to generate the smallest or fastest filter satisfying a given accuracy specification.

For this, the state of the art is Hilaire's formalism of Specialized Implicit Form (SIF) that captures all possible realizations of a filter. Lopez' thesis at LIP6 recently showed how to formally compute the accuracy of a realization expressed in SIF form, depending on the fixed-point format used. In parallel, we worked in Socrate on the minimal implementation of a scalar product with a constant vector. The internship should build upon theses recent complementary works.

URL sujet détaillé : :

Remarques :



SM207-172 Decompositions of Graphs  
Admin  
Encadrant : Yoshiko WAKABAYASHI
Labo/Organisme : Institute of Mathematics and Statistics - University of S=E3o Paulo (USP), Brazil
URL : http://www.ime.usp.br/~yw/
Ville : S=E3o Paulo

Description  

Area: Graph Theory

Title: Decompositions of Graphs

Abstract: The topic of research comprises problems on graph decompositions, in a very broad sense. We mention some problems on this topic that we are investigating or would like to welcome students to investigate.

OBS: A pdf file corresponding to the text below (with some definitions and references) can be found at: http://www.ime.usp.br/~yw/2014/projects/project.pdf

Given a graph G, a family of subgraphs {H_1,...,H_k} of G is a decomposition of G if these subgraphs are pairwise edge-disjoint and their union is precisely G. If each H_i is a path (resp. cycle) then we say that G admits a path (resp. cycle) decomposition. If each H_i is isomorphic to a graph H, then we say that G admits an H-decomposition.

The path number of a graph G, denoted by pn(G), is the size of a minimum path decomposition. According to Lov=E1sz, in 1966 Erd=F6s asked about this parameter, and Gallai stated the following conjecture.

Conjecture [Gallai, 1966]
If G is an n-vertex connected graph, then pn(G) is at most (n+1)/2.

This parameter is not known for most of the graph classes. Lov=E1sz (1968) found an upper bound for a similar parameter. He proved that every n-vertex connected graph G can be decomposed into at most n/2 paths and cycles. From this result, it follows that Gallai's conjecture holds on connected graphs with at most one even-degree vertex. Some upper bounds for the path number of a connected graph in terms of its number of odd-degree and even-degree vertices have been obtained by Lov=E1sz (1968), and improved by Donald (1980) and Dean and Kouider (2000). Finding good upper bounds for pn(G) is a challenging problem; even for special classes of graphs not much is known.

Another related problem concerns H-decompositions of a graph, where H is some fixed graph. This is also a topic that has raised much interest of the researchers, and a number of interesting results have been found. An interesting variant concerns the case in which a tree. In this respect, there is a conjecture posed by Bar=E1t and Thomassen (2006), which states that, for each tree T, there exists a natural number k(T) such that, if G s a k(T)-edge-connected graph and |E(T)| divides |E(G)|, then G admits a T-decomposition. In a series of papers, Thomassen has proved this conjecture for stars, some bistars, paths of length 3, and paths whose length is a power of 2. Recently, Botler, Mota, Oshiro and Wakabayashi (20014) have verified this conjecture for paths of length 5.

The topic is very broad, so an interested student may choose one of the variants to focus on, or may write a survey collecting the known results. Some PhD students and post-doc fellows of our Combinatorics group at IME-USP have been working (or have worked) on this topic. Some other results on decompositions of regular graphs into paths or bistars have also been proved by members of this group. We would be happy to welcome students interested to work with us.

URL sujet détaillé : http://www.ime.usp.br/~yw/2014/projects/project.pdf

Remarques :



SM207-173 Réseaux sociaux  
Admin  
Encadrant : Luca CASTELLI ALEARDI
Labo/Organisme : LIX (Ecole Polytechnique)
URL : http://www.lix.polytechnique.fr/~amturing/stages.html
Ville : Palaiseau

Description  

Large networks (social networks, web graph, neuronal networks, ...) are at the core of complex systems: the nodes correspond to the entities of a system, while arcs describe their interactions.
While there is a huge amount of works providing models and analysis of complex networks, there are very few attempts trying to model and exploit the possible existing geometric structure of networks.

Our goal is to exploit the tools from geometric modeling and computational geometry (such as spectral methods) in order to analyze, visualize and better represent such kind of large graphs. The main idea is to take advantage of the underlying geometric constraints of the spatial embedding of the graph in order to better
characterize their structure. This could help in designing new and more efficient algorithms
for network visualization and network compression.
URL sujet détaillé : http://www.lix.polytechnique.fr/~amturing/stages/pdf/Stage_GeometricNetworks.pdf

Remarques : Stage co-encadre' par Maks Ovsjanikov (maks.poytechnique.fr)



SM207-174 Réseaux complexes  
Admin  
Encadrant : Luca CASTELLI ALEARDI
Labo/Organisme : LIX (Ecole Polytechnique)
URL : http://www.lix.polytechnique.fr/~amturing/stages.html
Ville : Palaiseau

Description  

We address the problem of compressing and compactly representing the connectivity of large graphs.
In particular we consider complex networks: social networks, web graphs as well as biological networks
are a few examples of classes of complex networks arising from real world applications.
Complex networks are usually very sparse graphs of huge size (having millions and even billions of nodes) and exhibit a surprinsingly high number of structural properties.
The main goal of this internship is to exploit methods from networks analysis in order
to provide rigorous and precise evaluations (and comparisons) of compression schemes, from both
the practical and theoretical points of view.
We also hope to take advantage of some structural properties of complex networks in order
to design more efficient representations, finely tuned for some classes of graphs (such social networks).
URL sujet détaillé : http://www.lix.polytechnique.fr/~amturing/stages/pdf/Stage_ComplexNetworks.pdf

Remarques :



SM207-175 Graphes plongés sur les surfaces  
Admin  
Encadrant : Luca CASTELLI ALEARDI
Labo/Organisme : LIX (Ecole Polytechnique)
URL : http://www.lix.polytechnique.fr/~amturing/stages.html
Ville : Palaiseau

Description  

Our main goal is to study, from both the combinatorial and algorithmic points of view,
a class of structural properties characterizing the notion of graph planarity.
We will consider the characterization known as "Schnyder woods" (and the closely canonical orderings), which can be expressed in term of edge orientation and coloring: they lead to a number of applications in several domains, such as graph drawing and encoding, random and exhaustive generation, compact representations, plane spanners, ...
In particular we deal with higher genus graphs, which are graphs embedded on higher genus surfaces. For these graphs, two recent generalizations of Schnyder woods have been proposed, leading to new efficient algorithms for drawing and encoding graphs on surfaces.
One of the goals of the internship will be to extend to the toroidal case the algorithms for efficiently generating and embedding triangulated graphs: the solutions for these two problems, in the planar case, make use of canonical orderings.
URL sujet détaillé : http://www.lix.polytechnique.fr/~amturing/stages/pdf/Stage_GenerationTriangulations.pdf

Remarques :



SM207-176 Source-to-source inlining in MATLAB  
Admin  
Encadrant : Claude TADONKI
Labo/Organisme : Centre de Recherche en Informatique(CRI)
URL : http://www.cri.ensmp.fr/
Ville : Fontainebleau

Description  

MATLAB is a popular mathematical framework composes of a built-in internal library containing a significant set of commonly needed routines, and it also provides a language which allows the user to script macro calculations or to write complete programs, hence called ``the language of technical computing". When it comes to consider a source-to-source compilation chain to achieve better performance through high-level code optimisation techniques, we may need a self-contained code in order to have a better control of data dependencies. Inlining does not seem to be a native feature in MATLAB, so we need to write a tool that could perform the inlining task, thus merging a given function to all its callees. This operation could be performed just at the top-level call, or applied recursively on the calling paths. The main goal of this master 2 proposal is to design and implement a tool for a MATLAB source-to-source inlining.
URL sujet détaillé : :

Remarques : Corinne Ancourt



SM207-177 Parallel algorithms for geometric information analysis from 3D scanners  
Admin  
Encadrant : Sylvain CONTASSOT-VIVIER
Labo/Organisme : LORIA
URL : http://www.loria.fr/~contasss
Ville : Nancy

Description  

The objective of this internship is to design and implement parallel versions of algorithms dedicated to geometric information extraction
from 3D CT images. These algorithms should be adapted to the last parallel
computing architectures such as multi-cores, MIC Xeon-Phi and GPUs.
URL sujet détaillé : http://www.loria.fr/~contasss/documents/StageENS2015.pdf

Remarques : Le stage est co-encadré par Adrien Krähenb=FChl :
http://www.loria.fr/~krahenbu

La rémunération légale est prévue pour les étudiants non salariés.




SM207-178 Partitions acycliques de graphes orientes  
Admin  
Encadrant : Stéphane BESSY
Labo/Organisme : lirmm
URL : http://www.lirmm.fr/~bessy
Ville : montpellier

Description  


This internship is related to Graph Theory, oriented graphs and
probabilistic method in combinatorics. The main context is to study
the structure of oriented cycles in oriented graphs. Many (algorithmic
or structural) questions are open in this field and related tools
are very rich: structural graph theory, linear programming, probabilistic
method, Borsuk-Ulam Theorem...
More precisely, we will look at a question linked to a particular
oriented colouring of oriented graphs (where each colour class induces an
acylic oriented graph). Details are given in attachment.
URL sujet détaillé : http://www.lirmm.fr/~bessy/SujetM2-14-Chi-orient.pdf

Remarques :



SM207-179 Plongements topologiques de graphes dans des 2-complexes simpliciaux  
Admin  
Encadrant : Éric COLIN DE VERDIÈRE
Labo/Organisme : Dpt informatique, ENS Paris
URL : http://www.di.ens.fr/~colin/
Ville : Paris

Description  

A natural and standard problem in topology is to be able to decide whether a topological space X can be _embedded_ into another space Y, i.e., whether X can be viewed as a subspace of Y. The goal of the internship is to develop algorithms to decide whether a graph X can be embedded in a 2-dimensional simplicial complex Y (a space obtained from a graph by attaching two-dimensional triangles to some of the cycles of length three). Special cases of this problem has already been studied in very different contexts (topological graph theory, graph drawing, graph algorithms), but never with this level of generality.

URL sujet détaillé : http://www.di.ens.fr/~colin/stages/14ens-lyon/embed-simpl-compl.pdf

Remarques :



SM207-180 Décroisements de graphes dans le plan avec obstacles  
Admin  
Encadrant : Éric COLIN DE VERDIÈRE
Labo/Organisme : Dpt informatique, ENS Paris
URL : http://www.di.ens.fr/~colin/
Ville : Paris

Description  

The topic of the internship belongs to the field of computational topology, which aims at developing efficient algorithms for topological problems.

Suppose one is given a graph drawn in the plane with crossings, and a set of obstacles. The goal of the internship is to develop an efficient algorithm to decide whether it is possible to "uncross" the graph, by moving the edges and vertices of the graph, but without passing over any obstacle.

This is obviously a generalization of the planarity testing problem. Further extensions are also worth investigating, for example, the case where the graph is drawn with crossings not in the plane, but on a more complicated surface (e.g., a torus).

URL sujet détaillé : http://www.di.ens.fr/~colin/stages/14ens-lyon/untangle-graphs.pdf

Remarques :



SM207-181 Proof-Theoretic Strengh of MSO on Infinite Words  
Admin  
Encadrant : Colin RIBA
Labo/Organisme : LIP
URL : http://perso.ens-lyon.fr/colin.riba/
Ville : Lyon

Description  

The goal of this internship is to calibrate the proof-theoretic strength of Monadic Second-Order logic (MSO) on infinite words, seen as a subsystem of second-order arithmetic.

See the pdf for details.
URL sujet détaillé : http://perso.ens-lyon.fr/colin.riba/sujets/mso-arith-M2-2015.pdf

Remarques :



SM207-182 Time Dilation in the Distem emulator  
Admin  
Encadrant : Lucas NUSSBAUM
Labo/Organisme : LORIA, Université de Lorraine
URL : http://www.loria.fr/~lnussbau/
Ville : Nancy

Description  

Distem is a distributed systems emulator. When doing research on Cloud, P2P,
High Performance Computing or Grid systems, it can be used to transform an
homogenenous cluster (composed of identical nodes) into an experimental
platform where nodes have different performance, and are linked together
through a complex network topology, making it the ideal tool to benchmark
applications targetting such environments. Distem relies on modern Linux
technology to steal resources from applications and reproduce the desired
experimental conditions.

For a more detailed introduction to Distem, refer to http://distem.gforge.inria.fr/

The goal of this internship is to extend Distem to add support for textsl{time
dilation}. The basic idea is that if you change the perception of time to make
it run slower, the applications will run faster. This enables the experimenter
to create an experimental environment where hardware has performance that isn't
available yet (want a 1 Tb/s network? Take your 1Gb/s network, and make the
time run 1000 times slower).

The intern will add support for Time Dilation to Distem (a prototype for this
already exists) and perform experiments using Distem on the Grid'5000 testbed
to validate the approach at large scale.
URL sujet détaillé : http://www.loria.fr/~lnussbau/files/distem-dilation.pdf

Remarques :



SM207-183 Content-Centric Networks and Android on the Distem emulator  
Admin  
Encadrant : Lucas NUSSBAUM
Labo/Organisme : LORIA, Université de Lorraine
URL : http://www.loria.fr/~lnussbau/
Ville : Nancy

Description  

Distem is a distributed systems emulator. When doing research on Cloud, P2P,
High Performance Computing or Grid systems, it can be used to transform an
homogenenous cluster (composed of identical nodes) into an experimental
platform where nodes have different performance, and are linked together
through a complex network topology, making it the ideal tool to benchmark
applications targetting such environments. Distem relies on modern Linux
technology to steal resources from applications and reproduce the desired
experimental conditions.

See http://distem.gforge.inria.fr/ for more details about Distem.

The goal of this internship is to extend Distem to enable experiments on Content-Centric Networks, and on Android applications. Please refer to the detailed description for additional information.
URL sujet détaillé : http://www.loria.fr/~lnussbau/files/distem-ccn.pdf

Remarques :



SM207-184 Statistical analysis of pattern matching algorithms  
Admin  
Encadrant : Gilles DIDIER
Labo/Organisme : Institut de Mathématiques de Marseille
URL : https://www.i2m.univ-amu.fr/
Ville : Marseille

Description  

Context:
This project aims to study the (statistical) performance of text pattern matching algorithms (CITER LECROQ ET FARO). As part of an ongoing work, we've developed a general framework allowing to exactly estimate the average speed of such an algorithm while parsing a random text in search of a given pattern. More exactly, given a pattern matching algorithm, a pattern and the letters frequencies in the text, we compute the expected value of the number of reads (of text characters) made by this algorithm during a search for the word in a text iid (text whose letters follow a Bernoulli distribution). Given a word and the frequencies of the letters of the text, our approach allows as well to determine an effective (even optimal) pattern matching algorithm. The approach uses an original lattice-like structure associated with a given pattern, which, somehow, contains all the possible states of any algorithm of pattern matching (i.e. the combination of the states of its internal variables) when it looks for this pattern. Should the text be a sequence iid, for an algorithm and a given word, the succession of the states of the algorithm follows a Markov chain during a parsing.

Objectives:
Several developments are possible during the internship.
The approach may be extended to the multiple pattern matching case (that is the simultaneous search of several patterns) which has many applications in genetic sequences analysis.
Another development would be to deal with more complex models of random texts, like Markov or hidden Markov models.
Last, a more technical improvement would be to consider not only the number of text reads during the execution of an algorithm but also to take into account other elementary operations such as comparison, increment, etc.

Keywords: pattern matching algorithms, random texts, lattices, Markov chains
URL sujet détaillé : :

Remarques : Le stage est co-encadré avec Laurent Tichit et rémunéré...




SM207-185 Modelling the structure of ego networks  
Admin  
Encadrant : Christophe PRIEUR
Labo/Organisme : Univ. Paris-Diderot
URL : http://liafa.fr/~prieur
Ville : Paris

Description  

The success of user-centric services based on the structure of the personal network of their users has led to a renewal of the classical study of so-called ego-networks. In social network analysis, an ego-network is, roughly speaking, a network of "contacts" of a given individual called ego.

From any large network, coming from any type of data, it is possible to extract as many ego-networks as there are nodes in the network: the neighborhood of a node (in terms of graph theory) is an ego-network. One may even consider the 2-neighborhood (roughly speaking, the "friends of friends' network") as an extended ego-network, enabling to study the effective relational influence an individual has to deal with.

One structural property of social networks that remains an open question, is the relationship between number of links and number of nodes.
The study of large corpuses of ego-networks extracted from several datasets (phone calls, blogs, Facebook friends), shows that the density of such networks decreases drastically as the number of nodes increases.

With a simple probabilistic model, it is possible to generate a large graph whose ego-networks have similar densities, as shown on the following figure.
The aim of this internship is to elaborate on such models, analyze them and compare with various types of data.

URL sujet détaillé : http://www.liafa.univ-paris-diderot.fr/~prieur/Sujets/ego-structure.html

Remarques : Co-encadrement : Zvi Lotker, Univ. Ben Gurion et Liafa.



SM207-186 Topology of singular curves and surfaces  
Admin  
Encadrant : Guillaume MOROZ
Labo/Organisme : Loria/Inria
URL : http://www.loria.fr/equipes/vegas/
Ville : Nancy

Description  

The description of singular manifolds (self-intersecting curves or surfaces for example) arises in a wide range of applications, from scientific visualization to the design of robotic mechanisms. Currently, most software provide either (a) numerical approximations without topological guarantee; or (b) rely on symbolic computer algebra methods ([1]) to handle singular manifolds but suffer from efficiency in practice. Computing the topology of a manifold consists in computing a piecewise-linear graph or triangulation that can be deformed continuously toward the input manifold. In particular, it should preserve the number of connected components, the self-intersection points, etc. The goal of this subject is to design and to implement algorithms based on interval arithmetic (see [2] and references therein) to compute the topology of restricted classes of singular manifolds. The methods developed will be validated on some mechanism modelings coming from robotics.

For more information, see the full subject at http://www.loria.fr/~moroz/internships/2015-master-singcast.pdf

URL sujet détaillé : http://www.loria.fr/~moroz/internships/2015-master-singcast.pdf

Remarques : Ce stage est co-encadré par Guillaume Moroz et Marc Pouget.



SM207-187 Certified Parsing of Binary Data  
Admin  
Encadrant : Pierre-evariste DAGAND
Labo/Organisme : LIP6, Université Pierre et Marie Curie
URL : http://pagesperso-systeme.lip6.fr/Pierre-Evariste.Dagand/
Ville : Paris

Description  

This internship offers to design and implement a formally-verified format language in the Coq proof assistant. The rôle of a format language is to describe the structure of binary data. From a format description, one obtains a parser -- taking binary data to high-level data-structures -- and a serializer -- mapping back those data-structures to binary data. The parser is correct if it is the left-inverse of the serializer. We shall strive to provide a mechanically-checked proof of correctness of the parsers/serializers obtained from our format language. This internship will be held at LIP6 (UPMC, Paris).

URL sujet détaillé : http://pagesperso-systeme.lip6.fr/Pierre-Evariste.Dagand/stuffs/internship-2014-ens-whisper/proposal.pdf

Remarques :



SM207-188 Décidabilité et Complexité en Logique de Séparation  
Admin  
Encadrant : Didier GALMICHE
Labo/Organisme : LORIA
URL : http://www.loria.fr/~galmiche
Ville : Nancy

Description  

Le but de ce sujet est de rechercher et d'identifier de nouveaux fragments décidables de la logique de séparation avec et/ou sans extensions récursives et quantitatives, si possible moins restrictifs que le fragment purement existentiel, en utilisant une approche fondée sur la recherche de preuves et la construction de contre-modèles.

URL sujet détaillé : http://www.loria.fr/~galmiche/=stages/sujet1-M2R-2014.pdf

Remarques : Co-encadrement avec Daniel Méry.



SM207-189 Baguette magique et septraction dans la logique BI  
Admin  
Encadrant : Didier GALMICHE
Labo/Organisme : LORIA
URL : http://www.loria.fr/~galmiche
Ville : Nancy

Description  

L'objectif du sujet est d'étudier la version intuitionniste de la logique BI (Bunched Implications logic) où l'on a substitué à l'opérateur baguette magique l'opérateur de septraction. En particulier, on cherchera à adapter une méthode de recherche de preuves fondée sur les tableaux et la résolution de contraintes dans cette logique et à établir, le cas échéant, un plongement dans Boolean BI. On pourra étudier aussi, dans ce contexte, le cas de la logique BI où les deux opérateurs coexistent tant du point de vue de son expressivité que de celui de la théorie de la preuve.
URL sujet détaillé : http://www.loria.fr/~galmiche/=stages/sujet2-M2R-2014.pdf

Remarques : Co-encadrement avec Dominique Larchey-Wendling.



SM207-190 Génération de code de fonctions élémentaires avec arrondi correct  
Admin  
Encadrant : Jean-michel MULLER
Labo/Organisme : Aric team, LIP, ENS-Lyon
URL : http://www.ens-lyon.fr/LIP/AriC/
Ville : Lyon

Description  


La bibliothèque mathématique (libm) est un composant du système d'exploitation qui offre les fonctions élémentaires (exponentielle, logarithme, fonctions trigonométriques et leurs inverses, fonctions hyperboliques, ...) et certaines fonctions spéciales courantes (erf, gamma, ...). La performance est critique, puisque de nombreuses applications de calcul intensif passent une fraction importante de leur temps à évaluer de telles fonctions. Chaque évolution du processeur demande à réévaluer la pertinence des choix algorithmiques faits: des équipes d'ingénieurs, chez les principaux constructeurs de processeurs, travaillent à réimplémenter la libm sur chaque nouveau processeur. De plus, il faut de nos jours plusieurs versions de chaque fonction: différents contextes ont besoin de différentes spécifications, que ce soit en terme de performance (minimiser la latence, ou maximiser le débit du calcul), ou en terme de précision (favoriser la précision et la reproducibilité des calculs, ou favoriser la performance au détriment de la précision).

Mais face à cette complexité, les fonctions elles-mêmes, en tant qu'objets mathématiques, n'évoluent pas: leur implantation est basée sur une collection limitée de recettes bien connues, comme l'approximation polynômiale ou la tabulation de valeur précalculées. Le développement d'un libm est donc actuellement un art qui consiste à assembler au mieux ces recettes, pour maximiser la performance sous une contrainte de précision. Le projet MetaLibm a pour objectif de formaliser cet art, dans le but d'automatiser l'expertise mise en jeu.

Dans ce stage, on s'insérera dans ce projet pour y intégrer les fonctions avec arrondi correct. Cette nouveauté du standard IEEE-754-2008 demande de toujours retourner le nombre flottant le plus proche du réel f(x). Cela demande parfois de réaliser les calculs dans une précision intermédiaire très supérieure à la précision machine, nécessitant du code beaucoup plus complexe, et des preuves de programme à l'avenant. Toutefois, des expériences dans le projet CRLibm ont montré que l'automatisation de la génération de code permettait de maîtriser cette complexité.


L'intérêt de ce stage est donc de couvrir un spectre qui va des mathématiques (analyse et calcul d'erreur fin) jusqu'à l'architecture (utilisation pertinente des ressources des processeurs actuels).

Ce stage pourra se dérouler au LIP à Lyon ou au LIP6 à Paris. Il sera conjointement encadré par Jean-Michel Muller et Florent de Dinechin (Lyon), et Christoph Lauter (Paris).
URL sujet détaillé : :

Remarques :



SM207-191 Categories and Types for handling large hierarchies of classes for modeling Mathematics  
Admin  
Encadrant : Nicolas M. THIÉRY
Labo/Organisme : Laboratoire de Recherche en Informatique, Université Paris Sud
URL : http://Nicolas.Thiery.name/
Ville : Orsay

Description  

The modeling of thousands of different types of mathematical objects
in a general purpose mathematical software requires large hierarchies
of abstract classes whose design and maintenance is a challenge by
itself. Inspired by earlier work in Axiom, and followers (Aldor,
MuPAD, Fricas), Sage uses mathematical categories (category of groups,
...) and axioms (associativity, ...) as a design pattern to
dynamically construct this class hierarchy from duplication-free
semantic information. In Sage, this infrastructure is built on top of
(and not into) the standard Python object oriented and class hierarchy
mechanism, and did not require changing the language itself.

http://www.sagemath.org/doc/reference/categories/sage/categories/primer.html

The purpose of this internship is to explore, through the
implementation of a small scale but feature rich prototype, how this
design pattern could translate in some other programming language
like, for example, Julia, Scala, or OCaml. Specific questions involve
investigating how paradigms like type inference, traits, or dependent
types, could possibly help streamline the infrastructure, add type
safety, or enable code compilation.

URL sujet détaillé : :

Remarques : Le stagiaire aura dans son environnement proche (LRI) des spécialistes
de compilation, de systèmes de types, etc.




SM207-192 Multi-commodity flow assignment problem for wireless mesh network with dynamic flows  
Admin  
Encadrant : Hervé RIVANO
Labo/Organisme : CITI/INRIA
URL : http://www.citi-lab.fr/team/urbanet/
Ville : Villeurbanne

Description  

This work focuses on the design of a joint multi-path routing and scheduling algorithm for a backhaul wireless mesh network. A backhaul wireless mesh network is composed of a set of fixed nodes that uses their wireless interfaces to create a network topology. This topology is used to relay data flows among a set of origin and destination nodes. A typical example of a such network is a Wireless Sensor Network.
In our previous works [1][2], we have designed multi-path routing and scheduling algorithms that aim to maximize the efficiency of the network in term of capacity and/or energy consumption. A second criterion is to satisfy certain fairness among different carried flows, using for instance max-min fairness strategy.
When the carried flows are modeled as commodities with constant capacity requirements, the above problem can be formalized as a multi-commodity flow assignment problem with constraints. Solutions to solve this problem have been proposed in the literature [2][3]
The purpose of this work is to study the extension of this problem to dynamic flows. A first step toward this direction is to analyze at packet level the behavior of the wireless backhaul topology where the capacity has been dimensioned to satisfy the multi-commodity problem (i.e. capacity assignment resulting from the joint multi-path routing and scheduling algorithm). In the analysis we will focus on the temporal dynamic of the input queues of the nodes. This dynamic depends on the scheduling strategy and the input flow characteristics. The analysis will relays on event discrete simulations and eventually mathematical modeling using stochastic process (e.g. queuing theory). The objective of this analysis is to characterize the temporal behavior of the system considering packets level, rather that flow levels. From this observation we would like to propose a new formulation of the multi-commodity flow problem for wireless mesh networks.

References
[1] Abdelhak Farsi, Nadjib Achir, Khaled Boussetta, Backhaul Topology Design and Weighted Max-Min Fair Capacity Allocation in Wireless Mesh Networks, V. Guyot (Ed.): ICAIT 2012, LNCS 7593, pp. 296-309, 2013.
[2] Anis Ouni, Hervé Rivano, Fabrice Valois, Catherine Rosenberg, Energy and Throughput Optimization of Wireless Mesh Networks with Continuous Power Control, IEEE Transactions on Wireless Communications, 2014,
[3] Miriam Allalouf, Yuval Shavitt, Centralized and distributed algorithms for routing and weighted max-min fair bandwidth allocation. IEEE/ACM Transactions on Networking (Impact Factor: 2.01). 01/2008; 16:1015-1024

URL sujet détaillé : :

Remarques : Co-encadrement:
Khaled Boussetta: khaled.boussetta-lyon.fr

Possibilité de rénumération





SM207-193 Algorithms for detecting and correcting silent-errors  
Admin  
Encadrant : Yves ROBERT
Labo/Organisme : LIP, ENS Lyon
URL : http://graal.ens-lyon.fr/~yrobert/
Ville : Lyon

Description  

(1) Motivation

Silent errors (or silent data corruptions)
have become a major problem
for large-scale distributed systems.
Silent errors are caused for instance by soft efforts in L1 cache, or by bit flips
in main memory
due to cosmic radiation.
The key problem is that
the detection of a silent error is not immediate, because the error can be identified only when
the corrupted data is activated.
Silent error detection is a difficult task, but several application-specific algorithms have been developed
recently. Silent error correction is even more difficult, and only partial solutions
are currently available.
The major goal of this training period is to develop
generic algorithms to cope with silent errors, and to provide resilient
mechanisms able to both detect and correct them. The general-purpose approach
will combine verification mechanisms and checkpointing.
However, application-specific information, if available,
can always be used to decrease detection and correction costs.


(2) Technical setting

Initial studies will be focused around divisible-load applications,
where verifications and checkpoints can be inserted at any location during execution.
Further work will consider
task graph scheduling, starting with
an application task graph deployed on a large-scale platform. In term of traditional scheduling,
this corresponds to scheduling with unlimited resources and has polynomial complexity
in the absence of error management techniques.
With soft errors, which tasks should we verify and/or checkpoint?


(3) Background

We have extended expertise in the area of parallel algorithms, scheduling techniques,
checkpointing strategies and multi-criteria optimization problems.

Relevant literature will be provided.

(4) Required Skills

Algorithm design, probabilities.

URL sujet détaillé : :

Remarques : Co-encadrement:
Anne Benoit, Anne.benoit-lyon.fr




SM207-194 Optimal mechanisms for the protection of confidential information  
Admin  
Encadrant : Catuscia PALAMIDESSI
Labo/Organisme : LIX/Ecole Polytechnique
URL : http://www.lix.polytechnique.fr/~catuscia/
Ville : Palaiseau

Description  

The massive use of computing devices poses serious threats to the confidentiality of sensitive data and to the privacy of the user. Not surprisingly, then, there is a growing interest for mechanisms that protect the confidential part of the data while still allowing to use the service offered by the computing device. This is not an easy task, though, because in general the device requires certain information in order to provide the service, and this may be correlated to the information that the user wishes to keep secret. There is therefore a trade off between confidentiality and utility, and it is important to design mechanisms that are optimal, i.e., that offer the maximum utility for the desired level of confidentiality.

In this internship, we plan to investigate classes of utility and confidentiality measures, and to devise methods to design optimal mechanisms for given measures. We will consider a probabilistic setting, in which data are random variables, and the mechanisms themselves can be randomized. We will focus, in particular, on measures that are expressible as linear functions of the probabilities. This is the case of several notions of confidentiality and privacy considered in literature. We mention, in particular, the notion of g-vulnerability [1] and location privacy [2].

Bibliography

[1] M.S. Alvim, K. Chatzikokolakis, C. Palamidessi, and G. Smith. Measuring Information Leakage using Generalized Gain Functions. Proc. of CSF, 265-279, 2012. [https://hal.inria.fr/hal-00734044/]

[2] R. Shokri, G. Theodorakopoulos, C. Troncoso, J-P. Hubaux, and J-Y. Le Boudec. Protecting Location Privacy: Optimal Strategy against Localization Attacks. Proc. of the 19th ACM CCS, 617-627, 2012. [http://infoscience.epfl.ch/record/180460/files/ShokriTTHL_CCS12.pdf]

URL sujet détaillé : http://www.lix.polytechnique.fr/comete/stages/

Remarques : The équipe d'acqueil is the INRIA team Comète

The PhD student will be co-encadré by Kostas Chatzikokolakis (CNRS, Ecole Polytechnique)

Rémuneration: Standard INRIA gratification.



SM207-195 Optimal mechanisms for the protection of confidential information  
Admin  
Encadrant : Catuscia PALAMIDESSI
Labo/Organisme : LIX/Ecole Polytechnique
URL : http://www.lix.polytechnique.fr/~catuscia/
Ville : Palaiseau

Description  

The massive use of computing devices poses serious threats to the confidentiality of sensitive data and to the privacy of the user. Not surprisingly, then, there is a growing interest for mechanisms that protect the confidential part of the data while still allowing to use the service offered by the computing device. This is not an easy task, though, because in general the device requires certain information in order to provide the service, and this may be correlated to the information that the user wishes to keep secret. There is therefore a trade off between confidentiality and utility, and it is important to design mechanisms that are optimal, i.e., that offer the maximum utility for the desired level of confidentiality.

In this internship, we plan to investigate classes of utility and confidentiality measures, and to devise methods to design optimal mechanisms for given measures. We will consider a probabilistic setting, in which data are random variables, and the mechanisms themselves can be randomized. We will focus, in particular, on measures that are expressible as linear functions of the probabilities. This is the case of several notions of confidentiality and privacy considered in literature. We mention, in particular, the notion of g-vulnerability [1] and location privacy [2].

Bibliography

[1] M.S. Alvim, K. Chatzikokolakis, C. Palamidessi, and G. Smith. Measuring Information Leakage using Generalized Gain Functions. Proc. of CSF, 265-279, 2012. [https://hal.inria.fr/hal-00734044/]

[2] R. Shokri, G. Theodorakopoulos, C. Troncoso, J-P. Hubaux, and J-Y. Le Boudec. Protecting Location Privacy: Optimal Strategy against Localization Attacks. Proc. of the 19th ACM CCS, 617-627, 2012. [http://infoscience.epfl.ch/record/180460/files/ShokriTTHL_CCS12.pdf]

URL sujet détaillé : http://www.lix.polytechnique.fr/comete/stages/

Remarques : The équipe d'acqueil is the INRIA team Comète

The PhD student will be co-encadré by Kostas Chatzikokolakis (CNRS, Ecole Polytechnique)

Rémuneration: Standard INRIA gratification.



SM207-196 Etude et mise en oeuvre d'un support pour la gestion des grandes données au sein de l'intergiciel DIET sur environnements applicatifs dédiés  
Admin  
Encadrant : Eddy CARON
Labo/Organisme : LIP
URL : http://avalon.ens-lyon.fr
Ville : Lyon

Description  

(1.) Contexte

La société deFAB conçoit et développe des produits technologiquement
innovants. Fort de l'expérience en ingénierie de ses fondateurs, deFAB
souhaite participer à l'amélioration des logiques énergétiques liées
aux nouvelles technologies. Matériaux, gestion et optimisation des
surfaces et des énergies, deFAB s'inscrit dans les logiques du 21ème
siècle. Intelligence, empreinte écologique et innovation sont les
maitres mots.

deFAB développe une infrastructure innovante qui "consomme" des
tâches de calcul. Une collaboration avec le LIP a été mis en place
pour répondre à cette problématique dans un environnement dédié et
spécifique. Dans ce contexte, il est important de s'intéresser aux
mécanismes permettant de distribuer les tâches de calculs mais
également d'étudier les mécanismes de migrations. L'ordonnancement à
mettre en oeuvre est un problème complexe d'autant qu'il repose sur
un ensemble de paramètres spécifiques et hétérogènes. Le problème se
complexifie également par l'aspect dynamique des traitements à
réaliser que cela soit au niveau de la consommation des ressources ou
ai niveau des tâches de calcul.

En effet, différents modes de fonctionnement seront à considérer. Le
mode standard, le mode saturé (comment gérer une charge de calcul
supérieure aux ressources disponibles), le mode famine (comment gérer
une demande en calcul supérieure aux charges disponibles) ou encore le
mode dégradé (les ressources sont bridées ou indisponibles (problème
réseau, départ en vacances, etc.))

(2.) Travaux à faire

Dans un premier temps l'étudiant dressera un état de l'art à deux
niveaux. Un niveau lié aux solutions alliant consommation énergétique
et calculs. Un second niveau sur les solutions algorithmiques
d'ordonnancement dédiées aux architectures dynamiques (non pas en terme
de dynamicité du nombre de noeuds de calcul, mais soumis à des
variations de capacité de calcul).

Dans un second temps, l'étudiant aura la tâche de réaliser une
modélisation de l'infrastructure et de déterminer quels seront les
paramètres à prendre en compte pour réaliser la distribution des
tâches. Il sera important de déterminer quelles sont les informations
auxquelles nous pourront avoir accès sur l'infrastructure mise en
place. Inversement, cette analyse pourra également avoir des impacts
sur l'infrastructure elle-même et même le cas échéant influencer sur
sa conception.

Dans un troisième temps, sur la base de ces informations, l'étudiant
proposera des solutions algorithmiques permettant d'ordonnancer les
tâches et démontrera analytiquement ou par simulation l'efficacité des
solutions proposées.

Une dernière étape optionnelle qui dépendra des avancées des étapes
précédentes mais également de la disponibilité de la plate-forme de la
société deFAB, serait la mise en oeuvre de ces algorithmes. Pour
cela l'étudiant pourra bénéficier de l'expertise de l'équipe Avalon
dans la réalisation de plug-in d'ordonnancement dans l'intergiciel
DIET (http://graal.ens-lyon.fr/diet).

URL sujet détaillé : http://graal.ens-lyon.fr/~ecaron/sujet_M2_avalon_DeFAB_14.pdf

Remarques : Le stage sera co-encadré par Benjamin Laplane (deFAB) et Thomas Garnier (deFAB)

Le stage se déroulera dans l'Equipe-Projet Commune (Inria, ENS-Lyon,
CNRS, UCBL) Avalon (http://avalon.ens-lyon.fr) au sein du LIP dans les
locaux de l'ENS-Lyon. Cependant des visites et des contacts réguliers
avec l'équipe technique de la société deFAB seront mis en place. Un
financement de thèse CIFRE est envisageable à l'issue du stage.




SM207-197 Détection d'objets par contraintes géométriques et suivi de cibles pour les systèmes de transport intelligents  
Admin  
Encadrant : Sergio RODRIGUEZ
Labo/Organisme : Dept. Systèmes Autonomes, IEF, Université Paris Sud
URL : http://hebergement.u-psud.fr/sergio.rodriguez/doku.php?id=start
Ville : Orsay

Description  

The aim of the proposal is to adapt current state of the art algorithms in order to perform moving object detection and tracking from moving platforms. There are two major challenges, one regarding the geometrical constraints of the detectaion which have to be able to cope with strong dynamics. The second challenge is more related to the task of learning the targets and tracking them in complex environments. The student is expected to get involved in both aspects, but depending on his background he may get involved more in one of these two topics.
URL sujet détaillé : https://www.dropbox.com/s/rgrl2zqphardvra/internship2015dynamique.pdf?dl=0

Remarques : Co-encadrement avec Emanuel Aldea du même laboratoire.



SM207-198 Unsupervised analysis of cumulative spaces using a contrario approaches  
Admin  
Encadrant : Samia BOUCHAFA
Labo/Organisme : IBISC, Université d'Evry
URL : https://www.ibisc.univ-evry.fr/
Ville : Evry

Description  

The internship we propose is part of a broader study dealing with the perception of the environment by an autonomous vehicle. In this kind of applications, the robustness of the algorithms which are employed is essential. The objective of the internship is to propose, using a practical example such as the detection of frontal objects observed by a moving vehicle, a decision mechanism which relies at the same time on cumulative
and on a contrario strategies.

URL sujet détaillé : https://www.dropbox.com/s/q91ui5kl32fnrhg/internship2015acontrariocumulative.pdf?dl=0

Remarques : Co-encadrement avec Prof. Sylvie le Hégarat d'Université Paris Sud



SM207-199 Self-Optimization of service-oriented architectures for Mobile and Cloud Applications  
Admin  
Encadrant : Naouel MOHA
Labo/Organisme : Université du Québec à Montréal (UQAM)
URL : http://info.uqam.ca/
Ville : Montréal, Canada

Description  

In the case of Software-as-a-Service (SaaS) and mobile apps, the presence of defects inevitably
leads to resource leaks (CPU, memory, battery, etc.), thus preventing the deployment of sustainable solutions. The detection and correction of these defects are thus critical activities to improve the design and the QoS of SDS, in order to ease and speed up both maintenance and evolution tasks assigned to software engineers. However, current methods and techniques for the detection and correction of defects in Service-oriented Distributed Systems (SDS) are still in their infancy, as one can assess the questionable performances of existing systems and the fragility they can exhibit. Beyond the technological complexity of the underlying platforms, we believe that addressing this challenge requires to combine expertise in software engineering and distributed systems in order to correlate symptoms collected at runtime with the software artefacts being developed (source code, configuration files, architecture descriptions, etc.).

The goal of this research project is to implement an innovative approach to repair and optimize service-oriented distributed systems deployed across Cloud and mobile environments. The student will work with another PhD student who has proposed this innovative approach, which supports the runtime detection and correction of SOA anti-patterns in large-scale SDS in order to continuously optimize their QoS. One originality of this approach lies in the dynamic nature of the SOA environment and the application on emerging frameworks for embedded and distributed systems (e.g., Android/iOS for mobile devices, PaaS/SaaS for Cloud environments), and in particular mobile
systems interacting with remote services hosted on the Cloud.
URL sujet détaillé : :

Remarques : The student needs to have a very good knowledge of mobile applications. Knowledge on Cloud based applications is appreciated but not mandatory. The student needs to have also excellent or very
good skills in JAVA programmation and be familiar with development environments such as Eclipse. A good knowledge of good design practices such as design patterns is appreciated, but not required.

The student will mainly implement the approach proposed by the research team and perform experiments to evaluate it. The student will work with a PhD student and the professor in charge of this project.

The internship will be



SM207-200 Détection d'incohérences linguistiques dans les systèmes à base de services  
Admin  
Encadrant : Naouel MOHA
Labo/Organisme : Université du Québec à Montréal (UQAM)
URL : http://info.uqam.ca/
Ville : Montréal, Canada

Description  

Contexte

L'architecture orientée service (SOA) est un style architectural émergent et en cours d'être largement adopté qui permet de développer des systèmes distribués flexibles, évolutifs et à faible coût en combinant des services prêt à l'emploi. Ce style architectural permet la construction d'un large éventail de Systèmes à Base de Service (SBS) dans de nombreux domaines, comme par exemple la domotique et plus récemment l'informatique dans les nuages (cloud computing). Facebook, Google, Amazon, eBay, PayPal et FedEx sont des exemples types de ce genre de systèmes.

Problématique

Comme tout système logiciel complexe, les SBS évoluent pour s'adapter aux nouvelles exigences des utilisateurs en termes de fonctionnalités et de qualité de service (QdS). Les changements sont souvent fait avec une pression de temps et des ressources limitées ce qui peut mener à une dégradation de la QdS. Par exemple, lorsqu'un service évolue, son interface et sa documentation ne sont pas mis à jour et ils ne reflètent plus correctement la fonctionnalité du service. Dans le cadre des systèmes orientés objet (OO), des anti-patrons de ce type d'incohérence, c.-à-d. des incohérences entre le nom, la documentation et l'implémentation de méthodes et attributs, ont été définis et sont appelés des anti-patrons linguistiques (LAs). Toujours dans le contexte de systèmes OO, des études empiriques montrent que les développeurs perçoivent du code source contenant des LAs comme étant de mauvaise qualité. Dans le contexte des SBS des incohérences de ce type peuvent avoir un plus grand impact sur la qualité des services car la documentation et le nom du service sont la seule information dont disposent les consommateurs des services. Ainsi, pour aider l'évolution des SBS il est indispensable de détecter et corriger les incohérences aussi tôt que possible.

Travail à réaliser

Le but du stage est d'intégrer la détection d'anti-patrons linguistiques dans un outil existant de détection automatique de défauts dans les SBS. Plus précisément, le travail consiste à 1) implémenter des stratégies de détection d'incohérences entre le nom, la documentation et l'implémentation de service, 2) étudier à quel point les incohérences sont présents dans les SBS en exécutant les stratégies implémentées sur plusieurs SBS et 3) étudier la corrélation des LAs avec d'autres défauts (ex. ceux qui sont déjà implémentés dans l'outil existant). L'outil existant est développé en Java et utilise plusieurs technologies et plugins notamment des plugins d'analyse lexicale.
URL sujet détaillé : :

Remarques : Connaissances et habilités requises
L'étudiant doit avoir une connaissance solide de JAVA et être familier avec les environnements de développement tels qu'Eclipse. De plus, une bonne connaissance des patrons de conception serait appréciée.

Domaines d'études pertinents au poste :
Informatique, et en particulier la conception de logiciels.

Stage rémunéré: selon les objectifs et résultats atteints.

Environnement de travail : Ce travail sera réalisé en collaboration avec un étudiant de doctorat en informatique à l'UQÀM. Ce travail s'intègre donc dans un travail d'équipe et de collaboration.



SM207-201 Dynamic Monitoring of Information-Flow Past-Linear Time Logic Formulas in Virtualized Environments  
Admin  
Encadrant : Eddy CARON
Labo/Organisme : LIP, ENS Lyon
URL : http://avalon.ens-lyon.fr/
Ville : Lyon

Description  

Context:
Security is one of the main concern for massive adoption of Cloud Computing. A Cloud envi-
ronment faces a variety of threats from traditional threats such as in Operating System (OS) (at the hypervisor level or at the VM OS level) to new threats arising from having massively shared
environments.
Traditionally, a security expert is in charge of the security policy specification of a company's
system. Nevertheless, manual security specification is an exhaustive and error-prone task often leading to unsecured setups. On the other hand, a formal specification is more suitable as it opens the path for many opportunities such as automatic verification, enforcement or proofs.
Information-Flow Past-Linear Time Logic (IF-PLTL) is a new security-aware logic designed to
encompass security concerns such as indirect and intransitive information flows.

Intership:
This internship aims at investigating solutions for an efficient dynamic monitor of IF-PLTL formu-
las.
At first, the student will become acquainted with the state-of-the-art in both information-flow
security and temporal logics for security. Thus, a good start will be to implement the current
version of the dynamic monitor algorithm 1 using the Linux Security Module exposed by the Linux
kernel. During his work, the student is encouraged to study pratical limitations, to improve the
algorithm efficiency or to envision hypervisor-level solutions.
URL sujet détaillé : http://perso.ens-lyon.fr/arnaud.lefray/enseignement/SujetsStageM2/SujetM2_DynMonitoringIFPLTL.pdf

Remarques : Co-supervised with Arnaud Lefray.



SM207-202 Matroid-based packing of arborescences  
Admin  
Encadrant : Zolt=E1n SZIGETI
Labo/Organisme : Laboratoire G-SCOP
URL : http://pagesperso.g-scop.grenoble-inp.fr/~szigetiz/
Ville : Grenoble

Description  

The subject of this master internship concerns a packing type problem in Combinatorial Optimization, namely the existence of arc-disjoint arborescences in digraphs. The classic result of
Edmonds characterizes digraphes having k arc-disjoint
arborescences. The paper of O. Durand de Gevigney, V.-H. Nguyen and Z. Szigeti extended that result by characterizing digraphs having a packing of arborescences with matroid constraints. Such a packing can be found in polynomial time using submodular function minimization. The minimum cost version can also be solved in polynomial time using the ellipsoid method. The aim of this master internship is to find combinatorial algorithms for these problems.
URL sujet détaillé : http://pagesperso.g-scop.grenoble-inp.fr/~szigetiz/articles/packing_arborescences_21.pdf

Remarques :



SM207-203 Development of a constraint solver based on abstract domains  
Admin  
Encadrant : Charlotte TRUCHET
Labo/Organisme : LINA (UMR 6241), Université de Nantes
URL : http://www.normalesup.org/~truchet/
Ville : Nantes

Description  

Development of a constraint solver based on abstract domains

Constraint Progamming (CP) is a declarative programming framework for modeling and solving hard combinatorial problems. The constraints, which are first order logic predicates, express the relationships between the variables of the problem. Efficient and generic solving algorithms exist for a wide range of constraints, with many successful applications in logistics, design, bioinformatics, music, etc. However, these solvers usually deal with either problems on discrete domains, or problems on continuous domains. Mixed problems, with both real and integer variables, must be redefined before solving, with a possible loss of the properties of the solvers.

This internship focusses on a prototype constraint solver called Absolute, developed jointly in the TASC team in Nantes (LINA, Université de Nantes) and in the Abstract Interpretation team in Paris (ENS). It is based on notions and tools from Abstract Interpretation (AI), a theory of approximations of the semantics of programs. In AI, the semantics are approximated as abstract domains, which are sound approximations of the programs traces. Many abstract domains have been defined, along with a unified theory and tools making it possible to use them in the same analyzer. The abstract domains can also be used in CP to generalize some of its key components (consistency and propagation). Absolute exploits this link to solve constraints in the abstract domains, and possibly on any type of variables.

Absolute works well, but it is still a prototype. The goal of this internship is to improve its solving algorithms and its heuristics (choice of an abstract domains, variable choice, value choice, propagation loop...). In a first time, the intern will test the solver on wide benchmarks from CP and Operations Research, including problems with both integer and real variables. From this work, he/she will have a better understanding of the efficiency of these heuristics on practical problems. The second step will be to propose generic heuristics to choose the best abstract domain for each constraint of the problem.
URL sujet détaillé : :

Remarques : Ce stage fait l'objet d'une collaboration avec Antoine Miné, de l'équipe Interprétation Abstraite de l'ENS, et Marie Pelleau, de l'Université de Montréal.




Last modification : 2014-12-12 09:36:13 laurent.lefevre@ens-lyon.fr View source.