On the Impact of the Flow-Size Distribution's Tail Index on Network Performance with TCP Connections
Oana Goga, Patrick Loiseau, Paulo Gonçalves
Performance 2011, October 2011, Amsterdam, The Netherlands
In this paper, we study the impact of the flow-size distribution on network performance in the case of a single bottleneck with finite buffer. To tackle the case where flows are transmitted with the TCP protocol, we use real experiments and ns-2 simulations. Our preliminary results show that the distribution's tail index impacts the performance in a more complex way than what is reported in existing literature. In particular, we exhibit situations where a heavier tail gives better performance for certain metrics.
We argue that a main cause of our observed results is the transient behavior at the beginning of each flow.
Modeling TCP Throughput: an Elaborated Large-Deviations-Based Model and its Empirical Validation
Patrick Loiseau, Paulo Gonçalves, Julien Barral, Pascale Vicat-Blanc Primet
Performance 2010, November 2010, Namur, Belgium
In today's Internet, a large part of the traffic is carried using the TCP transport protocol. Characterization of the variations of TCP traffic is thus a major challenge, both for resource provisioning and Quality of Service purposes. However, most existing models are limited to the prediction of the (almost-sure) mean TCP throughput and are unable to characterize deviations from this value.
In this paper, we propose a method to describe the deviations of a long TCP flow's throughput from its almost-sure mean value. This method relies on an ergodic large-deviations result, which was recently proved to hold on almost every single realization for a large class of stochastic processes. Applying this result to a Markov chain modeling the congestion window's evolution of a long-lived TCP flow, we show that it is practically possible to quantify and to statistically bound the throughput's variations at different scales of interest for applications. Our Markov-chain model can take into account various network conditions and we demonstrate the accuracy of our method's prediction in different situations using simulations, experiments and real-world Internet traffic. In particular, in the classical case of Bernoulli losses, we demonstrate: i) the consistency of our method with the widely-used square-root formula predicting the almost-sure mean throughput, and ii) its ability to additionally predict finer properties reflecting the traffic variability at different scales.
Contributions to the Analysis of Scaling Laws and Quality of Service in Networks: Experimental and Theoretical Aspects
Patrick Loiseau
PhD thesis, École Normale Supérieure de Lyon
In today's context of rapid expansion of the Internet, deep understanding of the statistical properties of network traffic is essential for Internet Service Providers to offer users the best Quality of Service possible. A major breakthrough in that direction was the discovery in 1993 of the self-similarity of network traffic, followed up by the ON/OFF model proposed in 1997 that posits the heavy-tailness of flow-size distributions as a plausible origin of this property. While of great interest, such mathematical models always rely on necessary simplifying assumptions which can limit their practical applicability to real networks, in particular due to the complexity of the TCP protocol.
In this thesis, we use a hybrid approach based on the combination of real traffic traces, controlled experiments and theoretical developments to address some open questions concerning network traffic properties and their impact on QoS. Our experiments are based on a large-scale controllable testbed and an efficient traffic capture system.
Then, we first address issues related to aggregate network traffic: we extend previous long-range dependent models and we propose an estimator of the flow-size distribution's tail index under sampling.
We also perform an empirical study of the impact of long-range dependence and heavy-tails on QoS. Finally, we turn to the packet-level traffic of one TCP source and show, using a large-deviation principle, that it can be finely characterized by a multifractal structure intimately related to the control mechanism AIMD, and naturally reproduced by Markov models.
Investigating self-similarity and heavy-tailed distributions on a large scale experimental facility
Patrick Loiseau, Paulo Gonçalves, Guillaume Dewaele, Pierre Borgnat, Patrice Abry, Pascale Vicat-Blanc Primet
to appear in ToN, 2010
After the seminal work by Taqqu et al. relating self-similarity to heavy-tailed distributions, a number of research articles verified that aggregated Internet traffic time series show self-similarity and that Internet attributes, like Web file sizes and flow lengths, were heavy-tailed.
However, the validation of the theoretical prediction relating self-similarity and heavy tails remains unsatisfactorily addressed, being investigated either using numerical or network simulations, or from uncontrolled Web traffic data. Notably, this prediction has never been conclusively verified on real networks using controlled and stationary scenarii, prescribing specific heavy-tailed distributions, and estimating confidence intervals.
With this goal in mind, we use the potential and facilities offered by the large-scale, deeply reconfigurable and fully controllable experimental Grid5000 instrument, to investigate the prediction observability on real networks.
To this end we organize a large number of controlled traffic circulation sessions on a nation-wide real network involving two hundred independent hosts. We use a FPGA-based measurement system, to collect the corresponding traffic at packet level. We then estimate both the self-similarity exponent of the aggregated time series and the heavy-tail index of flow size distributions, independently.
On the one hand, our results complement and validate with a striking accuracy some conclusions drawn from a series of pioneer studies. On the other hand, they bring in new insights on the controversial role of certain components of real networks.
Impact of the Correlation between Flow Rates and Durations on the Large-Scale Properties of Aggregate Network Traffic
Patrick Loiseau, Paulo Gonçalves, Pascale Vicat-Blanc Primet
INRIA Research Report 7100, November 2009
Since the discovery of long-range dependence in network traffic in 1993, many models have appeared to reproduce this property, based on heavy-tailed distributions of some flow-scale properties of the traffic. However, none of these models consider the correlation existing between flow rates and flow durations. In this work, we extend previously proposed models to include this correlation. Based on a planar Poisson process setting, which describes the flow-scale traffic structure, we analytically compute the auto-covariance function of the aggregate traffic's bandwidth and show that it exhibits long-range dependence with a different Hurst parameter. In uncorrelated case, the model that we propose is consistent with existing models, and predict the same Hurst parameter. We also prove that pseudo long-range dependence with a different index can arise from highly variable flow rates. The pertinence of our model choices is validated on real web traffic traces.
Maximum Likelihood Estimation of the Flow Size Distribution Tail Index from Sampled Packet Data
Patrick Loiseau, Paulo Gonçalves, Stéphane Girard, Florence Forbes, Pascale Primet Vicat-Blanc
Sigmetrics/Performance 2009, June 2009, Seattle, WA, USA
In the context of network traffic analysis, we address the problem of estimating the tail index of flow (or more generally of any group) size distribution from the observation of a sampled population of packets (individuals).
We give an exhaustive bibliography of the existing methods and show the relations between them. The main contribution of this work is then to propose a new method to estimate the tail index from sampled data, based on the resolution of the maximum likelihood problem.
To assess the performance of our method, we present a full performance evaluation based on numerical simulations, and also on a real traffic trace corresponding to internet traffic recently acquired.
Metroflux: A high performance system for analyzing flow at very fine-grain
Patrick Loiseau, Paulo Gonçalves, Romaric Guillier, Matthieu Imbert, Yuetsu Kodama, Pascale Vicat-Blanc Primet
TridentCom 2009, April 2009, Washington DC, USA
Researches in network traffic analysis embrace a large diversity of goals and are based on a variety of methodologies and tools. To have a better insight on the real nature and on the evolution of network traffic we argue that fine-grain analysis of real traffic traces have to complement simulations studies as well as coarse grain measurement performed by classical flow measurement systems. In particular, packet level measurements and analysis are needed. However, such methodologies are resource consuming and require very high performance devices to be operational in real high speed networks. In this paper we present the Metroflux system which aims at providing researchers and network operators with a very flexible and accurate packet-level traffic analysis toolkit configured for 1 Gbps and 10 Gbps speed links. This system is based on the GtrcNet FPGA-based device technology and on specific statistical analysis tools. We show the potential and the facilities offered by the Metroflux system coupled with the Grid5000 large scale experimental platform and the Network eXperiment Engine (NXE) we have developed. We illustrate the application of the Metroflux system with the practical validation of the theoretical prediction relating self-similarity and heavy tails given by Taqqu theorem. We also illustrate several usages of this toolset, such as the investigation of conditions under which several traffic theories apply, as well as studies on traffic, protocols and systems interactions.
Investigating self-similarity and heavy-tailed distributions on a large scale experimental facility
Patrick Loiseau, Paulo Gonçalves, Guillaume Dewaele, Pierre Borgnat, Patrice Abry, Pascale Vicat-Blanc Primet
INRIA Research Report 6472, March 2008
After seminal work by Taqqu et al. relating self-similarity to heavy tail distributions, a number of research articles verified that aggregated Internet traffic time series show self-similarity and that Internet attributes, like WEB file sizes and flow lengths, were heavy tailed.
However, the validation of the theoretical prediction relating self-similarity and heavy tails remains unsatisfactorily addressed, being investigated either using numerical or network simulations, or from uncontrolled web traffic data. Notably, this prediction has never been conclusively verified on real networks using controlled and stationary scenarii, prescribing specific heavy-tail distributions, and estimating confidence intervals.
In the present work, we use the potential and facilities offered by the large-scale, deeply reconfigurable and fully controllable experimental Grid5000 instrument, to investigate the prediction observability on real networks.
To this end we organize a large number of controlled traffic circulation sessions on a nation-wide real network involving two hundred independent hosts. We use a FPGA-based measurement system, to collect the corresponding traffic at packet level. We then estimate both the self-similarity exponent of the aggregated time series and the heavy-tail index of flow size distributions, independently.
Comparison of these two estimated parameters, enables us to discuss the practical applicability conditions of the theoretical prediction.
A comparative study of different heavy tail index estimators of the flow size from sampled data
Patrick Loiseau, Paulo Gonçalves, Pascale Primet Vicat-Blanc
MetroGrid workshop, GridNets-07, October 17-19 2007, Lyon, France
In this article, we address the problem of estimating the tail
parameter of a flow size distribution from sampled packet
traffic. Based on synthetic data, we perform a systematic
comparison of several estimators proposed in the literature.
In the course, we propose a variant to an existing method
which takes into account some statistical a priori on the
expected distribution. This adapted estimator shows a significantly
improved performance, as compared to the others.
Empirical mode decomposition to assess cardiovascular autonomic control in rats
Edmundo Pereira de Souza Neto, Patrice Abry, Patrick Loiseau, Jean-Christophe Cejka, Marc-Antoine Custaud, Jean Frutoso, Claude Gharib, Patrick Flandrin
Fundamental & Clinical Pharmacology, Volume 21, Issue 5, Page 481-496, October 2007
Heart beat rate and blood pressure, together with baroreflex sensitivity, have become important tools in assessing cardiac autonomic system control and in studying sympathovagal balance. These analyses are usually performed thanks to spectral indices computed from standard spectral analysis techniques. However, standard spectral analysis and its corresponding rigid band-pass filter formulation suffer from two major drawbacks. It can be significantly distorted by non-stationarity issues and it proves unable to adjust to natural intra- and inter-individual variability. Empirical mode decomposition (EMD), a tool recently introduced in the literature, provides us with a signal-adaptive decomposition that proves useful for the analysis of non-stationary data and shows a strong capability to precisely adjust to the spectral content of the analyzed data. It is based on the concept that any complicated set of data can be decomposed into a finite number of components, called intrinsic mode functions, associated with different spectral contributions. The aims of this study were twofold. First, we studied the changes in the sympathovagal balance induced by various pharmacological blockades (phentolamine, atropine and atenolol) of the autonomic nervous system in normotensive rats. Secondly, we assessed the use of EMD for the analysis of the cardiac sympathovagal balance after pharmacological injections. For this, we developed a new (EMD-based) low frequency vs. high frequency spectral decomposition of heart beat variability and systolic blood pressure, we define the corresponding EMD spectral indices and study their relevance to detect and analyze changes accurately in the sympathovagal balance without having recourse to any a priori fixed high-pass/low-pass filters.