Handbook of Floating-Point Arithmetic
Jean-Michel Muller (coordinator), Nicolas Brisebarre, Florent de Dinechin, Claude-Pierre Jeannerod, Vincent Lefèvre, Guillaume Melquiond, Nathalie Revol, Damien Stehle, Serge Torres.
Birkhauser Boston, dec. 2009. 572 p. 62 illus., ISBN: 978-0-8176-4704-9

BibTeX entry:
@Book{MullerEtAl2010,
title = {Handbook of Floating-Point Arithmetic},
author = {Muller, Jean-Michel and Brisebarre, Nicolas and de
Dinechin, Florent and Jeannerod, Claude-Pierre and
Lef{\`e}vre, Vincent and Melquiond, Guillaume and Revol,
Nathalie and Stehl{\'e}, Damien and Torres, Serge},
publisher = {{B}irkh\"auser {B}oston },
pages = {572 },
note = {{ACM} {G}.1.0; {G}.1.2; {G}.4; {B}.2.0; {B}.2.4; {F}.2.1.,
ISBN 978-0-8176-4704-9},
year = {2010},
}
See a preliminary version of chapter 1.
Our bibliographic database (BibTeX format)
Springer/Birkhauser web site for the book
Floating-point arithmetic is by far the most widely used way of
implementing real-number arithmetic on modern computers. Although the
basic principles of floating-point arithmetic can be explained in a
short amount of time, making such an arithmetic reliable and portable,
yet fast, is a very difficult task. From the 1960s to the early 1980s,
many different arithmetics were developed, but their implementation
varied widely from one machine to another, making it difficult for
nonexperts to design, learn, and use the required algorithms. As a
result, floating-point arithmetic is far from being exploited to its
full potential.
This handbook aims to provide a complete overview of modern
floating-point arithmetic, including a detailed treatment of the newly
revised (IEEE 754-2008) standard for floating-point arithmetic.
Presented throughout are algorithms for implementing floating-point
arithmetic as well as algorithms that use floating-point arithmetic. So
that the techniques presented can be put directly into practice in
actual coding or design, they are illustrated, whenever possible, by a
corresponding program.
Key topics and features include:
- Presentation
of the history and basic concepts of floating-point arithmetic and
various aspects of the past and current standards
- Development
of smart and nontrivial algorithms, and algorithmic possibilities
induced by the availability of a fused multiply-add (fma) instruction,
e.g., correctly rounded software division and square roots
- Implementation
of floating-point arithmetic, either in software—on an integer
processor—or hardware, and a discussion of issues related to compilers
and languages
- Coverage
of several recent advances related to elementary functions: correct
rounding of these functions and computation of very accurate
approximations under constraints
- Extensions of floating-point arithmetic such as certification, verification, and big precision
Handbook
of Floating-Point Arithmetic is designed for programmers of numerical
applications, compiler designers, programmers of floating-point
algorithms, designers of arithmetic operators, and more generally,
students and researchers in numerical analysis who wish to better
understand a tool used in their daily work and research.
Some links related to Floating-Point Arithmetic:
Ercegovac and Lang's book "Digital Arithmetic"
Koren's book "Computer arithmetic algorithms"
Markstein's book "IA-64 and Elementary Functions"
Cornea, Harrison and Tang's book "Scientific Computing on Itanium-Based Systems"
Muller's book "Elementary
functions, Algorithms and Implementation" (second edition, nov.
2005)
Nick Higham's book,
Accuracy and
Stability of Numerical Algorithms(SIAM, Second edition, August 2002, xxx+680 pp.).
Bibtex
bibliography fparith.bib, (maintained by Norbert Juffa and Nelson H. F. Beebe)
People and groups
Stanford architecture and arithmetic
group
David
G. Hough's validlab home page
John Harrison
William Kahan's home page
Peter Markstein
Lehigh University Computer Architecture and Arithmetic
Group
David
Matula
Paul Zimmerman
Nick Higham
Michael Schulte
Interval computations
Errata
- page 4, line 7, "of w" should be suppressed (i.e., the sentence should end after "representation").
- page 128, line 1, the proof of Theorem 4 should start with "Without loss of generality, we assume a > 0".
- page 128, 7th line from the bottom, "If ... then s = a-b, since..." should be replaced by "If ... then s = a+b, since..."
- page 129, 4th line, "a+b" should be replaced by "|a+b|"
- page 153, 2nd line of Theorem 13, "Let sigma be x..." should be replaced by "Let sigma be sqrt(x)..."
- page 555 (reference [320]), "M. A. Overton" should be replaced by "M.
L. Overton"