Handbook of Floating-Point Arithmetic

Jean-Michel Muller (coordinator), Nicolas Brisebarre, Florent de Dinechin, Claude-Pierre Jeannerod, Vincent Lefèvre, Guillaume Melquiond, Nathalie Revol, Damien Stehle, Serge Torres.

Birkhauser Boston, dec. 2009. 572 p. 62 illus., ISBN: 978-0-8176-4704-9

BibTeX entry:

@Book{MullerEtAl2010,
  title        = {Handbook of Floating-Point Arithmetic},
  author    = {Muller, Jean-Michel and Brisebarre, Nicolas and de Dinechin, Florent and Jeannerod, Claude-Pierre and
          Lef{\`e}vre, Vincent and Melquiond, Guillaume and Revol,
          Nathalie and Stehl{\'e}, Damien and Torres, Serge},
  publisher    = {{B}irkh\"auser {B}oston },
  pages    = {572 },
  note        = {{ACM} {G}.1.0; {G}.1.2; {G}.4; {B}.2.0; {B}.2.4; {F}.2.1.,
                  ISBN 978-0-8176-4704-9},
  year        = {2010},
}



See a preliminary version of chapter 1.

Our bibliographic database (BibTeX format)

Springer/Birkhauser web site for the book


Floating-point arithmetic is by far the most widely used way of implementing real-number arithmetic on modern computers. Although the basic principles of floating-point arithmetic can be explained in a short amount of time, making such an arithmetic reliable and portable, yet fast, is a very difficult task. From the 1960s to the early 1980s, many different arithmetics were developed, but their implementation varied widely from one machine to another, making it difficult for nonexperts to design, learn, and use the required algorithms. As a result, floating-point arithmetic is far from being exploited to its full potential.
This handbook aims to provide a complete overview of modern floating-point arithmetic, including a detailed treatment of the newly revised (IEEE 754-2008) standard for floating-point arithmetic. Presented throughout are algorithms for implementing floating-point arithmetic as well as algorithms that use floating-point arithmetic. So that the techniques presented can be put directly into practice in actual coding or design, they are illustrated, whenever possible, by a corresponding program.
Key topics and features include:
Handbook of Floating-Point Arithmetic is designed for programmers of numerical applications, compiler designers, programmers of floating-point algorithms, designers of arithmetic operators, and more generally, students and researchers in numerical analysis who wish to better understand a tool used in their daily work and research.


Some links related to Floating-Point Arithmetic:

Ercegovac and Lang's book "Digital Arithmetic"

Kornerup and Matula's book "Finite Precision Number Systems and Arithmetic"

Koren's book "Computer arithmetic algorithms"

Markstein's book "IA-64 and Elementary Functions"

Cornea, Harrison and Tang's book "Scientific Computing on Itanium-Based Systems"

Muller's book "Elementary functions, Algorithms and Implementation" (second edition, nov. 2005)

Nick Higham's book, Accuracy and Stability of Numerical Algorithms(SIAM, Second edition, August 2002, xxx+680 pp.).
 

Bibtex bibliography fparith.bib, (maintained by
Norbert Juffa and Nelson H. F. Beebe)


People and groups

Stanford architecture and arithmetic group

David G. Hough's  validlab home page

John Harrison

William Kahan's home page

Peter Markstein

Lehigh University Computer Architecture and Arithmetic Group

David Matula

Paul Zimmerman

Nick Higham

Michael Schulte

Interval computations


Errata


- page 4, line 7, "of w" should be suppressed (i.e., the sentence should end after "representation").
- page 128, line 1, the proof of Theorem 4 should start with "Without loss of generality, we assume a > 0".
- page 128, 7th line from the bottom, "If ... then s = a-b, since..." should be replaced by "If ... then s = a+b, since..."
- page 129, 4th line, "a+b" should be replaced by "|a+b|"
- page 153, 2nd line of Theorem 13, "Let sigma be x..." should be replaced by "Let sigma be sqrt(x)..."
- page 514, the complexity of Fürer's multiplication algorithm is O(n·log(n)·2^O(log*(n))), and not O(n·log(n)·log*(n))
- page 555 (reference [320]), "M. A. Overton" should be replaced by "M. L.  Overton"