Last update March 7, 2012

Doc Comments /
Float



Difference (last change) (no other diffs, normal page display)

Changed: 96,206c96
jam tangan
jam tangan murah
jam tangan kw
hostgator coupon
kata mutiara
Jasa SEO
EZido
RDAnet
pioneer deh-1300mp
asus a53e-xa2
asus tf101-b1
asus tf101-a1
asus n53sv-eh72
asus republic of gamers g74sx
asus acer a5250
acer chromebook ac700
asus asus 53u
lg infinia 55lw5600
Sonicview 360 premier
asus 7 cu ft freezer
asus 30 single wall oven
brother cs6000i sewing machine
brother 1034d serger
brother sewing machines
Yokohama Geolandar H/T-S
crib tent tots in mind
kidco peapod plus
foscam fi8910w
samsung pl120 review
gopro helmet cam
Canon SX130IS
powershot s100
ContourHD 1080p
canon vixia hf r21
digital picture frame
canon ef 50mm f1.4
canon ef 70-300mm review
wide angle lenses
moving comfort sports bra
moving comfort bra
womens argyle sweater
bebe dresses
ViewSonic VX2250WM
Le Pan TC 970
Apple MacBook Air MC965LL
Sennheiser CX 880
plantronics cs540
ultrasonic jewelry cleaner
Sennheiser RS120
bose quietcomfort 15 acoustic noise cancelling headphones
logitech harmony one remote
logitech harmony 900
sony mhc-ec69i
sony mhcec909ip
bose wave music system
sony htss380
logitech squeezebox touch
sony dvp-fx970
onkyo tx-nr509
onkyo tx - nr609
onkyo ht-s3400
energy 5.1 take classic home theater system
polk audio psw505
onkyo ht-s5400
onkyo tx-nr709
belkin pf60
onkyo ht-rc360
denon avr-1912
Yamaha YHT-S400BL
fujitsu scansnap s1500
brother hl-2270dw
epson workforce 545
hp laserjet p2055dn
bushnell 8mp trophy cam
toshiba 32c110u
panasonic viera tc-p60s30
VIZIO E220VA
hauppauge wintv dcr-2650
Acer AM3970-U5022
Acer AspireRevo AR3700-U3002
Dell Inspiron i570
Dell GX620
Gateway FX6860-UR20P
Western Digital My Passport Essential SE 1 TB USB 3.0
Fujitsu ScanSnap S1300
Epson Perfection V300
Fujitsu SCANSNAP S1100
NeatDesk Desktop Scanner and Digital Filing System
Epson WorkForce Pro GT-S50
Kodak P811BK
Epson Perfection V330
Viewsonic VX2453MH
Asus VE228H
ViewSonic VA2431WM
Samsung B2230
HP 2711x
ASUS ML228H
Epson PowerLite Home Cinema 8350
Optoma PK301
Epson EX7210
Epson EX5210
ViewSonic PJD5133
Acer X1161P
FAVI RioHD-LED-2
Epson EX3210
ViewSonic PJD6531w
Trinity 360 Breville 800JEXL
Skil 3320-02
Delta 46-460
Grizzly G0555
Delta 18-900L


Floating Point

More Information

Examples

Message

Put your comments about the official/non-official page here.

Rounding Control

IEEE 754 floating point arithmetic includes the ability to set 4 different rounding modes. D adds syntax to access them: [blah, blah, blah] [NOTE: this is perhaps better done with a standard library call]

Exception Flags

IEEE 754 floating point arithmetic can set several flags based on what happened with a computation: [blah, blah, blah]. These flags can be set/reset with the syntax: [blah, blah, blah] [NOTE: this is perhaps better done with a standard library call]


How about finishing this page?

Amendments

You may add a link to this page on the DocumentationAmendments page to draw extra attention to your suggestion.


On many computers, greater precision operations do not take any longer than lesser precision operations, so it makes numerical sense to use the greatest precision available for internal temporaries... On Intel x86 machines, for example, it is expected (but not required) that the intermediate calculations be done to the full 80 bits of precision implemented by the hardware.

This rationale is incorrect. Quoting the IA-32 Intel® Architecture Optimization Reference Manual:

Do not use double precision unless necessary. Set the precision control (PC) field in the x87 FPU control word to "Single Precision". This allows single precision (32-bit) computation to complete faster on some operations...

Single precision operations allow the use of longer SIMD vectors, since more single precision data elements can fit in a register...

x87 supports 80-bit precision, double extended floating point. Streaming SIMD Extensions support a maximum of 32-bit precision, and Streaming SIMD Extensions 2 supports a maximum of 64-bit precision.

In a Pentium 4, the x87 instructions are effectively deprecated. They're painfully slow, slower than they were in the Pentium III. Using 80-bit arithmetic for all intermediate operations will make floating point performance three times slower just for this reason alone. The optimization reference manual cited above explains why smaller operand size is so important -- memory bandwidth is often a performance bottleneck. And vectorization (if gcc ever gets around to supporting it) will make for another factor of two slowdown from 32-bit to 64-bit. -- TimStarling


(Rebuttal)

In the context of the Intel doc., it looks like what they are suggesting is that the application programmer (as opposed to the compiler developer) use single precision when double precision is not needed. It's a common recommendation that the application programmer use single precision (floats) rather than doubles if the extra precision is not needed and there is a lot of floating point data moving around, because it is often faster.

On Intel (including the P4) the floating point registers are 80 bit. All the author of http://digitalmars.com/d/float.html is suggesting in the context of the D language is that compiler developers shouldn't have to limit precision to 32 bits (floats) or 64 bits (doubles) if keeping 80 bit precision results in faster code. D is allowing for this where other languages may specify a maximum precision regardless of the what is best for the hardware.

The best contemporary (Fall, 2004) optimizing compilers all use 80 bit precision to/from the Intel floating point registers for intermediate data when "maximum performance" switches are set. And for cases when strict maximum precision is needed all also have a switch to "improve floating point consistency" by rounding/truncating intermediate values, which is often a speed "deoptimization" [this includes code generation for both the P4 and AMD64 chips]. D on the other hand follows IEEE 754 minimum precision guidelines for floats and doubles, doesn't specify a maximum precision and also offers the real (80 bit floating point) type for code that would benefit from that.

I don't see anywhere in that Intel doc. where it says that 80 bit floating point register operations are "deprecated".

For operations (and compilers) that take advantage of SIMD instructions, then it is probably best to stick to 32 or 64 bit floating point types for code that can be vectorized. From what I've seen, contemporary compilers often don't do better than a mediocre job of vectorizing and often fall back to using the 80 bit floating point register operations.


SSE2 optimization

The only reason Intel is keeping around the x87 math instructions is for backwards compatibility. Their documentation recommends switching to SSE and SSE2 for floating point functionality. Compiler optimizations that use SSE2 are now a reality (e.g. MS visual C++.net). I and others have noted 50% to 100% speedups in floating point code using these optimizations. I would love to use D for some of my scientific computing, but without these optimizations it's a nonstarter. Contrary to the Mantra of some developers, speed does matter. I still have floating point Monte Carlo simulations that take days to run. I wonder if there are any plans for backend support for SSE and/or SSE2 optimization in D?

Comment: The reason 80-bit instructions are being "deprecated" is because they aren't used by most compilers (especially Microsoft). So Intel and AMD are paying less attention to them. The change is driven by compilers, not by chip makers.


Properties
reals and ireals support the .re and .im properties. if

real x=7; ireal y=2;

then

x.re = 7 x.im = 0 y.re = 0 y.im = 2

Floating Point Quirks

It is a mistake to assume that in floating point, it is possible to design an algorithm that does not degrade with increased precision.

For instance, many computations (of the Gamma function, for example) rely on series expansions with pre-computed constants in order to calculate the result. It may make a great deal of difference if I use 3.14 for single precision, 3.14159 for double precision, and 3.14159265 for extended precision.

Next, IEEE floating point traps are NOT the same as exceptions - sometimes a trap can be signaling and sometimes not. The same goes for NaNs? - some signal, some don't.

Here is where you can get into a flame war - GCC has an option called 'finite-math' that allows the compiler to assume things like "a==a" is always true. With true ieee arithmetic, "a==a" can be false if 'a' is a NaN?. Which is more important to you - fast or ieee-correct?

Ditto goes with vectorization primitives. Don't assume that you'll be running on SSE-type hardware - IBM build lots of number crunchers with AltiVec?..., and few compilers (even Intel's) are super-good with auto-vectorization.

For many floating point issues, see www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf - it is a pain to read, but full of good info. Also, look into the Fortran2000 community... those people KNOW how to implement floating point...

Cheers, -Andrew <andrew AT fernandes.org>


Floating point evaluation may be different at compile-time and runtime.

The compiler is allowed to evaluate intermediate results at a greater precision than that of the operands. The literal type suffix (like 'f') only indicates the type. The compiler may maintain internally as much precision as possible, for purposes of constant folding. Committing the actual precision of the result is done as late as possible.

For a low-precision constant put the value into a static, non-const variable. Since this is not really a constant, it cannot be constant folded and therefore affected by a possible compile-time increase in precision. However, if mixed with a higher precision at runtime, a increase in precision will still occur.

Links

See the corresponding page in the D Specification: DigitalMars:d/float.html
FrontPage | News | TestPage | MessageBoard | Search | Contributors | Folders | Index | Help | Preferences | Edit

Edit text of this page (date of last change: March 7, 2012 20:24 (diff))