summaryrefslogtreecommitdiff
path: root/recipes/gcc/gcc-4.2.4/ep93xx/arm-crunch-readme.patch
diff options
context:
space:
mode:
authorMarcin Juszkiewicz <marcin@juszkiewicz.com.pl>2009-09-30 15:06:17 +0200
committerMarcin Juszkiewicz <marcin@juszkiewicz.com.pl>2009-09-30 19:06:36 +0200
commita261b5ea923854b9a84f91cec0177ff57e905c98 (patch)
tree1d2563f24426102c3d6f83873349e79f288f0db3 /recipes/gcc/gcc-4.2.4/ep93xx/arm-crunch-readme.patch
parentfe2fa19a4a00b3c7778933c476d57ca46c303c61 (diff)
gcc: update Maverick Crunch support to 20090908 version
From Martin W. Guy page http://martinwguy.co.uk/martin/crunch/ The 20090908 version * performs single and double precision floating point in the FPU (add, sub, mul, neg, abs, cmp and conversions from single and double precision floats to integral types). * by default, disables the floating point cfnegs and cfnegd instructions, which fail to convert 0 to -0 as they should. You can re-enable them with the -funsafe-math-optimizations flag, which is one of those enabled by -ffast-math (gcc-4.3 has an even more specific -fno-signed-zeros flag, which is one of those enabled by -funsafe-math-optimizations). * by default, does not respect denormalised values, so the smallest representable values are ±2-126 for floats and ±2-1022 for doubles instead of the usual ±2-149 and ±2-1074. * has a -mieee flag, which enables handling of denormalized values by disabling all the buggy instructions. With this, floating point addition, subtraction, negation, absolute value and conversion between floats and integer types are performed in software, leaving only floating point multiplication and comparison performed in hardware. * has no negative impact on regular ARM code generation. * always works round the hardware bugs in the FPU and no longer has the -mcirrus-fix-invalid-insns flag since chip development has stopped and all existing silicon has the same bugs except for the original revision D0 which is not supported. * passes GCC's IEEE testsuite except for the one specific test that checks for correct handling of denormalized values. With -mieee it passes all the math tests. * passes all other testsuites that I've tried (see below) including the stringent "paranoia" floating point IEEE conformance test. * produces the fastest Maverick code yet: 5.94 MFLOPS according to FFTW's tests/bench -opatient cf1024 benchmark and LAME takes 2m25 to encode that 30-second WAV file on a 200MHz EP9307 (compared to 5.4 and 2m30 for the futaris patches for 4.1.2 and 4.2.0). * does not use the FPU's buggy 64-bit integer instructions unless the new -mcirrus-di flag is given. Programs that do a lot of 64-bit integer operations (add, sub, mul, neg, abs, shifts) may be faster using this, but rigorous testing will be necessary to ensure that bad code is not being produced. OpenSSL's testsuite fails if this is enabled. There is more detail at the head of the arm-crunch-cirrus-di-flag.patch file. Known bugs * C: Values held in Maverick registers are not restored when performing a setjmp/longjmp pair. There is a fix to glibc for this in a message to the linux-cirrus mailing list. * C++: Similarly, exception unwinding (performing a throw back to a catch block in a different function) does not restore floating point and 64-bit values held in Maverick registers. * C++: Some C++ files will not compile, saying ".save {mv8}" Error: register expected although the same files will compile with optimization disabled. There is a patch to make binutils recognize these registers in the .save macro in a message to the linux-cirrus mailing list.
Diffstat (limited to 'recipes/gcc/gcc-4.2.4/ep93xx/arm-crunch-readme.patch')
-rw-r--r--recipes/gcc/gcc-4.2.4/ep93xx/arm-crunch-readme.patch109
1 files changed, 109 insertions, 0 deletions
diff --git a/recipes/gcc/gcc-4.2.4/ep93xx/arm-crunch-readme.patch b/recipes/gcc/gcc-4.2.4/ep93xx/arm-crunch-readme.patch
new file mode 100644
index 0000000000..8e07712b48
--- /dev/null
+++ b/recipes/gcc/gcc-4.2.4/ep93xx/arm-crunch-readme.patch
@@ -0,0 +1,109 @@
+Index: gcc-4.2.4/gcc/config/arm/README-Maverick
+===================================================================
+--- /dev/null 1970-01-01 00:00:00.000000000 +0000
++++ gcc-4.2.4/gcc/config/arm/README-Maverick 2009-08-09 15:43:44.000000000 +0100
+@@ -0,0 +1,104 @@
++Cirrus Logic MaverickCrunch FPU support
++=======================================
++
++The MaverickCrunch is an FPU coprocessor that only exists in combination
++with an arm920t (ARMv4t arch) integer core in the 200MHz EP93xx devices.
++Code generation for it is usually selected with
++ -mcpu=ep9312 -mfpu=maverick (and most likely -mfloat-abi=softfp)
++
++Within GCC, the names "cirrus" "maverick" and "crunch" are used randomly
++in filenames and identifiers, but they all refer to the same thing.
++
++Initial support was mainlined by RedHat in gcc-3 but this never generated
++working code. Cirrus Logic funded the company Nucleusys to produce a modified
++GCC for it, but this never worked either. The first set of patches to pass
++the testsuite were made by Hasjim Williams for Open Embedded, though they
++did this by disabling various features and optimisations, therby incurring
++a small negative impact on regular ARM code generation.
++The OE ideas were reimplemented by Martin Guy to produce a working compiler
++with no negwative impact on regular code generation.
++
++The FPU's characteristics
++-------------------------
++Like most ARM coprocessors, it runs in parallel with the ARM though its
++instructions are inserted into the regular ARM instructions stream.
++It has 16 64-bit registers that can be operated as 32-bit or 64-bit integers
++or as 32-bit or 64-bit floats, three 72-bit saturating multiplier-accumulators.
++It can add, sub, mul, cmp, abs and neg these types, convert between them and
++transfer values between its registers and the ARM registers or main memory.
++
++Comparisons performed in the Maverick unit set the condition codes differently
++from the ARM/FPA/VFP instructions:
++
++ ARM/FPA/VFP - (cmp*): MaverickCrunch - (cfcmp*):
++ N Z C V N Z C V
++ A == B 0 1 1 0 A == B 0 1 0 0
++ A < B 1 0 0 0 A < B 1 0 0 0
++ A > B 0 0 1 0 A > B 1 0 0 1
++ unord 0 0 1 1 unord 0 0 0 0
++
++which means that the same conditions have to be tested for with different ARM
++conditions after a Maverick comparison. Furthermore, some conditions cannot
++be tested with a single condition.
++This was already true on ARM/FPA/VFP for conditions UNEQ and LTGT;
++on Maverick comparisons it is GE UNLT ORDERED and UNORDERED that cannot.
++(GE after a floating point comparison, that is, not after an integer comarison)
++
++GCC's use of the Maverick unit
++------------------------------
++GCC only uses the 32-bit and 64-bit floating point modes and the 64-bit
++integer mode. It does not use the 72-bit accumulators or the 32-bit integer
++mode because, from "GCC Machine Descriptions":
++ "It is obligatory to support floating point `movm' instructions
++ into and out of any registers that can hold fixed point values,
++ because unions and structures (which have modes SImode or DImode)
++ can be in those registers and they may have floating point members."
++(search also for "grief" in arm.c).
++
++It does not use the 64-bit integer comparison instruction because it can only
++do a signed or an unsigned comparison, while GCC expect the comparison to set
++the conditions codes for both modes and then to use the signed or unsigned
++mode when the condition code bits are tested.
++
++The different setting of the condition codes is tracked with an additional
++CCMAV mode for the condition code register, set when a comparison is performed
++in the Maverick unit. This always indicates a floating point comparison since
++the Maverick's 64-bit comparison is not used.
++
++Hardware bugs and workarounds
++-----------------------------
++All silicon implementations of the FPU have a dozen hardware bugs, mostly
++timing-dependent bugs that can write garbage into registers or memory or get
++conditional tests wrong, as well as a widespread failure to respect
++denormalised values. See http://wiki.debian.org/ArmEabiMaverickCrunch
++
++There used to be a -mcirrus-fix-invalid-instructions flag that claimed
++to avoid bugs in revision D0 silicon but its code was broken junk.
++Currently GCC always avoids the timing bugs in revision D1 to E2 silicon,
++while the many extra timing bugs in the now rare revision D0 are not handled.
++
++By default, the instructions that drop denoermalized values are enabled
++so as to obtain maximum speed at lower precision. By default, the cfnegs
++and cfnegd instrutions are disabled, since they also fail to produce negative
++zero. They can be enabled with -fno-signed-zeros.
++
++When -mfpu=maverick is selected, an additional -mieee flag is active that
++gives full IEEE precision by performs all the non-denorm-respecting
++floating point instructions in the softfloat library routines or in the
++integer registers.
++
++The 64-bit integer support is still buggy so it is disabled unless the
++-mcirrus-di flag is supplied. As well as the having unidentified
++hardware bugs which make openssl's testsuite fail in */sha/sha512.c and in
++*/bn/bn_adm.c, 64-bit shifts only work up to 31 places left or 32 right.
++
++Other bugs
++----------
++There seems to be no way to configure GCC to select Maverick code generation
++as the default.
++
++--with-arch=ep9312 the assembler barfs saying that ep9312 is not a
++recognised architecture.
++--with-arch=armv4t the build fails when it tries to compile hard FPA
++instructions into libgcc,
++--with-cpu=ep9312 it compiles armv5t instructions into libgcc