summaryrefslogtreecommitdiff
path: root/recipes/gcc/gcc-4.3.3/ep93xx/arm-crunch-fix-cirrus-reorg7.patch
diff options
context:
space:
mode:
authorMarcin Juszkiewicz <marcin@juszkiewicz.com.pl>2009-09-30 15:06:17 +0200
committerMarcin Juszkiewicz <marcin@juszkiewicz.com.pl>2009-09-30 19:06:36 +0200
commita261b5ea923854b9a84f91cec0177ff57e905c98 (patch)
tree1d2563f24426102c3d6f83873349e79f288f0db3 /recipes/gcc/gcc-4.3.3/ep93xx/arm-crunch-fix-cirrus-reorg7.patch
parentfe2fa19a4a00b3c7778933c476d57ca46c303c61 (diff)
gcc: update Maverick Crunch support to 20090908 version
From Martin W. Guy page http://martinwguy.co.uk/martin/crunch/ The 20090908 version * performs single and double precision floating point in the FPU (add, sub, mul, neg, abs, cmp and conversions from single and double precision floats to integral types). * by default, disables the floating point cfnegs and cfnegd instructions, which fail to convert 0 to -0 as they should. You can re-enable them with the -funsafe-math-optimizations flag, which is one of those enabled by -ffast-math (gcc-4.3 has an even more specific -fno-signed-zeros flag, which is one of those enabled by -funsafe-math-optimizations). * by default, does not respect denormalised values, so the smallest representable values are ±2-126 for floats and ±2-1022 for doubles instead of the usual ±2-149 and ±2-1074. * has a -mieee flag, which enables handling of denormalized values by disabling all the buggy instructions. With this, floating point addition, subtraction, negation, absolute value and conversion between floats and integer types are performed in software, leaving only floating point multiplication and comparison performed in hardware. * has no negative impact on regular ARM code generation. * always works round the hardware bugs in the FPU and no longer has the -mcirrus-fix-invalid-insns flag since chip development has stopped and all existing silicon has the same bugs except for the original revision D0 which is not supported. * passes GCC's IEEE testsuite except for the one specific test that checks for correct handling of denormalized values. With -mieee it passes all the math tests. * passes all other testsuites that I've tried (see below) including the stringent "paranoia" floating point IEEE conformance test. * produces the fastest Maverick code yet: 5.94 MFLOPS according to FFTW's tests/bench -opatient cf1024 benchmark and LAME takes 2m25 to encode that 30-second WAV file on a 200MHz EP9307 (compared to 5.4 and 2m30 for the futaris patches for 4.1.2 and 4.2.0). * does not use the FPU's buggy 64-bit integer instructions unless the new -mcirrus-di flag is given. Programs that do a lot of 64-bit integer operations (add, sub, mul, neg, abs, shifts) may be faster using this, but rigorous testing will be necessary to ensure that bad code is not being produced. OpenSSL's testsuite fails if this is enabled. There is more detail at the head of the arm-crunch-cirrus-di-flag.patch file. Known bugs * C: Values held in Maverick registers are not restored when performing a setjmp/longjmp pair. There is a fix to glibc for this in a message to the linux-cirrus mailing list. * C++: Similarly, exception unwinding (performing a throw back to a catch block in a different function) does not restore floating point and 64-bit values held in Maverick registers. * C++: Some C++ files will not compile, saying ".save {mv8}" Error: register expected although the same files will compile with optimization disabled. There is a patch to make binutils recognize these registers in the .save macro in a message to the linux-cirrus mailing list.
Diffstat (limited to 'recipes/gcc/gcc-4.3.3/ep93xx/arm-crunch-fix-cirrus-reorg7.patch')
-rw-r--r--recipes/gcc/gcc-4.3.3/ep93xx/arm-crunch-fix-cirrus-reorg7.patch325
1 files changed, 325 insertions, 0 deletions
diff --git a/recipes/gcc/gcc-4.3.3/ep93xx/arm-crunch-fix-cirrus-reorg7.patch b/recipes/gcc/gcc-4.3.3/ep93xx/arm-crunch-fix-cirrus-reorg7.patch
new file mode 100644
index 0000000000..1be8499643
--- /dev/null
+++ b/recipes/gcc/gcc-4.3.3/ep93xx/arm-crunch-fix-cirrus-reorg7.patch
@@ -0,0 +1,325 @@
+This patch:
+- maps branch-cirrus_insn to branch-nop-nop-cirrus_insn
+- maps branch-noncirrus-cirrus to branch-noncirrus-nop-cirrus
+- inserts a nop in load rN - load/store64 mvX,[rN] sequences to avoid an
+ undocumented hardware bug.
+- always fixes up invalid code sequences when compiling hard Maverick insns
+ and removes the -mcirrus-fix-invalid-insns flag because chip development
+ has stopped and all existing silicon has these bugs, while the extra code
+ that claimed to do other things for the extra bugs in the old revision D0
+ silicon was complete junk.
+- Takes the cirrus checking out of the main arm_reorg loop, to remove the
+ speed penalty it caused when not compiling for Maverick.
+
+ Martin Guy <martinwguy@yahoo.it> 3 March 2009
+
+Index: gcc-4.3.4/gcc/config/arm/arm.c
+===================================================================
+--- gcc-4.3.4.orig/gcc/config/arm/arm.c 2009-08-09 14:46:51.000000000 +0100
++++ gcc-4.3.4/gcc/config/arm/arm.c 2009-08-09 14:47:11.000000000 +0100
+@@ -134,7 +134,7 @@
+ static int arm_address_cost (rtx);
+ static bool arm_memory_load_p (rtx);
+ static bool arm_cirrus_insn_p (rtx);
+-static void cirrus_reorg (rtx);
++static void cirrus_reorg (void);
+ static void arm_init_builtins (void);
+ static rtx arm_expand_builtin (tree, rtx, rtx, enum machine_mode, int);
+ static void arm_init_iwmmxt_builtins (void);
+@@ -6538,6 +6538,9 @@
+
+ body = PATTERN (insn);
+
++ if (GET_CODE (body) == COND_EXEC)
++ body = COND_EXEC_CODE (body);
++
+ if (GET_CODE (body) != SET)
+ return false;
+
+@@ -6580,122 +6583,118 @@
+
+ /* Cirrus reorg for invalid instruction combinations. */
+ static void
+-cirrus_reorg (rtx first)
++cirrus_reorg (void)
+ {
+- enum attr_cirrus attr;
+- rtx body = PATTERN (first);
+- rtx t;
+- int nops;
+-
+- /* Any branch must be followed by 2 non Cirrus instructions. */
+- if (GET_CODE (first) == JUMP_INSN && GET_CODE (body) != RETURN)
+- {
+- nops = 0;
+- t = next_nonnote_insn (first);
+-
+- if (arm_cirrus_insn_p (t))
+- ++ nops;
+-
+- if (arm_cirrus_insn_p (next_nonnote_insn (t)))
+- ++ nops;
+-
+- while (nops --)
+- emit_insn_after (gen_nop (), first);
+-
+- return;
+- }
+-
+- /* (float (blah)) is in parallel with a clobber. */
+- if (GET_CODE (body) == PARALLEL && XVECLEN (body, 0) > 0)
+- body = XVECEXP (body, 0, 0);
+-
+- if (GET_CODE (body) == SET)
+- {
+- rtx lhs = XEXP (body, 0), rhs = XEXP (body, 1);
++ rtx insn, body;
+
+- /* cfldrd, cfldr64, cfstrd, cfstr64 must
+- be followed by a non Cirrus insn. */
+- if (get_attr_cirrus (first) == CIRRUS_DOUBLE)
+- {
+- if (arm_cirrus_insn_p (next_nonnote_insn (first)))
+- emit_insn_after (gen_nop (), first);
+-
+- return;
+- }
+- else if (arm_memory_load_p (first))
+- {
+- unsigned int arm_regno;
++ /* Examine every instruction and see if it needs adjusting */
++ for (insn = get_insns (); insn; insn = next_insn (insn))
++ switch (GET_CODE (insn))
++ {
++ case JUMP_INSN:
++ /* Any branch must be followed by 2 non Cirrus instructions. */
++ body = PATTERN (insn);
++ if (GET_CODE (body) != RETURN)
++ {
++ rtx next = next_real_insn (insn);
+
+- /* Any ldr/cfmvdlr, ldr/cfmvdhr, ldr/cfmvsr, ldr/cfmv64lr,
+- ldr/cfmv64hr combination where the Rd field is the same
+- in both instructions must be split with a non Cirrus
+- insn. Example:
++ if (arm_cirrus_insn_p (next))
++ {
++ emit_insn_after (gen_nop (), insn);
++ emit_insn_after (gen_nop (), insn);
++ }
++ else
++ if (arm_cirrus_insn_p (next_real_insn (next)))
++ emit_insn_after (gen_nop (), next);
++ }
++ break;
+
++ case INSN:
++ /* Any ldr/cfstrd combination where the Rd field is the same
++ in both instructions must be split with a non Cirrus insn.
++ Example:
+ ldr r0, blah
+ nop
+- cfmvsr mvf0, r0. */
+-
+- /* Get Arm register number for ldr insn. */
+- if (GET_CODE (lhs) == REG)
+- arm_regno = REGNO (lhs);
+- else
+- {
+- gcc_assert (GET_CODE (rhs) == REG);
+- arm_regno = REGNO (rhs);
+- }
+-
+- /* Next insn. */
+- first = next_nonnote_insn (first);
++ cfstrd mvd0, [r0]
++ otherwise the FPU stores to random memory locations.
++ */
++ body = PATTERN (insn);
++
++ /* Also applies to conditionally executed ldr */
++ if (GET_CODE (body) == COND_EXEC)
++ body = COND_EXEC_CODE (body);
+
+- if (! arm_cirrus_insn_p (first))
+- return;
++ /* If first insn is ldr rN, <mem>... */
++ if (GET_CODE (body) == SET && arm_memory_load_p (insn))
++ {
++ rtx next = next_real_insn (insn);
+
+- body = PATTERN (first);
++ /* ...and second is cirrus double word load or store... */
++ if (arm_cirrus_insn_p (next)
++ && get_attr_cirrus (next) == CIRRUS_DOUBLE)
++ {
++ rtx nextbody = PATTERN (next);
++ rtx ldr_target; /* destination of ldr insn: rN */
++ rtx arm_part; /* src or dest espression involving [rN] */
++ unsigned int arm_regno; /* the arm reg in the [rN] part */
+
+- /* (float (blah)) is in parallel with a clobber. */
+- if (GET_CODE (body) == PARALLEL && XVECLEN (body, 0))
+- body = XVECEXP (body, 0, 0);
+-
+- if (GET_CODE (body) == FLOAT)
+- body = XEXP (body, 0);
+-
+- if (get_attr_cirrus (first) == CIRRUS_MOVE
+- && GET_CODE (XEXP (body, 1)) == REG
+- && arm_regno == REGNO (XEXP (body, 1)))
+- emit_insn_after (gen_nop (), first);
++ ldr_target = XEXP (body, 0);
++ gcc_assert (GET_CODE (ldr_target) == REG);
+
+- return;
+- }
+- }
++ gcc_assert (GET_CODE (nextbody) == SET);
+
+- /* get_attr cannot accept USE or CLOBBER. */
+- if (!first
+- || GET_CODE (first) != INSN
+- || GET_CODE (PATTERN (first)) == USE
+- || GET_CODE (PATTERN (first)) == CLOBBER)
+- return;
++ /* Find the load or store address of the insn */
++ switch (GET_CODE (XEXP (nextbody, 0)))
++ {
++ case MEM: /* it's cfstrd/64 */
++ gcc_assert (GET_CODE (XEXP (nextbody, 1)) == REG);
++ arm_part = XEXP (XEXP (nextbody, 0), 0);
++ break;
+
+- attr = get_attr_cirrus (first);
++ case REG: /* it's cfldrd/64 */
++ if (GET_CODE (XEXP (nextbody, 1)) == MEM)
++ arm_part = XEXP (XEXP (nextbody, 1), 0);
++ else
++ /* It can also be const_double or const_int, which will
++ * turn into harmless [pc, #offset] in arm_reorg() */
++ continue;
++ break;
+
+- /* Any coprocessor compare instruction (cfcmps, cfcmpd, ...)
+- must be followed by a non-coprocessor instruction. */
+- if (attr == CIRRUS_COMPARE)
+- {
+- nops = 0;
++ default:
++ gcc_unreachable ();
++ }
+
+- t = next_nonnote_insn (first);
++ /* Find the arm register number in the [rN] expression */
++ arm_regno = 0; /* none */
++ switch (GET_CODE (arm_part))
++ {
++ case REG: /* it's [rN] */
++ arm_regno = REGNO (arm_part);
++ break;
+
+- if (arm_cirrus_insn_p (t))
+- ++ nops;
++ case PLUS: /* it's [rN, #XXX] or [rN, -#YYY]. */
++ case PRE_INC:
++ case POST_INC:
++ case PRE_DEC:
++ case POST_DEC:
++ gcc_assert (GET_CODE (XEXP (arm_part, 0)) == REG);
++ arm_regno = REGNO (XEXP (arm_part, 0));
++ break;
+
+- if (arm_cirrus_insn_p (next_nonnote_insn (t)))
+- ++ nops;
++ default:
++ /* Do nothing */
++ continue;
++ }
+
+- while (nops --)
+- emit_insn_after (gen_nop (), first);
++ if (arm_regno == REGNO (ldr_target))
++ emit_insn_after (gen_nop (), insn);
++ }
++ }
++ break;
+
+- return;
+- }
++ default:
++ break;
++ }
+ }
+
+ /* Return TRUE if X references a SYMBOL_REF. */
+@@ -9293,6 +9292,10 @@
+
+ minipool_fix_head = minipool_fix_tail = NULL;
+
++ /* Do cirrus_reorg() first as it may insert extra instructions */
++ if (TARGET_MAVERICK && TARGET_HARD_FLOAT)
++ cirrus_reorg ();
++
+ /* The first insn must always be a note, or the code below won't
+ scan it properly. */
+ insn = get_insns ();
+@@ -9302,12 +9305,6 @@
+ /* Scan all the insns and record the operands that will need fixing. */
+ for (insn = next_nonnote_insn (insn); insn; insn = next_nonnote_insn (insn))
+ {
+- if (TARGET_CIRRUS_FIX_INVALID_INSNS
+- && (arm_cirrus_insn_p (insn)
+- || GET_CODE (insn) == JUMP_INSN
+- || arm_memory_load_p (insn)))
+- cirrus_reorg (insn);
+-
+ if (GET_CODE (insn) == BARRIER)
+ push_minipool_barrier (insn, address);
+ else if (INSN_P (insn))
+Index: gcc-4.3.4/gcc/config/arm/arm.opt
+===================================================================
+--- gcc-4.3.4.orig/gcc/config/arm/arm.opt 2009-08-09 14:46:51.000000000 +0100
++++ gcc-4.3.4/gcc/config/arm/arm.opt 2009-08-09 14:47:11.000000000 +0100
+@@ -63,10 +63,6 @@
+ Target Report Mask(CALLER_INTERWORKING)
+ Thumb: Assume function pointers may go to non-Thumb aware code
+
+-mcirrus-fix-invalid-insns
+-Target Report Mask(CIRRUS_FIX_INVALID_INSNS)
+-Cirrus: Place NOPs to avoid invalid instruction combinations
+-
+ mcpu=
+ Target RejectNegative Joined
+ Specify the name of the target CPU
+Index: gcc-4.3.4/gcc/doc/invoke.texi
+===================================================================
+--- gcc-4.3.4.orig/gcc/doc/invoke.texi 2009-08-09 14:46:51.000000000 +0100
++++ gcc-4.3.4/gcc/doc/invoke.texi 2009-08-09 14:47:12.000000000 +0100
+@@ -429,7 +429,6 @@
+ -msingle-pic-base -mno-single-pic-base @gol
+ -mpic-register=@var{reg} @gol
+ -mnop-fun-dllimport @gol
+--mcirrus-fix-invalid-insns -mno-cirrus-fix-invalid-insns @gol
+ -mieee @gol
+ -mpoke-function-name @gol
+ -mthumb -marm @gol
+@@ -8673,18 +8672,6 @@
+ Specify the register to be used for PIC addressing. The default is R10
+ unless stack-checking is enabled, when R9 is used.
+
+-@item -mcirrus-fix-invalid-insns
+-@opindex mcirrus-fix-invalid-insns
+-@opindex mno-cirrus-fix-invalid-insns
+-Insert NOPs into the instruction stream to in order to work around
+-problems with invalid Maverick instruction combinations. This option
+-is only valid if the @option{-mcpu=ep9312} option has been used to
+-enable generation of instructions for the Cirrus Maverick floating
+-point co-processor. This option is not enabled by default, since the
+-problem is only present in older Maverick implementations. The default
+-can be re-enabled by use of the @option{-mno-cirrus-fix-invalid-insns}
+-switch.
+-
+ @item -mieee
+ When compiling for the Maverick FPU, disable the instructions that fail
+ to honor denormalized values. As these include floating point add, sub,