re PR target/54222 ([avr] Implement fixed-point support)

libgcc/
	PR target/54222
	* config/avr/lib1funcs-fixed.S: New file.
	* config/avr/lib1funcs.S: Include it.  Undefine some divmodsi
	after they are used.
	(neg2, neg4): New macros.
	(__mulqihi3,__umulqihi3,__mulhi3): Rewrite non-MUL variants.
	(__mulhisi3,__umulhisi3,__mulsi3): Rewrite non-MUL variants.
	(__umulhisi3): Speed up MUL variant if there is enough flash.
	* config/avr/avr-lib.h (TA, UTA): Adjust according to gcc's
	avr-modes.def.
	* config/avr/t-avr (LIB1ASMFUNCS): Add: _fractqqsf, _fractuqqsf,
	_fracthqsf, _fractuhqsf, _fracthasf, _fractuhasf, _fractsasf,
	_fractusasf, _fractsfqq, _fractsfuqq, _fractsfhq, _fractsfuhq,
	_fractsfha, _fractsfsa, _mulqq3, _muluqq3, _mulhq3, _muluhq3,
	_mulha3, _muluha3, _mulsa3, _mulusa3, _divqq3, _udivuqq3, _divhq3,
	_udivuhq3, _divha3, _udivuha3, _divsa3, _udivusa3.
	(LIB2FUNCS_EXCLUDE): Add supported functions.

gcc/
	PR target/54222
	* avr-modes.def (HA, SA, DA, TA, UTA): Adjust modes.
	* avr/avr-fixed.md: New file.
	* avr/avr.md: Include it.
	(cc): Add: minus.
	(adjust_len): Add: minus, minus64, ufract, sfract.
	(ALL1, ALL2, ALL4, ORDERED234): New mode iterators.
	(MOVMODE): Add: QQ, UQQ, HQ, UHQ, HA, UHA, SQ, USQ, SA, USA.
	(MPUSH): Add: HQ, UHQ, HA, UHA, SQ, USQ, SA, USA.
	(pushqi1, xload8_A, xload_8, movqi_insn, *reload_inqi, addqi3,
	subqi3, ashlqi3, *ashlqi3, ashrqi3, lshrqi3, *lshrqi3, *cmpqi, 
	cbranchqi4, *cpse.eq): Generalize to handle all 8-bit modes in ALL1.
	(*movhi, reload_inhi, addhi3, *addhi3, addhi3_clobber, subhi3,
	ashlhi3, *ashlhi3_const, ashrhi3, *ashirhi3_const, lshrhi3,
	*lshrhi3_const, *cmphi, cbranchhi4): Generalize to handle all
	16-bit modes in ALL2.
	(subhi3, casesi, strlenhi): Add clobber when expanding minus:HI.
	(*movsi, *reload_insi, addsi3, subsi3, ashlsi3, *ashlsi3_const,
	ashrsi3, *ashrhi3_const, *ashrsi3_const, lshrsi3, *lshrsi3_const,
	*reversed_tstsi, *cmpsi, cbranchsi4): Generalize to handle all
	32-bit modes in ALL4.
	* avr-dimode.md (ALL8): New mode iterator.
	(adddi3, adddi3_insn, adddi3_const_insn, subdi3, subdi3_insn,
	subdi3_const_insn, cbranchdi4, compare_di2,
	compare_const_di2, ashrdi3, lshrdi3, rotldi3, ashldi3_insn,
	ashrdi3_insn, lshrdi3_insn, rotldi3_insn): Generalize to handle
	all 64-bit modes in ALL8.
	* config/avr/avr-protos.h (avr_to_int_mode): New prototype.
	(avr_out_fract, avr_out_minus, avr_out_minus64): New prototypes.
	* config/avr/avr.c (TARGET_FIXED_POINT_SUPPORTED_P): Define to...
	(avr_fixed_point_supported_p): ...this new static function.
	(TARGET_BUILD_BUILTIN_VA_LIST): Define to...
	(avr_build_builtin_va_list): ...this new static function.
	(avr_adjust_type_node): New static function.
	(avr_scalar_mode_supported_p): Allow if ALL_FIXED_POINT_MODE_P.
	(avr_builtin_setjmp_frame_value): Use gen_subhi3 and return new
	pseudo instead of gen_rtx_MINUS.
	(avr_print_operand, avr_operand_rtx_cost): Handle: CONST_FIXED.
	(notice_update_cc): Handle: CC_MINUS.
	(output_movqi): Generalize to handle respective fixed-point modes.
	(output_movhi, output_movsisf, avr_2word_insn_p): Ditto.
	(avr_out_compare, avr_out_plus_1): Also handle fixed-point modes.
	(avr_assemble_integer): Ditto.
	(output_reload_in_const, output_reload_insisf): Ditto.
	(avr_compare_pattern): Skip all modes > 4 bytes.
	(avr_2word_insn_p): Skip movuqq_insn, movqq_insn.
	(avr_out_fract, avr_out_minus, avr_out_minus64): New functions.
	(avr_to_int_mode): New function.
	(adjust_insn_length): Handle: ADJUST_LEN_SFRACT,
	ADJUST_LEN_UFRACT, ADJUST_LEN_MINUS, ADJUST_LEN_MINUS64.
	* config/avr/predicates.md (const0_operand): Allow const_fixed.
	(const_operand, const_or_immediate_operand): New.
	(nonmemory_or_const_operand): New.
	* config/avr/constraints.md (Ynn, Y00, Y01, Y02, Ym1, Ym2, YIJ):
	New constraints.
	* config/avr/avr.h (LONG_LONG_ACCUM_TYPE_SIZE): Define.

From-SVN: r190644
This commit is contained in:
Georg-Johann Lay 2012-08-24 12:42:48 +00:00 committed by Georg-Johann Lay
parent 2960a36853
commit e55e405619
15 changed files with 3046 additions and 677 deletions

View File

@ -1,3 +1,62 @@
2012-08-24 Georg-Johann Lay <avr@gjlay.de>
PR target/54222
* avr-modes.def (HA, SA, DA, TA, UTA): Adjust modes.
* avr/avr-fixed.md: New file.
* avr/avr.md: Include it.
(cc): Add: minus.
(adjust_len): Add: minus, minus64, ufract, sfract.
(ALL1, ALL2, ALL4, ORDERED234): New mode iterators.
(MOVMODE): Add: QQ, UQQ, HQ, UHQ, HA, UHA, SQ, USQ, SA, USA.
(MPUSH): Add: HQ, UHQ, HA, UHA, SQ, USQ, SA, USA.
(pushqi1, xload8_A, xload_8, movqi_insn, *reload_inqi, addqi3,
subqi3, ashlqi3, *ashlqi3, ashrqi3, lshrqi3, *lshrqi3, *cmpqi,
cbranchqi4, *cpse.eq): Generalize to handle all 8-bit modes in ALL1.
(*movhi, reload_inhi, addhi3, *addhi3, addhi3_clobber, subhi3,
ashlhi3, *ashlhi3_const, ashrhi3, *ashirhi3_const, lshrhi3,
*lshrhi3_const, *cmphi, cbranchhi4): Generalize to handle all
16-bit modes in ALL2.
(subhi3, casesi, strlenhi): Add clobber when expanding minus:HI.
(*movsi, *reload_insi, addsi3, subsi3, ashlsi3, *ashlsi3_const,
ashrsi3, *ashrhi3_const, *ashrsi3_const, lshrsi3, *lshrsi3_const,
*reversed_tstsi, *cmpsi, cbranchsi4): Generalize to handle all
32-bit modes in ALL4.
* avr-dimode.md (ALL8): New mode iterator.
(adddi3, adddi3_insn, adddi3_const_insn, subdi3, subdi3_insn,
subdi3_const_insn, cbranchdi4, compare_di2,
compare_const_di2, ashrdi3, lshrdi3, rotldi3, ashldi3_insn,
ashrdi3_insn, lshrdi3_insn, rotldi3_insn): Generalize to handle
all 64-bit modes in ALL8.
* config/avr/avr-protos.h (avr_to_int_mode): New prototype.
(avr_out_fract, avr_out_minus, avr_out_minus64): New prototypes.
* config/avr/avr.c (TARGET_FIXED_POINT_SUPPORTED_P): Define to...
(avr_fixed_point_supported_p): ...this new static function.
(TARGET_BUILD_BUILTIN_VA_LIST): Define to...
(avr_build_builtin_va_list): ...this new static function.
(avr_adjust_type_node): New static function.
(avr_scalar_mode_supported_p): Allow if ALL_FIXED_POINT_MODE_P.
(avr_builtin_setjmp_frame_value): Use gen_subhi3 and return new
pseudo instead of gen_rtx_MINUS.
(avr_print_operand, avr_operand_rtx_cost): Handle: CONST_FIXED.
(notice_update_cc): Handle: CC_MINUS.
(output_movqi): Generalize to handle respective fixed-point modes.
(output_movhi, output_movsisf, avr_2word_insn_p): Ditto.
(avr_out_compare, avr_out_plus_1): Also handle fixed-point modes.
(avr_assemble_integer): Ditto.
(output_reload_in_const, output_reload_insisf): Ditto.
(avr_compare_pattern): Skip all modes > 4 bytes.
(avr_2word_insn_p): Skip movuqq_insn, movqq_insn.
(avr_out_fract, avr_out_minus, avr_out_minus64): New functions.
(avr_to_int_mode): New function.
(adjust_insn_length): Handle: ADJUST_LEN_SFRACT,
ADJUST_LEN_UFRACT, ADJUST_LEN_MINUS, ADJUST_LEN_MINUS64.
* config/avr/predicates.md (const0_operand): Allow const_fixed.
(const_operand, const_or_immediate_operand): New.
(nonmemory_or_const_operand): New.
* config/avr/constraints.md (Ynn, Y00, Y01, Y02, Ym1, Ym2, YIJ):
New constraints.
* config/avr/avr.h (LONG_LONG_ACCUM_TYPE_SIZE): Define.
2012-08-23 Kenneth Zadeck <zadeck@naturalbridge.com>
* alias.c (rtx_equal_for_memref_p): Convert constant cases.

View File

@ -47,44 +47,58 @@
[(ACC_A 18)
(ACC_B 10)])
;; Supported modes that are 8 bytes wide
(define_mode_iterator ALL8 [(DI "")
(DQ "") (UDQ "")
(DA "") (UDA "")
(TA "") (UTA "")])
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Addition
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(define_expand "adddi3"
[(parallel [(match_operand:DI 0 "general_operand" "")
(match_operand:DI 1 "general_operand" "")
(match_operand:DI 2 "general_operand" "")])]
;; "adddi3"
;; "adddq3" "addudq3"
;; "addda3" "adduda3"
;; "addta3" "adduta3"
(define_expand "add<mode>3"
[(parallel [(match_operand:ALL8 0 "general_operand" "")
(match_operand:ALL8 1 "general_operand" "")
(match_operand:ALL8 2 "general_operand" "")])]
"avr_have_dimode"
{
rtx acc_a = gen_rtx_REG (DImode, ACC_A);
rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A);
emit_move_insn (acc_a, operands[1]);
if (s8_operand (operands[2], VOIDmode))
if (DImode == <MODE>mode
&& s8_operand (operands[2], VOIDmode))
{
emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]);
emit_insn (gen_adddi3_const8_insn ());
}
else if (CONST_INT_P (operands[2])
|| CONST_DOUBLE_P (operands[2]))
else if (const_operand (operands[2], GET_MODE (operands[2])))
{
emit_insn (gen_adddi3_const_insn (operands[2]));
emit_insn (gen_add<mode>3_const_insn (operands[2]));
}
else
{
emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]);
emit_insn (gen_adddi3_insn ());
emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]);
emit_insn (gen_add<mode>3_insn ());
}
emit_move_insn (operands[0], acc_a);
DONE;
})
(define_insn "adddi3_insn"
[(set (reg:DI ACC_A)
(plus:DI (reg:DI ACC_A)
(reg:DI ACC_B)))]
;; "adddi3_insn"
;; "adddq3_insn" "addudq3_insn"
;; "addda3_insn" "adduda3_insn"
;; "addta3_insn" "adduta3_insn"
(define_insn "add<mode>3_insn"
[(set (reg:ALL8 ACC_A)
(plus:ALL8 (reg:ALL8 ACC_A)
(reg:ALL8 ACC_B)))]
"avr_have_dimode"
"%~call __adddi3"
[(set_attr "adjust_len" "call")
@ -99,10 +113,14 @@
[(set_attr "adjust_len" "call")
(set_attr "cc" "clobber")])
(define_insn "adddi3_const_insn"
[(set (reg:DI ACC_A)
(plus:DI (reg:DI ACC_A)
(match_operand:DI 0 "const_double_operand" "n")))]
;; "adddi3_const_insn"
;; "adddq3_const_insn" "addudq3_const_insn"
;; "addda3_const_insn" "adduda3_const_insn"
;; "addta3_const_insn" "adduta3_const_insn"
(define_insn "add<mode>3_const_insn"
[(set (reg:ALL8 ACC_A)
(plus:ALL8 (reg:ALL8 ACC_A)
(match_operand:ALL8 0 "const_operand" "n Ynn")))]
"avr_have_dimode
&& !s8_operand (operands[0], VOIDmode)"
{
@ -116,30 +134,62 @@
;; Subtraction
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(define_expand "subdi3"
[(parallel [(match_operand:DI 0 "general_operand" "")
(match_operand:DI 1 "general_operand" "")
(match_operand:DI 2 "general_operand" "")])]
;; "subdi3"
;; "subdq3" "subudq3"
;; "subda3" "subuda3"
;; "subta3" "subuta3"
(define_expand "sub<mode>3"
[(parallel [(match_operand:ALL8 0 "general_operand" "")
(match_operand:ALL8 1 "general_operand" "")
(match_operand:ALL8 2 "general_operand" "")])]
"avr_have_dimode"
{
rtx acc_a = gen_rtx_REG (DImode, ACC_A);
rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A);
emit_move_insn (acc_a, operands[1]);
emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]);
emit_insn (gen_subdi3_insn ());
if (const_operand (operands[2], GET_MODE (operands[2])))
{
emit_insn (gen_sub<mode>3_const_insn (operands[2]));
}
else
{
emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]);
emit_insn (gen_sub<mode>3_insn ());
}
emit_move_insn (operands[0], acc_a);
DONE;
})
(define_insn "subdi3_insn"
[(set (reg:DI ACC_A)
(minus:DI (reg:DI ACC_A)
(reg:DI ACC_B)))]
;; "subdi3_insn"
;; "subdq3_insn" "subudq3_insn"
;; "subda3_insn" "subuda3_insn"
;; "subta3_insn" "subuta3_insn"
(define_insn "sub<mode>3_insn"
[(set (reg:ALL8 ACC_A)
(minus:ALL8 (reg:ALL8 ACC_A)
(reg:ALL8 ACC_B)))]
"avr_have_dimode"
"%~call __subdi3"
[(set_attr "adjust_len" "call")
(set_attr "cc" "set_czn")])
;; "subdi3_const_insn"
;; "subdq3_const_insn" "subudq3_const_insn"
;; "subda3_const_insn" "subuda3_const_insn"
;; "subta3_const_insn" "subuta3_const_insn"
(define_insn "sub<mode>3_const_insn"
[(set (reg:ALL8 ACC_A)
(minus:ALL8 (reg:ALL8 ACC_A)
(match_operand:ALL8 0 "const_operand" "n Ynn")))]
"avr_have_dimode"
{
return avr_out_minus64 (operands[0], NULL);
}
[(set_attr "adjust_len" "minus64")
(set_attr "cc" "clobber")])
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Negation
@ -180,15 +230,19 @@
(pc)))]
"avr_have_dimode")
(define_expand "cbranchdi4"
[(parallel [(match_operand:DI 1 "register_operand" "")
(match_operand:DI 2 "nonmemory_operand" "")
;; "cbranchdi4"
;; "cbranchdq4" "cbranchudq4"
;; "cbranchda4" "cbranchuda4"
;; "cbranchta4" "cbranchuta4"
(define_expand "cbranch<mode>4"
[(parallel [(match_operand:ALL8 1 "register_operand" "")
(match_operand:ALL8 2 "nonmemory_operand" "")
(match_operator 0 "ordered_comparison_operator" [(cc0)
(const_int 0)])
(label_ref (match_operand 3 "" ""))])]
"avr_have_dimode"
{
rtx acc_a = gen_rtx_REG (DImode, ACC_A);
rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A);
emit_move_insn (acc_a, operands[1]);
@ -197,25 +251,28 @@
emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]);
emit_insn (gen_compare_const8_di2 ());
}
else if (CONST_INT_P (operands[2])
|| CONST_DOUBLE_P (operands[2]))
else if (const_operand (operands[2], GET_MODE (operands[2])))
{
emit_insn (gen_compare_const_di2 (operands[2]));
emit_insn (gen_compare_const_<mode>2 (operands[2]));
}
else
{
emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]);
emit_insn (gen_compare_di2 ());
emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]);
emit_insn (gen_compare_<mode>2 ());
}
emit_jump_insn (gen_conditional_jump (operands[0], operands[3]));
DONE;
})
(define_insn "compare_di2"
;; "compare_di2"
;; "compare_dq2" "compare_udq2"
;; "compare_da2" "compare_uda2"
;; "compare_ta2" "compare_uta2"
(define_insn "compare_<mode>2"
[(set (cc0)
(compare (reg:DI ACC_A)
(reg:DI ACC_B)))]
(compare (reg:ALL8 ACC_A)
(reg:ALL8 ACC_B)))]
"avr_have_dimode"
"%~call __cmpdi2"
[(set_attr "adjust_len" "call")
@ -230,10 +287,14 @@
[(set_attr "adjust_len" "call")
(set_attr "cc" "compare")])
(define_insn "compare_const_di2"
;; "compare_const_di2"
;; "compare_const_dq2" "compare_const_udq2"
;; "compare_const_da2" "compare_const_uda2"
;; "compare_const_ta2" "compare_const_uta2"
(define_insn "compare_const_<mode>2"
[(set (cc0)
(compare (reg:DI ACC_A)
(match_operand:DI 0 "const_double_operand" "n")))
(compare (reg:ALL8 ACC_A)
(match_operand:ALL8 0 "const_operand" "n Ynn")))
(clobber (match_scratch:QI 1 "=&d"))]
"avr_have_dimode
&& !s8_operand (operands[0], VOIDmode)"
@ -254,29 +315,39 @@
;; Shift functions from libgcc are called without defining these insns,
;; but with them we can describe their reduced register footprint.
;; "ashldi3"
;; "ashrdi3"
;; "lshrdi3"
;; "rotldi3"
(define_expand "<code_stdname>di3"
[(parallel [(match_operand:DI 0 "general_operand" "")
(di_shifts:DI (match_operand:DI 1 "general_operand" "")
(match_operand:QI 2 "general_operand" ""))])]
;; "ashldi3" "ashrdi3" "lshrdi3" "rotldi3"
;; "ashldq3" "ashrdq3" "lshrdq3" "rotldq3"
;; "ashlda3" "ashrda3" "lshrda3" "rotlda3"
;; "ashlta3" "ashrta3" "lshrta3" "rotlta3"
;; "ashludq3" "ashrudq3" "lshrudq3" "rotludq3"
;; "ashluda3" "ashruda3" "lshruda3" "rotluda3"
;; "ashluta3" "ashruta3" "lshruta3" "rotluta3"
(define_expand "<code_stdname><mode>3"
[(parallel [(match_operand:ALL8 0 "general_operand" "")
(di_shifts:ALL8 (match_operand:ALL8 1 "general_operand" "")
(match_operand:QI 2 "general_operand" ""))])]
"avr_have_dimode"
{
rtx acc_a = gen_rtx_REG (DImode, ACC_A);
rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A);
emit_move_insn (acc_a, operands[1]);
emit_move_insn (gen_rtx_REG (QImode, 16), operands[2]);
emit_insn (gen_<code_stdname>di3_insn ());
emit_insn (gen_<code_stdname><mode>3_insn ());
emit_move_insn (operands[0], acc_a);
DONE;
})
(define_insn "<code_stdname>di3_insn"
[(set (reg:DI ACC_A)
(di_shifts:DI (reg:DI ACC_A)
(reg:QI 16)))]
;; "ashldi3_insn" "ashrdi3_insn" "lshrdi3_insn" "rotldi3_insn"
;; "ashldq3_insn" "ashrdq3_insn" "lshrdq3_insn" "rotldq3_insn"
;; "ashlda3_insn" "ashrda3_insn" "lshrda3_insn" "rotlda3_insn"
;; "ashlta3_insn" "ashrta3_insn" "lshrta3_insn" "rotlta3_insn"
;; "ashludq3_insn" "ashrudq3_insn" "lshrudq3_insn" "rotludq3_insn"
;; "ashluda3_insn" "ashruda3_insn" "lshruda3_insn" "rotluda3_insn"
;; "ashluta3_insn" "ashruta3_insn" "lshruta3_insn" "rotluta3_insn"
(define_insn "<code_stdname><mode>3_insn"
[(set (reg:ALL8 ACC_A)
(di_shifts:ALL8 (reg:ALL8 ACC_A)
(reg:QI 16)))]
"avr_have_dimode"
"%~call __<code_stdname>di3"
[(set_attr "adjust_len" "call")

287
gcc/config/avr/avr-fixed.md Normal file
View File

@ -0,0 +1,287 @@
;; This file contains instructions that support fixed-point operations
;; for Atmel AVR micro controllers.
;; Copyright (C) 2012
;; Free Software Foundation, Inc.
;;
;; Contributed by Sean D'Epagnier (sean@depagnier.com)
;; Georg-Johann Lay (avr@gjlay.de)
;; This file is part of GCC.
;;
;; GCC is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 3, or (at your option)
;; any later version.
;;
;; GCC is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU General Public License for more details.
;;
;; You should have received a copy of the GNU General Public License
;; along with GCC; see the file COPYING3. If not see
;; <http://www.gnu.org/licenses/>.
(define_mode_iterator ALL1Q [(QQ "") (UQQ "")])
(define_mode_iterator ALL2Q [(HQ "") (UHQ "")])
(define_mode_iterator ALL2A [(HA "") (UHA "")])
(define_mode_iterator ALL2QA [(HQ "") (UHQ "")
(HA "") (UHA "")])
(define_mode_iterator ALL4A [(SA "") (USA "")])
;;; Conversions
(define_mode_iterator FIXED_A
[(QQ "") (UQQ "")
(HQ "") (UHQ "") (HA "") (UHA "")
(SQ "") (USQ "") (SA "") (USA "")
(DQ "") (UDQ "") (DA "") (UDA "")
(TA "") (UTA "")
(QI "") (HI "") (SI "") (DI "")])
;; Same so that be can build cross products
(define_mode_iterator FIXED_B
[(QQ "") (UQQ "")
(HQ "") (UHQ "") (HA "") (UHA "")
(SQ "") (USQ "") (SA "") (USA "")
(DQ "") (UDQ "") (DA "") (UDA "")
(TA "") (UTA "")
(QI "") (HI "") (SI "") (DI "")])
(define_insn "fract<FIXED_B:mode><FIXED_A:mode>2"
[(set (match_operand:FIXED_A 0 "register_operand" "=r")
(fract_convert:FIXED_A
(match_operand:FIXED_B 1 "register_operand" "r")))]
"<FIXED_B:MODE>mode != <FIXED_A:MODE>mode"
{
return avr_out_fract (insn, operands, true, NULL);
}
[(set_attr "cc" "clobber")
(set_attr "adjust_len" "sfract")])
(define_insn "fractuns<FIXED_B:mode><FIXED_A:mode>2"
[(set (match_operand:FIXED_A 0 "register_operand" "=r")
(unsigned_fract_convert:FIXED_A
(match_operand:FIXED_B 1 "register_operand" "r")))]
"<FIXED_B:MODE>mode != <FIXED_A:MODE>mode"
{
return avr_out_fract (insn, operands, false, NULL);
}
[(set_attr "cc" "clobber")
(set_attr "adjust_len" "ufract")])
;******************************************************************************
; mul
;; "mulqq3" "muluqq3"
(define_expand "mul<mode>3"
[(parallel [(match_operand:ALL1Q 0 "register_operand" "")
(match_operand:ALL1Q 1 "register_operand" "")
(match_operand:ALL1Q 2 "register_operand" "")])]
""
{
emit_insn (AVR_HAVE_MUL
? gen_mul<mode>3_enh (operands[0], operands[1], operands[2])
: gen_mul<mode>3_nomul (operands[0], operands[1], operands[2]));
DONE;
})
(define_insn "mulqq3_enh"
[(set (match_operand:QQ 0 "register_operand" "=r")
(mult:QQ (match_operand:QQ 1 "register_operand" "a")
(match_operand:QQ 2 "register_operand" "a")))]
"AVR_HAVE_MUL"
"fmuls %1,%2\;dec r1\;brvs 0f\;inc r1\;0:\;mov %0,r1\;clr __zero_reg__"
[(set_attr "length" "6")
(set_attr "cc" "clobber")])
(define_insn "muluqq3_enh"
[(set (match_operand:UQQ 0 "register_operand" "=r")
(mult:UQQ (match_operand:UQQ 1 "register_operand" "r")
(match_operand:UQQ 2 "register_operand" "r")))]
"AVR_HAVE_MUL"
"mul %1,%2\;mov %0,r1\;clr __zero_reg__"
[(set_attr "length" "3")
(set_attr "cc" "clobber")])
(define_expand "mulqq3_nomul"
[(set (reg:QQ 24)
(match_operand:QQ 1 "register_operand" ""))
(set (reg:QQ 25)
(match_operand:QQ 2 "register_operand" ""))
;; "*mulqq3.call"
(parallel [(set (reg:QQ 23)
(mult:QQ (reg:QQ 24)
(reg:QQ 25)))
(clobber (reg:QI 22))
(clobber (reg:HI 24))])
(set (match_operand:QQ 0 "register_operand" "")
(reg:QQ 23))]
"!AVR_HAVE_MUL")
(define_expand "muluqq3_nomul"
[(set (reg:UQQ 22)
(match_operand:UQQ 1 "register_operand" ""))
(set (reg:UQQ 24)
(match_operand:UQQ 2 "register_operand" ""))
;; "*umulqihi3.call"
(parallel [(set (reg:HI 24)
(mult:HI (zero_extend:HI (reg:QI 22))
(zero_extend:HI (reg:QI 24))))
(clobber (reg:QI 21))
(clobber (reg:HI 22))])
(set (match_operand:UQQ 0 "register_operand" "")
(reg:UQQ 25))]
"!AVR_HAVE_MUL")
(define_insn "*mulqq3.call"
[(set (reg:QQ 23)
(mult:QQ (reg:QQ 24)
(reg:QQ 25)))
(clobber (reg:QI 22))
(clobber (reg:HI 24))]
"!AVR_HAVE_MUL"
"%~call __mulqq3"
[(set_attr "type" "xcall")
(set_attr "cc" "clobber")])
;; "mulhq3" "muluhq3"
;; "mulha3" "muluha3"
(define_expand "mul<mode>3"
[(set (reg:ALL2QA 18)
(match_operand:ALL2QA 1 "register_operand" ""))
(set (reg:ALL2QA 26)
(match_operand:ALL2QA 2 "register_operand" ""))
;; "*mulhq3.call.enh"
(parallel [(set (reg:ALL2QA 24)
(mult:ALL2QA (reg:ALL2QA 18)
(reg:ALL2QA 26)))
(clobber (reg:HI 22))])
(set (match_operand:ALL2QA 0 "register_operand" "")
(reg:ALL2QA 24))]
"AVR_HAVE_MUL")
;; "*mulhq3.call" "*muluhq3.call"
;; "*mulha3.call" "*muluha3.call"
(define_insn "*mul<mode>3.call"
[(set (reg:ALL2QA 24)
(mult:ALL2QA (reg:ALL2QA 18)
(reg:ALL2QA 26)))
(clobber (reg:HI 22))]
"AVR_HAVE_MUL"
"%~call __mul<mode>3"
[(set_attr "type" "xcall")
(set_attr "cc" "clobber")])
;; On the enhanced core, don't clobber either input and use a separate output
;; "mulsa3" "mulusa3"
(define_expand "mul<mode>3"
[(set (reg:ALL4A 16)
(match_operand:ALL4A 1 "register_operand" ""))
(set (reg:ALL4A 20)
(match_operand:ALL4A 2 "register_operand" ""))
(set (reg:ALL4A 24)
(mult:ALL4A (reg:ALL4A 16)
(reg:ALL4A 20)))
(set (match_operand:ALL4A 0 "register_operand" "")
(reg:ALL4A 24))]
"AVR_HAVE_MUL")
;; "*mulsa3.call" "*mulusa3.call"
(define_insn "*mul<mode>3.call"
[(set (reg:ALL4A 24)
(mult:ALL4A (reg:ALL4A 16)
(reg:ALL4A 20)))]
"AVR_HAVE_MUL"
"%~call __mul<mode>3"
[(set_attr "type" "xcall")
(set_attr "cc" "clobber")])
; / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
; div
(define_code_iterator usdiv [udiv div])
;; "divqq3" "udivuqq3"
(define_expand "<code><mode>3"
[(set (reg:ALL1Q 25)
(match_operand:ALL1Q 1 "register_operand" ""))
(set (reg:ALL1Q 22)
(match_operand:ALL1Q 2 "register_operand" ""))
(parallel [(set (reg:ALL1Q 24)
(usdiv:ALL1Q (reg:ALL1Q 25)
(reg:ALL1Q 22)))
(clobber (reg:QI 25))])
(set (match_operand:ALL1Q 0 "register_operand" "")
(reg:ALL1Q 24))])
;; "*divqq3.call" "*udivuqq3.call"
(define_insn "*<code><mode>3.call"
[(set (reg:ALL1Q 24)
(usdiv:ALL1Q (reg:ALL1Q 25)
(reg:ALL1Q 22)))
(clobber (reg:QI 25))]
""
"%~call __<code><mode>3"
[(set_attr "type" "xcall")
(set_attr "cc" "clobber")])
;; "divhq3" "udivuhq3"
;; "divha3" "udivuha3"
(define_expand "<code><mode>3"
[(set (reg:ALL2QA 26)
(match_operand:ALL2QA 1 "register_operand" ""))
(set (reg:ALL2QA 22)
(match_operand:ALL2QA 2 "register_operand" ""))
(parallel [(set (reg:ALL2QA 24)
(usdiv:ALL2QA (reg:ALL2QA 26)
(reg:ALL2QA 22)))
(clobber (reg:HI 26))
(clobber (reg:QI 21))])
(set (match_operand:ALL2QA 0 "register_operand" "")
(reg:ALL2QA 24))])
;; "*divhq3.call" "*udivuhq3.call"
;; "*divha3.call" "*udivuha3.call"
(define_insn "*<code><mode>3.call"
[(set (reg:ALL2QA 24)
(usdiv:ALL2QA (reg:ALL2QA 26)
(reg:ALL2QA 22)))
(clobber (reg:HI 26))
(clobber (reg:QI 21))]
""
"%~call __<code><mode>3"
[(set_attr "type" "xcall")
(set_attr "cc" "clobber")])
;; Note the first parameter gets passed in already offset by 2 bytes
;; "divsa3" "udivusa3"
(define_expand "<code><mode>3"
[(set (reg:ALL4A 24)
(match_operand:ALL4A 1 "register_operand" ""))
(set (reg:ALL4A 18)
(match_operand:ALL4A 2 "register_operand" ""))
(parallel [(set (reg:ALL4A 22)
(usdiv:ALL4A (reg:ALL4A 24)
(reg:ALL4A 18)))
(clobber (reg:HI 26))
(clobber (reg:HI 30))])
(set (match_operand:ALL4A 0 "register_operand" "")
(reg:ALL4A 22))])
;; "*divsa3.call" "*udivusa3.call"
(define_insn "*<code><mode>3.call"
[(set (reg:ALL4A 22)
(usdiv:ALL4A (reg:ALL4A 24)
(reg:ALL4A 18)))
(clobber (reg:HI 26))
(clobber (reg:HI 30))]
""
"%~call __<code><mode>3"
[(set_attr "type" "xcall")
(set_attr "cc" "clobber")])

View File

@ -1 +1,28 @@
FRACTIONAL_INT_MODE (PSI, 24, 3);
/* On 8 bit machines it requires fewer instructions for fixed point
routines if the decimal place is on a byte boundary which is not
the default for signed accum types. */
ADJUST_IBIT (HA, 7);
ADJUST_FBIT (HA, 8);
ADJUST_IBIT (SA, 15);
ADJUST_FBIT (SA, 16);
ADJUST_IBIT (DA, 31);
ADJUST_FBIT (DA, 32);
/* Make TA and UTA 64 bits wide.
128 bit wide modes would be insane on a 8-bit machine.
This needs special treatment in avr.c and avr-lib.h. */
ADJUST_BYTESIZE (TA, 8);
ADJUST_ALIGNMENT (TA, 1);
ADJUST_IBIT (TA, 15);
ADJUST_FBIT (TA, 48);
ADJUST_BYTESIZE (UTA, 8);
ADJUST_ALIGNMENT (UTA, 1);
ADJUST_IBIT (UTA, 16);
ADJUST_FBIT (UTA, 48);

View File

@ -79,6 +79,9 @@ extern const char* avr_load_lpm (rtx, rtx*, int*);
extern bool avr_rotate_bytes (rtx operands[]);
extern const char* avr_out_fract (rtx, rtx[], bool, int*);
extern rtx avr_to_int_mode (rtx);
extern void expand_prologue (void);
extern void expand_epilogue (bool);
extern bool avr_emit_movmemhi (rtx*);
@ -92,6 +95,8 @@ extern const char* avr_out_plus (rtx*, int*, int*);
extern const char* avr_out_plus_noclobber (rtx*, int*, int*);
extern const char* avr_out_plus64 (rtx, int*);
extern const char* avr_out_addto_sp (rtx*, int*);
extern const char* avr_out_minus (rtx*, int*, int*);
extern const char* avr_out_minus64 (rtx, int*);
extern const char* avr_out_xload (rtx, rtx*, int*);
extern const char* avr_out_movmem (rtx, rtx*, int*);
extern const char* avr_out_insert_bits (rtx*, int*);

View File

@ -49,6 +49,10 @@
#include "params.h"
#include "df.h"
#ifndef CONST_FIXED_P
#define CONST_FIXED_P(X) (CONST_FIXED == GET_CODE (X))
#endif
/* Maximal allowed offset for an address in the LD command */
#define MAX_LD_OFFSET(MODE) (64 - (signed)GET_MODE_SIZE (MODE))
@ -264,6 +268,23 @@ avr_popcount_each_byte (rtx xval, int n_bytes, int pop_mask)
return true;
}
/* Access some RTX as INT_MODE. If X is a CONST_FIXED we can get
the bit representation of X by "casting" it to CONST_INT. */
rtx
avr_to_int_mode (rtx x)
{
enum machine_mode mode = GET_MODE (x);
return VOIDmode == mode
? x
: simplify_gen_subreg (int_mode_for_mode (mode), x, mode, 0);
}
/* Implement `TARGET_OPTION_OVERRIDE'. */
static void
avr_option_override (void)
{
@ -389,9 +410,14 @@ avr_regno_reg_class (int r)
}
/* Implement `TARGET_SCALAR_MODE_SUPPORTED_P'. */
static bool
avr_scalar_mode_supported_p (enum machine_mode mode)
{
if (ALL_FIXED_POINT_MODE_P (mode))
return true;
if (PSImode == mode)
return true;
@ -715,6 +741,58 @@ avr_initial_elimination_offset (int from, int to)
}
}
/* Helper for the function below. */
static void
avr_adjust_type_node (tree *node, enum machine_mode mode, int sat_p)
{
*node = make_node (FIXED_POINT_TYPE);
TYPE_SATURATING (*node) = sat_p;
TYPE_UNSIGNED (*node) = UNSIGNED_FIXED_POINT_MODE_P (mode);
TYPE_IBIT (*node) = GET_MODE_IBIT (mode);
TYPE_FBIT (*node) = GET_MODE_FBIT (mode);
TYPE_PRECISION (*node) = GET_MODE_BITSIZE (mode);
TYPE_ALIGN (*node) = 8;
SET_TYPE_MODE (*node, mode);
layout_type (*node);
}
/* Implement `TARGET_BUILD_BUILTIN_VA_LIST'. */
static tree
avr_build_builtin_va_list (void)
{
/* avr-modes.def adjusts [U]TA to be 64-bit modes with 48 fractional bits.
This is more appropriate for the 8-bit machine AVR than 128-bit modes.
The ADJUST_IBIT/FBIT are handled in toplev:init_adjust_machine_modes()
which is auto-generated by genmodes, but the compiler assigns [U]DAmode
to the long long accum modes instead of the desired [U]TAmode.
Fix this now, right after node setup in tree.c:build_common_tree_nodes().
This must run before c-cppbuiltin.c:builtin_define_fixed_point_constants()
which built-in defines macros like __ULLACCUM_FBIT__ that are used by
libgcc to detect IBIT and FBIT. */
avr_adjust_type_node (&ta_type_node, TAmode, 0);
avr_adjust_type_node (&uta_type_node, UTAmode, 0);
avr_adjust_type_node (&sat_ta_type_node, TAmode, 1);
avr_adjust_type_node (&sat_uta_type_node, UTAmode, 1);
unsigned_long_long_accum_type_node = uta_type_node;
long_long_accum_type_node = ta_type_node;
sat_unsigned_long_long_accum_type_node = sat_uta_type_node;
sat_long_long_accum_type_node = sat_ta_type_node;
/* Dispatch to the default handler. */
return std_build_builtin_va_list ();
}
/* Implement `TARGET_BUILTIN_SETJMP_FRAME_VALUE'. */
/* Actual start of frame is virtual_stack_vars_rtx this is offset from
frame pointer by +STARTING_FRAME_OFFSET.
Using saved frame = virtual_stack_vars_rtx - STARTING_FRAME_OFFSET
@ -723,10 +801,13 @@ avr_initial_elimination_offset (int from, int to)
static rtx
avr_builtin_setjmp_frame_value (void)
{
return gen_rtx_MINUS (Pmode, virtual_stack_vars_rtx,
gen_int_mode (STARTING_FRAME_OFFSET, Pmode));
rtx xval = gen_reg_rtx (Pmode);
emit_insn (gen_subhi3 (xval, virtual_stack_vars_rtx,
gen_int_mode (STARTING_FRAME_OFFSET, Pmode)));
return xval;
}
/* Return contents of MEM at frame pointer + stack size + 1 (+2 if 3 byte PC).
This is return address of function. */
rtx
@ -1580,7 +1661,7 @@ avr_legitimate_address_p (enum machine_mode mode, rtx x, bool strict)
MEM, strict);
if (strict
&& DImode == mode
&& GET_MODE_SIZE (mode) > 4
&& REG_X == REGNO (x))
{
ok = false;
@ -2081,6 +2162,14 @@ avr_print_operand (FILE *file, rtx x, int code)
/* Use normal symbol for direct address no linker trampoline needed */
output_addr_const (file, x);
}
else if (GET_CODE (x) == CONST_FIXED)
{
HOST_WIDE_INT ival = INTVAL (avr_to_int_mode (x));
if (code != 0)
output_operand_lossage ("Unsupported code '%c'for fixed-point:",
code);
fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival);
}
else if (GET_CODE (x) == CONST_DOUBLE)
{
long val;
@ -2116,6 +2205,7 @@ notice_update_cc (rtx body ATTRIBUTE_UNUSED, rtx insn)
case CC_OUT_PLUS:
case CC_OUT_PLUS_NOCLOBBER:
case CC_MINUS:
case CC_LDI:
{
rtx *op = recog_data.operand;
@ -2139,6 +2229,11 @@ notice_update_cc (rtx body ATTRIBUTE_UNUSED, rtx insn)
cc = (enum attr_cc) icc;
break;
case CC_MINUS:
avr_out_minus (op, &len_dummy, &icc);
cc = (enum attr_cc) icc;
break;
case CC_LDI:
cc = (op[1] == CONST0_RTX (GET_MODE (op[0]))
@ -2779,9 +2874,11 @@ output_movqi (rtx insn, rtx operands[], int *real_l)
if (real_l)
*real_l = 1;
if (register_operand (dest, QImode))
gcc_assert (1 == GET_MODE_SIZE (GET_MODE (dest)));
if (REG_P (dest))
{
if (register_operand (src, QImode)) /* mov r,r */
if (REG_P (src)) /* mov r,r */
{
if (test_hard_reg_class (STACK_REG, dest))
return "out %0,%1";
@ -2803,7 +2900,7 @@ output_movqi (rtx insn, rtx operands[], int *real_l)
rtx xop[2];
xop[0] = dest;
xop[1] = src == const0_rtx ? zero_reg_rtx : src;
xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src;
return out_movqi_mr_r (insn, xop, real_l);
}
@ -2825,6 +2922,8 @@ output_movhi (rtx insn, rtx xop[], int *plen)
return avr_out_lpm (insn, xop, plen);
}
gcc_assert (2 == GET_MODE_SIZE (GET_MODE (dest)));
if (REG_P (dest))
{
if (REG_P (src)) /* mov r,r */
@ -2843,7 +2942,6 @@ output_movhi (rtx insn, rtx xop[], int *plen)
return TARGET_NO_INTERRUPTS
? avr_asm_len ("out __SP_H__,%B1" CR_TAB
"out __SP_L__,%A1", xop, plen, -2)
: avr_asm_len ("in __tmp_reg__,__SREG__" CR_TAB
"cli" CR_TAB
"out __SP_H__,%B1" CR_TAB
@ -2880,7 +2978,7 @@ output_movhi (rtx insn, rtx xop[], int *plen)
rtx xop[2];
xop[0] = dest;
xop[1] = src == const0_rtx ? zero_reg_rtx : src;
xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src;
return out_movhi_mr_r (insn, xop, plen);
}
@ -3403,9 +3501,10 @@ output_movsisf (rtx insn, rtx operands[], int *l)
if (!l)
l = &dummy;
if (register_operand (dest, VOIDmode))
gcc_assert (4 == GET_MODE_SIZE (GET_MODE (dest)));
if (REG_P (dest))
{
if (register_operand (src, VOIDmode)) /* mov r,r */
if (REG_P (src)) /* mov r,r */
{
if (true_regnum (dest) > true_regnum (src))
{
@ -3440,10 +3539,10 @@ output_movsisf (rtx insn, rtx operands[], int *l)
{
return output_reload_insisf (operands, NULL_RTX, real_l);
}
else if (GET_CODE (src) == MEM)
else if (MEM_P (src))
return out_movsi_r_mr (insn, operands, real_l); /* mov r,m */
}
else if (GET_CODE (dest) == MEM)
else if (MEM_P (dest))
{
const char *templ;
@ -4126,14 +4225,25 @@ avr_out_compare (rtx insn, rtx *xop, int *plen)
rtx xval = xop[1];
/* MODE of the comparison. */
enum machine_mode mode = GET_MODE (xreg);
enum machine_mode mode;
/* Number of bytes to operate on. */
int i, n_bytes = GET_MODE_SIZE (mode);
int i, n_bytes = GET_MODE_SIZE (GET_MODE (xreg));
/* Value (0..0xff) held in clobber register xop[2] or -1 if unknown. */
int clobber_val = -1;
/* Map fixed mode operands to integer operands with the same binary
representation. They are easier to handle in the remainder. */
if (CONST_FIXED == GET_CODE (xval))
{
xreg = avr_to_int_mode (xop[0]);
xval = avr_to_int_mode (xop[1]);
}
mode = GET_MODE (xreg);
gcc_assert (REG_P (xreg));
gcc_assert ((CONST_INT_P (xval) && n_bytes <= 4)
|| (const_double_operand (xval, VOIDmode) && n_bytes == 8));
@ -4143,7 +4253,7 @@ avr_out_compare (rtx insn, rtx *xop, int *plen)
/* Comparisons == +/-1 and != +/-1 can be done similar to camparing
against 0 by ORing the bytes. This is one instruction shorter.
Notice that DImode comparisons are always against reg:DI 18
Notice that 64-bit comparisons are always against reg:ALL8 18 (ACC_A)
and therefore don't use this. */
if (!test_hard_reg_class (LD_REGS, xreg)
@ -5884,6 +5994,9 @@ avr_out_plus_1 (rtx *xop, int *plen, enum rtx_code code, int *pcc)
/* MODE of the operation. */
enum machine_mode mode = GET_MODE (xop[0]);
/* INT_MODE of the same size. */
enum machine_mode imode = int_mode_for_mode (mode);
/* Number of bytes to operate on. */
int i, n_bytes = GET_MODE_SIZE (mode);
@ -5908,8 +6021,11 @@ avr_out_plus_1 (rtx *xop, int *plen, enum rtx_code code, int *pcc)
*pcc = (MINUS == code) ? CC_SET_CZN : CC_CLOBBER;
if (CONST_FIXED_P (xval))
xval = avr_to_int_mode (xval);
if (MINUS == code)
xval = simplify_unary_operation (NEG, mode, xval, mode);
xval = simplify_unary_operation (NEG, imode, xval, imode);
op[2] = xop[3];
@ -5920,7 +6036,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enum rtx_code code, int *pcc)
{
/* We operate byte-wise on the destination. */
rtx reg8 = simplify_gen_subreg (QImode, xop[0], mode, i);
rtx xval8 = simplify_gen_subreg (QImode, xval, mode, i);
rtx xval8 = simplify_gen_subreg (QImode, xval, imode, i);
/* 8-bit value to operate with this byte. */
unsigned int val8 = UINTVAL (xval8) & GET_MODE_MASK (QImode);
@ -5941,7 +6057,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enum rtx_code code, int *pcc)
&& i + 2 <= n_bytes
&& test_hard_reg_class (ADDW_REGS, reg8))
{
rtx xval16 = simplify_gen_subreg (HImode, xval, mode, i);
rtx xval16 = simplify_gen_subreg (HImode, xval, imode, i);
unsigned int val16 = UINTVAL (xval16) & GET_MODE_MASK (HImode);
/* Registers R24, X, Y, Z can use ADIW/SBIW with constants < 64
@ -6085,6 +6201,41 @@ avr_out_plus_noclobber (rtx *xop, int *plen, int *pcc)
}
/* Output subtraction of register XOP[0] and compile time constant XOP[2]:
XOP[0] = XOP[0] - XOP[2]
This is basically the same as `avr_out_plus' except that we subtract.
It's needed because (minus x const) is not mapped to (plus x -const)
for the fixed point modes. */
const char*
avr_out_minus (rtx *xop, int *plen, int *pcc)
{
rtx op[4];
if (pcc)
*pcc = (int) CC_SET_CZN;
if (REG_P (xop[2]))
return avr_asm_len ("sub %A0,%A2" CR_TAB
"sbc %B0,%B2", xop, plen, -2);
if (!CONST_INT_P (xop[2])
&& !CONST_FIXED_P (xop[2]))
return avr_asm_len ("subi %A0,lo8(%2)" CR_TAB
"sbci %B0,hi8(%2)", xop, plen, -2);
op[0] = avr_to_int_mode (xop[0]);
op[1] = avr_to_int_mode (xop[1]);
op[2] = gen_int_mode (-INTVAL (avr_to_int_mode (xop[2])),
GET_MODE (op[0]));
op[3] = xop[3];
return avr_out_plus (op, plen, pcc);
}
/* Prepare operands of adddi3_const_insn to be used with avr_out_plus_1. */
const char*
@ -6103,6 +6254,19 @@ avr_out_plus64 (rtx addend, int *plen)
return "";
}
/* Prepare operands of subdi3_const_insn to be used with avr_out_plus64. */
const char*
avr_out_minus64 (rtx subtrahend, int *plen)
{
rtx xneg = avr_to_int_mode (subtrahend);
xneg = simplify_unary_operation (NEG, DImode, xneg, DImode);
return avr_out_plus64 (xneg, plen);
}
/* Output bit operation (IOR, AND, XOR) with register XOP[0] and compile
time constant XOP[2]:
@ -6442,6 +6606,349 @@ avr_rotate_bytes (rtx operands[])
return true;
}
/* Outputs instructions needed for fixed point type conversion.
This includes converting between any fixed point type, as well
as converting to any integer type. Conversion between integer
types is not supported.
The number of instructions generated depends on the types
being converted and the registers assigned to them.
The number of instructions required to complete the conversion
is least if the registers for source and destination are overlapping
and are aligned at the decimal place as actual movement of data is
completely avoided. In some cases, the conversion may already be
complete without any instructions needed.
When converting to signed types from signed types, sign extension
is implemented.
Converting signed fractional types requires a bit shift if converting
to or from any unsigned fractional type because the decimal place is
shifted by 1 bit. When the destination is a signed fractional, the sign
is stored in either the carry or T bit. */
const char*
avr_out_fract (rtx insn, rtx operands[], bool intsigned, int *plen)
{
int i;
bool sbit[2];
/* ilen: Length of integral part (in bytes)
flen: Length of fractional part (in bytes)
tlen: Length of operand (in bytes)
blen: Length of operand (in bits) */
int ilen[2], flen[2], tlen[2], blen[2];
int rdest, rsource, offset;
int start, end, dir;
bool sign_in_T = false, sign_in_Carry = false, sign_done = false;
bool widening_sign_extend = false;
int clrword = -1, lastclr = 0, clr = 0;
rtx xop[6];
const int dest = 0;
const int src = 1;
xop[dest] = operands[dest];
xop[src] = operands[src];
if (plen)
*plen = 0;
/* Determine format (integer and fractional parts)
of types needing conversion. */
for (i = 0; i < 2; i++)
{
enum machine_mode mode = GET_MODE (xop[i]);
tlen[i] = GET_MODE_SIZE (mode);
blen[i] = GET_MODE_BITSIZE (mode);
if (SCALAR_INT_MODE_P (mode))
{
sbit[i] = intsigned;
ilen[i] = GET_MODE_SIZE (mode);
flen[i] = 0;
}
else if (ALL_SCALAR_FIXED_POINT_MODE_P (mode))
{
sbit[i] = SIGNED_SCALAR_FIXED_POINT_MODE_P (mode);
ilen[i] = (GET_MODE_IBIT (mode) + 1) / 8;
flen[i] = (GET_MODE_FBIT (mode) + 1) / 8;
}
else
fatal_insn ("unsupported fixed-point conversion", insn);
}
/* Perform sign extension if source and dest are both signed,
and there are more integer parts in dest than in source. */
widening_sign_extend = sbit[dest] && sbit[src] && ilen[dest] > ilen[src];
rdest = REGNO (xop[dest]);
rsource = REGNO (xop[src]);
offset = flen[src] - flen[dest];
/* Position of MSB resp. sign bit. */
xop[2] = GEN_INT (blen[dest] - 1);
xop[3] = GEN_INT (blen[src] - 1);
/* Store the sign bit if the destination is a signed fract and the source
has a sign in the integer part. */
if (sbit[dest] && ilen[dest] == 0 && sbit[src] && ilen[src] > 0)
{
/* To avoid using BST and BLD if the source and destination registers
overlap or the source is unused after, we can use LSL to store the
sign bit in carry since we don't need the integral part of the source.
Restoring the sign from carry saves one BLD instruction below. */
if (reg_unused_after (insn, xop[src])
|| (rdest < rsource + tlen[src]
&& rdest + tlen[dest] > rsource))
{
avr_asm_len ("lsl %T1%t3", xop, plen, 1);
sign_in_Carry = true;
}
else
{
avr_asm_len ("bst %T1%T3", xop, plen, 1);
sign_in_T = true;
}
}
/* Pick the correct direction to shift bytes. */
if (rdest < rsource + offset)
{
dir = 1;
start = 0;
end = tlen[dest];
}
else
{
dir = -1;
start = tlen[dest] - 1;
end = -1;
}
/* Perform conversion by moving registers into place, clearing
destination registers that do not overlap with any source. */
for (i = start; i != end; i += dir)
{
int destloc = rdest + i;
int sourceloc = rsource + i + offset;
/* Source register location is outside range of source register,
so clear this byte in the dest. */
if (sourceloc < rsource
|| sourceloc >= rsource + tlen[src])
{
if (AVR_HAVE_MOVW
&& i + dir != end
&& (sourceloc + dir < rsource
|| sourceloc + dir >= rsource + tlen[src])
&& ((dir == 1 && !(destloc % 2) && !(sourceloc % 2))
|| (dir == -1 && (destloc % 2) && (sourceloc % 2)))
&& clrword != -1)
{
/* Use already cleared word to clear two bytes at a time. */
int even_i = i & ~1;
int even_clrword = clrword & ~1;
xop[4] = GEN_INT (8 * even_i);
xop[5] = GEN_INT (8 * even_clrword);
avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1);
i += dir;
}
else
{
if (i == tlen[dest] - 1
&& widening_sign_extend
&& blen[src] - 1 - 8 * offset < 0)
{
/* The SBRC below that sign-extends would come
up with a negative bit number because the sign
bit is out of reach. ALso avoid some early-clobber
situations because of premature CLR. */
if (reg_unused_after (insn, xop[src]))
avr_asm_len ("lsl %T1%t3" CR_TAB
"sbc %T0%t2,%T0%t2", xop, plen, 2);
else
avr_asm_len ("mov __tmp_reg__,%T1%t3" CR_TAB
"lsl __tmp_reg__" CR_TAB
"sbc %T0%t2,%T0%t2", xop, plen, 3);
sign_done = true;
continue;
}
/* Do not clear the register if it is going to get
sign extended with a MOV later. */
if (sbit[dest] && sbit[src]
&& i != tlen[dest] - 1
&& i >= flen[dest])
{
continue;
}
xop[4] = GEN_INT (8 * i);
avr_asm_len ("clr %T0%t4", xop, plen, 1);
/* If the last byte was cleared too, we have a cleared
word we can MOVW to clear two bytes at a time. */
if (lastclr)
clrword = i;
clr = 1;
}
}
else if (destloc == sourceloc)
{
/* Source byte is already in destination: Nothing needed. */
continue;
}
else
{
/* Registers do not line up and source register location
is within range: Perform move, shifting with MOV or MOVW. */
if (AVR_HAVE_MOVW
&& i + dir != end
&& sourceloc + dir >= rsource
&& sourceloc + dir < rsource + tlen[src]
&& ((dir == 1 && !(destloc % 2) && !(sourceloc % 2))
|| (dir == -1 && (destloc % 2) && (sourceloc % 2))))
{
int even_i = i & ~1;
int even_i_plus_offset = (i + offset) & ~1;
xop[4] = GEN_INT (8 * even_i);
xop[5] = GEN_INT (8 * even_i_plus_offset);
avr_asm_len ("movw %T0%t4,%T1%t5", xop, plen, 1);
i += dir;
}
else
{
xop[4] = GEN_INT (8 * i);
xop[5] = GEN_INT (8 * (i + offset));
avr_asm_len ("mov %T0%t4,%T1%t5", xop, plen, 1);
}
}
lastclr = clr;
clr = 0;
}
/* Perform sign extension if source and dest are both signed,
and there are more integer parts in dest than in source. */
if (widening_sign_extend)
{
if (!sign_done)
{
xop[4] = GEN_INT (blen[src] - 1 - 8 * offset);
/* Register was cleared above, so can become 0xff and extended.
Note: Instead of the CLR/SBRC/COM the sign extension could
be performed after the LSL below by means of a SBC if only
one byte has to be shifted left. */
avr_asm_len ("sbrc %T0%T4" CR_TAB
"com %T0%t2", xop, plen, 2);
}
/* Sign extend additional bytes by MOV and MOVW. */
start = tlen[dest] - 2;
end = flen[dest] + ilen[src] - 1;
for (i = start; i != end; i--)
{
if (AVR_HAVE_MOVW && i != start && i-1 != end)
{
i--;
xop[4] = GEN_INT (8 * i);
xop[5] = GEN_INT (8 * (tlen[dest] - 2));
avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1);
}
else
{
xop[4] = GEN_INT (8 * i);
xop[5] = GEN_INT (8 * (tlen[dest] - 1));
avr_asm_len ("mov %T0%t4,%T0%t5", xop, plen, 1);
}
}
}
/* If destination is a signed fract, and the source was not, a shift
by 1 bit is needed. Also restore sign from carry or T. */
if (sbit[dest] && !ilen[dest] && (!sbit[src] || ilen[src]))
{
/* We have flen[src] non-zero fractional bytes to shift.
Because of the right shift, handle one byte more so that the
LSB won't be lost. */
int nonzero = flen[src] + 1;
/* If the LSB is in the T flag and there are no fractional
bits, the high byte is zero and no shift needed. */
if (flen[src] == 0 && sign_in_T)
nonzero = 0;
start = flen[dest] - 1;
end = start - nonzero;
for (i = start; i > end && i >= 0; i--)
{
xop[4] = GEN_INT (8 * i);
if (i == start && !sign_in_Carry)
avr_asm_len ("lsr %T0%t4", xop, plen, 1);
else
avr_asm_len ("ror %T0%t4", xop, plen, 1);
}
if (sign_in_T)
{
avr_asm_len ("bld %T0%T2", xop, plen, 1);
}
}
else if (sbit[src] && !ilen[src] && (!sbit[dest] || ilen[dest]))
{
/* If source was a signed fract and dest was not, shift 1 bit
other way. */
start = flen[dest] - flen[src];
if (start < 0)
start = 0;
for (i = start; i < flen[dest]; i++)
{
xop[4] = GEN_INT (8 * i);
if (i == start)
avr_asm_len ("lsl %T0%t4", xop, plen, 1);
else
avr_asm_len ("rol %T0%t4", xop, plen, 1);
}
}
return "";
}
/* Modifies the length assigned to instruction INSN
LEN is the initially computed length of the insn. */
@ -6489,6 +6996,8 @@ adjust_insn_length (rtx insn, int len)
case ADJUST_LEN_OUT_PLUS: avr_out_plus (op, &len, NULL); break;
case ADJUST_LEN_PLUS64: avr_out_plus64 (op[0], &len); break;
case ADJUST_LEN_MINUS: avr_out_minus (op, &len, NULL); break;
case ADJUST_LEN_MINUS64: avr_out_minus64 (op[0], &len); break;
case ADJUST_LEN_OUT_PLUS_NOCLOBBER:
avr_out_plus_noclobber (op, &len, NULL); break;
@ -6502,6 +7011,9 @@ adjust_insn_length (rtx insn, int len)
case ADJUST_LEN_XLOAD: avr_out_xload (insn, op, &len); break;
case ADJUST_LEN_LOAD_LPM: avr_load_lpm (insn, op, &len); break;
case ADJUST_LEN_SFRACT: avr_out_fract (insn, op, true, &len); break;
case ADJUST_LEN_UFRACT: avr_out_fract (insn, op, false, &len); break;
case ADJUST_LEN_TSTHI: avr_out_tsthi (insn, op, &len); break;
case ADJUST_LEN_TSTPSI: avr_out_tstpsi (insn, op, &len); break;
case ADJUST_LEN_TSTSI: avr_out_tstsi (insn, op, &len); break;
@ -6683,6 +7195,20 @@ avr_assemble_integer (rtx x, unsigned int size, int aligned_p)
return true;
}
else if (CONST_FIXED_P (x))
{
unsigned n;
/* varasm fails to handle big fixed modes that don't fit in hwi. */
for (n = 0; n < size; n++)
{
rtx xn = simplify_gen_subreg (QImode, x, GET_MODE (x), n);
default_assemble_integer (xn, 1, aligned_p);
}
return true;
}
return default_assemble_integer (x, size, aligned_p);
}
@ -7489,6 +8015,7 @@ avr_operand_rtx_cost (rtx x, enum machine_mode mode, enum rtx_code outer,
return 0;
case CONST_INT:
case CONST_FIXED:
case CONST_DOUBLE:
return COSTS_N_INSNS (GET_MODE_SIZE (mode));
@ -7518,6 +8045,7 @@ avr_rtx_costs_1 (rtx x, int codearg, int outer_code ATTRIBUTE_UNUSED,
switch (code)
{
case CONST_INT:
case CONST_FIXED:
case CONST_DOUBLE:
case SYMBOL_REF:
case CONST:
@ -8446,11 +8974,17 @@ avr_compare_pattern (rtx insn)
if (pattern
&& NONJUMP_INSN_P (insn)
&& SET_DEST (pattern) == cc0_rtx
&& GET_CODE (SET_SRC (pattern)) == COMPARE
&& DImode != GET_MODE (XEXP (SET_SRC (pattern), 0))
&& DImode != GET_MODE (XEXP (SET_SRC (pattern), 1)))
&& GET_CODE (SET_SRC (pattern)) == COMPARE)
{
return pattern;
enum machine_mode mode0 = GET_MODE (XEXP (SET_SRC (pattern), 0));
enum machine_mode mode1 = GET_MODE (XEXP (SET_SRC (pattern), 1));
/* The 64-bit comparisons have fixed operands ACC_A and ACC_B.
They must not be swapped, thus skip them. */
if ((mode0 == VOIDmode || GET_MODE_SIZE (mode0) <= 4)
&& (mode1 == VOIDmode || GET_MODE_SIZE (mode1) <= 4))
return pattern;
}
return NULL_RTX;
@ -8788,6 +9322,8 @@ avr_2word_insn_p (rtx insn)
return false;
case CODE_FOR_movqi_insn:
case CODE_FOR_movuqq_insn:
case CODE_FOR_movqq_insn:
{
rtx set = single_set (insn);
rtx src = SET_SRC (set);
@ -8796,7 +9332,7 @@ avr_2word_insn_p (rtx insn)
/* Factor out LDS and STS from movqi_insn. */
if (MEM_P (dest)
&& (REG_P (src) || src == const0_rtx))
&& (REG_P (src) || src == CONST0_RTX (GET_MODE (dest))))
{
return CONSTANT_ADDRESS_P (XEXP (dest, 0));
}
@ -9021,7 +9557,7 @@ output_reload_in_const (rtx *op, rtx clobber_reg, int *len, bool clear_p)
if (NULL_RTX == clobber_reg
&& !test_hard_reg_class (LD_REGS, dest)
&& (! (CONST_INT_P (src) || CONST_DOUBLE_P (src))
&& (! (CONST_INT_P (src) || CONST_FIXED_P (src) || CONST_DOUBLE_P (src))
|| !avr_popcount_each_byte (src, n_bytes,
(1 << 0) | (1 << 1) | (1 << 8))))
{
@ -9048,6 +9584,7 @@ output_reload_in_const (rtx *op, rtx clobber_reg, int *len, bool clear_p)
ldreg_p = test_hard_reg_class (LD_REGS, xdest[n]);
if (!CONST_INT_P (src)
&& !CONST_FIXED_P (src)
&& !CONST_DOUBLE_P (src))
{
static const char* const asm_code[][2] =
@ -9239,6 +9776,7 @@ output_reload_insisf (rtx *op, rtx clobber_reg, int *len)
if (AVR_HAVE_MOVW
&& !test_hard_reg_class (LD_REGS, op[0])
&& (CONST_INT_P (op[1])
|| CONST_FIXED_P (op[1])
|| CONST_DOUBLE_P (op[1])))
{
int len_clr, len_noclr;
@ -10834,6 +11372,12 @@ avr_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *arg,
#undef TARGET_SCALAR_MODE_SUPPORTED_P
#define TARGET_SCALAR_MODE_SUPPORTED_P avr_scalar_mode_supported_p
#undef TARGET_BUILD_BUILTIN_VA_LIST
#define TARGET_BUILD_BUILTIN_VA_LIST avr_build_builtin_va_list
#undef TARGET_FIXED_POINT_SUPPORTED_P
#define TARGET_FIXED_POINT_SUPPORTED_P hook_bool_void_true
#undef TARGET_ADDR_SPACE_SUBSET_P
#define TARGET_ADDR_SPACE_SUBSET_P avr_addr_space_subset_p

View File

@ -261,6 +261,7 @@ enum
#define FLOAT_TYPE_SIZE 32
#define DOUBLE_TYPE_SIZE 32
#define LONG_DOUBLE_TYPE_SIZE 32
#define LONG_LONG_ACCUM_TYPE_SIZE 64
#define DEFAULT_SIGNED_CHAR 1

File diff suppressed because it is too large Load Diff

View File

@ -192,3 +192,47 @@
"32-bit integer constant where no nibble equals 0xf."
(and (match_code "const_int")
(match_test "!avr_has_nibble_0xf (op)")))
;; CONST_FIXED is no element of 'n' so cook our own.
;; "i" or "s" would match but because the insn uses iterators that cover
;; INT_MODE, "i" or "s" is not always possible.
(define_constraint "Ynn"
"Fixed-point constant known at compile time."
(match_code "const_fixed"))
(define_constraint "Y00"
"Fixed-point or integer constant with bit representation 0x0"
(and (match_code "const_fixed,const_int")
(match_test "op == CONST0_RTX (GET_MODE (op))")))
(define_constraint "Y01"
"Fixed-point or integer constant with bit representation 0x1"
(ior (and (match_code "const_fixed")
(match_test "1 == INTVAL (avr_to_int_mode (op))"))
(match_test "satisfies_constraint_P (op)")))
(define_constraint "Ym1"
"Fixed-point or integer constant with bit representation -0x1"
(ior (and (match_code "const_fixed")
(match_test "-1 == INTVAL (avr_to_int_mode (op))"))
(match_test "satisfies_constraint_N (op)")))
(define_constraint "Y02"
"Fixed-point or integer constant with bit representation 0x2"
(ior (and (match_code "const_fixed")
(match_test "2 == INTVAL (avr_to_int_mode (op))"))
(match_test "satisfies_constraint_K (op)")))
(define_constraint "Ym2"
"Fixed-point or integer constant with bit representation -0x2"
(ior (and (match_code "const_fixed")
(match_test "-2 == INTVAL (avr_to_int_mode (op))"))
(match_test "satisfies_constraint_Cm2 (op)")))
;; Similar to "IJ" used with ADIW/SBIW, but for CONST_FIXED.
(define_constraint "YIJ"
"Fixed-point constant from @minus{}0x003f to 0x003f."
(and (match_code "const_fixed")
(match_test "IN_RANGE (INTVAL (avr_to_int_mode (op)), -63, 63)")))

View File

@ -74,7 +74,7 @@
;; Return 1 if OP is the zero constant for MODE.
(define_predicate "const0_operand"
(and (match_code "const_int,const_double")
(and (match_code "const_int,const_fixed,const_double")
(match_test "op == CONST0_RTX (mode)")))
;; Return 1 if OP is the one constant integer for MODE.
@ -248,3 +248,21 @@
(define_predicate "o16_operand"
(and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), -(1<<16), -1)")))
;; Const int, fixed, or double operand
(define_predicate "const_operand"
(ior (match_code "const_fixed")
(match_code "const_double")
(match_operand 0 "const_int_operand")))
;; Const int, const fixed, or const double operand
(define_predicate "nonmemory_or_const_operand"
(ior (match_code "const_fixed")
(match_code "const_double")
(match_operand 0 "nonmemory_operand")))
;; Immediate, const fixed, or const double operand
(define_predicate "const_or_immediate_operand"
(ior (match_code "const_fixed")
(match_code "const_double")
(match_operand 0 "immediate_operand")))

View File

@ -1,3 +1,23 @@
2012-08-24 Georg-Johann Lay <avr@gjlay.de>
PR target/54222
* config/avr/lib1funcs-fixed.S: New file.
* config/avr/lib1funcs.S: Include it. Undefine some divmodsi
after they are used.
(neg2, neg4): New macros.
(__mulqihi3,__umulqihi3,__mulhi3): Rewrite non-MUL variants.
(__mulhisi3,__umulhisi3,__mulsi3): Rewrite non-MUL variants.
(__umulhisi3): Speed up MUL variant if there is enough flash.
* config/avr/avr-lib.h (TA, UTA): Adjust according to gcc's
avr-modes.def.
* config/avr/t-avr (LIB1ASMFUNCS): Add: _fractqqsf, _fractuqqsf,
_fracthqsf, _fractuhqsf, _fracthasf, _fractuhasf, _fractsasf,
_fractusasf, _fractsfqq, _fractsfuqq, _fractsfhq, _fractsfuhq,
_fractsfha, _fractsfsa, _mulqq3, _muluqq3, _mulhq3, _muluhq3,
_mulha3, _muluha3, _mulsa3, _mulusa3, _divqq3, _udivuqq3, _divhq3,
_udivuhq3, _divha3, _udivuha3, _divsa3, _udivusa3.
(LIB2FUNCS_EXCLUDE): Add supported functions.
2012-08-22 Georg-Johann Lay <avr@gjlay.de>
* Makefile.in (fixed-funcs,fixed-conv-funcs): filter-out

View File

@ -4,3 +4,79 @@
#define DI SI
typedef int QItype __attribute__ ((mode (QI)));
#endif
/* fixed-bit.h does not define functions for TA and UTA because
that part is wrapped in #if MIN_UNITS_PER_WORD > 4.
This would lead to empty functions for TA and UTA.
Thus, supply appropriate defines as if HAVE_[U]TA == 1.
#define HAVE_[U]TA 1 won't work because avr-modes.def
uses ADJUST_BYTESIZE(TA,8) and fixed-bit.h is not generic enough
to arrange for such changes of the mode size. */
typedef unsigned _Fract UTAtype __attribute__ ((mode (UTA)));
#if defined (UTA_MODE)
#define FIXED_SIZE 8 /* in bytes */
#define INT_C_TYPE UDItype
#define UINT_C_TYPE UDItype
#define HINT_C_TYPE USItype
#define HUINT_C_TYPE USItype
#define MODE_NAME UTA
#define MODE_NAME_S uta
#define MODE_UNSIGNED 1
#endif
#if defined (FROM_UTA)
#define FROM_TYPE 4 /* ID for fixed-point */
#define FROM_MODE_NAME UTA
#define FROM_MODE_NAME_S uta
#define FROM_INT_C_TYPE UDItype
#define FROM_SINT_C_TYPE DItype
#define FROM_UINT_C_TYPE UDItype
#define FROM_MODE_UNSIGNED 1
#define FROM_FIXED_SIZE 8 /* in bytes */
#elif defined (TO_UTA)
#define TO_TYPE 4 /* ID for fixed-point */
#define TO_MODE_NAME UTA
#define TO_MODE_NAME_S uta
#define TO_INT_C_TYPE UDItype
#define TO_SINT_C_TYPE DItype
#define TO_UINT_C_TYPE UDItype
#define TO_MODE_UNSIGNED 1
#define TO_FIXED_SIZE 8 /* in bytes */
#endif
/* Same for TAmode */
typedef _Fract TAtype __attribute__ ((mode (TA)));
#if defined (TA_MODE)
#define FIXED_SIZE 8 /* in bytes */
#define INT_C_TYPE DItype
#define UINT_C_TYPE UDItype
#define HINT_C_TYPE SItype
#define HUINT_C_TYPE USItype
#define MODE_NAME TA
#define MODE_NAME_S ta
#define MODE_UNSIGNED 0
#endif
#if defined (FROM_TA)
#define FROM_TYPE 4 /* ID for fixed-point */
#define FROM_MODE_NAME TA
#define FROM_MODE_NAME_S ta
#define FROM_INT_C_TYPE DItype
#define FROM_SINT_C_TYPE DItype
#define FROM_UINT_C_TYPE UDItype
#define FROM_MODE_UNSIGNED 0
#define FROM_FIXED_SIZE 8 /* in bytes */
#elif defined (TO_TA)
#define TO_TYPE 4 /* ID for fixed-point */
#define TO_MODE_NAME TA
#define TO_MODE_NAME_S ta
#define TO_INT_C_TYPE DItype
#define TO_SINT_C_TYPE DItype
#define TO_UINT_C_TYPE UDItype
#define TO_MODE_UNSIGNED 0
#define TO_FIXED_SIZE 8 /* in bytes */
#endif

View File

@ -0,0 +1,874 @@
/* -*- Mode: Asm -*- */
;; Copyright (C) 2012
;; Free Software Foundation, Inc.
;; Contributed by Sean D'Epagnier (sean@depagnier.com)
;; Georg-Johann Lay (avr@gjlay.de)
;; This file is free software; you can redistribute it and/or modify it
;; under the terms of the GNU General Public License as published by the
;; Free Software Foundation; either version 3, or (at your option) any
;; later version.
;; In addition to the permissions in the GNU General Public License, the
;; Free Software Foundation gives you unlimited permission to link the
;; compiled version of this file into combinations with other programs,
;; and to distribute those combinations without any restriction coming
;; from the use of this file. (The General Public License restrictions
;; do apply in other respects; for example, they cover modification of
;; the file, and distribution when not linked into a combine
;; executable.)
;; This file is distributed in the hope that it will be useful, but
;; WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
;; General Public License for more details.
;; You should have received a copy of the GNU General Public License
;; along with this program; see the file COPYING. If not, write to
;; the Free Software Foundation, 51 Franklin Street, Fifth Floor,
;; Boston, MA 02110-1301, USA.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Fixed point library routines for AVR
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
.section .text.libgcc.fixed, "ax", @progbits
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Conversions to float
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
#if defined (L_fractqqsf)
DEFUN __fractqqsf
;; Move in place for SA -> SF conversion
clr r22
mov r23, r24
lsl r23
;; Sign-extend
sbc r24, r24
mov r25, r24
XJMP __fractsasf
ENDF __fractqqsf
#endif /* L_fractqqsf */
#if defined (L_fractuqqsf)
DEFUN __fractuqqsf
;; Move in place for USA -> SF conversion
clr r22
mov r23, r24
;; Zero-extend
clr r24
clr r25
XJMP __fractusasf
ENDF __fractuqqsf
#endif /* L_fractuqqsf */
#if defined (L_fracthqsf)
DEFUN __fracthqsf
;; Move in place for SA -> SF conversion
wmov 22, 24
lsl r22
rol r23
;; Sign-extend
sbc r24, r24
mov r25, r24
XJMP __fractsasf
ENDF __fracthqsf
#endif /* L_fracthqsf */
#if defined (L_fractuhqsf)
DEFUN __fractuhqsf
;; Move in place for USA -> SF conversion
wmov 22, 24
;; Zero-extend
clr r24
clr r25
XJMP __fractusasf
ENDF __fractuhqsf
#endif /* L_fractuhqsf */
#if defined (L_fracthasf)
DEFUN __fracthasf
;; Move in place for SA -> SF conversion
clr r22
mov r23, r24
mov r24, r25
;; Sign-extend
lsl r25
sbc r25, r25
XJMP __fractsasf
ENDF __fracthasf
#endif /* L_fracthasf */
#if defined (L_fractuhasf)
DEFUN __fractuhasf
;; Move in place for USA -> SF conversion
clr r22
mov r23, r24
mov r24, r25
;; Zero-extend
clr r25
XJMP __fractusasf
ENDF __fractuhasf
#endif /* L_fractuhasf */
#if defined (L_fractsqsf)
DEFUN __fractsqsf
XCALL __floatsisf
;; Divide non-zero results by 2^31 to move the
;; decimal point into place
tst r25
breq 0f
subi r24, exp_lo (31)
sbci r25, exp_hi (31)
0: ret
ENDF __fractsqsf
#endif /* L_fractsqsf */
#if defined (L_fractusqsf)
DEFUN __fractusqsf
XCALL __floatunsisf
;; Divide non-zero results by 2^32 to move the
;; decimal point into place
cpse r25, __zero_reg__
subi r25, exp_hi (32)
ret
ENDF __fractusqsf
#endif /* L_fractusqsf */
#if defined (L_fractsasf)
DEFUN __fractsasf
XCALL __floatsisf
;; Divide non-zero results by 2^16 to move the
;; decimal point into place
cpse r25, __zero_reg__
subi r25, exp_hi (16)
ret
ENDF __fractsasf
#endif /* L_fractsasf */
#if defined (L_fractusasf)
DEFUN __fractusasf
XCALL __floatunsisf
;; Divide non-zero results by 2^16 to move the
;; decimal point into place
cpse r25, __zero_reg__
subi r25, exp_hi (16)
ret
ENDF __fractusasf
#endif /* L_fractusasf */
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Conversions from float
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
#if defined (L_fractsfqq)
DEFUN __fractsfqq
;; Multiply with 2^{24+7} to get a QQ result in r25
subi r24, exp_lo (-31)
sbci r25, exp_hi (-31)
XCALL __fixsfsi
mov r24, r25
ret
ENDF __fractsfqq
#endif /* L_fractsfqq */
#if defined (L_fractsfuqq)
DEFUN __fractsfuqq
;; Multiply with 2^{24+8} to get a UQQ result in r25
subi r25, exp_hi (-32)
XCALL __fixunssfsi
mov r24, r25
ret
ENDF __fractsfuqq
#endif /* L_fractsfuqq */
#if defined (L_fractsfha)
DEFUN __fractsfha
;; Multiply with 2^24 to get a HA result in r25:r24
subi r25, exp_hi (-24)
XJMP __fixsfsi
ENDF __fractsfha
#endif /* L_fractsfha */
#if defined (L_fractsfuha)
DEFUN __fractsfuha
;; Multiply with 2^24 to get a UHA result in r25:r24
subi r25, exp_hi (-24)
XJMP __fixunssfsi
ENDF __fractsfuha
#endif /* L_fractsfuha */
#if defined (L_fractsfhq)
DEFUN __fractsfsq
ENDF __fractsfsq
DEFUN __fractsfhq
;; Multiply with 2^{16+15} to get a HQ result in r25:r24
;; resp. with 2^31 to get a SQ result in r25:r22
subi r24, exp_lo (-31)
sbci r25, exp_hi (-31)
XJMP __fixsfsi
ENDF __fractsfhq
#endif /* L_fractsfhq */
#if defined (L_fractsfuhq)
DEFUN __fractsfusq
ENDF __fractsfusq
DEFUN __fractsfuhq
;; Multiply with 2^{16+16} to get a UHQ result in r25:r24
;; resp. with 2^32 to get a USQ result in r25:r22
subi r25, exp_hi (-32)
XJMP __fixunssfsi
ENDF __fractsfuhq
#endif /* L_fractsfuhq */
#if defined (L_fractsfsa)
DEFUN __fractsfsa
;; Multiply with 2^16 to get a SA result in r25:r22
subi r25, exp_hi (-16)
XJMP __fixsfsi
ENDF __fractsfsa
#endif /* L_fractsfsa */
#if defined (L_fractsfusa)
DEFUN __fractsfusa
;; Multiply with 2^16 to get a USA result in r25:r22
subi r25, exp_hi (-16)
XJMP __fixunssfsi
ENDF __fractsfusa
#endif /* L_fractsfusa */
;; For multiplication the functions here are called directly from
;; avr-fixed.md instead of using the standard libcall mechanisms.
;; This can make better code because GCC knows exactly which
;; of the call-used registers (not all of them) are clobbered. */
/*******************************************************
Fractional Multiplication 8 x 8 without MUL
*******************************************************/
#if defined (L_mulqq3) && !defined (__AVR_HAVE_MUL__)
;;; R23 = R24 * R25
;;; Clobbers: __tmp_reg__, R22, R24, R25
;;; Rounding: ???
DEFUN __mulqq3
XCALL __fmuls
;; TR 18037 requires that (-1) * (-1) does not overflow
;; The only input that can produce -1 is (-1)^2.
dec r23
brvs 0f
inc r23
0: ret
ENDF __mulqq3
#endif /* L_mulqq3 && ! HAVE_MUL */
/*******************************************************
Fractional Multiply .16 x .16 with and without MUL
*******************************************************/
#if defined (L_mulhq3)
;;; Same code with and without MUL, but the interfaces differ:
;;; no MUL: (R25:R24) = (R22:R23) * (R24:R25)
;;; Clobbers: ABI, called by optabs
;;; MUL: (R25:R24) = (R19:R18) * (R27:R26)
;;; Clobbers: __tmp_reg__, R22, R23
;;; Rounding: -0.5 LSB <= error <= 0.5 LSB
DEFUN __mulhq3
XCALL __mulhisi3
;; Shift result into place
lsl r23
rol r24
rol r25
brvs 1f
;; Round
sbrc r23, 7
adiw r24, 1
ret
1: ;; Overflow. TR 18037 requires (-1)^2 not to overflow
ldi r24, lo8 (0x7fff)
ldi r25, hi8 (0x7fff)
ret
ENDF __mulhq3
#endif /* defined (L_mulhq3) */
#if defined (L_muluhq3)
;;; Same code with and without MUL, but the interfaces differ:
;;; no MUL: (R25:R24) *= (R23:R22)
;;; Clobbers: ABI, called by optabs
;;; MUL: (R25:R24) = (R19:R18) * (R27:R26)
;;; Clobbers: __tmp_reg__, R22, R23
;;; Rounding: -0.5 LSB < error <= 0.5 LSB
DEFUN __muluhq3
XCALL __umulhisi3
;; Round
sbrc r23, 7
adiw r24, 1
ret
ENDF __muluhq3
#endif /* L_muluhq3 */
/*******************************************************
Fixed Multiply 8.8 x 8.8 with and without MUL
*******************************************************/
#if defined (L_mulha3)
;;; Same code with and without MUL, but the interfaces differ:
;;; no MUL: (R25:R24) = (R22:R23) * (R24:R25)
;;; Clobbers: ABI, called by optabs
;;; MUL: (R25:R24) = (R19:R18) * (R27:R26)
;;; Clobbers: __tmp_reg__, R22, R23
;;; Rounding: -0.5 LSB <= error <= 0.5 LSB
DEFUN __mulha3
XCALL __mulhisi3
XJMP __muluha3_round
ENDF __mulha3
#endif /* L_mulha3 */
#if defined (L_muluha3)
;;; Same code with and without MUL, but the interfaces differ:
;;; no MUL: (R25:R24) *= (R23:R22)
;;; Clobbers: ABI, called by optabs
;;; MUL: (R25:R24) = (R19:R18) * (R27:R26)
;;; Clobbers: __tmp_reg__, R22, R23
;;; Rounding: -0.5 LSB < error <= 0.5 LSB
DEFUN __muluha3
XCALL __umulhisi3
XJMP __muluha3_round
ENDF __muluha3
#endif /* L_muluha3 */
#if defined (L_muluha3_round)
DEFUN __muluha3_round
;; Shift result into place
mov r25, r24
mov r24, r23
;; Round
sbrc r22, 7
adiw r24, 1
ret
ENDF __muluha3_round
#endif /* L_muluha3_round */
/*******************************************************
Fixed Multiplication 16.16 x 16.16
*******************************************************/
#if defined (__AVR_HAVE_MUL__)
;; Multiplier
#define A0 16
#define A1 A0+1
#define A2 A1+1
#define A3 A2+1
;; Multiplicand
#define B0 20
#define B1 B0+1
#define B2 B1+1
#define B3 B2+1
;; Result
#define C0 24
#define C1 C0+1
#define C2 C1+1
#define C3 C2+1
#if defined (L_mulusa3)
;;; (C3:C0) = (A3:A0) * (B3:B0)
;;; Clobbers: __tmp_reg__
;;; Rounding: -0.5 LSB < error <= 0.5 LSB
DEFUN __mulusa3
;; Some of the MUL instructions have LSBs outside the result.
;; Don't ignore these LSBs in order to tame rounding error.
;; Use C2/C3 for these LSBs.
clr C0
clr C1
mul A0, B0 $ movw C2, r0
mul A1, B0 $ add C3, r0 $ adc C0, r1
mul A0, B1 $ add C3, r0 $ adc C0, r1 $ rol C1
;; Round
sbrc C3, 7
adiw C0, 1
;; The following MULs don't have LSBs outside the result.
;; C2/C3 is the high part.
mul A0, B2 $ add C0, r0 $ adc C1, r1 $ sbc C2, C2
mul A1, B1 $ add C0, r0 $ adc C1, r1 $ sbci C2, 0
mul A2, B0 $ add C0, r0 $ adc C1, r1 $ sbci C2, 0
neg C2
mul A0, B3 $ add C1, r0 $ adc C2, r1 $ sbc C3, C3
mul A1, B2 $ add C1, r0 $ adc C2, r1 $ sbci C3, 0
mul A2, B1 $ add C1, r0 $ adc C2, r1 $ sbci C3, 0
mul A3, B0 $ add C1, r0 $ adc C2, r1 $ sbci C3, 0
neg C3
mul A1, B3 $ add C2, r0 $ adc C3, r1
mul A2, B2 $ add C2, r0 $ adc C3, r1
mul A3, B1 $ add C2, r0 $ adc C3, r1
mul A2, B3 $ add C3, r0
mul A3, B2 $ add C3, r0
clr __zero_reg__
ret
ENDF __mulusa3
#endif /* L_mulusa3 */
#if defined (L_mulsa3)
;;; (C3:C0) = (A3:A0) * (B3:B0)
;;; Clobbers: __tmp_reg__
;;; Rounding: -0.5 LSB <= error <= 0.5 LSB
DEFUN __mulsa3
XCALL __mulusa3
tst B3
brpl 1f
sub C2, A0
sbc C3, A1
1: sbrs A3, 7
ret
sub C2, B0
sbc C3, B1
ret
ENDF __mulsa3
#endif /* L_mulsa3 */
#undef A0
#undef A1
#undef A2
#undef A3
#undef B0
#undef B1
#undef B2
#undef B3
#undef C0
#undef C1
#undef C2
#undef C3
#else /* __AVR_HAVE_MUL__ */
#define A0 18
#define A1 A0+1
#define A2 A0+2
#define A3 A0+3
#define B0 22
#define B1 B0+1
#define B2 B0+2
#define B3 B0+3
#define C0 22
#define C1 C0+1
#define C2 C0+2
#define C3 C0+3
;; __tmp_reg__
#define CC0 0
;; __zero_reg__
#define CC1 1
#define CC2 16
#define CC3 17
#define AA0 26
#define AA1 AA0+1
#define AA2 30
#define AA3 AA2+1
#if defined (L_mulsa3)
;;; (R25:R22) *= (R21:R18)
;;; Clobbers: ABI, called by optabs
;;; Rounding: -1 LSB <= error <= 1 LSB
DEFUN __mulsa3
push B0
push B1
bst B3, 7
XCALL __mulusa3
;; A survived in 31:30:27:26
rcall 1f
pop AA1
pop AA0
bst AA3, 7
1: brtc 9f
;; 1-extend A/B
sub C2, AA0
sbc C3, AA1
9: ret
ENDF __mulsa3
#endif /* L_mulsa3 */
#if defined (L_mulusa3)
;;; (R25:R22) *= (R21:R18)
;;; Clobbers: ABI, called by optabs and __mulsua
;;; Rounding: -1 LSB <= error <= 1 LSB
;;; Does not clobber T and A[] survives in 26, 27, 30, 31
DEFUN __mulusa3
push CC2
push CC3
; clear result
clr __tmp_reg__
wmov CC2, CC0
; save multiplicand
wmov AA0, A0
wmov AA2, A2
rjmp 3f
;; Loop the integral part
1: ;; CC += A * 2^n; n >= 0
add CC0,A0 $ adc CC1,A1 $ adc CC2,A2 $ adc CC3,A3
2: ;; A <<= 1
lsl A0 $ rol A1 $ rol A2 $ rol A3
3: ;; IBIT(B) >>= 1
;; Carry = n-th bit of B; n >= 0
lsr B3
ror B2
brcs 1b
sbci B3, 0
brne 2b
;; Loop the fractional part
;; B2/B3 is 0 now, use as guard bits for rounding
;; Restore multiplicand
wmov A0, AA0
wmov A2, AA2
rjmp 5f
4: ;; CC += A:Guard * 2^n; n < 0
add B3,B2 $ adc CC0,A0 $ adc CC1,A1 $ adc CC2,A2 $ adc CC3,A3
5:
;; A:Guard >>= 1
lsr A3 $ ror A2 $ ror A1 $ ror A0 $ ror B2
;; FBIT(B) <<= 1
;; Carry = n-th bit of B; n < 0
lsl B0
rol B1
brcs 4b
sbci B0, 0
brne 5b
;; Move result into place and round
lsl B3
wmov C2, CC2
wmov C0, CC0
clr __zero_reg__
adc C0, __zero_reg__
adc C1, __zero_reg__
adc C2, __zero_reg__
adc C3, __zero_reg__
;; Epilogue
pop CC3
pop CC2
ret
ENDF __mulusa3
#endif /* L_mulusa3 */
#undef A0
#undef A1
#undef A2
#undef A3
#undef B0
#undef B1
#undef B2
#undef B3
#undef C0
#undef C1
#undef C2
#undef C3
#undef AA0
#undef AA1
#undef AA2
#undef AA3
#undef CC0
#undef CC1
#undef CC2
#undef CC3
#endif /* __AVR_HAVE_MUL__ */
/*******************************************************
Fractional Division 8 / 8
*******************************************************/
#define r_divd r25 /* dividend */
#define r_quo r24 /* quotient */
#define r_div r22 /* divisor */
#if defined (L_divqq3)
DEFUN __divqq3
mov r0, r_divd
eor r0, r_div
sbrc r_div, 7
neg r_div
sbrc r_divd, 7
neg r_divd
cp r_divd, r_div
breq __divqq3_minus1 ; if equal return -1
XCALL __udivuqq3
lsr r_quo
sbrc r0, 7 ; negate result if needed
neg r_quo
ret
__divqq3_minus1:
ldi r_quo, 0x80
ret
ENDF __divqq3
#endif /* defined (L_divqq3) */
#if defined (L_udivuqq3)
DEFUN __udivuqq3
clr r_quo ; clear quotient
inc __zero_reg__ ; init loop counter, used per shift
__udivuqq3_loop:
lsl r_divd ; shift dividend
brcs 0f ; dividend overflow
cp r_divd,r_div ; compare dividend & divisor
brcc 0f ; dividend >= divisor
rol r_quo ; shift quotient (with CARRY)
rjmp __udivuqq3_cont
0:
sub r_divd,r_div ; restore dividend
lsl r_quo ; shift quotient (without CARRY)
__udivuqq3_cont:
lsl __zero_reg__ ; shift loop-counter bit
brne __udivuqq3_loop
com r_quo ; complement result
; because C flag was complemented in loop
ret
ENDF __udivuqq3
#endif /* defined (L_udivuqq3) */
#undef r_divd
#undef r_quo
#undef r_div
/*******************************************************
Fractional Division 16 / 16
*******************************************************/
#define r_divdL 26 /* dividend Low */
#define r_divdH 27 /* dividend Hig */
#define r_quoL 24 /* quotient Low */
#define r_quoH 25 /* quotient High */
#define r_divL 22 /* divisor */
#define r_divH 23 /* divisor */
#define r_cnt 21
#if defined (L_divhq3)
DEFUN __divhq3
mov r0, r_divdH
eor r0, r_divH
sbrs r_divH, 7
rjmp 1f
NEG2 r_divL
1:
sbrs r_divdH, 7
rjmp 2f
NEG2 r_divdL
2:
cp r_divdL, r_divL
cpc r_divdH, r_divH
breq __divhq3_minus1 ; if equal return -1
XCALL __udivuhq3
lsr r_quoH
ror r_quoL
brpl 9f
;; negate result if needed
NEG2 r_quoL
9:
ret
__divhq3_minus1:
ldi r_quoH, 0x80
clr r_quoL
ret
ENDF __divhq3
#endif /* defined (L_divhq3) */
#if defined (L_udivuhq3)
DEFUN __udivuhq3
sub r_quoH,r_quoH ; clear quotient and carry
;; FALLTHRU
ENDF __udivuhq3
DEFUN __udivuha3_common
clr r_quoL ; clear quotient
ldi r_cnt,16 ; init loop counter
__udivuhq3_loop:
rol r_divdL ; shift dividend (with CARRY)
rol r_divdH
brcs __udivuhq3_ep ; dividend overflow
cp r_divdL,r_divL ; compare dividend & divisor
cpc r_divdH,r_divH
brcc __udivuhq3_ep ; dividend >= divisor
rol r_quoL ; shift quotient (with CARRY)
rjmp __udivuhq3_cont
__udivuhq3_ep:
sub r_divdL,r_divL ; restore dividend
sbc r_divdH,r_divH
lsl r_quoL ; shift quotient (without CARRY)
__udivuhq3_cont:
rol r_quoH ; shift quotient
dec r_cnt ; decrement loop counter
brne __udivuhq3_loop
com r_quoL ; complement result
com r_quoH ; because C flag was complemented in loop
ret
ENDF __udivuha3_common
#endif /* defined (L_udivuhq3) */
/*******************************************************
Fixed Division 8.8 / 8.8
*******************************************************/
#if defined (L_divha3)
DEFUN __divha3
mov r0, r_divdH
eor r0, r_divH
sbrs r_divH, 7
rjmp 1f
NEG2 r_divL
1:
sbrs r_divdH, 7
rjmp 2f
NEG2 r_divdL
2:
XCALL __udivuha3
sbrs r0, 7 ; negate result if needed
ret
NEG2 r_quoL
ret
ENDF __divha3
#endif /* defined (L_divha3) */
#if defined (L_udivuha3)
DEFUN __udivuha3
mov r_quoH, r_divdL
mov r_divdL, r_divdH
clr r_divdH
lsl r_quoH ; shift quotient into carry
XJMP __udivuha3_common ; same as fractional after rearrange
ENDF __udivuha3
#endif /* defined (L_udivuha3) */
#undef r_divdL
#undef r_divdH
#undef r_quoL
#undef r_quoH
#undef r_divL
#undef r_divH
#undef r_cnt
/*******************************************************
Fixed Division 16.16 / 16.16
*******************************************************/
#define r_arg1L 24 /* arg1 gets passed already in place */
#define r_arg1H 25
#define r_arg1HL 26
#define r_arg1HH 27
#define r_divdL 26 /* dividend Low */
#define r_divdH 27
#define r_divdHL 30
#define r_divdHH 31 /* dividend High */
#define r_quoL 22 /* quotient Low */
#define r_quoH 23
#define r_quoHL 24
#define r_quoHH 25 /* quotient High */
#define r_divL 18 /* divisor Low */
#define r_divH 19
#define r_divHL 20
#define r_divHH 21 /* divisor High */
#define r_cnt __zero_reg__ /* loop count (0 after the loop!) */
#if defined (L_divsa3)
DEFUN __divsa3
mov r0, r_arg1HH
eor r0, r_divHH
sbrs r_divHH, 7
rjmp 1f
NEG4 r_divL
1:
sbrs r_arg1HH, 7
rjmp 2f
NEG4 r_arg1L
2:
XCALL __udivusa3
sbrs r0, 7 ; negate result if needed
ret
NEG4 r_quoL
ret
ENDF __divsa3
#endif /* defined (L_divsa3) */
#if defined (L_udivusa3)
DEFUN __udivusa3
ldi r_divdHL, 32 ; init loop counter
mov r_cnt, r_divdHL
clr r_divdHL
clr r_divdHH
wmov r_quoL, r_divdHL
lsl r_quoHL ; shift quotient into carry
rol r_quoHH
__udivusa3_loop:
rol r_divdL ; shift dividend (with CARRY)
rol r_divdH
rol r_divdHL
rol r_divdHH
brcs __udivusa3_ep ; dividend overflow
cp r_divdL,r_divL ; compare dividend & divisor
cpc r_divdH,r_divH
cpc r_divdHL,r_divHL
cpc r_divdHH,r_divHH
brcc __udivusa3_ep ; dividend >= divisor
rol r_quoL ; shift quotient (with CARRY)
rjmp __udivusa3_cont
__udivusa3_ep:
sub r_divdL,r_divL ; restore dividend
sbc r_divdH,r_divH
sbc r_divdHL,r_divHL
sbc r_divdHH,r_divHH
lsl r_quoL ; shift quotient (without CARRY)
__udivusa3_cont:
rol r_quoH ; shift quotient
rol r_quoHL
rol r_quoHH
dec r_cnt ; decrement loop counter
brne __udivusa3_loop
com r_quoL ; complement result
com r_quoH ; because C flag was complemented in loop
com r_quoHL
com r_quoHH
ret
ENDF __udivusa3
#endif /* defined (L_udivusa3) */
#undef r_arg1L
#undef r_arg1H
#undef r_arg1HL
#undef r_arg1HH
#undef r_divdL
#undef r_divdH
#undef r_divdHL
#undef r_divdHH
#undef r_quoL
#undef r_quoH
#undef r_quoHL
#undef r_quoHH
#undef r_divL
#undef r_divH
#undef r_divHL
#undef r_divHH
#undef r_cnt

View File

@ -91,6 +91,35 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
.endfunc
.endm
;; Negate a 2-byte value held in consecutive registers
.macro NEG2 reg
com \reg+1
neg \reg
sbci \reg+1, -1
.endm
;; Negate a 4-byte value held in consecutive registers
.macro NEG4 reg
com \reg+3
com \reg+2
com \reg+1
.if \reg >= 16
neg \reg
sbci \reg+1, -1
sbci \reg+2, -1
sbci \reg+3, -1
.else
com \reg
adc \reg, __zero_reg__
adc \reg+1, __zero_reg__
adc \reg+2, __zero_reg__
adc \reg+3, __zero_reg__
.endif
.endm
#define exp_lo(N) hlo8 ((N) << 23)
#define exp_hi(N) hhi8 ((N) << 23)
.section .text.libgcc.mul, "ax", @progbits
@ -126,175 +155,246 @@ ENDF __mulqi3
#endif /* defined (L_mulqi3) */
#if defined (L_mulqihi3)
DEFUN __mulqihi3
clr r25
sbrc r24, 7
dec r25
clr r23
sbrc r22, 7
dec r22
XJMP __mulhi3
ENDF __mulqihi3:
#endif /* defined (L_mulqihi3) */
#if defined (L_umulqihi3)
DEFUN __umulqihi3
clr r25
clr r23
XJMP __mulhi3
ENDF __umulqihi3
#endif /* defined (L_umulqihi3) */
/*******************************************************
Widening Multiplication 16 = 8 x 8 without MUL
Multiplication 16 x 16 without MUL
*******************************************************/
#define A0 r22
#define A1 r23
#define B0 r24
#define BB0 r20
#define B1 r25
;; Output overlaps input, thus expand result in CC0/1
#define C0 r24
#define C1 r25
#define CC0 __tmp_reg__
#define CC1 R21
#if defined (L_umulqihi3)
;;; R25:R24 = (unsigned int) R22 * (unsigned int) R24
;;; (C1:C0) = (unsigned int) A0 * (unsigned int) B0
;;; Clobbers: __tmp_reg__, R21..R23
DEFUN __umulqihi3
clr A1
clr B1
XJMP __mulhi3
ENDF __umulqihi3
#endif /* L_umulqihi3 */
#if defined (L_mulqihi3)
;;; R25:R24 = (signed int) R22 * (signed int) R24
;;; (C1:C0) = (signed int) A0 * (signed int) B0
;;; Clobbers: __tmp_reg__, R20..R23
DEFUN __mulqihi3
;; Sign-extend B0
clr B1
sbrc B0, 7
com B1
;; The multiplication runs twice as fast if A1 is zero, thus:
;; Zero-extend A0
clr A1
#ifdef __AVR_HAVE_JMP_CALL__
;; Store B0 * sign of A
clr BB0
sbrc A0, 7
mov BB0, B0
call __mulhi3
#else /* have no CALL */
;; Skip sign-extension of A if A >= 0
;; Same size as with the first alternative but avoids errata skip
;; and is faster if A >= 0
sbrs A0, 7
rjmp __mulhi3
;; If A < 0 store B
mov BB0, B0
rcall __mulhi3
#endif /* HAVE_JMP_CALL */
;; 1-extend A after the multiplication
sub C1, BB0
ret
ENDF __mulqihi3
#endif /* L_mulqihi3 */
#if defined (L_mulhi3)
#define r_arg1L r24 /* multiplier Low */
#define r_arg1H r25 /* multiplier High */
#define r_arg2L r22 /* multiplicand Low */
#define r_arg2H r23 /* multiplicand High */
#define r_resL __tmp_reg__ /* result Low */
#define r_resH r21 /* result High */
;;; R25:R24 = R23:R22 * R25:R24
;;; (C1:C0) = (A1:A0) * (B1:B0)
;;; Clobbers: __tmp_reg__, R21..R23
DEFUN __mulhi3
clr r_resH ; clear result
clr r_resL ; clear result
__mulhi3_loop:
sbrs r_arg1L,0
rjmp __mulhi3_skip1
add r_resL,r_arg2L ; result + multiplicand
adc r_resH,r_arg2H
__mulhi3_skip1:
add r_arg2L,r_arg2L ; shift multiplicand
adc r_arg2H,r_arg2H
cp r_arg2L,__zero_reg__
cpc r_arg2H,__zero_reg__
breq __mulhi3_exit ; while multiplicand != 0
;; Clear result
clr CC0
clr CC1
rjmp 3f
1:
;; Bit n of A is 1 --> C += B << n
add CC0, B0
adc CC1, B1
2:
lsl B0
rol B1
3:
;; If B == 0 we are ready
sbiw B0, 0
breq 9f
lsr r_arg1H ; gets LSB of multiplier
ror r_arg1L
sbiw r_arg1L,0
brne __mulhi3_loop ; exit if multiplier = 0
__mulhi3_exit:
mov r_arg1H,r_resH ; result to return register
mov r_arg1L,r_resL
ret
ENDF __mulhi3
;; Carry = n-th bit of A
lsr A1
ror A0
;; If bit n of A is set, then go add B * 2^n to C
brcs 1b
#undef r_arg1L
#undef r_arg1H
#undef r_arg2L
#undef r_arg2H
#undef r_resL
#undef r_resH
;; Carry = 0 --> The ROR above acts like CP A0, 0
;; Thus, it is sufficient to CPC the high part to test A against 0
cpc A1, __zero_reg__
;; Only proceed if A != 0
brne 2b
9:
;; Move Result into place
mov C0, CC0
mov C1, CC1
ret
ENDF __mulhi3
#endif /* L_mulhi3 */
#endif /* defined (L_mulhi3) */
#undef A0
#undef A1
#undef B0
#undef BB0
#undef B1
#undef C0
#undef C1
#undef CC0
#undef CC1
#define A0 22
#define A1 A0+1
#define A2 A0+2
#define A3 A0+3
#define B0 18
#define B1 B0+1
#define B2 B0+2
#define B3 B0+3
#define CC0 26
#define CC1 CC0+1
#define CC2 30
#define CC3 CC2+1
#define C0 22
#define C1 C0+1
#define C2 C0+2
#define C3 C0+3
/*******************************************************
Widening Multiplication 32 = 16 x 16 without MUL
*******************************************************/
#if defined (L_mulhisi3)
DEFUN __mulhisi3
;;; FIXME: This is dead code (noone calls it)
mov_l r18, r24
mov_h r19, r25
clr r24
sbrc r23, 7
dec r24
mov r25, r24
clr r20
sbrc r19, 7
dec r20
mov r21, r20
XJMP __mulsi3
ENDF __mulhisi3
#endif /* defined (L_mulhisi3) */
#if defined (L_umulhisi3)
DEFUN __umulhisi3
;;; FIXME: This is dead code (noone calls it)
mov_l r18, r24
mov_h r19, r25
clr r24
clr r25
mov_l r20, r24
mov_h r21, r25
wmov B0, 24
;; Zero-extend B
clr B2
clr B3
;; Zero-extend A
wmov A2, B2
XJMP __mulsi3
ENDF __umulhisi3
#endif /* defined (L_umulhisi3) */
#endif /* L_umulhisi3 */
#if defined (L_mulhisi3)
DEFUN __mulhisi3
wmov B0, 24
;; Sign-extend B
lsl r25
sbc B2, B2
mov B3, B2
#ifdef __AVR_ERRATA_SKIP_JMP_CALL__
;; Sign-extend A
clr A2
sbrc A1, 7
com A2
mov A3, A2
XJMP __mulsi3
#else /* no __AVR_ERRATA_SKIP_JMP_CALL__ */
;; Zero-extend A and __mulsi3 will run at least twice as fast
;; compared to a sign-extended A.
clr A2
clr A3
sbrs A1, 7
XJMP __mulsi3
;; If A < 0 then perform the B * 0xffff.... before the
;; very multiplication by initializing the high part of the
;; result CC with -B.
wmov CC2, A2
sub CC2, B0
sbc CC3, B1
XJMP __mulsi3_helper
#endif /* __AVR_ERRATA_SKIP_JMP_CALL__ */
ENDF __mulhisi3
#endif /* L_mulhisi3 */
#if defined (L_mulsi3)
/*******************************************************
Multiplication 32 x 32 without MUL
*******************************************************/
#define r_arg1L r22 /* multiplier Low */
#define r_arg1H r23
#define r_arg1HL r24
#define r_arg1HH r25 /* multiplier High */
#define r_arg2L r18 /* multiplicand Low */
#define r_arg2H r19
#define r_arg2HL r20
#define r_arg2HH r21 /* multiplicand High */
#define r_resL r26 /* result Low */
#define r_resH r27
#define r_resHL r30
#define r_resHH r31 /* result High */
#if defined (L_mulsi3)
DEFUN __mulsi3
clr r_resHH ; clear result
clr r_resHL ; clear result
clr r_resH ; clear result
clr r_resL ; clear result
__mulsi3_loop:
sbrs r_arg1L,0
rjmp __mulsi3_skip1
add r_resL,r_arg2L ; result + multiplicand
adc r_resH,r_arg2H
adc r_resHL,r_arg2HL
adc r_resHH,r_arg2HH
__mulsi3_skip1:
add r_arg2L,r_arg2L ; shift multiplicand
adc r_arg2H,r_arg2H
adc r_arg2HL,r_arg2HL
adc r_arg2HH,r_arg2HH
lsr r_arg1HH ; gets LSB of multiplier
ror r_arg1HL
ror r_arg1H
ror r_arg1L
brne __mulsi3_loop
sbiw r_arg1HL,0
cpc r_arg1H,r_arg1L
brne __mulsi3_loop ; exit if multiplier = 0
__mulsi3_exit:
mov_h r_arg1HH,r_resHH ; result to return register
mov_l r_arg1HL,r_resHL
mov_h r_arg1H,r_resH
mov_l r_arg1L,r_resL
ret
ENDF __mulsi3
;; Clear result
clr CC2
clr CC3
;; FALLTHRU
ENDF __mulsi3
#undef r_arg1L
#undef r_arg1H
#undef r_arg1HL
#undef r_arg1HH
#undef r_arg2L
#undef r_arg2H
#undef r_arg2HL
#undef r_arg2HH
#undef r_resL
#undef r_resH
#undef r_resHL
#undef r_resHH
DEFUN __mulsi3_helper
clr CC0
clr CC1
rjmp 3f
#endif /* defined (L_mulsi3) */
1: ;; If bit n of A is set, then add B * 2^n to the result in CC
;; CC += B
add CC0,B0 $ adc CC1,B1 $ adc CC2,B2 $ adc CC3,B3
2: ;; B <<= 1
lsl B0 $ rol B1 $ rol B2 $ rol B3
3: ;; A >>= 1: Carry = n-th bit of A
lsr A3 $ ror A2 $ ror A1 $ ror A0
brcs 1b
;; Only continue if A != 0
sbci A1, 0
brne 2b
sbiw A2, 0
brne 2b
;; All bits of A are consumed: Copy result to return register C
wmov C0, CC0
wmov C2, CC2
ret
ENDF __mulsi3_helper
#endif /* L_mulsi3 */
#undef A0
#undef A1
#undef A2
#undef A3
#undef B0
#undef B1
#undef B2
#undef B3
#undef C0
#undef C1
#undef C2
#undef C3
#undef CC0
#undef CC1
#undef CC2
#undef CC3
#endif /* !defined (__AVR_HAVE_MUL__) */
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
@ -316,7 +416,7 @@ ENDF __mulsi3
#define C3 C0+3
/*******************************************************
Widening Multiplication 32 = 16 x 16
Widening Multiplication 32 = 16 x 16 with MUL
*******************************************************/
#if defined (L_mulhisi3)
@ -364,7 +464,17 @@ DEFUN __umulhisi3
mul A1, B1
movw C2, r0
mul A0, B1
#ifdef __AVR_HAVE_JMP_CALL__
;; This function is used by many other routines, often multiple times.
;; Therefore, if the flash size is not too limited, avoid the RCALL
;; and inverst 6 Bytes to speed things up.
add C1, r0
adc C2, r1
clr __zero_reg__
adc C3, __zero_reg__
#else
rcall 1f
#endif
mul A1, B0
1: add C1, r0
adc C2, r1
@ -375,7 +485,7 @@ ENDF __umulhisi3
#endif /* L_umulhisi3 */
/*******************************************************
Widening Multiplication 32 = 16 x 32
Widening Multiplication 32 = 16 x 32 with MUL
*******************************************************/
#if defined (L_mulshisi3)
@ -425,7 +535,7 @@ ENDF __muluhisi3
#endif /* L_muluhisi3 */
/*******************************************************
Multiplication 32 x 32
Multiplication 32 x 32 with MUL
*******************************************************/
#if defined (L_mulsi3)
@ -468,7 +578,7 @@ ENDF __mulsi3
#endif /* __AVR_HAVE_MUL__ */
/*******************************************************
Multiplication 24 x 24
Multiplication 24 x 24 with MUL
*******************************************************/
#if defined (L_mulpsi3)
@ -1247,6 +1357,19 @@ __divmodsi4_exit:
ENDF __divmodsi4
#endif /* defined (L_divmodsi4) */
#undef r_remHH
#undef r_remHL
#undef r_remH
#undef r_remL
#undef r_arg1HH
#undef r_arg1HL
#undef r_arg1H
#undef r_arg1L
#undef r_arg2HH
#undef r_arg2HL
#undef r_arg2H
#undef r_arg2L
#undef r_cnt
/*******************************************************
Division 64 / 64
@ -2757,9 +2880,7 @@ DEFUN __fmulsu_exit
XJMP __fmul
1: XCALL __fmul
;; C = -C iff A0.7 = 1
com C1
neg C0
sbci C1, -1
NEG2 C0
ret
ENDF __fmulsu_exit
#endif /* L_fmulsu */
@ -2794,3 +2915,5 @@ ENDF __fmul
#undef B1
#undef C0
#undef C1
#include "lib1funcs-fixed.S"

View File

@ -2,6 +2,7 @@ LIB1ASMSRC = avr/lib1funcs.S
LIB1ASMFUNCS = \
_mulqi3 \
_mulhi3 \
_mulqihi3 _umulqihi3 \
_mulpsi3 _mulsqipsi3 \
_mulhisi3 \
_umulhisi3 \
@ -55,6 +56,24 @@ LIB1ASMFUNCS = \
_cmpdi2 _cmpdi2_s8 \
_fmul _fmuls _fmulsu
# Fixed point routines in avr/lib1funcs-fixed.S
LIB1ASMFUNCS += \
_fractqqsf _fractuqqsf \
_fracthqsf _fractuhqsf _fracthasf _fractuhasf \
_fractsasf _fractusasf _fractsqsf _fractusqsf \
\
_fractsfqq _fractsfuqq \
_fractsfhq _fractsfuhq _fractsfha _fractsfuha \
_fractsfsa _fractsfusa \
_mulqq3 \
_mulhq3 _muluhq3 \
_mulha3 _muluha3 _muluha3_round \
_mulsa3 _mulusa3 \
_divqq3 _udivuqq3 \
_divhq3 _udivuhq3 \
_divha3 _udivuha3 \
_divsa3 _udivusa3
LIB2FUNCS_EXCLUDE = \
_moddi3 _umoddi3 \
_clz
@ -81,3 +100,49 @@ libgcc-objects += $(patsubst %,%$(objext),$(hiintfuncs16))
ifeq ($(enable_shared),yes)
libgcc-s-objects += $(patsubst %,%_s$(objext),$(hiintfuncs16))
endif
# Filter out supported conversions from fixed-bit.c
conv_XY=$(conv)$(mode1)$(mode2)
conv_X=$(conv)$(mode)
# Conversions supported by the compiler
convf_modes = QI UQI QQ UQQ \
HI UHI HQ UHQ HA UHA \
SI USI SQ USQ SA USA \
DI UDI DQ UDQ DA UDA \
TI UTI TQ UTQ TA UTA
LIB2FUNCS_EXCLUDE += \
$(foreach conv,_fract _fractuns,\
$(foreach mode1,$(convf_modes),\
$(foreach mode2,$(convf_modes),$(conv_XY))))
# Conversions supported by lib1funcs-fixed.S
conv_to_sf_modes = QQ UQQ HQ UHQ HA UHA SQ USQ SA USA
conv_from_sf_modes = QQ UQQ HQ UHQ HA UHA SA USA
LIB2FUNCS_EXCLUDE += \
$(foreach conv,_fract, \
$(foreach mode1,$(conv_to_sf_modes), \
$(foreach mode2,SF,$(conv_XY))))
LIB2FUNCS_EXCLUDE += \
$(foreach conv,_fract,\
$(foreach mode1,SF,\
$(foreach mode2,$(conv_from_sf_modes),$(conv_XY))))
# Arithmetik supported by the compiler
allfix_modes = QQ UQQ HQ UHQ HA UHA SQ USQ SA USA DA UDA DQ UDQ TQ UTQ TA UTA
LIB2FUNCS_EXCLUDE += \
$(foreach conv,_add _sub,\
$(foreach mode,$(allfix_modes),$(conv_X)3))
LIB2FUNCS_EXCLUDE += \
$(foreach conv,_lshr _ashl _ashr _cmp,\
$(foreach mode,$(allfix_modes),$(conv_X)))