Continuing the preparation for additional _FloatN / _FloatNx type
support, this patch prepares __MATH_TG to handle more such types.
Various unhandled cases, which do not correspond to any current glibc
configuration, have explicit #errors added. _Float32 and _Float64x
are then handled appropriately in the _Generic case, which is the only
one, other than the cases where use of sizeof is sufficient, where
they should ever be explicit types at the language level instead of
typedefs. There is no need to handle _Float64 or _Float32x explicitly
there because the default case calling a double function is correct
for those types.
Tested for x86_64.
* math/math.h [__HAVE_DISTINCT_FLOAT16
|| __HAVE_DISTINCT_FLOAT32 || __HAVE_DISTINCT_FLOAT64
|| __HAVE_DISTINCT_FLOAT32X || __HAVE_DISTINCT_FLOAT64X
|| __HAVE_DISTINCT_FLOAT128X]: Use #error.
[__NO_LONG_DOUBLE_MATH && __HAVE_DISTINCT_FLOAT128]: Likewise.
[__HAVE_DISTINCT_FLOAT128 && !__HAVE_GENERIC_SELECTION
&& __HAVE_FLOATN_NOT_TYPEDEF]: Likewise.
[__HAVE_DISTINCT_FLOAT128 && __HAVE_GENERIC_SELECTION]
(__MATH_TG_F32): New macro.
[__HAVE_DISTINCT_FLOAT128 && __HAVE_GENERIC_SELECTION]
(__MATH_TG_F64X): Likewise.
[__HAVE_DISTINCT_FLOAT128 && __HAVE_GENERIC_SELECTION]
(__MATH_TG): Use __MATH_TG_F32 and __MATH_TG_F64X.
Continuing the preparation for additional _FloatN / _FloatNx type
support, this patch improves how <tgmath.h> handles such types.
Use of #error is added for cases of distinct types that are not
supported by the header, to indicate that additional work on the
header would be needed if, for example, _Float16 support were added to
glibc. Given that #error, types with the same format as other types
are handled automatically by the sizeof-based logic, so the only case
needing special handling is that where _Float64x exists, has the same
format as _Float128, does not have the same format as long double, and
is not a typedef for _Float128. In this case (which will apply for
powerpc64le once _Float64x support is added to glibc), the
__builtin_types_compatible_p calls testing for _Float128 need
corresponding calls testing for _Float64x, which this patch adds.
Tested for x86_64.
* math/tgmath.h [__HAVE_DISTINCT_FLOAT16
|| __HAVE_DISTINCT_FLOAT32 || __HAVE_DISTINCT_FLOAT64
|| __HAVE_DISTINCT_FLOAT32X || __HAVE_DISTINCT_FLOAT64X
|| __HAVE_DISTINCT_FLOAT128X]: Use #error.
[__HAVE_DISTINCT_FLOAT128 && __GLIBC_USE (IEC_60559_TYPES_EXT)
&& __HAVE_FLOAT64X && !__HAVE_FLOAT64X_LONG_DOUBLE
&& __HAVE_FLOATN_NOT_TYPEDEF] (__TGMATH_F128): Handle _Float64x
the same as _Float128.
[__HAVE_DISTINCT_FLOAT128 && __GLIBC_USE (IEC_60559_TYPES_EXT)
&& __HAVE_FLOAT64X && !__HAVE_FLOAT64X_LONG_DOUBLE
&& __HAVE_FLOATN_NOT_TYPEDEF] (__TGMATH_CF128): Likewise.
Using the cache hierarchy linesize minimum in CTR_EL0.
See the comment within the code for rationale.
* sysdeps/unix/sysv/linux/aarch64/sysconf.c: New file.
Remove some load/store instructions from the dynamic tlsdesc resolver
fast path. This gives around 20% faster tls access in dlopened shared
libraries (assuming glibc ran out of static tls space).
* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Optimize.
Lazy tlsdesc initialization is no longer used in the dynamic linker
so all related code can be removed.
* sysdeps/arm/dl-machine.h (elf_machine_runtime_setup): Remove
DT_TLSDESC_GOT initialization.
* sysdeps/arm/dl-tlsdesc.S (_dl_tlsdesc_lazy_resolver): Remove.
(_dl_tlsdesc_resolve_hold): Likewise.
* sysdeps/aarch64/dl-tlsdesc.h (_dl_tlsdesc_lazy_resolver): Remove.
(_dl_tlsdesc_resolve_hold): Likewise.
* sysdeps/aarch64/tlsdesc.c (_dl_tlsdesc_lazy_resolver_fixup): Remove.
(_dl_tlsdesc_resolve_hold_fixup): Likewise.
Follow up to
https://sourceware.org/ml/libc-alpha/2015-11/msg00272.html
Always do tls descriptor initialization at load time during relocation
processing (as if DF_BIND_NOW were set for the binary) to avoid barriers
at every tls access. This patch mimics bind-now semantics in the lazy
relocation code of the arm target (elf_machine_lazy_rel).
Ideally the static linker should be updated too to not emit tlsdesc
relocs in DT_REL*, so elf_machine_lazy_rel is not called on them at all.
[BZ #18572]
* sysdeps/arm/dl-machine.h (elf_machine_lazy_rel): Do symbol binding
non-lazily for R_ARM_TLS_DESC.
This patch reverts
commit 9c82da17b5
Author: Maciej W. Rozycki <macro@codesourcery.com>
Date: 2014-07-17 19:22:05 +0100
[BZ #17078] ARM: R_ARM_TLS_DESC prelinker support
This only implemented support for the lazy binding case (and thus
closed the bugzilla ticket prematurely), however tlsdesc on arm is
not correct with lazy binding because there is a data race between
the lazy initialization code and tlsdesc resolver functions.
Lazy initialization of tlsdesc entries will be removed from arm to
fix the data races and thus this half-finished prelinker support
is no longer useful.
[BZ #17078]
* sysdeps/arm/dl-machine.h (elf_machine_rela): Remove the
R_ARM_TLS_DESC case.
(elf_machine_lazy_rel): Remove the prelink check.
Always do TLS descriptor initialization at load time during relocation
processing to avoid barriers at every TLS access. In non-dlopened shared
libraries the overhead of tls access vs static global access is > 3x
bigger when lazy initialization is used (_dl_tlsdesc_return_lazy)
compared to bind-now (_dl_tlsdesc_return) so the barriers dominate tls
access performance.
TLSDESC relocs are in DT_JMPREL which are processed at load time using
elf_machine_lazy_rel which is only supposed to do lightweight
initialization using the DT_TLSDESC_PLT trampoline (the trampoline code
jumps to the entry point in DT_TLSDESC_GOT which does the lazy tlsdesc
initialization at runtime). This patch changes elf_machine_lazy_rel
in aarch64 to do the symbol binding and initialization as if DF_BIND_NOW
was set, so the non-lazy code path of elf/do-rel.h was replicated.
The static linker could be changed to emit TLSDESC relocs in DT_REL*,
which are processed non-lazily, but the goal of this patch is to always
guarantee bind-now semantics, even if the binary was produced with an
old linker, so the barriers can be dropped in tls descriptor functions.
After this change the synchronizing ldar instructions can be dropped
as well as the lazy initialization machinery including the DT_TLSDESC_GOT
setup.
I believe this should be done on all targets, including ones where no
barrier is needed for lazy initialization. There is very little gain in
optimizing for large number of symbolic tlsdesc relocations which is an
extremely uncommon case. And currently the tlsdesc entries are only
readonly protected with -z now and some hardennings against writable
JUMPSLOT relocs don't work for TLSDESC so they are a security hazard.
(But to fix that the static linker has to be changed.)
* sysdeps/aarch64/dl-machine.h (elf_machine_lazy_rel): Do symbol
binding and initialization non-lazily for R_AARCH64_TLSDESC.
These static functions are not needed if a target does not do lazy
tlsdesc initialization.
* elf/tlsdeschtab.h (_dl_tls_resolve_early_return_p): Mark unused.
(_dl_tlsdesc_wake_up_held_fixups): Likewise.
Continuing the preparation for additional _FloatN / _FloatNx type
support, this patch arranges for <bits/cmathcalls.h> to be included by
<complex.h> for each such type under conditions and with macros
defined corresponding to those used for _Float128.
Tested for x86_64.
* math/complex.h
[(__HAVE_DISTINCT_FLOAT16 || (__HAVE_FLOAT16 && !_LIBC))
&& __GLIBC_USE (IEC_60559_TYPES_EXT)]: Include <bits/cmathcalls.h>
with appropriate macros defined and undefined.
[(__HAVE_DISTINCT_FLOAT32 || (__HAVE_FLOAT32 && !_LIBC))
&& __GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise.
[(__HAVE_DISTINCT_FLOAT64 || (__HAVE_FLOAT64 && !_LIBC))
&& __GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise.
[(__HAVE_DISTINCT_FLOAT32X || (__HAVE_FLOAT32X && !_LIBC))
&& __GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise.
[(__HAVE_DISTINCT_FLOAT64X || (__HAVE_FLOAT64X && !_LIBC))
&& __GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise.
[(__HAVE_DISTINCT_FLOAT128X || (__HAVE_FLOAT128X && !_LIBC))
&& __GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise.
This patch cleans up the way complex.h handles inclusion of
bits/cmathcalls.h for float128. The inclusion was between those for
the types float and long double; the patch moves it after that for
long double, matching how bits/mathcalls.h and bits/math-finite.h
inclusions are ordered. There is no need for the undefine and define
of _Mdouble_complex_ to be conditional, since __CFLOAT128 is always
defined by bits/floatn.h when _Float128 is supported, so the patch
removes the unnecessary conditionals.
Tested for x86_64.
* math/complex.h
[(__HAVE_DISTINCT_FLOAT128 || (__HAVE_FLOAT128 && !LIBC))
&& __GLIBC_USE (IEC_60559_TYPES_EXT)]: Move conditional code after
that for long double. Do not condition define and undefine of
_Mdouble_complex_ on [__CFLOAT128].
Add a new header file, sysdeps/x86/sysdep.h, for common assembly code
macros between i386 and x86-64. Tested on i686 and x86-64. There are
no differences in outputs of "readelf -a" and "objdump -dw" on all glibc
shared objects before and after the patch.
* sysdeps/i386/sysdep.h: Include <sysdeps/x86/sysdep.h> instead
of <sysdeps/generic/sysdep.h>.
(ALIGNARG): Removed.
(ASM_SIZE_DIRECTIVE): Likewise.
(ENTRY): Likewise.
(END): Likewise.
(ENTRY_CHK): Likewise.
(END_CHK): Likewise.
(syscall_error): Likewise.
(mcount): Likewise.
(PSEUDO_END): Likewise.
(L): Likewise.
(atom_text_section): Likewise.
* sysdeps/x86/sysdep.h: New file.
* sysdeps/x86_64/sysdep.h: Include <sysdeps/x86/sysdep.h> instead
of <sysdeps/generic/sysdep.h>.
(ALIGNARG): Removed.
(ASM_SIZE_DIRECTIVE): Likewise.
(ENTRY): Likewise.
(END): Likewise.
(ENTRY_CHK): Likewise.
(END_CHK): Likewise.
(syscall_error): Likewise.
(mcount): Likewise.
(PSEUDO_END): Likewise.
(L): Likewise.
(atom_text_section): Likewise.
Following the previous work by Carlos O'Donell the category of LC_CTYPE
is correctly set to "i18n:2012" rather than "unicode:2014" and the
i18n_ctype file is once again regenerated from scratch to make sure it
does not contain any manual additions except the copyright message.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* localedata/unicode-gen/gen_unicode_ctype.py (output_head):
category of LC_CTYPE set to "i18n:2012".
* localedata/locales/i18n_ctype: Regenerate.
sigprocmask.c, sigtimedwait.c, sigwait.c and sigwaitinfo.c files from
sysdeps/unix/sysv/linux include nptl-signals.h via nptl/pthreadP.h,
and so SIGCANCEL and SIGSETXID become defined unconditionally. But
later in the code, there are some checks weither symbols defined,
which is useless. This patch removes useless checks.
Checked on x86_64-linux-gnu.
* sysdeps/unix/sysv/linux/sigprocmask.c: Remove useless #ifdefs.
* sysdeps/unix/sysv/linux/sigtimedwait.c: Likewise.
* sysdeps/unix/sysv/linux/sigwait.c: Likewise.
* sysdeps/unix/sysv/linux/sigwaitinfo.c: Likewise.
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
ia64, s390-64, sparc64 and x86_64 host their own implementation of
sigpending() in corresponding files, but they are identical to generic
linux file despite few comments. This patch removes that files, so the
implementation of sigpending() is taken from sysdeps/unix/sysv/linux
for all ports.
Build-tested on x86_64.
* sysdeps/unix/sysv/linux/ia64/sigpending.c: Remove file.
* sysdeps/unix/sysv/linux/s390/s390-64/sigpending.c: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/sigpending.c: Likewise.
* sysdeps/unix/sysv/linux/x86_64/sigpending.c: Likewise.
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Continuing the preparation for additional _FloatN / _FloatNx type
support, this patch adds an additional case in the definition of
__MATH_EVAL_FMT2, as used in defining iseqsig: when
__FLT_EVAL_METHOD__ is 0 or 32, it adds 0.0f to the arguments, so that
the correct function would be selected in the case of _Float16
arguments with excess precision (were glibc to support _Float16, which
of course __MATH_TG and other facilities do not at present - and
_Float16 support is not part of what this patch series is aiming for,
but this particular fix is simple so is included anyway).
Tested for x86_64.
* math/math.h
[__FLT_EVAL_METHOD__ == 0 || __FLT_EVAL_METHOD__ == 32]
(__MATH_EVAL_FMT2): Define to add 0.0f.
This is another one where we'll be wanting the base symbols for
powerpc64le rather than just a power7 variant.
* sysdeps/powerpc/powerpc64/multiarch/strncase_l-power7.c: Include
string/strncase_l.c, not string/strncase.c.
(USE_IN_EXTENDED_LOCALE_MODEL): Don't define.
(libc_hidden_def): Redefine.
The routine being assembled here is strcasecmp_l, so ask for that via
__STRCMP and STRCMP defines. That change means tweaking the power7
override. Needed for later powerpc64le changes where we want the base
symbols, not just a power7 variant.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S:
(__STRCMP, STRCMP, __strcasecmp_l): Define.
(__strcasecmp): Don't define.
These functions aren't used in ld.so at the moment since we don't have
strcmp or strncmp ifuncs for them there. Remove the ld.so bloat.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S: Wrap in
IS_IN (libc).
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise.
USE_AS_STPNCPY is defined by sysdeps/powerpc/powerpc64/power8/stpncpy.S,
included by this file.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Don't define
USE_AS_STPNCPY.
It seems to me that libc.a should not contain any of the __GI_
symbols, and certainly --enable-multi-arch ought to not add to the
list. At the end of this patch series we have the following in both
--enable-multi-arch and --disable-multi-arch libc.a:
0000000000000000 T __GI___readdir64
0000000000000000 T __GI___fxstatat64
0000000000000000 T __GI_getrlimit
0000000000000000 T __GI___getrlimit
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S (hidden_def):
Redefine only when SHARED.
Continuing the preparation for additional _FloatN / _FloatNx type
support, this patch extends the includes of <bits/math-finite.h> to
cover all such types, under conditions analogous to those for
_Float128.
Tested for x86_64.
* math/math.h [__HAVE_DISTINCT_FLOAT16 || (__HAVE_FLOAT16 && !_LIBC)]:
Include <bits/math-finite.h> with appropriate macros defined and
undefined.
[__HAVE_DISTINCT_FLOAT32 || (__HAVE_FLOAT32 && !_LIBC)]: Likewise.
[__HAVE_DISTINCT_FLOAT64 || (__HAVE_FLOAT64 && !_LIBC)]: Likewise.
[__HAVE_DISTINCT_FLOAT32X || (__HAVE_FLOAT32X && !_LIBC)]: Likewise.
[__HAVE_DISTINCT_FLOAT64X || (__HAVE_FLOAT64X && !_LIBC)]: Likewise.
[__HAVE_DISTINCT_FLOAT128X || (__HAVE_FLOAT128X && !_LIBC)]: Likewise.
math.h has a macro _Mlong_double_ for the type to use when declaring
long double functions, and similar macros for other types.
math/Makefile uses -D_Mlong_double_=double in the case of long double
having the same ABI as double.
This originates with:
Mon Jul 8 13:37:40 1996 Roland McGrath <roland@delasyd.gnu.ai.mit.edu>
* math/math.h (_Mfloat_, _Mlong_double_): New macros, defined iff not
already defined to float, long double. Use those macros for _Mdouble_
defns when including mathcalls.h.
* math/Makefile [$(long-double-fcts) != yes] (CPPFLAGS): Append
-D_Mlong_double_=double.
However, math.h stopped declaring long double functions in the case of
long double having the same ABI as double (and thus probably stopped
actually needing the Makefile definition of _Mlong_double_) with:
1998-11-05 Ulrich Drepper <drepper@cygnus.com>
* math/math.h: Unconditionally include bits/mathdef.h. Declare
long double functions only if __NO_LONG_DOUBLE_MATH is not
defined.
* sysdeps/generic/bits/mathdef.h: Define only if __USE_ISOC9X.
Define __NO_LONG_DOUBLE_MATH.
* sysdeps/m68k/fpu/bits/mathdef.h: Define only if __USE_ISOC9X.
* sysdeps/i386/fpu/bits/mathdef.h: Likewise.
The declarations were since restored for compiling user code, but
remain absent when _LIBC is defined, which is sufficient to avoid
problems declaring function aliases of incompatible types. Thus the
indirection through the _Mlong_double_ macro is not needed (probably
since that 1998 patch), and this patch removes _Mlong_double_ and
associated macros for other types, leaving only the macro _Mdouble_
which is actually used as the type for which a given inclusion of
<bits/mathcalls.h> should declared functions.
Tested for x86_64, and tested with build-many-glibcs.py that installed
stripped shared libraries are unchanged by this patch.
* math/math.h [!_Mfloat_] (_Mfloat_): Do not define.
[!_Mlong_double_] (_Mlong_double_): Likewise.
[!_Mfloat16_] (_Mfloat16_): Likewise.
[!_Mfloat32_] (_Mfloat32_): Likewise.
[!_Mfloat64_] (_Mfloat64_): Likewise.
[!_Mfloat128_] (_Mfloat128_): Likewise.
[!_Mfloat32x_] (_Mfloat32x_): Likewise.
[!_Mfloat64x_] (_Mfloat64x_): Likewise.
[!_Mfloat128x_] (_Mfloat128x_): Likewise.
(_Mdouble_): Define without indirection through those macros.
* math/complex.h [!_Mfloat_] (_Mfloat_): Do not define.
[!_Mfloat128_] (_Mfloat128_): Likewise.
[_Mlong_double_] (_Mlong_double_): Likewise.
(_Mdouble_): Define without indirection through those macros.
* math/Makefile [$(long-double-fcts) != yes] (math-CPPFLAGS): Do
not add -D_Mlong_double_=double.
* include/math.h [_ISOMAC] (_Mlong_double_): Do not undefine.
* math/test-signgam-finite-c99.c (_Mlong_double_): Likewise.
i586 strcpy.S used a clever trick with LEA to implement jump table:
/* ECX has the last 2 bits of the address of source - 1. */
andl $3, %ecx
call 2f
2: popl %edx
/* 0xb is the distance between 2: and 1:. */
leal 0xb(%edx,%ecx,8), %ecx
jmp *%ecx
.align 8
1: /* ECX == 0 */
orb (%esi), %al
jz L(end)
stosb
xorl %eax, %eax
incl %esi
/* ECX == 1 */
orb (%esi), %al
jz L(end)
stosb
xorl %eax, %eax
incl %esi
/* ECX == 2 */
orb (%esi), %al
jz L(end)
stosb
xorl %eax, %eax
incl %esi
/* ECX == 3 */
L(1): movl (%esi), %ecx
leal 4(%esi),%esi
This fails if there are instruction length changes before L(1):. This
patch replaces it with conditional branches:
cmpb $2, %cl
je L(Src2)
ja L(Src3)
cmpb $1, %cl
je L(Src1)
L(Src0):
which have similar performance and work with any instruction lengths.
Tested on i586 and i686 with and without --disable-multi-arch.
[BZ #22353]
* sysdeps/i386/i586/strcpy.S (STRCPY): Use conditional branches.
(1): Renamed to ...
(L(Src0)): This.
(L(Src1)): New.
(L(Src2)): Likewise.
(L(1)): Renamed to ...
(L(Src3)): This.
[BZ #19485]
* localedata/locales/csb_PL (LC_TIME): Fix “abmon” for March
and use a better translation for March in “mon”.
* localedata/locales/csb_PL: Use more ASCII to improve the
readability of the source.
[BZ #13953]
* localedata/locales/km_KH: Use ASCII as much
as possible for better readability of the source and
remove useless comments.
* localedata/locales/km_KH (LC_TIME): Remove era stuff, it
was commented out and apparently wrong anyway because it was
using Lao characters. If Buddhist era should be used
for km_KH, a native speaker should write the correct formaat
for Khmer.
* localedata/locales/km_KH (LC_TIME): Add first_weekday 1
(According to CLDR, the first weekday for Cambodia is Sunday).
* localedata/locales/km_KH (LC_NAME): Remove name_mr and name_mrs
(These were using Lao characters which must be wrong. If we get
the correct data from a native speaker, we could add it back, until
then it is better not to have name_mr and name_mrs at all than
having it wrong).
There were several problems with checking the array size in the past,
for example BZ#356, caused by incorrectly assuming that every locale
token represents one element. In fact, if a token represented
a subarray, for example an array of month names or characters category
and it appeared at the end of the array the compiler assumed that
the array ends just after the first element of the subarray.
A workaround used in the past was to skip some categories while testing,
for example LC_CTYPE. Now when we are about to add alternative month
names to LC_TIME (BZ#10871) this will fail again.
* locale/loadlocale.c: Correct size of
_nl_value_type_LC_<category> arrays.
Reviewed-by: Zack Weinberg <zackw@panix.com>
Continuing the preparation for additional _FloatN / _FloatNx type
support, this patch arranges for <bits/mathcalls.h> and
<bits/mathcalls-helper-functions.h> to be included for each such type
under conditions and with macros defined corresponding to those
already present for _Float128.
Tested for x86_64.
* math/math.h [__HAVE_DISTINCT_FLOAT16 || (__HAVE_FLOAT16 && !_LIBC)]:
Include <bits/mathcalls-helper-functions.h> and <bits/mathcalls.h>
with appropriate macros defined and undefined.
[__HAVE_DISTINCT_FLOAT32 || (__HAVE_FLOAT32 && !_LIBC)]: Likewise.
[__HAVE_DISTINCT_FLOAT64 || (__HAVE_FLOAT64 && !_LIBC)]: Likewise.
[__HAVE_DISTINCT_FLOAT32X || (__HAVE_FLOAT32X && !_LIBC)]: Likewise.
[__HAVE_DISTINCT_FLOAT64X || (__HAVE_FLOAT64X && !_LIBC)]: Likewise.
[__HAVE_DISTINCT_FLOAT128X || (__HAVE_FLOAT128X && !_LIBC)]: Likewise.