To support building glibc with GCC 6 configured with --enable-default-pie,
which generates PIE by default, we need to build programs as PIE. But
elf/tst-dlopen-aout must not be built as PIE since it tests dlopen on
ET_EXEC file and PIE is ET_DYN.
[BZ #17841]
* Makeconfig (no-pie-ldflag): New.
(+link): Set to $(+link-pie) if default to PIE.
(+link-tests): Set to $(+link-pie-tests) if default to PIE.
* config.make.in (build-pie-default): New.
* configure.ac (libc_cv_pie_default): New. Set to yes if -fPIE
is default. AC_SUBST.
* configure: Regenerated.
* elf/Makefile (LDFLAGS-tst-dlopen-aout): New.
cexp, ccos, ccosh, csin and csinh have spurious underflows in cases
where they compute sin of the smallest normal, that produces an
underflow exception (depending on which sin implementation is in use)
but the final result does not underflow. ctan and ctanh may also have
such underflows, or they may be latent (the issue there is that
e.g. ctan (DBL_MIN) should, rounded upwards, be the next double value
above DBL_MIN, which under glibc's accuracy goals may not have an
underflow exception, but the intermediate computation of sin (DBL_MIN)
would legitimately underflow on before-rounding architectures).
This patch fixes all those functions so they use plain comparisons (>
DBL_MIN etc.) instead of comparing the result of fpclassify with
FP_SUBNORMAL (in all these cases, we already know the number being
compared is finite). Note that in the case of csin / csinf / csinl,
there is no need for fabs calls in the comparison because the real
part has already been reduced to its absolute value.
As the patch fixes the failures that previously obstructed moving
tests of cexp to use ALL_RM_TEST, those tests are moved to ALL_RM_TEST
by the patch (two functions remain yet to be converted).
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18594]
* math/s_ccosh.c (__ccosh): Compare with least normal value
instead of comparing class with FP_SUBNORMAL.
* math/s_ccoshf.c (__ccoshf): Likewise.
* math/s_ccoshl.c (__ccoshl): Likewise.
* math/s_cexp.c (__cexp): Likewise.
* math/s_cexpf.c (__cexpf): Likewise.
* math/s_cexpl.c (__cexpl): Likewise.
* math/s_csin.c (__csin): Likewise.
* math/s_csinf.c (__csinf): Likewise.
* math/s_csinh.c (__csinh): Likewise.
* math/s_csinhf.c (__csinhf): Likewise.
* math/s_csinhl.c (__csinhl): Likewise.
* math/s_csinl.c (__csinl): Likewise.
* math/s_ctan.c (__ctan): Likewise.
* math/s_ctanf.c (__ctanf): Likewise.
* math/s_ctanh.c (__ctanh): Likewise.
* math/s_ctanhf.c (__ctanhf): Likewise.
* math/s_ctanhl.c (__ctanhl): Likewise.
* math/s_ctanl.c (__ctanl): Likewise.
* math/auto-libm-test-in: Add more tests of ccos, ccosh, cexp,
csin, csinh, ctan and ctanh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (cexp_test): Use ALL_RM_TEST.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Many packages, including GCC, install Python files for GDB in library
diretory. ldconfig reads them and issue errors since they aren't ELF
files:
ldconfig: /usr/gcc-5.1.1/lib/libstdc++.so.6.0.21-gdb.py is not an ELF file - it has the wrong magic bytes at the start.
ldconfig: /usr/gcc-5.1.1/libx32/libstdc++.so.6.0.21-gdb.py is not an ELF file - it has the wrong magic bytes at the start.
ldconfig: /usr/gcc-5.1.1/lib64/libstdc++.so.6.0.21-gdb.py is not an ELF file - it has the wrong magic bytes at the start.
This patch silences ldconfig on GDB Python files by checking filenames
with -gdb.py suffix.
[BZ #18585]
* elf/readlib.c (is_gdb_python_file): New.
(process_file): Don't issue errors on filenames with -gdb.py
suffix.
csin and csinh can produce bad results when overflowing in directed
rounding modes, because a multiplication that can overflow is followed
by a possible negation. This patch fixes this by negating one of the
arguments of the multiplication before the multiplication instead of
negating the result.
The new tests for this issue are added to auto-libm-test-in, starting
use of that file for csin and csinh. The issue was found in the
course of moving existing tests for csin and csinh (existing tests, by
being enabled in more cases than previously, showed the issue for
float and double but not for long double); that move will now be done
separately.
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18593]
* math/s_csin.c (__csin): Negate before rather than after possibly
overflowing multiplication.
* math/s_csinf.c (__csinf): Likewise.
* math/s_csinh.c (__csinh): Likewise.
* math/s_csinhf.c (__csinhf): Likewise.
* math/s_csinhl.c (__csinhl): Likewise.
* math/s_csinl.c (__csinl): Likewise.
* math/auto-libm-test-in: Add some tests of csin and csinh.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (csin_test_data): Use AUTO_TESTS_c_c.
(csinh_test_data): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
Similar to various other bugs in this area, the ldbl-128 expl
implementation does not raise the underflow exception for all
subnormal results, if the scaling down is exact although the actual
result is inexact. This patch fixes this by forcing the exception in
this case (the tests that failed before and pass after the test are
already in the testsuite).
Tested for mips64.
[BZ #18586]
* sysdeps/ieee754/ldbl-128/e_expl.c (__ieee754_expl): Force
underflow exception for small results.
Similar to various other bugs in this area, some sin and sincos
implementations do not raise the underflow exception for subnormal
arguments, when the result is tiny and inexact. This patch forces the
exception in a similar way to previous fixes.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #16526]
[BZ #16538]
* sysdeps/ieee754/dbl-64/s_sin.c: Include <float.h>.
(__sin): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sincosl.c: Include <float.h>.
(__kernel_sincosl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/ieee754/ldbl-96/k_sinl.c: Include <float.h>.
(__kernel_sinl): Force underflow exception for arguments with
small absolute value.
* sysdeps/powerpc/fpu/k_sinf.c: Include <float.h>.
(__kernel_sinf): Force underflow exception for arguments with
small absolute value.
* math/auto-libm-test-in: Add more tests of sin and sincos.
* math/auto-libm-test-out: Regenerated.
__kernel_standard_l converts long double arguments to double for use
in SVID "struct exception". This has special-case handling for when
that conversion would overflow or underflow but the original long
double function wouldn't. However, it turns out that "inexact"
exceptions can be spurious here as well, when the function is exactly
determined and __kernel_standard_l is being called for a domain error.
This patch fixes this by using feholdexcept / fesetenv to avoid
exceptions from the conversion, replacing the previous special-case
logic for overflow and underflow (this covers all functions using
__kernel_standard_l, not just those that actually need a change, since
there doesn't seem to be much point in restricting things just to the
functions that mustn't get "inexact" here).
Tested for x86_64 and x86.
[BZ #18245]
[BZ #18583]
* sysdeps/ieee754/k_standardl.c: Include <fenv.h>.
(__kernel_standard_l): Use feholdexcept and fesetenv around
conversion to double instead of special-casing overflow and
underflow.
* math/libm-test.inc (fmod_test_data): Add more tests.
(remainder_test_data): Likewise.
(sqrt_test_data): Likewise.
This fixes BZ #17403 by defining atomic_full_barrier,
atomic_read_barrier, and atomic_write_barrier on x86 and x86_64. A full
barrier is implemented through an atomic idempotent modification to the
stack and not through using mfence because the latter can supposedly be
somewhat slower due to having to provide stronger guarantees wrt.
self-modifying code, for example.
The csqrt implementations in glibc can cause spurious underflows in
some cases as a side-effect of the scaling for large arguments (when
underflow is correct for the square root of the argument that was
scaled down to avoid overflow, but not for the original argument).
This patch arranges to avoid the underflowing intermediate computation
(eliminating a multiplication in 0.5 in the problem cases where a
subsequent scaling by 2 would follow).
Tested for x86_64 and x86 and ulps updated accordingly (only needed
for x86).
[BZ #18371]
* math/s_csqrt.c (__csqrt): Avoid multiplication by 0.5 where
intermediate but not final result might underflow.
* math/s_csqrtf.c (__csqrtf): Likewise.
* math/s_csqrtl.c (__csqrtl): Likewise.
* math/auto-libm-test-in: Add more tests of csqrt.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
The dbl-64 and flt-32 implementations of exp2 functions produce
spurious underflow exceptions. The underlying reason is the same in
both cases: the computation works as (2^a - 1)*2^b + 2^b for suitably
chosen a and b, where a has small magnitude so 2^a - 1 can be computed
with a low-degree polynomial approximation, and (2^a - 1)*2^b can
underflow even when the final result does not. This patch fixes this
by adjusting the threshold for when scaling is used to avoid
intermediate underflow so it works for any possible value of a where
the final result would not underflow.
Tested for x86_64 and x86.
[BZ #18219]
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Reduce
threshold on absolute value of exponent for which scaling is used.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* math/auto-libm-test-in: Add more tests of exp2.
* math/auto-libm-test-out: Regenerated.
When "reorder" resolver option is enabled, threads of a multi-threaded process
could hang in gethostbyaddr_r, gethostbyname_r, or gethostbyname2_r.
Due to a trivial bug in _res_hconf_reorder_addrs, simultaneous
invocations of this function in a multi-threaded process could result to
_res_hconf_reorder_addrs returning without releasing the lock it holds,
causing other threads to block indefinitely while waiting for the lock
that is not going to be released.
[BZ #17977]
* resolv/res_hconf.c (_res_hconf_reorder_addrs): Fix unlocking
when initializing interface list, based on the bug analysis
and the patch proposed by Eric Newton.
* resolv/tst-res_hconf_reorder.c: New test.
* resolv/Makefile [$(have-thread-library) = yes] (tests): Add
tst-res_hconf_reorder.
($(objpfx)tst-res_hconf_reorder): Depend on $(libdl)
and $(shared-thread-library).
(tst-res_hconf_reorder-ENV): New variable.
Similar to various other bugs in this area, some expm1 implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
(The issue does not apply to the ldbl-* implementations or to those
for x86 / x86_64 long double. The change to
sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c is one I missed when
previously fixing bug 16354; the bug in that implementation was
previously latent, but the expm1 fixes stopped it being latent and so
required it to be fixed to avoid spurious underflows from cosh.)
Tested for x86_64 and x86.
[BZ #16353]
* sysdeps/i386/fpu/s_expm1.S (dbl_min): New object.
(__expm1): Force underflow exception for arguments with small
absolute value.
* sysdeps/i386/fpu/s_expm1f.S (flt_min): New object.
(__expm1f): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/dbl-64/s_expm1.c: Include <float.h>.
(__expm1): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_expm1f.c: Include <float.h>.
(__expm1f): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/dbl-64/wordsize-64/e_cosh.c (__ieee754_cosh):
Check for small arguments before calling __expm1.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16353.
* math/auto-libm-test-out: Regenerated.
In the x86 / x86_64 implementations of expm1l, when expm1l's result
should underflow to 0 (argument minus the least subnormal, in some
rounding modes), it can be a zero of the wrong sign. This patch fixes
this by returning the argument with underflow forced in that case
(this is a 1ulp error relative to the correctly rounded result of -0,
which is OK in terms of the documented accuracy goals, whereas a
result with the wrong sign never is).
Tested for x86_64 and x86.
[BZ #18569]
* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]: Force
underflow and return argument in case of subnormal argument.
* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [USE_AS_EXPM1L]:
Likewise.
* math/auto-libm-test-in: Add more tests of expm1.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, the x86 and x86_64
implementations of expl / exp10l can fail to produce underflow
exceptions when the unscaled result has trailing 0 bits so the scaling
down to subnormal precision is exact. This patch fixes this by
forcing the exception in the case of tiny results.
Tested for x86_64 and x86.
[BZ #16361]
* sysdeps/i386/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object.
[!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for
tiny results.
* sysdeps/x86_64/fpu/e_expl.S [!USE_AS_EXPM1L] (cmin): New object.
[!USE_AS_EXPM1L] (IEEE754_EXPL): Force underflow exception for
tiny results.
* math/auto-libm-test-in: Add more tests of exp and exp10. Do not
mark underflow exceptions as possibly missing for bug 16361.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some asinh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86 and mips64.
[BZ #16350]
* sysdeps/i386/fpu/s_asinh.S (__asinh): Force underflow exception
for arguments with small absolute value.
* sysdeps/i386/fpu/s_asinhf.S (__asinhf): Likewise.
* sysdeps/i386/fpu/s_asinhl.S (__asinhl): Likewise.
* sysdeps/ieee754/dbl-64/s_asinh.c: Include <float.h>.
(__asinh): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/flt-32/s_asinhf.c: Include <float.h>.
(__asinhf): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* sysdeps/ieee754/ldbl-96/s_asinhl.c: Include <float.h>.
(__asinhl): Force underflow exception for arguments with small
absolute value.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16350.
* math/auto-libm-test-out: Regenerated.
sysdeps/unix/sysv/linux/bits/in.h (as included in netinet/in.h, and
via that in netdb.h and arpa/inet.h) defines a series of MCAST_*
macros, both under __USE_MISC and then again unconditionally. These
are not POSIX macros, nor in any of the namespaces listed in POSIX as
reserved for this header, so should not be defined unconditionally.
This patch duly removes the unconditional definitions, leaving the
ones conditional on __USE_MISC.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18558]
* sysdeps/unix/sysv/linux/bits/in.h (MCAST_JOIN_GROUP): Remove
unconditional definition.
(MCAST_BLOCK_SOURCE): Likewise.
(MCAST_UNBLOCK_SOURCE): Likewise.
(MCAST_LEAVE_GROUP): Likewise.
(MCAST_JOIN_SOURCE_GROUP): Likewise.
(MCAST_LEAVE_SOURCE_GROUP): Likewise.
(MCAST_MSFILTER): Likewise.
* conform/Makefile (test-xfail-XOPEN2K/arpa/inet.h/conform):
Remove variable.
(test-xfail-XOPEN2K/netdb.h/conform): Likewise.
(test-xfail-XOPEN2K/netinet/in.h/conform): Likewise.
(test-xfail-XOPEN2K8/arpa/inet.h/conform): Likewise.
(test-xfail-XOPEN2K8/netdb.h/conform): Likewise.
(test-xfail-XOPEN2K8/netinet/in.h/conform): Likewise.
nice (XPG3) calls getpriority and setpriority (in XPG4 but not XPG3,
i.e. UX-shaded in XPG4). This patch fixes this by making those
functions into weak aliases of __* functions and calling the __*
versions as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by this patch).
This completes cleaning up the unsorted linknamespace test XFAILs.
[BZ #18553]
* resource/getpriority.c (getpriority): Rename to __getpriority
and define as weak alias of __getpriority.
* resource/setpriority.c (setpriority): Rename to __setpriority
and define as weak alias of __setpriority.
* sysdeps/mach/hurd/getpriority.c (getpriority): Rename to
__getpriority and define as weak alias of __getpriority.
* sysdeps/mach/hurd/setpriority.c (setpriority): Rename to
__setpriority and define as weak alias of __setpriority.
* sysdeps/unix/syscalls.list (getpriority): Use __getpriority as
strong name.
(setpriority): Use __setpriority as strong name.
* sysdeps/unix/sysv/linux/getpriority.c (getpriority): Rename to
__getpriority and define as weak alias of __getpriority.
* include/sys/resource.h (__getpriority): Declare. Use
libc_hidden_proto.
(__setpriority): Likewise.
(getpriority): Don't use libc_hidden_proto.
(setpriority): Likewise.
* sysdeps/posix/nice.c (nice): Call __getpriority instead of
getpriority. Call __setpriority instead of setpriority.
* conform/Makefile (test-xfail-XPG3/unistd.h/linknamespace):
Remove variable.
ttyslot (XPG4) calls the non-XPG4 functions endttyent, getttyent and
setttyent, which in turn bring in references to fgets_unlocked and
getttynam. This patch fixes this by making these functions into weak
aliases and calling the __* names as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch).
[BZ #18547]
* misc/getttyent.c (getttynam): Rename to __getttynam and define
as weak alias of __getttynam. Use prototype function definition.
Call __setttyent, __getttyent and __endttyent instead of
setttyent, getttyent and endttyent.
(getttyent): Rename to __getttyent and define as weak alias of
__getttyent. Call __setttyent instead of setttyent. Call
__fgets_unlocked instead of fgets_unlocked.
(setttyent): Rename to __setttyent and define as weak alias of
__setttyent.
(endttyent): Rename to __endttyent and define as weak alias of
__endttyent.
* include/ttyent.h (__getttyent): Declare. Use libc_hidden_proto.
(__setttyent): Likewise.
(__endttyent): Likewise.
(getttyent): Don't use libc_hidden_proto.
(setttyent): Likewise.
(endttyent): Likewise.
* misc/ttyslot.c (ttyslot): Call __setttyent, __getttyent and
__endttyent instead of setttyent, getttyent and endttyent.
* conform/Makefile (test-xfail-XPG4/unistd.h/linknamespace):
Remove variable.
mq_notify (in the 1996 edition of POSIX) brings in references to recv
and socket (not in POSIX until the 2001 edition). This patch fixes
this by using __recv and __socket, exporting them from libc at version
GLIBC_PRIVATE.
Tested for x86_64 and x86 (testsuite and comparison of installed
stripped shared libraries; PLT / dynamic symbol table changes render
the comparison not particularly useful for libc).
[BZ #18546]
* socket/recv.c (__recv): Use libc_hidden_def.
* socket/socket.c (__socket): Likewise.
* sysdeps/mach/hurd/recv.c (__recv): Likewise.
* sysdeps/mach/hurd/socket.c (__socket): Likewise.
* sysdeps/unix/sysv/linux/generic/recv.c (__recv): Likewise.
* sysdeps/unix/sysv/linux/recv.c (__recv): Use libc_hidden_weak.
* sysdeps/unix/sysv/linux/socket.c (__socket): Use
libc_hidden_def.
* sysdeps/unix/sysv/linux/x86_64/recv.c (__recv): Use
libc_hidden_weak.
* include/sys/socket.h (__socket): Do not use attribute_hidden.
Use libc_hidden_proto.
(__recv): Likewise.
* socket/Versions (libc): Export __recv and __socket at version
GLIBC_PRIVATE.
* sysdeps/unix/sysv/linux/mq_notify.c (helper_thread): Call __recv
instead of recv.
(init_mq_netlink): Call __socket instead of socket.
* conform/Makefile (test-xfail-POSIX/mqueue.h/linknamespace):
Remove variable.
mq_receive calls mq_timedreceive, and mq_send calls mq_timedsend. But
mq_receive and mq_send were in POSIX by 1996, while mq_timed* were
added in the 2001 edition of POSIX. This patch fixes this by making
mq_timed* into weak aliases for __mq_timed* and calling the
__mq_timed* names.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18545]
* rt/mq_timedreceive.c (mq_timedreceive): Rename to
__mq_timedreceive and define as alias of __mq_timedreceive. Use
hidden_weak.
* rt/mq_timedsend.c (mq_timedsend): Rename to __mq_timedsend and
define as alias of __mq_timedsend. Use hidden_weak.
* sysdeps/unix/sysv/linux/syscalls.list (mq_timedsend): Use
__mq_timedsend as strong name.
(mq_timedreceive): Use __mq_timedreceive as strong name.
* include/mqueue.h (__mq_timedsend): Declare. Use hidden_proto.
(__mq_timedreceive): Likewise.
* sysdeps/unix/sysv/linux/mq_receive.c (mq_receive): Call
__mq_timedreceive instead of mq_timedreceive.
* sysdeps/unix/sysv/linux/mq_send.c (mq_send): Call __mq_timedsend
instead of mq_timedsend.
* conform/Makefile (test-xfail-UNIX98/mqueue.h/linknamespace):
Remove variable.
mq_notify (present in POSIX by 1996) brings in references to
pthread_barrier_init and pthread_barrier_wait (new in the 2001 edition
of POSIX). This patch fixes this by making those functions into weak
aliases of __pthread_barrier_*, exporting the __pthread_barrier_*
names at version GLIBC_PRIVATE and using them in mq_notify.
Tested for x86_64 and x86 (testsuite, and comparison of installed
stripped shared libraries). Changes in addresses from dynamic symbol
table / PLT changes render most comparisons not particularly useful,
but when the addresses of subsequent code don't change there's no sign
of unexpected changes there. This patch does not remove any
linknamespace XFAILs because of other namespace issues remaining with
mqueue.h functions.
[BZ #18544]
* nptl/pthread_barrier_init.c (pthread_barrier_init): Rename to
__pthread_barrier_init and define as weak alias of
__pthread_barrier_init.
* sysdeps/sparc/nptl/pthread_barrier_init.c
(pthread_barrier_init): Likewise.
* nptl/pthread_barrier_wait.c (pthread_barrier_wait): Rename to
__pthread_barrier_wait and define as weak alias of
__pthread_barrier_wait.
* sysdeps/sparc/nptl/pthread_barrier_wait.c
(pthread_barrier_wait): Likewise.
* sysdeps/sparc/sparc32/pthread_barrier_wait.c
(pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/i386/i486/pthread_barrier_wait.S
(pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S
(pthread_barrier_wait): Likewise.
* nptl/Versions (libpthread): Export __pthread_barrier_init and
__pthread_barrier_wait at version GLIBC_PRIVATE.
* include/pthread.h (__pthread_barrier_init): Declare.
(__pthread_barrier_wait): Likewise.
* sysdeps/unix/sysv/linux/mq_notify.c (notification_function):
Call __pthread_barrier_wait instead of pthread_barrier_wait.
(helper_thread): Likewise.
(init_mq_netlink): Call __pthread_barrier_init instead of
pthread_barrier_init.
swscanf (added in C90 Amendment 1, present in UNIX98) calls vswscanf
(added in C99, not in C90 Amendment 1 or UNIX98). This patch fixes
this by using __vswscanf instead and making vswscanf into a weak
alias.
(I intend to add conform/ test support for C90 Amendment 1 - and
various other standard versions supported by glibc but not yet by
conform/ tests - at some point, once the results for currently tested
standards are cleaner.)
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18542]
* libio/iovswscanf.c (__vswscanf): Use libc_hidden_def.
(vswscanf): Use ldbl_weak_alias instead of ldbl_strong_alias
* include/wchar.h (__vswscanf): Declare. Use libc_hidden_proto.
* libio/swscanf.c (__swscanf): Call __vswscanf instead of
vswscanf.
* conform/Makefile (test-xfail-UNIX98/wchar.h/linknamespace):
Remove variable.
The getpass function (XPG3 / XPG4 / UNIX98) calls fflush_unlocked (not
in any of those standards). This patch fixes this by making
fflush_unlocked into a weak alias for __fflush_unlocked and calling
__fflush_unlocked from getpass.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch).
[BZ #18540]
* libio/iofflush.c [!_IO_MTSAFE_IO] (__fflush_unlocked): Define as
strong alias of _IO_fflush. Use libc_hidden_def.
* libio/iofflush_u.c (fflush_unlocked): Rename to
__fflush_unlocked and define as weak alias of __fflush_unlocked.
Use libc_hidden_weak.
* include/stdio.h (__fflush_unlocked): Declare. Use
libc_hidden_proto.
* misc/getpass.c (getpass): Call __fflush_unlocked instead of
fflush_unlocked.
* conform/Makefile (test-xfail-UNIX98/unistd.h/linknamespace):
Remove variable.
Use of fmtmsg (XSI POSIX) brings in addseverity (non-POSIX). This
patch fixes this by making addseverity into a weak alias for
__addseverity.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18539]
* stdlib/fmtmsg.c (addseverity): Rename to __addseverity and
define as weak alias of __addseverity.
* conform/Makefile (test-xfail-XPG4/fmtmsg.h/linknamespace):
Remove variable.
(test-xfail-UNIX98/fmtmsg.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/fmtmsg.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/fmtmsg.h/linknamespace): Likewise.
The sem_* functions bring in references to tdelete, tfind, tsearch and
twalk. But the t* functions are XSI-shaded, while sem_* aren't. This
patch fixes this by using __t* instead, exporting those functions from
libc at version GLIBC_PRIVATE (since sem_* are in libpthread) and
using libc_hidden_* for the benefit of calls within libc.
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed stripped shared libraries). libpthread gets changes from
PLT reordering; addresses in libc change because of PLT / dynamic
symbol table changes.
[BZ #18536]
* misc/tsearch.c (__tsearch): Use libc_hidden_def.
(__tfind): Likewise.
(__tdelete): Likewise.
(__twalk): Likewise.
* misc/Versions (libc): Add __tdelete, __tfind, __tsearch and
__twalk to GLIBC_PRIVATE.
* include/search.h (__tsearch): Use libc_hidden_proto.
(__tfind): Likewise.
(__tdelete): Likewise.
(__twalk): Likewise.
* nptl/sem_close.c (sem_close): Call __twalk instead of twalk.
Call __tdelete instead of tdelete.
* nptl/sem_open.c (check_add_mapping): Call __tfind instead of
tfind. Call __tsearch instead of tsearch.
* sysdeps/sparc/sparc32/sem_open.c (check_add_mapping): Likewise.
* conform/Makefile (test-xfail-POSIX/semaphore.h/linknamespace):
Remove variable.
(test-xfail-POSIX2008/semaphore.h/linknamespace): Likewise.
syslog functions bring in references to dprintf, which wasn't added to
POSIX until the 2008 edition and so isn't in various standards
containing the syslog functions. This patch fixes this by making
dprintf into a weak alias of __dprintf and using __dprintf as
appropriate.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18534]
* stdio-common/dprintf.c (__dprintf): Use libc_hidden_def.
(dprintf): Define as a weak alias of __dprintf, not a strong
alias.
* include/stdio.h (__dprintf): Declare. Use libc_hidden_proto.
* misc/syslog.c (__vsyslog_chk): Call __dprintf instead of
dprintf.
* conform/Makefile (test-xfail-XPG4/syslog.h/linknamespace):
Remove variable.
(test-xfail-UNIX98/syslog.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/syslog.h/linknamespace): Likewise.
syslog functions (in POSIX) bring in the strong symbol vsyslog (not in
POSIX). This patch fixes this by changing this symbol from a strong
alias to a weak alias.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch). (vsyslog becomes weak
in the static libraries, which is what's needed; the particular macro
sequence in use leaves it as strong in the shared libraries, hence
those libraries being completely unchanged, but it doesn't generally
matter whether symbols exported from the shared libraries are weak or
strong.)
[BZ #18533]
* misc/syslog.c (vsyslog): Define as a weak alias of __vsyslog,
not a strong alias.
* conform/Makefile (test-xfail-XOPEN2K8/syslog.h/linknamespace):
Remove variable.
gethostbyaddr brings in references to in6addr_any and thereby
in6addr_loopback, which aren't in all the standards containing
gethostbyaddr (gethostbyaddr is in XPG4 and UNIX98, in6addr_any and
in6addr_loopback are new in POSIX.1:2001). This patch fixes this by
making those symbols into weak aliases (safe in this case, unlike for
most data symbols, because these data symbols are const).
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed stripped shared libraries). Disassembly is unchanged for
x86_64; for x86, I see some changes of stack offsets, but no other
code generation changes or code size differences.
[BZ #18532]
* inet/in6_addr.c (in6addr_any): Rename to __in6addr_any and
define as weak alias of __in6addr_any. Use libc_hidden_data_weak.
(in6addr_loopback): Rename to __in6addr_loopback and define as
weak alias of __in6addr_loopback. Use libc_hidden_data_weak.
* include/netinet/in.h (__in6addr_loopback): Declare. Use
libc_hidden_proto.
(__in6addr_any): Likewise.
* inet/gethstbyad_r.c (PREPROCESS): Use __in6addr_any instead of
in6addr_any.
* conform/Makefile (test-xfail-XPG4/netdb.h/linknamespace): Remove
variable.
(test-xfail-UNIX98/netdb.h/linknamespace): Likewise.
Lazy TLSDESC initialization needs to be synchronized with concurrent TLS
accesses. The TLS descriptor contains a function pointer (entry) and an
argument that is accessed from the entry function. With lazy initialization
the first call to the entry function updates the entry and the argument to
their final value. A final entry function must make sure that it accesses an
initialized argument, this needs synchronization on systems with weak memory
ordering otherwise the writes of the first call can be observed out of order.
There are at least two issues with the current code:
tlsdesc.c (i386, x86_64, arm, aarch64) uses volatile memory accesses on the
write side (in the initial entry function) instead of C11 atomics.
And on systems with weak memory ordering (arm, aarch64) the read side
synchronization is missing from the final entry functions (dl-tlsdesc.S).
This patch only deals with aarch64.
* Write side:
Volatile accesses were replaced with C11 relaxed atomics, and a release
store was used for the initialization of entry so the read side can
synchronize with it.
* Read side:
TLS access generated by the compiler and an entry function code is roughly
ldr x1, [x0] // load the entry
blr x1 // call it
entryfunc:
ldr x0, [x0,#8] // load the arg
ret
Various alternatives were considered to force the ordering in the entry
function between the two loads:
(1) barrier
entryfunc:
dmb ishld
ldr x0, [x0,#8]
(2) address dependency (if the address of the second load depends on the
result of the first one the ordering is guaranteed):
entryfunc:
ldr x1,[x0]
and x1,x1,#8
orr x1,x1,#8
ldr x0,[x0,x1]
(3) load-acquire (ARMv8 instruction that is ordered before subsequent
loads and stores)
entryfunc:
ldar xzr,[x0]
ldr x0,[x0,#8]
Option (1) is the simplest but slowest (note: this runs at every TLS
access), options (2) and (3) do one extra load from [x0] (same address
loads are ordered so it happens-after the load on the call site),
option (2) clobbers x1 which is problematic because existing gcc does
not expect that, so approach (3) was chosen.
A new _dl_tlsdesc_return_lazy entry function was introduced for lazily
relocated static TLS, so non-lazy static TLS can avoid the synchronization
cost.
[BZ #18034]
* sysdeps/aarch64/dl-tlsdesc.h (_dl_tlsdesc_return_lazy): Declare.
* sysdeps/aarch64/dl-tlsdesc.S (_dl_tlsdesc_return_lazy): Define.
(_dl_tlsdesc_undefweak): Guarantee TLSDESC entry and argument load-load
ordering using ldar.
(_dl_tlsdesc_dynamic): Likewise.
(_dl_tlsdesc_return_lazy): Likewise.
* sysdeps/aarch64/tlsdesc.c (_dl_tlsdesc_resolve_rela_fixup): Use
relaxed atomics instead of volatile and synchronize with release store.
(_dl_tlsdesc_resolve_hold_fixup): Use relaxed atomics instead of
volatile.
* elf/tlsdeschtab.h (_dl_tlsdesc_resolve_early_return_p): Likewise.
syslog (XSI POSIX) brings in references to fputs_unlocked (not
POSIX). This patch fixes this by making fputs_unlocked into a weak
alias for __fputs_unlocked and using __fputs_unlocked as needed. (No
linknamespace test XFAILs are removed because there are other failures
from syslog as well.)
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed stripped shared libraries). Disassembly of installed
stripped shared libraries is unchanged on x86_64; on x86, I see some
small changes to instruction ordering and register choice, with no
apparent reason for such changes to be related to this patch, but they
also seem completely harmless with no change to code size.
[BZ #18530]
* libio/iofputs.c [!_IO_MTSAFE_IO] (__fputs_unlocked): Define as
strong alias of _IO_fputs. Use libc_hidden_def.
* libio/iofputs_u.c (fputs_unlocked): Rename to __fputs_unlocked
and define as weak alias of __fputs_unlocked. Use
libc_hidden_weak.
* include/stdio.h (__fputs_unlocked): Declare. Use
libc_hidden_proto.
* misc/syslog.c (__vsyslog_chk): Call __fputs_unlocked instead of
fputs_unlocked.
netdb.h declares interfaces such as getaddrinfo if __USE_POSIX,
i.e. POSIX.1:1990 or later. However, these interfaces were new in the
2001 edition of POSIX, although the header was in XPG4 and UNIX98, so
they should not be declared for XPG4 or UNIX98. (This produces
spurious linknamespace test failures, although there are other
failures for this header as well for the same standards so this patch
doesn't remove any XFAILs.) This patch corrects the condition, and
the conform/ test expectations which were similarly wrong.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18529]
* resolv/netdb.h [__USE_POSIX]: Change condition to
[__USE_XOPEN2K].
* conform/data/netdb.h-data [XPG4 || UNIX98] (struct addrinfo): Do
not expect.
[XPG4 || UNIX98] (AI_PASSIVE): Likewise.
[XPG4 || UNIX98] (AI_CANONNAME): Likewise.
[XPG4 || UNIX98] (AI_NUMERICHOST): Likewise.
[XPG4 || UNIX98] (AI_V4MAPPED): Likewise.
[XPG4 || UNIX98] (AI_ALL): Likewise.
[XPG4 || UNIX98] (AI_ADDRCONFIG): Likewise.
[XPG4 || UNIX98] (AI_NUMERICSERV): Likewise.
[XPG4 || UNIX98] (NI_NOFQDN): Likewise.
[XPG4 || UNIX98] (NI_NUMERICHOST): Likewise.
[XPG4 || UNIX98] (NI_NAMEREQD): Likewise.
[XPG4 || UNIX98] (NI_NUMERICSERV): Likewise.
[XPG4 || UNIX98] (NI_DGRAM): Likewise.
[XPG4 || UNIX98] (EAI_AGAIN): Likewise.
[XPG4 || UNIX98] (EAI_BADFLAGS): Likewise.
[XPG4 || UNIX98] (EAI_FAIL): Likewise.
[XPG4 || UNIX98] (EAI_FAMILY): Likewise.
[XPG4 || UNIX98] (EAI_MEMORY): Likewise.
[XPG4 || UNIX98] (EAI_NONAME): Likewise.
[XPG4 || UNIX98] (EAI_SERVICE): Likewise.
[XPG4 || UNIX98] (EAI_SOCKTYPE): Likewise.
[XPG4 || UNIX98] (EAI_SYSTEM): Likewise.
[XPG4 || UNIX98] (EAI_SYSTEM): Likewise.
[XPG4 || UNIX98] (freeaddrinfo): Likewise.
[XPG4 || UNIX98] (gai_strerror): Likewise.
[XPG4 || UNIX98] (getaddrinfo): Likewise.
[XPG4 || UNIX98] (getnameinfo): Likewise.
grp.h declares endgrent and getgrent if __USE_XOPEN2K8 (i.e. 2008
edition of POSIX, non-XSI). However, the 2013 Technical Corrigendum
corrected the grp.h specification to XSI-shade these functions as in
previous editions (see <http://austingroupbugs.net/view.php?id=24>),
so they should not be declared for non-XSI POSIX. This patch corrects
the conditions - using __USE_MISC || __USE_XOPEN_EXTENDED to match
setgrent - and the conform/ test expectations for this header, thereby
fixing the conform tests for this header for XPG3 (where the
expectations were wrong) and the linknamespace tests for it for
POSIX2008 (where the header bug meant it was wrongly considered a
problem for endgrent to bring in a reference to setgrent).
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18528]
* grp/grp.h (endgrent): Condition on [__USE_MISC ||
__USE_XOPEN_EXTENDED], not [__USE_XOPEN_EXTENDED ||
__USE_XOPEN2K8].
(getgrent): Likewise.
* conform/data/grp.h-data [XPG3 || POSIX2008] (getgrent): Do not
expect.
[XPG3 || POSIX2008] (endgrent): Likewise.
[XPG3] (setgrent): Likewise.
* conform/Makefile (test-xfail-XPG3/grp.h/conform): Remove
variable.
(test-xfail-POSIX2008/grp.h/linknamespace): Likewise.
Various functions in XPG4 bring in references to getlogin_r, which is
not in XPG4; this is also a bug for some older POSIX versions which
aren't yet covered by the linknamespace tests. This patch fixes this
by making getlogin_r into a weak alias for __getlogin_r and using
__getlogin_r as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch).
[BZ #18527]
* login/getlogin_r.c (getlogin_r): Rename to __getlogin_r and
define as weak alias of __getlogin_r. Use libc_hidden_weak.
* sysdeps/mach/hurd/getlogin_r.c (getlogin_r): Likewise.
* sysdeps/unix/getlogin_r.c (getlogin_r): Likewise.
* sysdeps/unix/sysv/linux/getlogin_r.c (getlogin_r): Likewise.
* include/unistd.h (__getlogin_r): Declare. Use
libc_hidden_proto.
* posix/glob.c (glob): Call __getlogin_r instead of getlogin_r.
* conform/Makefile (test-xfail-XPG3/glob.h/linknamespace): Remove
variable.
(test-xfail-XPG3/wordexp.h/linknamespace): Likewise.
(test-xfail-XPG4/glob.h/linknamespace): Likewise.
(test-xfail-XPG4/wordexp.h/linknamespace): Likewise.
a non-standard directory specified by the prefix make variable
fails with an error. Since this is an unsupported use case,
this change makes make install fail early and with a descriptive
error message when either the prefix or the exec_prefix make
variable is overridden on the command line.
aio_* bring in references to pread, which isn't in all the standards
containing aio_* (as a reference from one library to another, this is
a bug for dynamic as well as static linking). This patch fixes this
by using __libc_pread instead, exporting that function from libc at
symbol version GLIBC_PRIVATE; the code, with conditionals that may
call either __pread64 or __libc_pread, becomes exactly analogous to
that elsewhere in the same file that may call either __pwrite64 or
__libc_pwrite.
Tested for x86_64 and x86 (testsuite, and comparison of disassembly of
installed shared libraries). libc changes because of the PLT entry
for the newly exported __libc_pread; librt changes because of
assertion line numbers and PLT rearrangement; other stripped installed
shared libraries do not change.
[BZ #18519]
* posix/Versions (libc): Export __libc_pread at version
GLIBC_PRIVATE.
* sysdeps/pthread/aio_misc.c (handle_fildes_io): Call __libc_pread
instead of pread.
* conform/Makefile (test-xfail-POSIX/aio.h/linknamespace): Remove
variable.
The functions ecvt, fcvt and gcvt, in some standards, bring in
references to ecvt_r and fcvt_r, which aren't in any of those
standards. The calls are correctly to __ecvt_r and __fcvt_r, but then
the names ecvt_r and fcvt_r are defined as strong aliases; this patch
changes them to weak aliases.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed stripped shared libraries is unchanged by the patch).
[BZ #18522]
* misc/efgcvt_r.c
[LONG_DOUBLE_COMPAT (libc, GLIBC_2_0) && !LONG_DOUBLE_CVT]
(cvt_symbol): Use weak_alias instead of strong_alias.
[LONG_DOUBLE_COMPAT (libc, GLIBC_2_0)] (cvt_symbol): Likewise.
* conform/Makefile (test-xfail-XPG4/stdlib.h/linknamespace):
Remove variable.
(test-xfail-UNIX98/stdlib.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/stdlib.h/linknamespace): Likewise.
The 2008 edition of POSIX removed h_errno, but some functions still
bring in references to the h_errno external symbol. As this symbol is
not a part of the public ABI (only __h_errno_location is), this patch
fixes this by renaming the GLIBC_PRIVATE TLS symbol to __h_errno.
Tested for x86_64 and x86 (testsuite, and comparison of installed
shared libraries). Disassembly of all shared libraries using h_errno
changes because of the renaming (and changes to associated TLS / GOT
offsets in some cases); disassembly of libpthread on x86_64 changes
more substantially because the enlargement of .dynsym affects
subsequent addresses.
[BZ #18520]
* inet/herrno.c (h_errno): Rename to __h_errno.
(__libc_h_errno): Define as alias of __h_errno not h_errno.
* include/netdb.h [IS_IN_LIB && !IS_IN (libc)] (h_errno): Define
to __h_errno instead of h_errno.
* nptl/herrno.c (h_errno): Rename to __h_errno.
(__h_errno_location): Refer to __h_errno not h_errno.
* resolv/Versions (h_errno): Rename to __h_errno.
* conform/Makefile (test-xfail-XOPEN2K8/grp.h/linknamespace):
Remove variable.
(test-xfail-XOPEN2K8/pwd.h/linknamespace): Likewise.
Here is implementation of vectorized sin containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
* bits/libm-simd-decl-stubs.h: Added stubs for sin.
* math/bits/mathcalls.h: Added sin declaration with __MATHCALL_VEC.
* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
* sysdeps/x86/fpu/bits/math-vector.h: SIMD declaration for sin.
* sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
* sysdeps/x86_64/fpu/Versions: New versions added.
* sysdeps/x86_64/fpu/libm-test-ulps: Regenerated.
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
build of SSE, AVX2 and AVX512 IFUNC versions.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin2_core_sse4.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin4_core_avx2.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_sin8_core_avx512.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin2_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin4_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin4_core_avx.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin8_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin_data.S: New file.
* sysdeps/x86_64/fpu/svml_d_sin_data.h: New file.
* sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c: Added vector sin test.
* sysdeps/x86_64/fpu/test-double-vlen2.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4-avx2.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen4.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c: Likewise.
* sysdeps/x86_64/fpu/test-double-vlen8.c: Likewise.
* NEWS: Mention addition of x86_64 vector sin.
In commit 02657da2cf, .interp section
was removed from libpthread.so. This led to an error:
$ /lib64/libpthread.so.0
Native POSIX Threads Library by Ulrich Drepper et al
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Forced unwind support included.
Segmentation fault
(gdb) bt
#0 0x00000000000055a6 in _exit@plt ()
Unfortunately, there is no way to add a regression test for the bug
because .interp specifies the path to dynamic linker of the target
system.
[BZ #18479]
* nptl/pt-interp.c: New file.
* nptl/Makefile (libpthread-routines, libpthread-shared-only-routines):
Add pt-interp.
[$(build-shared) = yes] ($(objpfx)pt-interp.os): Depend on
$(common-objpfx)runtime-linker.h.
regcomp brings in references to wcscoll, which isn't in all the
standards that contain regcomp. In turn, wcscoll brings in references
to wcscmp, also not in all those standards. This patch fixes this by
making those functions into weak aliases of __wcscoll and __wcscmp and
calling those names instead as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18497]
* wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead
of wcscmp.
(wcscmp): Define as weak alias of WCSCMP.
* wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of
wcscoll.
(USE_HIDDEN_DEF): Define.
[!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of
__wcscoll. Don't use libc_hidden_weak.
* wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of
wcscmp.
* sysdeps/i386/i686/multiarch/wcscmp-c.c
[SHARED] (libc_hidden_def): Define __GI___wcscmp instead of
__GI_wcscmp.
(weak_alias): Undefine and redefine.
* sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to
__wcscmp and define as weak alias of __wcscmp.
* sysdeps/x86_64/wcscmp.S (wcscmp): Likewise.
* include/wchar.h (__wcscmp): Declare. Use libc_hidden_proto.
(__wcscoll): Likewise.
(wcscmp): Don't use libc_hidden_proto.
(wcscoll): Likewise.
* posix/regcomp.c (build_range_exp): Call __wcscoll instead of
wcscoll.
* posix/regexec.c (check_node_accept_bytes): Likewise.
* conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove
variable.
(test-xfail-XPG4/regex.h/linknamespace): Likewise.
(test-xfail-POSIX/regex.h/linknamespace): Likewise.
pathconf uses __statvfs64, and fpathconf uses __fstatvfs64. On
systems using sysdeps/unix/sysv/linux/wordsize-64, __statvfs64 then
brings in the strong symbol statvfs, and __fstatvfs64 brings in the
strong symbol fstatvfs, which are not in all the standards that have
pathconf and fpathconf. This patch fixes this by making those symbols
into weak aliases.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18507]
* sysdeps/unix/sysv/linux/fstatvfs.c (fstatvfs): Rename to
__fstatvfs and define as weak alias of __fstatvfs. Use
libc_hidden_weak.
* sysdeps/unix/sysv/linux/statvfs.c (statvs): Rename to __statvfs
and define as weak alias of __statvfs. Use libc_hidden_weak.
* sysdeps/unix/sysv/linux/wordsize-64/fstatvfs.c (__fstatvfs64):
Define as alias of __fstatvfs, not fstatvfs.
(fstatvfs64): Likewise.
* sysdeps/unix/sysv/linux/wordsize-64/statvfs.c (__statvfs64):
Define as alias of __statvfs, not statvfs.
(statvfs64): Likewise.
* conform/Makefile (test-xfail-POSIX/unistd.h/linknamespace):
Remove variable.
Here is implementation of vectorized cosf containing SSE, AVX,
AVX2 and AVX512 versions according to Vector ABI
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
* sysdeps/x86_64/fpu/Makefile (libmvec-support): Added new files.
* sysdeps/x86_64/fpu/Versions: New versions added.
* sysdeps/x86_64/fpu/svml_s_cosf4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf4_core_sse4.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf8_core_avx.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf8_core_avx2.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf16_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_s_cosf16_core_avx512.S: New file.
* sysdeps/x86_64/fpu/svml_s_wrapper_impl.h: New file.
* sysdeps/x86_64/fpu/svml_s_cosf_data.S: New file.
* sysdeps/x86_64/fpu/svml_s_cosf_data.h: New file.
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
build of SSE, AVX2 and AVX512 IFUNC versions.
* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New versions added.
* sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cosf.
* NEWS: Mention addition of x86_64 vector cosf.
Here is implementation of cos containing SSE, AVX, AVX2 and AVX512
versions according to Vector ABI which had been discussed in
<https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.
Vector math library build and ABI testing enabled by default for x86_64.
* sysdeps/x86_64/fpu/Makefile: New file.
* sysdeps/x86_64/fpu/Versions: New file.
* sysdeps/x86_64/fpu/svml_d_cos_data.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos_data.h: New file.
* sysdeps/x86_64/fpu/svml_d_cos2_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos4_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos4_core_avx.S: New file.
* sysdeps/x86_64/fpu/svml_d_cos8_core.S: New file.
* sysdeps/x86_64/fpu/svml_d_wrapper_impl.h: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos2_core_sse4.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos4_core_avx2.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core.S: New file.
* sysdeps/x86_64/fpu/multiarch/svml_d_cos8_core_avx512.S: New file.
* sysdeps/x86_64/fpu/multiarch/Makefile (libmvec-sysdep_routines): Added
build of SSE, AVX2 and AVX512 IFUNC versions.
* sysdeps/x86/fpu/bits/math-vector.h: Added SIMD declaration for cos.
* math/bits/mathcalls.h: Added cos declaration with __MATHCALL_VEC.
* sysdeps/x86_64/configure.ac: Options for libmvec build.
* sysdeps/x86_64/configure: Regenerated.
* sysdeps/x86_64/sysdep.h (cfi_offset_rel_rsp): New macro.
* sysdeps/unix/sysv/linux/x86_64/libmvec.abilist: New file.
* manual/install.texi (Configuring and compiling): Document
--disable-mathvec.
* INSTALL: Regenerated.
* NEWS: Mention addition of libmvec and x86_64 vector cos.
open_memstream is new in the 2008 edition of POSIX. However, the
older functions getopt, closelog and fmtmsg all bring in references to
it. This patch fixes this in the usual way, making open_memstream
into a weak alias of __open_memstream and calling __open_memstream
from the relevant places.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch). 32-bit builds
produce an XPASS for conform/POSIX/unistd.h/linknamespace after this
patch (because the only cause of failure left there now is 64-bit
specific); that will disappear once the 64-bit failure is resolved and
the XFAIL removed at that time.
[BZ #18498]
* libio/memstream.c (open_memstream): Rename to __open_memstream
and define as weak alias of __open_memstream.
* include/stdio.h (__open_memstream): Declare. Use
libc_hidden_proto.
(open_memstream): Don't use libc_hidden_proto.
* misc/syslog.c (__vsyslog_chk): Call __open_memstream instead of
open_memstream.
* posix/getopt.c (_getopt_internal_r): Likewise.
* conform/Makefile (test-xfail-XPG3/stdio.h/linknamespace): Remove
variable.
(test-xfail-XPG4/stdio.h/linknamespace): Likewise.
(test-xfail-UNIX98/stdio.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/unistd.h/linknamespace): Likewise.
The regex code brings in references to wcrtomb, which isn't in all the
standards that contain regex. This patch makes it call __wcrtomb
instead (in fact some places already called __wcrtomb, so this patch
makes it internally consistent about which name is used).
Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by the patch.
[BZ #18496]
* posix/regex_internal.c (build_wcs_upper_buffer): Call __wcrtomb
instead of wcrtomb.
signal.h declares psignal and psiginfo if __USE_XOPEN2K - that is, for
the 2001 edition of POSIX. These functions were actually added in the
2008 edition (as indicated in the header comments). This patch fixes
the header conditionals. This fixes some linknamespace test failures
because psiginfo uses fmemopen, which is also new in the 2008 edition,
so before the header fix this appeared to the linknamespace tests as a
2001 function bringing in references to a 2008 function. The problem
also appeared in conformtest header namespace test results (the
conformtest data has correct conditionals for when these functions
should be visible), but the affected headers still have other
namespace problems so this doesn't fix any of those XFAILs.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18483]
* signal/signal.h [__USE_XOPEN2K] (psignal): Change condition to
[__USE_XOPEN2K8]. Remove redundant #endif.
[__USE_XOPEN2K] (psiginfo): Change condition to [__USE_XOPEN2K8].
Remove redundant #if.
* conform/Makefile (test-xfail-XOPEN2K/signal.h/linknamespace):
Remove variable.
(test-xfail-XOPEN2K/sys/wait.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/ucontext.h/linknamespace): Likewise.
regcomp brings in references to various wctype functions that aren't
in all the standards including regcomp. This patch fixes this in the
usual way by using the __* versions of these functions (which already
exist, but some didn't have libc_hidden_proto / libc_hidden_def
before).
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch). (Other wide character
function references from the regex code mean that this patch by itself
doesn't fix any XFAILed linknamespace test failures; further patches
will be needed for that.)
[BZ #18495]
* wctype/wcfuncs.c (__iswalnum): Use libc_hidden_def.
(__iswlower): Likewise.
* include/wctype.h (__iswalnum): Declare. Use libc_hidden_proto.
(__iswlower): Likewise.
* posix/regcomp.c (re_compile_fastmap_iter): Call __towlower
instead of towlower.
* posix/regex_internal.c (build_wcs_upper_buffer): Call __iswlower
instead of iswlower. Call __towupper instead of towupper.
* posix/regex_internal.h (IS_WIDE_WORD_CHAR): Call __iswalnum
instead of iswalnum.
This adds wake-ups that would be missing if assuming that for a
non-writer-preferring rwlock, if one thread has acquired a rdlock and
does not release it, another thread will eventually acquire a rdlock too
despite concurrent write lock acquisition attempts. BZ 14958 is about
supporting this assumption. Strictly speaking, this isn't a valid
test case, but nonetheless worth supporting (see comment 7 of BZ 14958).
If we set up a rwlock to prefer writers (and disallow recursive rdlock
acquisitions), then readers will block for writers that are blocked to
acquire the lock (otherwise, readers could constantly enter and exit,
and the writer would never get the lock). However, the existing
implementation did not wake such readers when the writer timed out.
This patch adds the missing wake-up.
There's no similar case for writers being blocked on readers.
fnmatch brings in references to strnlen, which isn't in all the
standards that contain fnmatch (not added until the 2008 edition of
POSIX), resulting in linknamespace test failures. (This is contrary
to glibc conventions, rather than a standards conformance issue,
because of the str* reservation.) This patch fixes this in the usual
way, using __strnlen instead of strnlen.
Tested for x86_64 and x86 (testsuite, and that installed stripped
shared libraries are unchanged by the patch).
[BZ #18470]
* posix/fnmatch.c (fnmatch) [_LIBC]: Call __strnlen instead of
strnlen.
* conform/Makefile (test-xfail-XPG3/fnmatch.h/linknamespace):
Remove variable.
(test-xfail-XPG4/fnmatch.h/linknamespace): Likewise.
(test-xfail-POSIX/fnmatch.h/linknamespace): Likewise.
(test-xfail-POSIX/glob.h/linknamespace): Likewise.
(test-xfail-POSIX/wordexp.h/linknamespace): Likewise.
(test-xfail-UNIX98/fnmatch.h/linknamespace): Likewise.
(test-xfail-UNIX98/glob.h/linknamespace): Likewise.
(test-xfail-UNIX98/wordexp.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/fnmatch.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/glob.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/wordexp.h/linknamespace): Likewise.
fnmatch brings in references to wmemchr, which isn't in all the
standards that contain fnmatch, resulting in linknamespace test
failures. This patch fixes this in the usual way, making wmemchr into
a weak alias for __wmemchr.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18468]
* wcsmbs/wmemchr.c (wmemchr): Rename to __wmemchr and define as
weak alias of __wmemchr. Use libc_hidden_weak.
* include/wchar.h (__wmemchr): Declare. Use libc_hidden_proto.
* posix/fnmatch.c [HANDLE_MULTIBYTE] (MEMCHR): Use __wmemchr
instead of wmemchr.
fnmatch brings in references to towlower (and thereby towupper), which
isn't in all the standards that contain fnmatch, resulting in
linknamespace test failures. (This is contrary to glibc conventions,
rather than a standards conformance issue, because of the to*
reservation.) This patch fixes this in the usual way, making those
functions into weak aliases.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch). This is on top
of <https://sourceware.org/ml/libc-alpha/2015-06/msg00019.html>, but
the two patches should be independent.
(The __attribute_pure__ on the declarations in include/wctype.h comes
from GCC's built-in attributes for towlower and towupper, and is
needed to get the same code generation for fnmatch before and after
the patch. It seems likely there are cases where the declaration of
__foo in the internal headers is missing attributes from foo in the
public headers, built-in to GCC or both, but I don't know a good way
to detect such missing attributes.)
[BZ #18469]
* wctype/wcfuncs.c (towlower): Rename to __towlower and define as
weak alias of __towlower. Use libc_hidden_weak.
(towupper): Rename to __towupper and define as weak alias of
__towupper. Use libc_hidden_weak.
* include/wctype.h (__towlower): Declare. Use libc_hidden_proto.
(__towupper): Likewise.
* posix/fnmatch.c [HANDLE_MULTIBYTE && _LIBC] (FOLD): Use
__towlower instead of towlower.
PLT relocations aren't required when -z now used. Linker on master with:
commit 25070364b0ce33eed46aa5d78ebebbec6accec7e
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sat May 16 07:00:21 2015 -0700
Don't generate PLT relocations for now binding
There is no need for PLT relocations with -z now. We can use GOT
relocations, which take less space, instead and replace 16-byte .plt
entres with 8-byte .plt.got entries.
bfd/
* elf32-i386.c (elf_i386_check_relocs): Create .plt.got section
for now binding.
(elf_i386_allocate_dynrelocs): Use .plt.got section for now
binding.
* elf64-x86-64.c (elf_x86_64_check_relocs): Create .plt.got
section for now binding.
(elf_x86_64_allocate_dynrelocs): Use .plt.got section for now
binding.
won't generate PLT relocations with -z now. elf/tst-audit2.c expect
certain order of execution in ld.so. With PLT relocations, the GOTPLT
entry of calloc is update to calloc defined in tst-audit2:
(gdb) bt
skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
version=<optimized out>, sym=<optimized out>, map=<optimized out>)
at ../sysdeps/i386/dl-machine.h:329
out>,
nrelative=<optimized out>, relsize=<optimized out>,
reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
reloc_mode=reloc_mode@entry=0,
consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
start_argptr=start_argptr@entry=0xffffcfb0,
dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at
../elf/dl-sysdep.c:249
from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
(gdb)
and then calloc is called:
(gdb) c
Continuing.
Breakpoint 4, calloc (n=n@entry=20, m=4) at tst-audit2.c:18
18 {
(gdb) bt
reloc_mode=reloc_mode@entry=0, consider_profiling=1,
consider_profiling@entry=0) at dl-reloc.c:272
user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
start_argptr=start_argptr@entry=0xffffcfb0,
dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at
../elf/dl-sysdep.c:249
from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
(gdb)
With GOT relocation, calloc in ld.so is called first:
(gdb) bt
consider_profiling=1) at dl-reloc.c:272
user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2074
start_argptr=start_argptr@entry=0xffffcfa0,
dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at
../elf/dl-sysdep.c:249
from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
(gdb)
and then the GOT entry of calloc is updated:
(gdb) bt
skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
version=<optimized out>, sym=<optimized out>, map=<optimized out>)
at ../sysdeps/i386/dl-machine.h:329
out>,
nrelative=<optimized out>, relsize=<optimized out>,
reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
reloc_mode=reloc_mode@entry=0,
consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2133
start_argptr=start_argptr@entry=0xffffcfa0,
dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at
../elf/dl-sysdep.c:249
from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
(gdb)
After that, since calloc isn't called from ld.so nor any other modules,
magic in tst-audit2 isn't updated. Both orders are correct. This patch
makes sure that calloc in tst-audit2.c is called at least once from ld.so.
[BZ #18422]
* Makefile ($(objpfx)tst-audit2): Depend on $(libdl).
($(objpfx)tst-audit2.out): Also depend on
$(objpfx)tst-auditmod9b.so.
* elf/tst-audit2.c: Include <dlfcn.h>.
(calloc_called): New.
(calloc): Allow to be called more than once.
(do_test): dllopen/dlclose $ORIGIN/tst-auditmod9b.so.
In the introduction for the official orthography rules for Ukrainian
language (http://spelling.ulif.org.ua/peredmova.htm) there's a note
that only apostrophe does not affect order of the words when sorting.
As could be seen from the official alphabet the soft sign
(U+044C/U+042C) has its hard position and thus affects the order and
also letters "е" and "є" (CYR-IE: U+0435/U+0415 and UKR-IE:
U+0454/U+0404) have their own positions and should have separate place
when sorting.
This also corresponds to official Unicode collation chart for these
letters: http://unicode.org/charts/collation/chart_Cyrillic.html
On 21/05/15 05:29, Siddhesh Poyarekar wrote:
> On Wed, May 20, 2015 at 06:55:02PM +0100, Szabolcs Nagy wrote:
>> i guess it's ok for consistency if i fix struct stat64
>> too to use __USE_XOPEN2K8.
>>
>> i will run some tests and come back with a patch
>
> I also think it would be appropriate to change this code in other
> architectures (microblaze and nacl IIRC) to make all of them
> consistent. It is a mechanical enough change IMO that all arch
> maintainer acks is not necessary.
>
here is the patch with consistent __USE_XOPEN2K8
ok to commit?
2015-05-21 Szabolcs Nagy <szabolcs.nagy@arm.com>
[BZ #18234]
* conform/data/sys/stat.h-data (struct stat): Add tests for st_atim,
st_mtim and st_ctim members.
* sysdeps/nacl/bits/stat.h (struct stat, struct stat64): Make
st_atim, st_ctim, st_mtim visible under __USE_XOPEN2K8 only.
* sysdeps/unix/sysv/linux/generic/bits/stat.h (struct stat,):
(struct stat64): Likewise.
* sysdeps/unix/sysv/linux/ia64/bits/stat.h (struct stat,):
(struct stat64): Likewise.
* sysdeps/unix/sysv/linux/microblaze/bits/stat.h (struct stat,):
(struct stat64): Likewise.
A shared object doesn't need PLT if there are no PLT relocations. It
shouldn't be an error if DT_PLTRELSZ is missing.
[BZ #18410]
* elf/dl-reloc.c (_dl_relocate_object): Don't issue an error
for missing DT_PLTRELSZ.
[BZ #18412]
* intl/locale.alias: Remove obsolete aliases "bokmål" and "français"
which caused 'locale -a' to output Latin-1 data in UTF-8 locales,
breaking some applications that use 'locale -a' output.
Change the encoding of this file from Latin-1 to ASCII to avoid
other potential problems with people grepping this file.
My review of conformtest expectations for POSIX showed up that the
_POSIX2_C_VERSION macro, required by POSIX and XPG standards before
2001, was missing in unistd.h, having been removed on 2003-04-03
despite those standards still being supported. This patch adds it
back. As it's in the implementation namespace, there's no need for it
to be conditional, and other such macros aren't conditional in this
header either.
Tested for x86_64 and x86 (testsuite). Note that this *does* change
the installed libraries, because it affects the sysconf support
(present all along) for _SC_2_C_VERSION.
[BZ #438]
* posix/unistd.h (_POSIX2_C_VERSION): New macro.
* conform/Makefile (test-xfail-POSIX/unistd.h/conform): Remove
variable.
pathconf (sysdeps/unix/sysv/linux/pathconf.c) uses basename. But
pathconf is in POSIX back to 1990 while basename is only reserved with
external linkage in those standards including XPG functions. This
patch fixes this namespace issue in the usual way, renaming basename
to __basename and making it into a weak alias.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18444]
* string/basename.c (basename): Rename to __basename and define as
weak alias of __basename. Use libc_hidden_weak.
* include/string.h (__basename): Declare. Use libc_hidden_proto.
* sysdeps/unix/sysv/linux/pathconf.c (distinguish_extX): Call
__basename instead of basename.
* conform/Makefile (test-xfail-POSIX2008/unistd.h/linknamespace):
Remove variable.
(test-xfail-XOPEN2K8/unistd.h/linknamespace): Likewise.
Remove use of ext.nsmap member of struct __res_state and always use
an identity mapping betwen the nsaddr_list array and the ext.nsaddrs
array. The fact that a nameserver has an IPv6 address is signalled by
setting nsaddr_list[].sin_family to zero.
ldbl-96 remquol wrongly handles the case where the first argument is
finite and the second infinite, because the check for the second
argument being a NaN fails to disregard the explicit high mantissa bit
and so wrongly interprets an infinity as being a NaN. This patch
fixes this by masking off that bit, and improves test coverage for
both remainder and remquo (various cases were missing tests, or, as in
the case of the bug, were tested only for one of the two functions).
Tested for x86_64 and x86.
[BZ #18244]
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Ignore explicit
high mantissa bit when testing whether P is a NaN.
* math/libm-test.inc (remainder_test_data): Add more tests.
(remquo_test_data): Likewise.
The i386 implementation of atanhl, for small arguments, does a
calculation that involves computing twice the square of the argument,
resulting in spurious underflows for some arguments. This patch fixes
this by just returning the argument when its exponent is below -32,
with underflow being forced as needed for subnormal arguments.
Tested for x86 and x86_64.
[BZ #18049]
* sysdeps/i386/fpu/e_atanhl.S (__ieee754_atanhl): For exponents
below -32, return the argument, with underflow if subnormal.
* math/auto-libm-test-in: Add more tests of atanh.
* math/auto-libm-test-out: Regenerated.
[BZ #17581] The checking chain of unused chunks was terminated by a hash of
the block pointer, which was sometimes confused with the chunk length byte.
We now avoid using a length byte equal to the magic byte.
When the malloc subsystem detects some kind of memory corruption,
depending on the configuration it prints the error, a backtrace, a
memory map and then aborts the process. In this process, the
backtrace() call may result in a call to malloc, resulting in
various kinds of problematic behavior.
In one case, the malloc it calls may detect a corruption and call
backtrace again, and a stack overflow may result due to the infinite
recursion. In another case, the malloc it calls may deadlock on an
arena lock with the malloc (or free, realloc, etc.) that detected the
corruption. In yet another case, if the program is linked with
pthreads, backtrace may do a pthread_once initialization, which
deadlocks on itself.
In all these cases, the program exit is not as intended. This is
avoidable by marking the arena that malloc detected a corruption on,
as unusable. The following patch does that. Features of this patch
are as follows:
- A flag is added to the mstate struct of the arena to indicate if the
arena is corrupt.
- The flag is checked whenever malloc functions try to get a lock on
an arena. If the arena is unusable, a NULL is returned, causing the
malloc to use mmap or try the next arena.
- malloc_printerr sets the corrupt flag on the arena when it detects a
corruption
- free does not concern itself with the flag at all. It is not
important since the backtrace workflow does not need free. A free
in a parallel thread may cause another corruption, but that's not
new
- The flag check and set are not atomic and may race. This is fine
since we don't care about contention during the flag check. We want
to make sure that the malloc call in the backtrace does not trip on
itself and all that action happens in the same thread and not across
threads.
I verified that the test case does not show any regressions due to
this patch. I also ran the malloc benchmarks and found an
insignificant difference in timings (< 2%).
* malloc/Makefile (tests): New test case tst-malloc-backtrace.
* malloc/arena.c (arena_lock): Check if arena is corrupt.
(reused_arena): Find a non-corrupt arena.
(heap_trim): Pass arena to unlink.
* malloc/hooks.c (malloc_check_get_size): Pass arena to
malloc_printerr.
(top_check): Likewise.
(free_check): Likewise.
(realloc_check): Likewise.
* malloc/malloc.c (malloc_printerr): Add arena argument.
(unlink): Likewise.
(munmap_chunk): Adjust.
(ARENA_CORRUPTION_BIT): New macro.
(arena_is_corrupt): Likewise.
(set_arena_corrupt): Likewise.
(sysmalloc): Use mmap if there are no usable arenas.
(_int_malloc): Likewise.
(__libc_malloc): Don't fail if arena_get returns NULL.
(_mid_memalign): Likewise.
(__libc_calloc): Likewise.
(__libc_realloc): Adjust for additional argument to
malloc_printerr.
(_int_free): Likewise.
(malloc_consolidate): Likewise.
(_int_realloc): Likewise.
(_int_memalign): Don't touch corrupt arenas.
* malloc/tst-malloc-backtrace.c: New test case.
Similar to various other bugs in this area, some atanh implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (No change in this regard is needed
for the i386 implementation; special handling to force underflows in
these cases will only be needed there when the spurious underflows,
bug 18049, get fixed.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16352]
* sysdeps/i386/fpu/e_atanh.S (dbl_min): New object.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/i386/fpu/e_atanhf.S (flt_min): New object.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/e_atanh.c: Include <float.h>.
(__ieee754_atanh): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/flt-32/e_atanhf.c: Include <float.h>.
(__ieee754_atanhf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_atanhl.c: Include <float.h>.
(__ieee754_atanhl): Force underflow exception for results with
small absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from atanh.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of tanf produces spurious underflow
exceptions for some small arguments, through computing values on the
order of x^5. This patch fixes this by adjusting the threshold for
returning x (or, as applicable, +/- 1/x) to 2**-13 (the next term in
the power series being x^3/3).
Tested for x86_64 and x86.
[BZ #18221]
* sysdeps/ieee754/flt-32/k_tanf.c (__kernel_tanf): Use 2**-13 not
2**-28 as threshold for returning x or +/- 1/x.
* math/auto-libm-test-in: Add more tests of tan.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of lgammaf produces spurious underflow
exceptions for some large arguments, because of calculations involving
x^-2 multiplied by small constants. This patch fixes this by
adjusting the threshold for a simpler computation to 2**26 (the error
in the simpler computation is on the order of 0.5 * log (x), for a
result on the order of x * log (x)).
Tested for x86_64 and x86.
[BZ #18220]
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r): Use
2**26 not 2**58 as threshold for returning x * (log (x) - 1).
* math/auto-libm-test-in: Add another test of lgamma.
* math/auto-libm-test-out: Regenerated.
The flt-32 implementation of erfcf produces spurious underflow
exceptions for some arguments close to 0, because of calculations
squaring the argument and then multiplying by small constants. This
patch fixes this by adjusting the threshold for arguments for which
the result is so close to 1 that 1 - x will give the right result from
2**-56 to 2**-26. (If 1 - x * 2/sqrt(pi) were used, the errors would be
on the order of x^3 and a much larger threshold could be used.)
Tested for x86_64 and x86.
[BZ #18217]
* sysdeps/ieee754/flt-32/s_erff.c (__erfcf): Use 2**-26 not 2**-56
as threshold for returning 1 - x.
* math/auto-libm-test-in: Add more tests of erfc.
* math/auto-libm-test-out: Regenerated.
The sysdeps/ieee754/flt-32 version of atanf produces spurious
underflow exceptions for some large arguments, because of computations
that compute x^-4. This patch fixes this by adjusting the threshold
for large arguments (for which +/- pi/2 can just be returned, the
correct result being roughly +/- pi/2 - 1/x) from 2^34 to 2^25.
Tested for x86_64 and x86.
[BZ #18196]
* sysdeps/ieee754/flt-32/s_atanf.c (__atanf): Use 2^25 not 2^34 as
threshold for large arguments.
* math/auto-libm-test-in: Add another test of atan.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some log1p implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes. (The ldbl-128ibm implementation
doesn't currently need any change as it already generates this
exception, albeit through code that would generate spurious exceptions
in other cases; special code for this issue will only be needed there
when fixing the spurious exceptions.)
Tested for x86_64, x86, powerpc and mips64.
[BZ #16339]
* sysdeps/i386/fpu/s_log1p.S (dbl_min): New object.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/i386/fpu/s_log1pf.S (flt_min): New object.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/dbl-64/s_log1p.c: Include <float.h>.
(__log1p): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/flt-32/s_log1pf.c: Include <float.h>.
(__log1pf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_log1pl.c: Include <float.h>.
(__log1pl): Force underflow exception for results with small
absolute value.
* math/auto-libm-test-in: Do not allow missing underflow
exceptions from log1p.
* math/auto-libm-test-out: Regenerated.
Programs are supposed to be able to define the __fpu_control variable,
overriding the library's version to cause the floating-point control
word to be set to the chosen value at startup.
This is broken for mips16 for static linking because the library's
__fpu_control variable is in the same object file as the helper
functions used by fpu_control.h for mips16, so test-fpucw-ieee-static
fails to link with multiple definitions of __fpu_control.
This patch fixes this by putting the helpers in a separate file rather
than overriding fpu_control.c. Tested for mips16 that this fixes the
link failure and the ABI tests still pass.
[BZ #18397]
* sysdeps/mips/mips32/fpu/fpu_control.c: Move to ....
* sysdeps/mips/mips32/fpu/fpucw-helpers.c: ... here. Include
<fpu_control.h> instead of <math/fpu_control.c>.
* sysdeps/mips/mips32/fpu/Makefile: New file.
There appears to be a discrepancy among the implementations
of setcontext with regards to the function called once the last
linked-to context has finished executing via setcontext.
The POSIX standard says:
~~~
If the uc_link member of the ucontext_t structure pointed to by
the ucp argument is equal to 0, then this context is the main
context, and the thread will exit when this context returns.
~~~
It says "exit" not "exit immediately" nor "exit without running
functions registered with atexit or on_exit."
Therefore the AArch64, ARM, hppa and NIOS II implementations are
wrong and no test detects it.
It is questionable if this should even be fixed or just documented
that the above 4 targets are wrong. The functions are deprecated
and nobody should be using them, but at the same time it silly to
have cross-target differences that make it hard to port old
applications from say x86_64 to AArch64.
Therefore I will ix the 4 arches, and checkin a regression
test to prevent it from changing again.
https://sourceware.org/ml/libc-alpha/2015-03/msg00720.html
Robin Hack discovered Samba would enter an infinite loop processing
certain quota-related requests. We eventually tracked this down to a
glibc issue.
Running a (simplified) test case under strace shows that /etc/passwd
is continuously opened and closed:
…
open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
lseek(3, 0, SEEK_CUR) = 0
read(3, "root❌0:0:root:/root:/bin/bash\n"..., 4096) = 2717
lseek(3, 2717, SEEK_SET) = 2717
close(3) = 0
open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
lseek(3, 0, SEEK_CUR) = 0
lseek(3, 0, SEEK_SET) = 0
read(3, "root❌0:0:root:/root:/bin/bash\n"..., 4096) = 2717
lseek(3, 2717, SEEK_SET) = 2717
close(3) = 0
open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
lseek(3, 0, SEEK_CUR) = 0
…
The lookup function implementation in
nss/nss_files/files-XXX.c:DB_LOOKUP has code to prevent that. It is
supposed skip closing the input file if it was already open.
/* Reset file pointer to beginning or open file. */ \
status = internal_setent (keep_stream); \
\
if (status == NSS_STATUS_SUCCESS) \
{ \
/* Tell getent function that we have repositioned the file pointer. */ \
last_use = getby; \
\
while ((status = internal_getent (result, buffer, buflen, errnop \
H_ERRNO_ARG EXTRA_ARGS_VALUE)) \
== NSS_STATUS_SUCCESS) \
{ break_if_match } \
\
if (! keep_stream) \
internal_endent (); \
} \
keep_stream is initialized from the stayopen flag in internal_setent.
internal_setent is called from the set*ent implementation as:
status = internal_setent (stayopen);
However, for non-host database, this flag is always 0, per the
STAYOPEN magic in nss/getXXent_r.c.
Thus, the fix is this:
- status = internal_setent (stayopen);
+ status = internal_setent (1);
This is not a behavioral change even for the hosts database (where the
application can specify the stayopen flag) because with a call to
sethostent(0), the file handle is still not closed in the
implementation of gethostent.
The implementation of roundl for ldbl-128 involves undefined behavior
for arguments with exponents from 31 to 47 inclusive, from the shift:
u_int64_t i = -1ULL >> (j0 - 48);
For example, on mips64, this means roundl (0xffffffffffff.8p0L)
wrongly returns its argument, which is not an integer. A condition
checking for exponents < 31 should actually be checking for exponents
< 48, and this patch makes it do so. (That condition is for whether
the bit representing 0.5 is in the high 64-bit half of the
floating-point number. The value 31 might have arisen from an
incorrect conversion of the ldbl-96 version to handle ldbl-128.)
This was originally reported as a GCC libquadmath bug
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65757>.
Tested for mips64; also tested for x86_64 and x86 to make sure the new
tests pass there.
[BZ #18346]
* sysdeps/ieee754/ldbl-128/s_roundl.c (__roundl): Handle all
exponents less than 48 as cases where high part of mantissa needs
examining to determine whether argument is integral.
* math/libm-test.inc (round_test_data): Add more tests.
add_temp_file now makes a copy which is freed by delete_temp_files.
Callers to create_temp_file can now free the returned file name to
avoid the memory leak. These changes do not affect the leak behavior
of existing code.
Also address a NULL pointer derefence in tzset after a memoru allocation
failure, found during testing.
This patch adds support to query cache information on s390
via sysconf() function - e.g. with _SC_LEVEL1_ICACHE_SIZE.
The attributes size, linesize and assoc can be queried
for cache level 1 - 4 via "extract cpu attribute" instruction,
which was first available with z10.
* NEWS: Mention sysconf() cache information support for s390.
* sysdeps/unix/sysv/linux/s390/sysconf.c: New File.
[BZ #18206]
* wcsmbs/wcsncmp.c (wcsncmp): Compare as wchar_t, not wint_t.
Use signed comparision instead of substraction to avoid
overflow bug.
* localedata/tests-mbwc/tst_wcsncmp.c (tst_wcsncmp):
Take the sign of ret.
* localedata/tests-mbwc/dat_wcsncmp.c (tst_wcsncmp_loc):
Do not expect precise return values. Only the sign matters.
* wcsmbs/Makefile (strop-tests): Add wcsncmp.
* wcsmbs/test-wcsncmp.c: New File.
* string/test-strncmp.c: Add wcsncmp support.
According to bug 6792, errno is not set to ERANGE/EDOM
by calling log1p/log1pf/log1pl with x = -1 or x < -1.
This patch adds a wrapper which sets errno in those cases
and returns the value of the existing __log1p function.
The log1p is now an alias to the wrapper function
instead of __log1p.
The files in sysdeps are reflecting these changes.
The ia64 implementation sets errno by itself,
thus the wrapper-file is empty.
The libm-test is adjusted for log1p-tests to check errno.
[BZ #6792]
* math/w_log1p.c: New file.
* math/w_log1pf.c: Likewise.
* math/w_log1pl.c: Likewise.
* math/Makefile (libm-calls): Add w_log1p.
* math/s_log1pl.c (log1pl): Remove weak_alias.
* sysdeps/i386/fpu/s_log1p.S (log1p): Likewise.
* sysdeps/i386/fpu/s_log1pf.S (log1pf): Likewise.
* sysdeps/i386/fpu/s_log1pl.S (log1pl): Likewise.
* sysdeps/x86_64/fpu/s_log1pl.S (log1pl): Likewise.
* sysdeps/ieee754/dbl-64/s_log1p.c (log1p): Likewise.
[NO_LONG_DOUBLE] (log1pl): Likewise.
* sysdeps/ieee754/flt-32/s_log1pf.c (log1pf): Likewise.
* sysdeps/ieee754/ldbl-128/s_log1pl.c (log1pl): Likewise.
* sysdeps/ieee754/ldbl-64-128/s_log1pl.c
(log1p): Remove long_double_symbol.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (log1pl): Likewise.
* sysdeps/ieee754/ldbl-64-128/w_log1pl.c: New file.
* sysdeps/ieee754/ldbl-128ibm/w_log1pl.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_log1p.c: Define empty weak_alias to
remove weak_alias for corresponding log1p function.
* sysdeps/m68k/m680x0/fpu/s_log1pf.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_log1pl.c: Likewise.
* sysdeps/ia64/fpu/w_log1p.c: New file.
* sysdeps/ia64/fpu/w_log1pf.c: Likewise.
* sysdeps/ia64/fpu/w_log1pl.c: Likewise.
* math/libm-test.inc (log1p_test_data): Add errno expectations.
Bug 18247 is an off-by-one error in strtof's determination of a
decimal exponent such that any value with that decimal exponent is at
most half the least subnormal and so the appropriate underflowing
value for the rounding mode can be determined with no
multiple-precision computations. (Whether the value is in fact safe
despite the off-by-one depends on the floating-point format in
question. It's wrong for float and for m68k ldbl-96 but not for other
supported formats.) This patch corrects the computation of the
exponent in question to be safe in general, adding a comment
explaining the new computation.
Tested for x86_64.
[BZ #18247]
* stdlib/strtod_l.c (____STRTOF_INTERNAL): Decrease minimum
decimal exponent by 1.
* stdlib/tst-strtod-round-data: Add more tests.
* stdlib/tst-strtod-round.c (tests): Regenerated.
The dbl-64 implementation of atan2 does computations that expect to
run in round-to-nearest mode, and in other modes the errors can
accumulate to more than the maximum accepted 9ulp. This patch makes
it use FE_TONEAREST internally, similar to other functions with such
issues. Tests that previously produced large errors are added for
atan2 and the closely related carg, clog and clog10 functions.
Tested for x86_64 and x86 and ulps updated accordingly.
[BZ #18210]
[BZ #18211]
* sysdeps/ieee754/dbl-64/e_atan2.c: Include <fenv.h>.
(__ieee754_atan2): Set FE_TONEAREST mode for internal
computations.
* math/auto-libm-test-in: Add more tests of atan2, carg, clog and
clog10.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The dbl-64 implementation of atan does computations that expect to run
in round-to-nearest mode, and in other modes the errors can accumulate
to more than the maximum accepted 9ulp. This patch makes it use
FE_TONEAREST internally, similar to other functions with such issues.
Tested for x86_64 and x86; no ulps updates needed.
[BZ #18197]
* sysdeps/ieee754/dbl-64/s_atan.c: Include <fenv.h>.
(atan): Set FE_TONEAREST mode for internal computations.
* math/auto-libm-test-in: Add more tests of atan.
* math/auto-libm-test-out: Regenerated.
Trimming heaps is a balance between saving memory and the system overhead
required to update page tables and discard allocated pages. The malloc
option M_TRIM_THRESHOLD is a tunable that users are meant to use to decide
where this balance point is but it is only applied to the main arena.
For scalability reasons, glibc malloc has per-thread heaps but these are
shrunk with madvise() if there is one page free at the top of the heap.
In some circumstances this can lead to high system overhead if a thread
has a control flow like
while (data_to_process) {
buf = malloc(large_size);
do_stuff();
free(buf);
}
For a large size, the free() will call madvise (pagetable teardown, page
free and TLB flush) every time followed immediately by a malloc (fault,
kernel page alloc, zeroing and charge accounting). The kernel overhead
can dominate such a workload.
This patch allows the user to tune when madvise gets called by applying
the trim threshold to the per-thread heaps and using similar logic to the
main arena when deciding whether to shrink. Alternatively if the dynamic
brk/mmap threshold gets adjusted then the new values will be obeyed by
the per-thread heaps.
Bug 17195 was a test case motivated by a problem encountered in scientific
applications written in python that performance badly due to high page fault
overhead. The basic operation of such a program was posted by Julian Taylor
https://sourceware.org/ml/libc-alpha/2015-02/msg00373.html
With this patch applied, the overhead is eliminated. All numbers in this
report are in seconds and were recorded by running Julian's program 30
times.
pyarray
glibc madvise
2.21 v2
System min 1.81 ( 0.00%) 0.00 (100.00%)
System mean 1.93 ( 0.00%) 0.02 ( 99.20%)
System stddev 0.06 ( 0.00%) 0.01 ( 88.99%)
System max 2.06 ( 0.00%) 0.03 ( 98.54%)
Elapsed min 3.26 ( 0.00%) 2.37 ( 27.30%)
Elapsed mean 3.39 ( 0.00%) 2.41 ( 28.84%)
Elapsed stddev 0.14 ( 0.00%) 0.02 ( 82.73%)
Elapsed max 4.05 ( 0.00%) 2.47 ( 39.01%)
glibc madvise
2.21 v2
User 141.86 142.28
System 57.94 0.60
Elapsed 102.02 72.66
Note that almost a minutes worth of system time is eliminted and the
program completes 28% faster on average.
To illustrate the problem without python this is a basic test-case for
the worst case scenario where every free is a madvise followed by a an alloc
/* gcc bench-free.c -lpthread -o bench-free */
static int num = 1024;
void __attribute__((noinline,noclone)) dostuff (void *p)
{
}
void *worker (void *data)
{
int i;
for (i = num; i--;)
{
void *m = malloc (48*4096);
dostuff (m);
free (m);
}
return NULL;
}
int main()
{
int i;
pthread_t t;
void *ret;
if (pthread_create (&t, NULL, worker, NULL))
exit (2);
if (pthread_join (t, &ret))
exit (3);
return 0;
}
Before the patch, this resulted in 1024 calls to madvise. With the patch applied,
madvise is called twice because the default trim threshold is high enough to avoid
this.
This a more complex case where there is a mix of frees. It's simply a different worker
function for the test case above
void *worker (void *data)
{
int i;
int j = 0;
void *free_index[num];
for (i = num; i--;)
{
void *m = malloc ((i % 58) *4096);
dostuff (m);
if (i % 2 == 0) {
free (m);
} else {
free_index[j++] = m;
}
}
for (; j >= 0; j--)
{
free(free_index[j]);
}
return NULL;
}
glibc 2.21 calls malloc 90305 times but with the patch applied, it's
called 13438. Increasing the trim threshold will decrease the number of
times it's called with the option of eliminating the overhead.
ebizzy is meant to generate a workload resembling common web application
server workloads. It is threaded with a large working set that at its core
has an allocation, do_stuff, free loop that also hits this case. The primary
metric of the benchmark is records processed per second. This is running on
my desktop which is a single socket machine with an I7-4770 and 8 cores.
Each thread count was run for 30 seconds. It was only run once as the
performance difference is so high that the variation is insignificant.
glibc 2.21 patch
threads 1 10230 44114
threads 2 19153 84925
threads 4 34295 134569
threads 8 51007 183387
Note that the saving happens to be a concidence as the size allocated
by ebizzy was less than the default threshold. If a different number of
chunks were specified then it may also be necessary to tune the threshold
to compensate
This is roughly quadrupling the performance of this benchmark. The difference in
system CPU usage illustrates why.
ebizzy running 1 thread with glibc 2.21
10230 records/s 306904
real 30.00 s
user 7.47 s
sys 22.49 s
22.49 seconds was spent in the kernel for a workload runinng 30 seconds. With the
patch applied
ebizzy running 1 thread with patch applied
44126 records/s 1323792
real 30.00 s
user 29.97 s
sys 0.00 s
system CPU usage was zero with the patch applied. strace shows that glibc
running this workload calls madvise approximately 9000 times a second. With
the patch applied madvise was called twice during the workload (or 0.06
times per second).
2015-02-10 Mel Gorman <mgorman@suse.de>
[BZ #17195]
* malloc/arena.c (free): Apply trim threshold to per-thread heaps
as well as the main arena.
Silvermont and Knights Landing have a modular system design with two cores
sharing an L2 cache. If more than 2 cores are detected to shared L2 cache,
it should be adjusted for Silvermont and Knights Landing.
[BZ #18185]
* sysdeps/x86_64/cacheinfo.c (init_cacheinfo): Limit threads
sharing L2 cache to 2 for Silvermont/Knights Landing.
This patch is glibc support for a PowerPC TLS optimization, inspired
by Alexandre Oliva's TLS optimization for other processors,
http://www.lsd.ic.unicamp.br/~oliva/writeups/TLS/RFC-TLSDESC-x86.txt
In essence, this optimization uses a zero module id in the tls_index
GOT entry to indicate that a TLS variable is allocated space in the
static TLS area. A special plt call linker stub for __tls_get_addr
checks for such a tls_index and if found, returns the offset
immediately. The linker communicates the fact that the special
__tls_get_addr stub is used by setting a bit in the dynamic tag
DT_PPC64_OPT/DT_PPC_OPT. glibc communicates to the linker that this
optimization is available by the presence of __tls_get_addr_opt.
tst-tlsmod2.so is built with -Wl,--no-tls-get-addr-optimize for
tst-tls-dlinfo, which otherwise would fail since it tests that no
static tls is allocated. The ld option --no-tls-get-addr-optimize has
been available since binutils-2.20 so doesn't need a configure test.
* NEWS: Advertise TLS optimization.
* elf/elf.h (R_PPC_TLSGD, R_PPC_TLSLD, DT_PPC_OPT, PPC_OPT_TLS): Define.
(DT_PPC_NUM): Increment.
* elf/dynamic-link.h (HAVE_STATIC_TLS): Define.
(CHECK_STATIC_TLS): Use here.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Optimize
TLS descriptors.
* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/powerpc/dl-tls.c: New file.
* sysdeps/powerpc/Versions: Add __tls_get_addr_opt.
* sysdeps/powerpc/tst-tlsopt-powerpc.c: New tls test.
* sysdeps/unix/sysv/linux/powerpc/Makefile: Add new test.
Build tst-tlsmod2.so with --no-tls-get-addr-optimize.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist: Update.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist: Likewise.
sem_timedwait converts absolute timeouts to relative to pass them to
the futex syscall. (Before the recent reimplementation, on x86_64 it
used FUTEX_CLOCK_REALTIME, but not on other architectures.)
Correctly implementing POSIX requirements, however, requires use of
FUTEX_CLOCK_REALTIME; passing a relative timeout to the kernel does
not conform to POSIX. The POSIX specification for sem_timedwait says
"The timeout shall be based on the CLOCK_REALTIME clock.". The POSIX
specification for clock_settime says "If the value of the
CLOCK_REALTIME clock is set via clock_settime(), the new value of the
clock shall be used to determine the time of expiration for absolute
time services based upon the CLOCK_REALTIME clock. This applies to the
time at which armed absolute timers expire. If the absolute time
requested at the invocation of such a time service is before the new
value of the clock, the time service shall expire immediately as if
the clock had reached the requested time normally.". If a relative
timeout is passed to the kernel, it is interpreted according to the
CLOCK_MONOTONIC clock, and so fails to meet that POSIX requirement in
the event of clock changes.
This patch makes sem_timedwait use lll_futex_timed_wait_bitset with
FUTEX_CLOCK_REALTIME when possible, as done in some other places in
NPTL. FUTEX_CLOCK_REALTIME is always available for supported Linux
kernel versions; unavailability of lll_futex_timed_wait_bitset is only
an issue for hppa (an issue noted in
<https://sourceware.org/glibc/wiki/PortStatus>, and fixed by the
unreviewed
<https://sourceware.org/ml/libc-alpha/2014-12/msg00655.html> that
removes the hppa lowlevellock.h completely).
In the FUTEX_CLOCK_REALTIME case, the glibc code still needs to check
for negative tv_sec and handle that as timeout, because the Linux
kernel returns EINVAL not ETIMEDOUT for that case, so resulting in
failures of nptl/tst-abstime and nptl/tst-sem13 in the absence of that
check. If we're trying to distinguish between Linux-specific and
generic-futex NPTL code, I suppose having this in an nptl/ file isn't
ideal, but there doesn't seem to be any better place at present.
It's not possible to add a testcase for this issue to the testsuite
because of the requirement to change the system clock as part of a
test (this is a case where testing would require some form of
container, with root in that container, and one whose CLOCK_REALTIME
is isolated from that of the host; I'm not sure what forms of
containers, short of a full virtual machine, provide that clock
isolation).
Tested for x86_64. Also tested for powerpc with the testcase included
in the bug.
[BZ #18138]
* nptl/sem_waitcommon.c: Include <kernel-features.h>.
(futex_abstimed_wait)
[__ASSUME_FUTEX_CLOCK_REALTIME && lll_futex_timed_wait_bitset]:
Use lll_futex_timed_wait_bitset with FUTEX_CLOCK_REALTIME instead
of lll_futex_timed_wait.
If xports is NULL in xprt_register we malloc it but if sock >
_rpc_dtablesize() that memory does not get initialised and may in theory
contain any value. Later we make a conditional jump in svc_getreq_common
based on the uninitialised memory and this caused a general protection
fault in rpc.statd on an older version of glibc but this code has not
changed since that version.
Following is the valgrind warning.
==26802== Conditional jump or move depends on uninitialised value(s)
==26802== at 0x5343A25: svc_getreq_common (in /lib64/libc-2.5.so)
==26802== by 0x534357B: svc_getreqset (in /lib64/libc-2.5.so)
==26802== by 0x10DE1F: ??? (in /sbin/rpc.statd)
==26802== by 0x10D0EF: main (in /sbin/rpc.statd)
==26802== Uninitialised value was created by a heap allocation
==26802== at 0x4C2210C: malloc (vg_replace_malloc.c:195)
==26802== by 0x53438BE: xprt_register (in /lib64/libc-2.5.so)
==26802== by 0x53450DF: svcudp_bufcreate (in /lib64/libc-2.5.so)
==26802== by 0x10FE32: ??? (in /sbin/rpc.statd)
==26802== by 0x10D13E: main (in /sbin/rpc.statd)
for ChangeLog
[BZ #17090]
[BZ #17620]
[BZ #17621]
[BZ #17628]
* NEWS: Update.
* elf/dl-tls.c (_dl_update_slotinfo): Clean up outdated DTV
entries with Static TLS too. Skip entries past the end of the
allocated DTV, from Alan Modra.
(tls_get_addr_tail): Update to glibc_likely/unlikely. Move
Static TLS DTV entry set up from...
(_dl_allocate_tls_init): ... here (fix modid assertion), ...
* elf/dl-reloc.c (_dl_nothread_init_static_tls): ... here...
* nptl/allocatestack.c (init_one_static_tls): ... and here...
* elf/dlopen.c (dl_open_worker): Drop l_tls_modid upper bound
for Static TLS.
* elf/tlsdeschtab.h (map_generation): Return size_t. Check
that the slot we find is associated with the given map before
using its generation count.
* nptl_db/db_info.c: Include ldsodefs.h.
(rtld_global, dtv_slotinfo_list, dtv_slotinfo): New typedefs.
* nptl_db/structs.def (DB_RTLD_VARIABLE): New macro.
(DB_MAIN_VARIABLE, DB_RTLD_GLOBAL_FIELD): Likewise.
(link_map::l_tls_offset): New struct field.
(dtv_t::counter): Likewise.
(rtld_global): New struct.
(_rtld_global): New rtld variable.
(dl_tls_dtv_slotinfo_list): New rtld global field.
(dtv_slotinfo_list): New struct.
(dtv_slotinfo): Likewise.
* nptl_db/td_symbol_list.c: Drop gnu/lib-names.h include.
(td_lookup): Rename to...
(td_mod_lookup): ... this. Use new mod parameter instead of
LIBPTHREAD_SO.
* nptl_db/td_thr_tlsbase.c: Include link.h.
(dtv_slotinfo_list, dtv_slotinfo): New functions.
(td_thr_tlsbase): Check DTV generation. Compute Static TLS
addresses even if the DTV is out of date or missing them.
* nptl_db/fetch-value.c (_td_locate_field): Do not refuse to
index zero-length arrays.
* nptl_db/thread_dbP.h: Include gnu/lib-names.h.
(td_lookup): Make it a macro implemented in terms of...
(td_mod_lookup): ... this declaration.
* nptl_db/db-symbols.awk (DB_RTLD_VARIABLE): Override.
(DB_MAIN_VARIABLE): Likewise.
In bug 14906 the user complains that the inotify support in nscd
is not sufficient when it comes to detecting changes in the
configurationfiles that should be watched for the various databases.
The current nscd implementation uses inotify to watch for changes in
the configuration files, but adds watches only for IN_DELETE_SELF and
IN_MODIFY. These watches are insufficient to cover even the most basic
uses by a system administrator. For example using emacs or vim to edit
a configuration file should trigger a reload but it might not if
the editors use move to atomically update the file. This atomic update
changes the inode and thus removes the notification on the file (as
inotify is based on inodes). Thus the inotify support in nscd for
configuration files is insufficient to account for the average use
cases of system administrators and users.
The inotify support is significantly enhanced and described here:
https://www.sourceware.org/ml/libc-alpha/2015-02/msg00504.html
Tested on x86_64 with and without inotify support.
ldconfig is using an aux-cache to speed up the ld.so.cache update. It
is read by mmaping the file to a structure which contains data offsets
used as pointers. As they are not checked, it is not hard to get
ldconfig to segfault with a corrupted file. This happens for instance if
the file is truncated, which is common following a filesystem check
following a system crash.
This can be reproduced for example by truncating the file to roughly
half of it's size.
There is already some code in elf/cache.c (load_aux_cache) to check
for a corrupted aux cache, but it happens to be broken and not enough.
The test (aux_cache->nlibs >= aux_cache_size) compares the number of
libs entry with the cache size. It's a non sense, as it basically
assumes that each library entry is a 1 byte... Instead this commit
computes the theoretical cache size using the headers and compares it
to the real size.
The function feupdateenv has been fixed to correctly handle FE_DFL_ENV
and FE_NOMASK_ENV.
The fesetexceptflag function has been fixed to correctly handle setting
the new flags instead of just OR-ing the existing flags.
This fixes the test-fenv-return and test-fenvinline failures on hppa.
The constraints in the inline assembly in feholdexcept and fesetenv
are incorrect. The assembly modifies the buffer pointer, but doesn't
express that in the constraints. The simple fix is to remove the
modification of the buffer pointer which is no longer required by
the existing code, and adjust the one constraint that did express
the modification of bufptr.
The change fixes test-fenv when glibc is compiled with recent gcc.
This patch fixes the inline feraiseexcept and feclearexcept macros for
powerpc by casting the input argument to integer before operation on it.
It fixes BZ#17776.
Since 2014-11-24 binutils git commit bb4d2ac2, readelf has appended
the symbol version to symbols shown in reloc dumps.
[BZ #16512]
* scripts/localplt.awk: Strip off symbol version.
* NEWS: Mention bug fix.
__ASSUME_PRLIMIT64 is defined in kernel-features.h for kernels 2.6.36
and later, but hppa, microblaze and sh did not add the prlimit64
syscall until 2.6.37. This patch adds corresponding undefines of
__ASSUME_PRLIMIT64 to those architectures' kernel-features.h files.
(This concludes the kernel-features.h fixes arising out of the review
- limited to macros defined in the architecture-independent
kernel-features.h file - I did in connection with the move to 2.6.32
minimum kernel version. For that subset of macros - I didn't check
any purely architecture-specific macros - I think they are now defined
for the correct kernel versions on each architecture after this
patch.)
[BZ #17779]
* sysdeps/unix/sysv/linux/hppa/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020625] (__ASSUME_PRLIMIT64):
Undefine.
* sysdeps/unix/sysv/linux/microblaze/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020625] (__ASSUME_PRLIMIT64):
Likewise.
* sysdeps/unix/sysv/linux/sh/kernel-features.h
[__LINUX_KERNEL_VERSION < 0x020625] (__ASSUME_PRLIMIT64):
Likewise.
Protocted symbol in shared library can only be accessed from PIE
or shared library. Linker in binutils 2.26 enforces it. We must
compile vismain with -fPIE and link it with -pie.
[BZ #17711]
* elf/Makefile (tests): Add vismain only if PIE is enabled.
(tests-pie): Add vismain.
(CFLAGS-vismain.c): New.
* elf/vismain.c: Add comments for PIE requirement.
The threshold in ldbl-96 atanhl for when to return the argument,
0x1p-28, is a bit too big, and that in ldbl-128ibm atanhl is much too
big (the relevant condition being x^3/3 being < 0.5ulp of x),
resulting in errors a bit above the limits of those considered
acceptable in glibc in the ldbl-96 case, and in large errors in the
ldbl-128ibm case. This patch changes those implementations to use
more appropriate thresholds and adds tests around the thresholds for
various formats.
Tested for x86_64, x86 and powerpc. x86_64 and x86 ulps updated
accordingly.
[BZ #18046]
[BZ #18047]
* sysdeps/ieee754/ldbl-128ibm/e_atanhl.c (__ieee754_atanhl): Use
0x1p-56L as threshold for just returning the argument.
* sysdeps/ieee754/ldbl-96/e_atanhl.c (__ieee754_atanhl): Use
0x1p-32L as threshold for just returning the argument.
* math/auto-libm-test-in: Add more tests of atanh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulp: Likewise.
The ldbl-128 and ldbl-128ibm implementations of acosl have similar
bugs, using a threshold of 0x1p-57L to determine when they just return
pi/2. Since the result pi/2 - asinl (x) is roughly pi/2 - x for small
x, the relevant cut-off is actually x being < 0.5ulp of 1. This patch
fixes the implementations to use that cut-off and adds tests of small
acos arguments.
Tested for powerpc and mips64. Also tested for x86_64 and x86; no
ulps updates needed.
[BZ #18038]
[BZ #18039]
* sysdeps/ieee754/ldbl-128/e_acosl.c (__ieee754_acosl): Only
return pi/2 for arguments below 0x1p-113L.
* sysdeps/ieee754/ldbl-128ibm/e_acosl.c (__ieee754_acosl): Only
return pi/2 for arguments below 0x1p-106L.
* math/auto-libm-test-in: Add more tests of acos.
* math/auto-libm-test-out: Regenerated.
Similar to various other bugs in this area, some asin implementations
do not raise the underflow exception for subnormal arguments, when the
result is tiny and inexact. This patch forces the exception in a
similar way to previous fixes.
Tested for x86_64, x86, powerpc and mips64.
[BZ #16351]
* sysdeps/i386/fpu/e_asin.S (dbl_min): New object.
(MO): New macro.
(__ieee754_asin): Force underflow exception for results with small
absolute value.
* sysdeps/i386/fpu/e_asinf.S (flt_min): New object.
(MO): New macro.
(__ieee754_asinf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/e_asin.c: Include <float.h> and <math.h>.
(__ieee754_asin): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/flt-32/e_asinf.c: Include <float.h>.
(__ieee754_asinf): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-128ibm/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/ldbl-96/e_asinl.c: Include <float.h>.
(__ieee754_asinl): Force underflow exception for results with
small absolute value.
* sysdeps/x86_64/fpu/multiarch/e_asin.c [HAVE_FMA4_SUPPORT]:
Include <math.h>.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 16351.
* math/auto-libm-test-out: Regenerated.
The ldbl-128ibm implementation of logbl produces incorrect results
when the high part of the argument is a power of 2 and the low part a
nonzero number with the opposite sign (and so the returned exponent
should be 1 less than that of the high part). For example, logbl
(0x1.ffffffffffffffp1L) returns 2 but should return 1. (This is
similar to (fixed) bug 16740 for frexpl, and (fixed) bug 18029 for
ilogbl.) This patch adds checks for that case.
Tested for powerpc.
[BZ #18030]
* sysdeps/ieee754/ldbl-128ibm/s_logbl.c (__logbl): Adjust exponent
of power of 2 down when low part has opposite sign.
* math/libm-test.inc (logb_test_data): Add more tests.
The ldbl-128ibm implementation of ilogbl produces incorrect results
when the high part of the argument is a power of 2 and the low part a
nonzero number with the opposite sign (and so the returned exponent
should be 1 less than that of the high part). For example, ilogbl
(0x1.ffffffffffffffp1L) returns 2 but should return 1. (This is
similar to (fixed) bug 16740 for frexpl, and bug 18030 for logbl.)
This patch adds checks for that case.
Tested for powerpc.
[BZ #18029]
* sysdeps/ieee754/ldbl-128ibm/e_ilogbl.c (__ieee754_ilogbl):
Adjust exponent of power of 2 down when low part has opposite
sign.
* math/libm-test.inc (ilogb_test_data): Add more tests.
If a locale alias is defined in locale.alias but not in an archive,
and the referenced locale is only present in the archive, setlocale
will fail if given the alias name. This is unintuitive. This patch
fixes it, arranging for the locale archive to be searched again after
alias expansion.
for ChangeLog
[BZ #15969]
* locale/findlocale.c (_nl_find_locale): Retry archive search
after alias expansion.
The ldbl-128ibm implementation of asinhl uses cut-offs of 0x1p28 and
0x1p-29 to determine when to use simpler formulas that avoid possible
overflow / underflow. Both those cut-offs are inappropriate for this
format, resulting in large errors. This patch changes the code to use
more appropriate cut-offs of 0x1p56 and 0x1p-56, adding tests around
the cut-offs for various floating-point formats.
Tested for powerpc. Also tested for x86_64 and x86 and updated ulps.
[BZ #18020]
* sysdeps/ieee754/ldbl-128ibm/s_asinhl.c (__asinhl): Use 2**56 and
2**-56 not 2**28 and 2**-29 as thresholds for simpler formulas.
* math/auto-libm-test-in: Add more tests of asinh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
The ldbl-128ibm implementation of acoshl uses a cut-off of 0x1p28 to
determine when to use log(x) + log(2) as a formula. That cut-off is
too small for this format, resulting in large errors. This patch
changes it to a more appropriate cut-off of 0x1p56, adding tests
around the cut-offs for various floating-point formats.
Tested for powerpc. Also tested for x86_64 and x86 and updated ulps.
[BZ #18019]
* sysdeps/ieee754/ldbl-128ibm/e_acoshl.c (__ieee754_acoshl): Use
2**56 not 2**28 as threshold for log (2x) formula.
* math/auto-libm-test-in: Add more tests of acosh.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
Various x86 / x86_64 versions of scalb / scalbf / scalbl produce
spurious "invalid" exceptions for (qNaN, -Inf) arguments, because this
is wrongly handled like (+/-Inf, -Inf) which *should* raise such an
exception. (In fact the NaN case of the code determining whether to
quietly return a zero or a NaN for second argument -Inf was
accidentally dead since the code had been made to return a NaN with
exception.) This patch fixes the code to do the proper test for an
infinity as distinct from a NaN.
(Since the existing code does nothing to distinguish qNaNs and sNaNs
here, this patch doesn't either. If in future we systematically
implement proper sNaN semantics following TS 18661-1:2014, there will
be lots of bugs to address - Thomas found lots of issues with his
patch <https://sourceware.org/ml/libc-ports/2013-04/msg00008.html> to
add SNaN tests (which never went in and would now require significant
reworking).)
Tested for x86_64 and x86. Committed.
[BZ #16783]
* sysdeps/i386/fpu/e_scalb.S (__ieee754_scalb): Do not handle
arguments (NaN, -Inf) the same as (+/-Inf, -Inf).
* sysdeps/i386/fpu/e_scalbf.S (__ieee754_scalbf): Likewise.
* sysdeps/i386/fpu/e_scalbl.S (__ieee754_scalbl): Likewise.
* sysdeps/x86_64/fpu/e_scalbl.S (__ieee754_scalbl): Likewise.
* math/libm-test.inc (scalb_test_data): Add more tests.
Both open and openat load their last argument 'mode' lazily, using
va_arg() only if O_CREAT is found in oflag. This is wrong, mode is also
necessary if O_TMPFILE is in oflag.
By chance on x86_64, the problem wasn't evident when using O_TMPFILE
with open, as the 3rd argument of open, even when not loaded with
va_arg, is left untouched in RDX, where the syscall expects it.
However, openat was not so lucky, and O_TMPFILE couldn't be used: mode
is the 4th argument, in RCX, but the syscall expects its 4th argument in
a different register than the glibc wrapper, in R10.
Introduce a macro __OPEN_NEEDS_MODE (oflag) to test if either O_CREAT or
O_TMPFILE is set in oflag.
Tested on Linux x86_64.
[BZ #17523]
* io/fcntl.h (__OPEN_NEEDS_MODE): New macro.
* io/bits/fcntl2.h (open): Use it.
(openat): Likewise.
* io/open.c (__libc_open): Likewise.
* io/open64.c (__libc_open64): Likewise.
* io/open64_2.c (__open64_2): Likewise.
* io/open_2.c (__open_2): Likewise.
* io/openat.c (__openat): Likewise.
* io/openat64.c (__openat64): Likewise.
* io/openat64_2.c (__openat64_2): Likewise.
* io/openat_2.c (__openat_2): Likewise.
* sysdeps/mach/hurd/open.c (__libc_open): Likewise.
* sysdeps/mach/hurd/openat.c (__openat): Likewise.
* sysdeps/posix/open64.c (__libc_open64): Likewise.
* sysdeps/unix/sysv/linux/dl-openat64.c (openat64): Likewise.
* sysdeps/unix/sysv/linux/generic/open.c (__libc_open): Likewise.
(__open_nocancel): Likewise.
* sysdeps/unix/sysv/linux/generic/open64.c (__libc_open64): Likewise.
* sysdeps/unix/sysv/linux/open64.c (__libc_open64): Likewise.
* sysdeps/unix/sysv/linux/openat.c (__OPENAT): Likewise.
DNSSEC defines a number of response types that one me expect when the
DO bit is set. We don't process any of them, but since we do allow
setting the DO bit, skip them without logging an error since it is
only a nuisance.
Tested on x86_64.
[BZ #14841]
* resolv/gethnamaddr.c (getanswer): Skip logging if
RES_USE_DNSSEC is set.
* resolv/nss_dns/dns-host.c (getanswer_r): Likewise.
We compile gcrt1.o with -fPIC to support both "gcc -pg" and "gcc -pie -pg".
[BZ #17836]
* csu/Makefile (extra-objs): Add gmon-start.o if not builing
shared library. Add gmon-start.os otherwise.
($(objpfx)g$(start-installed-name)): Use $(objpfx)S%
$(objpfx)gmon-start.os if builing shared library.
($(objpfx)g$(static-start-installed-name)): Likewise.
for localedata/ChangeLog
[BZ #17588]
[BZ #13064]
[BZ #14094]
[BZ #17998]
* unicode-gen/Makefile: New.
* unicode-gen/unicode-license.txt: New, from Unicode.
* unicode-gen/UnicodeData.txt: New, from Unicode.
* unicode-gen/DerivedCoreProperties.txt: New, from Unicode.
* unicode-gen/EastAsianWidth.txt: New, from Unicode.
* unicode-gen/gen_unicode_ctype.py: New generator, from Mike
FABIAN <mfabian@redhat.com>.
* unicode-gen/ctype_compatibility.py: New verifier, from
Pravin Satpute <psatpute@redhat.com> and Mike FABIAN.
* unicode-gen/ctype_compatibility_test_cases.py: New verifier
module, from Mike FABIAN.
* unicode-gen/utf8_gen.py: New generator, from Pravin Satpute
and Mike FABIAN.
* unicode-gen/utf8_compatibility.py: New verifier, from Pravin
Satpute and Mike FABIAN.
* charmaps/UTF-8: Update.
* locales/i18n: Update.
* gen-unicode-ctype.c: Remove.
* tst-ctype-de_DE.ISO-8859-1.in: Adjust, islower now returns
true for ordinal indicators.
The POSIX function scandir calls scandirat, which is not a POSIX
function. This patch fixes this by making it use __scandirat and
making scandirat a weak alias. There are no changes for scandir64 /
scandirat64 because those are both _GNU_SOURCE-only functions so no
namespace issue arises for them.
Tested for x86_64 that the disassembly of installed shared libraries
is unchanged by this patch.
[BZ #17999]
* dirent/scandir.c [!SCANDIR] (SCANDIRAT): Define to __scandirat
instead of scandirat.
* dirent/scandirat.c [!SCANDIRAT] (SCANDIRAT): Likewise.
[!SCANDIRAT] (SCANDIRAT_WEAK_ALIAS): Define.
[SCANDIRAT_WEAK_ALIAS] (scandirat): Define as weak alias of
__scandirat.
* include/dirent.h (scandirat): Do not use libc_hidden_proto.
(__scandirat): Declare. Use libc_hidden_proto.
* conform/Makefile (test-xfail-POSIX2008/dirent.h/linknamespace):
Remove variable.
(test-xfail-XOPEN2K8/dirent.h/linknamespace): Likewise.
This patch fixes bug 15319, missing underflows from atan / atan2 when
the result of atan is very close to its small argument (or that of
atan2 is very close to the ratio of its arguments, which may be an
exact division).
The usual approach of doing an underflowing computation if the
computed result is subnormal is followed. For 32-bit x86, there are
extra complications: the inline __ieee754_atan2 in bits/mathinline.h
needs to be disabled for float and double because other libm functions
using it generally rely on getting proper underflow exceptions from
it, while the out-of-line functions have to remove excess range and
precision from the underflowing result so as to return an exact 0 in
the case where errno should be set for underflow to 0. (The failures
I saw without that are similar to those Carlos reported for other
functions, where I haven't seen a response to
<https://sourceware.org/ml/libc-alpha/2015-01/msg00485.html>
confirming if my diagnosis is correct. Arguably all libm functions
with float and double returns should remove excess range and
precision, but that's a separate matter.)
The x86_64 long double case reported in a comment in bug 15319 is not
a bug (it's an argument of LDBL_MIN, and x86_64 is an after-rounding
architecture so the correct IEEE result is not to raise underflow in
the given rounding mode, in addition to treating the result as an
exact LDBL_MIN being within the newly clarified documentation of
accuracy goals). I'm presuming that the fpatan instruction can be
trusted to raise appropriate exceptions when the (long double) result
underflows (after rounding) and so no changes are needed for x86 /
x86_64 long double functions here; empirically this is the case for
the cases covered in the testsuite, on my system.
Tested for x86_64, x86, powerpc and mips64. Only 32-bit x86 needs
ulps updates (for the changes to inlines meaning some functions no
longer get excess precision from their __ieee754_atan2* calls).
[BZ #15319]
* sysdeps/i386/fpu/e_atan2.S (dbl_min): New object.
(MO): New macro.
(__ieee754_atan2): For results with small absolute value, force
underflow exception and remove excess range and precision from
return value.
* sysdeps/i386/fpu/e_atan2f.S (flt_min): New object.
(MO): New macro.
(__ieee754_atan2f): For results with small absolute value, force
underflow exception and remove excess range and precision from
return value.
* sysdeps/i386/fpu/s_atan.S (dbl_min): New object.
(MO): New macro.
(__atan): For results with small absolute value, force underflow
exception and remove excess range and precision from return value.
* sysdeps/i386/fpu/s_atanf.S (flt_min): New object.
(MO): New macro.
(__atanf): For results with small absolute value, force underflow
exception and remove excess range and precision from return value.
* sysdeps/ieee754/dbl-64/e_atan2.c: Include <float.h> and
<math.h>.
(__ieee754_atan2): Force underflow exception for results with
small absolute value.
* sysdeps/ieee754/dbl-64/s_atan.c: Include <float.h> and
<math_private.h>.
(atan): Force underflow exception for results with small absolute
value.
* sysdeps/ieee754/flt-32/s_atanf.c: Include <float.h>.
(__atanf): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128/s_atanl.c: Include <float.h> and
<math.h>.
(__atanl): Force underflow exception for results with small
absolute value.
* sysdeps/ieee754/ldbl-128ibm/s_atanl.c: Include <float.h>.
(__atanl): Force underflow exception for results with small
absolute value.
* sysdeps/x86/fpu/bits/mathinline.h
[!__SSE2_MATH__ && !__x86_64__ && __LIBC_INTERNAL_MATH_INLINES]
(__ieee754_atan2): Only define inline for long double.
* sysdeps/x86_64/fpu/multiarch/e_atan2.c
[HAVE_FMA4_SUPPORT || HAVE_AVX_SUPPORT]: Include <math.h>.
* math/auto-libm-test-in: Do not mark underflow exceptions as
possibly missing for bug 15319. Add more tests of atan2.
* math/auto-libm-test-out: Regenerated.
* math/libm-test.inc (casin_test_data): Do not mark underflow
exceptions as possibly missing for bug 15319.
(casinh_test_data): Likewise.
* sysdeps/i386/fpu/libm-test-ulps: Update.
The implementation of the (XSI POSIX) functions hsearch / hcreate /
hdestroy uses hsearch_r / hcreate_r / hdestroy_r, which are not POSIX
functions. This patch makes those into weak aliases for __*_r and
uses those names for the calls within libc.
Tested for x86_64 that the disassembly of installed shared libraries
is unchanged by this patch.
[BZ #17996]
* include/search.h (hcreate_r): Don't use libc_hidden_proto.
(hdestroy_r): Likewise.
(hsearch_r): Likewise.
(__hcreate_r): Declare and use libc_hidden_proto.
(__hdestroy_r): Likewise.
(__hsearch_r): Likewise.
* misc/hsearch.c (hsearch): Call __hsearch_r instead of hsearch_r.
(hcreate): Call __hcreate_r instead of hcreate_r.
(__hdestroy): Call __hdestroy_r instead of hdestroy_r.
* misc/hsearch_r.c (hcreate_r): Rename to __hcreate_r and define
as weak alias of __hcreate_r.
(hdestroy_r): Rename to __hdestroy_r and define as weak alias of
__hdestroy_r.
(hsearch_r): Rename to __hsearch_r and define as weak alias of
__hsearch_r.
* conform/Makefile (test-xfail-XPG3/search.h/linknamespace):
Remove variable.
(test-xfail-XPG4/search.h/linknamespace): Likewise.
(test-xfail-UNIX98/search.h/linknamespace): Likewise.
(test-xfail-XOPEN2K/search.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/search.h/linknamespace): Likewise.
posix_spawn (a standard POSIX function) brings in a use of getrlimit64
(not a standard POSIX function). This patch fixes this by using
__getrlimit64 and making getrlimit64 a weak alias.
This is more complicated than some such changes because of files that
define getrlimit64 in their own way using symbol versioning after
including the main sysdeps/unix/sysv/linux/getrlimit64.c with a
getrlimit macro defined. There are various existing patterns for such
cases in glibc; the one I've used here is that a getrlimit64 macro
disables the weak_alias / libc_hidden_weak calls, leaving it to the
including file to define the getrlimit64 name in whatever way is
appropriate.
Tested for x86_64 and x86 that installed stripped shared libraries are
unchanged by this patch.
[BZ #17991]
* include/sys/resource.h (__getrlimit64): Declare. Use
libc_hidden_proto.
* resource/getrlimit64.c (getrlimit64): Rename to __getrlimit64
and define as weak alias of __getrlimit64. Use libc_hidden_weak.
* sysdeps/posix/spawni.c (__spawni): Call __getrlimit64 instead of
getrlimit64.
* sysdeps/unix/sysv/linux/getrlimit64.c (getrlimit64): Rename to
__getrlimit64.
[!getrlimit64] (getrlimit64): Define as weak alias of
__getrlimit64. Use libc_hidden_weak.
* sysdeps/unix/sysv/linux/i386/getrlimit64.c (getrlimit64): Define
using __getrlimit64 not __new_getrlimit64.
(__GI_getrlimit64): Likewise.
* sysdeps/unix/sysv/linux/mips/getrlimit64.c (getrlimit64):
Likewise.
(__GI_getrlimit64): Likewise.
(__old_getrlimit64): Use __getrlimit64 not __new_getrlimit64.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/syscalls.list
(getrlimit): Add __getrlimit64 alias.
* sysdeps/unix/sysv/linux/wordsize-64/syscalls.list (getrlimit):
Likewise.
* conform/Makefile (test-xfail-XOPEN2K/spawn.h/linknamespace):
Remove variable.
(test-xfail-POSIX2008/spawn.h/linknamespace): Likewise.
(test-xfail-XOPEN2K8/spawn.h/linknamespace): Likewise.
Various remquo implementations produce a zero remainder with the wrong
sign (a zero remainder should always have the sign of the first
argument, as specified in IEEE 754) in round-downward mode, resulting
from the sign of 0 - 0. This patch checks for zero results and fixes
their sign accordingly.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #17987]
* sysdeps/ieee754/dbl-64/s_remquo.c (__remquo): Ensure sign of
zero result does not depend on the sign resulting from
subtraction.
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Likewise.
* sysdeps/ieee754/flt-32/s_remquof.c (__remquof): Likewise.
* sysdeps/ieee754/ldbl-128/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Likewise.
* math/libm-test.inc (remquo_test_data): Add more tests.
Various remquo implementations, when computing the last three bits of
the quotient, have spurious overflows when 4 times the second argument
to remquo overflows. These overflows can in turn cause bad results in
rounding modes where that overflow results in a finite value. This
patch adds tests to avoid the problem multiplications in cases where
they would overflow, similar to those that control an earlier
multiplication by 8.
Tested for x86_64, x86, mips64 and powerpc.
[BZ #17978]
* sysdeps/ieee754/dbl-64/s_remquo.c (__remquo): Do not form
products 4 * y and 2 * y where those would overflow.
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Likewise.
* sysdeps/ieee754/flt-32/s_remquof.c (__remquof): Likewise.
* sysdeps/ieee754/ldbl-128/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Likewise.
* math/libm-test.inc (remquo_test_data): Add more tests.
Remove IA64 PAGE_SIZE related macros as PAGE_SIZE is not defined.
Also remove macros that are only used for BFD's trad-core support
which is not relavant for IA64 according to the thread starting
here:
https://sourceware.org/ml/libc-ports/2013-11/msg00028.html
This patch is neither built nor tested but is equivalent to a MIPS
patch for the same fix.
The dbl-64/wordsize-64 remquo implementation follows similar logic to
various other implementations, but where that logic computes some
absolute values, it wrongly uses a previously computed bit-pattern for
the absolute value of the first argument, where actually it needs the
absolute value of the first argument mod 8 times the second. This
patch fixes it to compute the correct absolute value.
The integer quotient result of remquo is only specified mod 8
(including its sign); architecture-specific versions may well vary in
what results they give for higher bits of that result (and indeed bug
17569 gives an example correct result from __builtin_remquo giving 9
for that result, where the particular glibc implementation used in
that bug report would give 1 after this fix). Thus, this patch adapts
the tests of remquo to test that result only mod 8, to allow for such
variation when tests with higher quotient are included.
Tested for x86_64 and x86.
[BZ #17569]
* sysdeps/ieee754/dbl-64/wordsize-64/s_remquo.c (__remquo):
Compute absolute value of x as modified by fmod, not original
value of x.
* math/libm-test.inc (RUN_TEST_ffI_f1): Rename to
RUN_TEST_ffI_f1_mod8. Check extra return value mod 8.
(RUN_TEST_LOOP_ffI_f1): Rename to RUN_TEST_LOOP_ffI_f1_mod8. Call
RUN_TEST_ffI_f1_mod8.
(remquo_test_data): Add more tests.
Similarly to sqrt in
<https://sourceware.org/ml/libc-alpha/2015-02/msg00353.html>, the
powerpc sqrtf implementation for when _ARCH_PPCSQ is not defined also
relies on a * b + c being contracted into a fused multiply-add.
Although this contraction is not explicitly disabled for e_sqrtf.c, it
still seems appropriate to make the file explicit about its
requirements by using __builtin_fmaf; this patch does so.
Furthermore, it turns out that doing so fixes the observed inaccuracy
and missing exceptions (that is, that without explicit __builtin_fmaf
usage, it was not being compiled as intended).
Tested for powerpc32 (hard float).
[BZ #17967]
* sysdeps/powerpc/fpu/e_sqrtf.c (__slow_ieee754_sqrtf): Use
__builtin_fmaf instead of relying on contraction of a * b + c.
As Adhemerval noted in
<https://sourceware.org/ml/libc-alpha/2015-01/msg00451.html>, the
powerpc sqrt implementation for when _ARCH_PPCSQ is not defined is
inaccurate in some cases.
The problem is that this code relies on fused multiply-add, and relies
on the compiler contracting a * b + c to get a fused operation. But
sysdeps/ieee754/dbl-64/Makefile disables contraction for e_sqrt.c,
because the implementation in that directory relies on *not* having
contracted operations.
While it would be possible to arrange makefiles so that an earlier
sysdeps directory can disable the setting in
sysdeps/ieee754/dbl-64/Makefile, it seems a lot cleaner to make the
dependence on fused operations explicit in the .c file. GCC 4.6
introduced support for __builtin_fma on powerpc and other
architectures with such instructions, so we can rely on that; this
patch duly makes the code use __builtin_fma for all such fused
operations.
Tested for powerpc32 (hard float).
2015-02-12 Joseph Myers <joseph@codesourcery.com>
[BZ #17964]
* sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Use
__builtin_fma instead of relying on contraction of a * b + c.
The tv_sec is of type time_t in both struct timeval and struct timespec.
This matches the implementation and also the relevant standard (checked
C11 for timespec and opengroup for timeval).
This patch fixes the remaining part of bug 16560, spurious underflows
from exp2 of arguments close to 0 (when the result is close to 1, so
should not underflow), by just using 1+x instead of a more complicated
calculation when the argument is sufficiently small.
Tested for x86_64, x86 and mips64.
[BZ #16560]
* math/e_exp2l.c [LDBL_MANT_DIG == 106] (LDBL_EPSILON): Undefine
and redefine.
(__ieee754_exp2l): Do not multiply small fractional parts by
M_LN2l.
* sysdeps/i386/fpu/e_exp2l.S (__ieee754_exp2l): Just add 1 to
small argument.
* sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Likewise.
* sysdeps/ieee754/flt-32/e_exp2f.c (__ieee754_exp2f): Likewise.
* sysdeps/x86_64/fpu/e_exp2l.S (__ieee754_exp2l): Likewise.
* math/auto-libm-test-in: Add more tests of exp2.
* math/auto-libm-test-out: Regenerated.
pthread_mutexattr_settype adds PTHREAD_MUTEX_NO_ELISION_NP to kind,
which is an internal flag that pthread_mutexattr_gettype shouldn't
expose, since pthread_mutexattr_settype wouldn't accept it.
This patch makes sincos set errno to EDOM when passed an infinity,
similarly to sin and cos.
Tested for x86_64, x86, powerpc and mips64. I don't know if the
architecture-specific implementations for ia64 and m68k might need
corresponding fixes.
2015-02-11 Joseph Myers <joseph@codesourcery.com>
[BZ #15467]
* sysdeps/ieee754/dbl-64/s_sincos.c: Include <errno.h>.
(__sincos): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/flt-32/s_sincosf.c: Include <errno.h>.
(SINCOSF_FUNC): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-128/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-128ibm/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* sysdeps/ieee754/ldbl-96/s_sincosl.c: Include <errno.h>.
(__sincosl): Set errno to EDOM for infinite argument.
* math/libm-test.inc (sincos_test_data): Test errno setting.
soft-fp's _FP_FMA fails to set the result's exponent for cases where
the result of the multiplication is 0, yielding incorrect (arbitrary,
depending on uninitialized values) results for those cases. This
affects libm for architectures using soft-fp to implement fma. This
patch adds the exponent setting and tests for this case.
Tested for ARM soft-float (which uses soft-fp fma), x86_64 and x86 (to
verify not introducing new libm test failures there).
(This bug showed up in testing my patch to move the Linux kernel to
current soft-fp. math/Makefile has "override CFLAGS +=
-Wno-uninitialized" which would have stopped compiler warnings from
showing up this problem, although I wouldn't be surprised if removing
that shows spurious warnings from this code, if the compiler fails to
follow that various cases where the exponent is uninitialized don't
need it initialized because the class is set to a value meaning the
uninitialized exponent isn't used.)
[BZ #17932]
* soft-fp/op-common.h (_FP_FMA): Set exponent of result in case
where multiplication results in zero and third argument is finite
and nonzero.
* math/auto-libm-test-in: Add more tests of fma.
* math/auto-libm-test-out: Regenerated.
BZ #16618
Under certain conditions wscanf can allocate too little memory for the
to-be-scanned arguments and overflow the allocated buffer. The
implementation now correctly computes the required buffer size when
using malloc.
A regression test was added to tst-sscanf.
memcpy with unaligned 256-bit AVX register loads/stores are slow on older
processorsl like Sandy Bridge. This patch adds bit_AVX_Fast_Unaligned_Load
and sets it only when AVX2 is available.
[BZ #17801]
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
Set the bit_AVX_Fast_Unaligned_Load bit for AVX2.
* sysdeps/x86_64/multiarch/init-arch.h (bit_AVX_Fast_Unaligned_Load):
New.
(index_AVX_Fast_Unaligned_Load): Likewise.
(HAS_AVX_FAST_UNALIGNED_LOAD): Likewise.
* sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check the
bit_AVX_Fast_Unaligned_Load bit instead of the bit_AVX_Usable bit.
* sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Likewise.
* sysdeps/x86_64/multiarch/memmove.c (__libc_memmove): Replace
HAS_AVX with HAS_AVX_FAST_UNALIGNED_LOAD.
* sysdeps/x86_64/multiarch/memmove_chk.c (__memmove_chk): Likewise.
The padding bytes in the statsdata struct are not initialized, due to
which valgrind throws a warning:
==11384== Memcheck, a memory error detector
==11384== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==11384== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==11384== Command: nscd -d
==11384==
Fri 25 Apr 2014 10:34:53 AM CEST - 11384: handle_request: request received (Version = 2) from PID 11396
Fri 25 Apr 2014 10:34:53 AM CEST - 11384: GETSTAT
==11384== Thread 6:
==11384== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==11384== at 0x4E4ACDC: send (in /lib64/libpthread-2.12.so)
==11384== by 0x11AF6B: send_stats (in /usr/sbin/nscd)
==11384== by 0x112F75: nscd_run_worker (in /usr/sbin/nscd)
==11384== by 0x4E439D0: start_thread (in /lib64/libpthread-2.12.so)
==11384== by 0x599AB6C: clone (in /lib64/libc-2.12.so)
==11384== Address 0x15708395 is on thread 6's stack
Fix the warning by initializing the structure.
This patch fixes a bug introduced by 18f2945ae9, where it optimizes
the FPSCR set by just issuing a mtfs instruction if new flag is different
from older one. The issue is a typo, where the new flag should the the
new value, instead of the old one.
It fixes BZ#17885.
Some powerpc64 processors (e5500 core for instance) does not provide the
fsqrt instruction, however current check to use in math_private.h is
__WORDSIZE and _ARCH_PWR4 (ISA 2.02). This is patch change it to use
the compiler flag _ARCH_PPCSQ (which is the same condition GCC uses to
decide whether to generate fsqrt instruction).
It fixes BZ#16576.
GLIBC memset optimization for POWER8 uses the '.machine power8'
directive, which is only supported officially on binutils 2.24+. This
causes a build failure on older binutils.
Since the requirement of .machine power8 is to correctly assembly the
'mtvsrd' instruction and it is already handled by the MTVSRD_V1_R4
macro, there is no really needed of using it.
The patch replaces the power8 with power7 for .machine directive.
It fixes BZ#17869.
This patch fix the elf/ifuncmain6pie failure when building with GCC
4.9+. For some reason, the compiler removes the branch taken code at
resolve_ifunc (sysdeps/powerpc/powerpc64/dl-machine.h) as dead-code
and thus the testcase fails because the ifunc resolves branches to an
invalid memory location. It fixes by explicit adding a dependency of
value based on odp variable to avoid compiler optimization.
It fixes BZ#17868.
This patch replaces unsigned long int and 1UL with uint64_t and
(uint64_t) 1 to support ILP32 targets like x32.
[BZ #17870]
* nptl/sem_post.c (__new_sem_post): Replace unsigned long int
with uint64_t.
* nptl/sem_waitcommon.c (__sem_wait_cleanup): Replace 1UL with
(uint64_t) 1.
(__new_sem_wait_slow): Replace unsigned long int with uint64_t.
Replace 1UL with (uint64_t) 1.
* sysdeps/nptl/internaltypes.h (new_sem): Replace unsigned long
int with uint64_t.
This patch fix powerpc __get_clockfreq racy and cancel-safe issues by
dropping internal static cache and by using nocancel file operations.
The vDSO failure check is also removed, since kernel code does not
return an error (it cleans cr0.so bit on function return) and the static
code (to read value /proc) now uses non-cancellable calls.
The ability to recursively call dlopen is useful for malloc
implementations that wish to load other dynamic modules that
implement reentrant/AS-safe functions to use in their own
implementation.
Given that a user malloc implementation may be called by an
ongoing dlopen to allocate memory the user malloc
implementation interrupts dlopen and if it calls dlopen again
that's a reentrant call.
This patch fixes the issues with the ld.so.cache mapping
and the _r_debug assertion which prevent this from working
as expected.
See:
https://sourceware.org/ml/libc-alpha/2014-12/msg00446.html
This commit fixes semaphore destruction by either using 64b atomic
operations (where available), or by using two separate fields when only
32b atomic operations are available. In the latter case, we keep a
conservative estimate of whether there are any waiting threads in one
bit of the field that counts the number of available tokens, thus
allowing sem_post to atomically both add a token and determine whether
it needs to call futex_wake.
See:
https://sourceware.org/ml/libc-alpha/2014-12/msg00155.html
This patch adds an optimized POWER8 strncmp. The implementation focus
on speeding up unaligned cases follwing the ideas of power8 strcmp.
The algorithm first check the initial 16 bytes, then align the first
function source and uses unaligned loads on second argument only.
Aditional checks for page boundaries are done for unaligned cases
(where sources alignment are different).
This patch adds an optimized POWER8 strcmp using unaligned accesses.
The algorithm first check the initial 16 bytes, then align the first
function source and uses unaligned loads on second argument only.
Aditional checks for page boundaries are done for unaligned cases
This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses.
It shows 10%-80% improvement over the optimized POWER7 one that uses
only aligned accesses, specially on unaligned inputs.
The algorithm first read and check 16 bytes (if inputs do not cross a 4K
page size). The it realign source to 16-bytes and issue a 16 bytes read
and compare loop to speedup null byte checks for large strings. Also,
different from POWER7 optimization, the null pad is done inline in the
implementation using possible unaligned accesses, instead of realying on
a memset call. Special case is added for page cross reads.
This patch adds an optimized POWER8 strcpy using unaligned accesses.
For strings up to 16 bytes the implementation first calculate the
string size, like strlen, and issues a memcpy. For larger strings,
source is first aligned to 16 bytes and then tested over a loop that
reads 16 bytes am combine the cmpb results for speedup. Special case is
added for page cross reads.
It shows 30%-60% improvement over the optimized POWER7 one that uses
only aligned accesses.
[Modified from the original email by Siddhesh Poyarekar]
This patch solves bug #16009 by implementing an additional path in
strxfrm that does not depend on caching the weight and rule indices.
In detail the following changed:
* The old main loop was factored out of strxfrm_l into the function
do_xfrm_cached to be able to alternativly use the non-caching version
do_xfrm.
* strxfrm_l allocates a a fixed size array on the stack. If this is not
sufficiant to store the weight and rule indices, the non-caching path is
taken. As the cache size is not dependent on the input there can be no
problems with integer overflows or stack allocations greater than
__MAX_ALLOCA_CUTOFF. Note that malloc-ing is not possible because the
definition of strxfrm does not allow an oom errorhandling.
* The uncached path determines the weight and rule index for every char
and for every pass again.
* Passing all the locale data array by array resulted in very long
parameter lists, so I introduced a structure that holds them.
* Checking for zero src string has been moved a bit upwards, it is
before the locale data initialization now.
* To verify that the non-caching path works correct I added a test run
to localedata/sort-test.sh & localedata/xfrm-test.c where all strings
are patched up with spaces so that they are too large for the caching path.
The ldbl-96 implementation of scalblnl (used for x86_64 and ia64) uses
a condition k <= -63 to determine when a standard underflowing result
tiny*__copysignl(tiny,x) should be returned. However, that condition
corresponds to values with exponent -16446 or less, and in the case of
-16446, the correct result for round-to-nearest depends on whether the
value is exactly 0x1p-16446 (half the least subnormal) or more than
that. This patch fixes the bug by changing the condition to k <= -64
and accordingly adjusting the exponent by 64 not 63 when converting to
a normal value.
Tested for x86_64.
[BZ #17803]
* sysdeps/ieee754/ldbl-96/s_scalblnl.c (twom63): Rename to
twom64. Adjust value to 0x1p-64L.
(__scalblnl): Only return standard underflowing result for K <=
-64 not K <= -63; adjust exponent for underflowing result by 64
not 63.
* math/libm-test.inc (scalbn_test_data): Add more tests.
(scalbln_test_data): Likewise.
The ldbl-96 implementation of scalblnl (used for x86_64 and ia64) is
incorrect for subnormal arguments (this is a separate bug from bug
17803, which is about underflowing results). There are two problems
with the adjustments of subnormal arguments: the "two63" variable
multiplied by is actually 0x1p52L not 0x1p63L, so is insufficient to
make values normal, and then GET_LDOUBLE_EXP(es,x), used to extract
the new exponent, extracts it into a variable that isn't used, while
the value taken to by the new exponent is wrongly taken from the high
part of the mantissa before the adjustment (hx). This patch fixes
both those problems and adds appropriate tests.
Tested for x86_64.
[BZ #17834]
* sysdeps/ieee754/ldbl-96/s_scalblnl.c (two63): Change value to
0x1p63L.
(__scalblnl): Get new exponent of adjusted subnormal value from ES
not HX.
* math/libm-test.inc (scalbn_test_data): Add more tests.
(scalbln_test_data): Likewise.
This patch adds support for lock elision using ISA 2.07 hardware
transactional memory instructions for pthread_mutex primitives.
Similar to s390 version, the for elision logic defined in
'force-elision.h' is only enabled if ENABLE_LOCK_ELISION is defined.
Also, the lock elision code should be able to be built even with
a compiler that does not provide HTM support with builtins.
However I have noted the performance is sub-optimal due scheduling
pressures.
Microblaze apparently has a variable page size (see thread below) and
should not hard-code any page-size related macros.
Also remove macros that are only used for BFD's trad-core support
which is not relavant for microblaze also according to the thread
starting here:
https://sourceware.org/ml/libc-ports/2013-11/msg00028.html
This patch is neither built nor tested but mirrors a MIPS patch that
fixes the same issue.
Thanks,
Matthew
* sysdepsysdeps/unix/sysv/linux/microblaze/sys/user.h
(PAGE_SHIFT, PAGE_SIZE, PAGE_MASK, NBPG, UPAGES): Remove.
(HOST_TEXT_START_ADDR, HOST_STACK_END_ADDR): Remove.
Signed-off-by: David Holsgrove <david.holsgrove@xilinx.com>
Concluding the fixes for C90 libm functions calling C99 fe* functions,
this patch fixes the case of feupdateenv by making it a weak alias for
__feupdateenv and making the affected code call __feupdateenv.
Tested for x86_64 (testsuite, and that installed stripped shared
libraries are unchanged by the patch). Also tested for ARM
(soft-float) that the math.h linknamespace tests now pass.
[BZ #17748]
* include/fenv.h (__feupdateenv): Use libm_hidden_proto.
* math/feupdateenv.c (__feupdateenv): Use libm_hidden_def.
* sysdeps/aarch64/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/alpha/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/arm/feupdateenv.c (feupdateenv): Rename to __feupdateenv
and define as weak alias of __feupdateenv. Use libm_hidden_weak.
* sysdeps/hppa/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/i386/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/ia64/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/m68k/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/mips/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/powerpc/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/powerpc/nofpu/feupdateenv.c (__feupdateenv): Likewise.
* sysdeps/powerpc/powerpc32/e500/nofpu/feupdateenv.c
(__feupdateenv): Likewise.
* sysdeps/s390/fpu/feupdateenv.c (feupdateenv): Rename to
__feupdateenv and define as weak alias of __feupdateenv. Use
libm_hidden_weak.
* sysdeps/sh/sh4/fpu/feupdateenv.c (feupdateenv): Likewise.
* sysdeps/sparc/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/tile/math_private.h (__feupdateenv): New inline
function.
* sysdeps/x86_64/fpu/feupdateenv.c (__feupdateenv): Use
libm_hidden_def.
* sysdeps/generic/math_private.h (default_libc_feupdateenv): Call
__feupdateenv instead of feupdateenv.
(default_libc_feupdateenv_test): Likewise.
(libc_feresetround_ctx): Likewise.
glibc maintains a binary tree of environment strings it malloc()ed
itself. However, it's possible for it to malloc() a string, then find
that an identical string is already in the tree. In this case, the
memory is leaked and is not freed if the application later calls
__libc_freeres(). Fix this by freeing 'new_value' when it's unneeded.
Test case:
#include <stdlib.h>
#include <string.h>
int main()
{
char *p = calloc(100000, 1);
memset(p, 'A', 99999);
setenv("TESTVAR", p, 1);
setenv("TESTVAR", p, 1);
free(p);
}
Leak that was reported by valgrind:
100,008 bytes in 1 blocks are definitely lost in loss record 1 of 1
at 0x4C29F90: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x4E6B3D4: __add_to_environ (setenv.c:176)
by 0x4C31B8F: setenv (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x400642: main (in /mnt/tmpfs/a.out)
When mount entry contains only four fields and have more then one space or
tab at the and, mp.mnt_freq and mp.mnt_passno will be set to some specific
values as side effect from parsing of previus mount entry. It is because
sscanf(""," %d %d ", &a, &b) returns -1, but this case is unprocessed.
Values of mp.mnt_freq and mp.mnt_passno stays unchanged. This patch is
attempt to fix described issue by removing trailing tabs and spaces.