glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-11-28 12:13:37 +08:00

Author	SHA1	Message	Date
Adhemerval Zanella	e760874ee3	linux: Consolidate time implementation The IFUNC bypass to vDSO is used when USE_IFUNC_TIME is set. Currently powerpc and x86 defines it. Otherwise the generic implementation is used, which calls clock_gettime. Checked on powerpc64le-linux-gnu, powerpc64-linux-gnu, powerpc-linux-gnu-power4, x86_64-linux-gnu, and i686-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 11:22:04 -03:00
Adhemerval Zanella	c701bcc6f4	linux: Consolidate Linux gettimeofday The IFUNC bypass to vDSO is used when USE_IFUNC_GETTIMEOFDAY is set. Currently aarch64, powerpc*, and x86 defines it. Otherwise the generic implementation is used, which calls clock_gettime. Checked on aarch64-linux-gnu, powerpc64le-linux-gnu, powerpc64-linux-gnu, powerpc-linux-gnu-power4, x86_64-linux-gnu, and i686-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 11:21:50 -03:00
Adhemerval Zanella	7bcaf77574	linux: Update mips vDSO symbols The clock_getres is a new implementation added on Linux 5.4 (abed3d826f2f). Checked with a build against mips-linux-gnu and mips64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:05 -03:00
Adhemerval Zanella	eca6aec6a3	linux: Update x86 vDSO symbols Add the missing time and clock_getres vDSO symbol names on x86. For time, the iFUNC already uses expected name so it affects only the static build. The clock_getres is a new implementation added on Linux 5.3 (f66501dc53e72). Checked on x86-linux-gnu and i686-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:05 -03:00
Adhemerval Zanella	2822aaf4f7	Remove vDSO support from make-syscall.sh The auto-generated vDSO call shows some issues: - It requires sync the auto-generated C file with current glibc implementation; - It still uses symbol redirections hacks where libc-symbols.h provide macros that uses compiler builtins (libc_ifunc_redirected for instance); - It does not handle all required compiler handling (inhibit_stack_protector on iFUNC resolver). - No architecure uses it. Checked with a build against all major ABIs. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:05 -03:00
Adhemerval Zanella	bc36727be9	x86: Make x32 use x86 time implementation This is the only use of auto-generation syscall which uses a vDSO plus IFUNC and the current x86 generic implementation already covers the expected semantic. Checked on x86_64-linux-gnu-x32. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:05 -03:00
Adhemerval Zanella	d0def09ff6	linux: Fix vDSO macros build with time64 interfaces As indicated on libc-help [1] the `ec138c67cb` commit broke 32-bit builds when configured with --enable-kernel=5.1 or higher. The scenario 10 from [2] might also occur in this configuration and INLINE_VSYSCALL will try to use the vDSO symbol and HAVE_CLOCK_GETTIME64_VSYSCALL does not set HAVE_VSYSCALL prior its usage. Also, there is no easy way to just enable the code to use one vDSO symbol since the macro INLINE_VSYSCALL is redefined if HAVE_VSYSCALL is set. Instead of adding more pre-processor handling and making the code even more convoluted, this patch removes the requirement of defining HAVE_VSYSCALL before including sysdep-vdso.h to enable vDSO usage. The INLINE_VSYSCALL is now expected to be issued inside a HAVE_*_VSYSCALL check, since it will try to use the internal vDSO pointers. Both clock_getres and clock_gettime vDSO code for time64_t were removed since there is no vDSO setup code for the symbol (an architecture can not set HAVE_CLOCK_GETTIME64_VSYSCALL). Checked on i686-linux-gnu (default and with --enable-kernel=5.1), x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu. I also checked against a build to mips64-linux-gnu and sparc64-linux-gnu. [1] https://sourceware.org/ml/libc-help/2019-12/msg00014.html [2] https://sourceware.org/ml/libc-alpha/2019-12/msg00142.html Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:05 -03:00
Adhemerval Zanella	b03688bfbb	Linux: Fix clock_nanosleep time64 check The result of INTERNAL_SYSCALL_CANCEL should be checked with macros INTERNAL_SYSCALL_ERROR_P and INTERNAL_SYSCALL_ERRNO instead of comparing the result directly. Checked on powerpc-linux-gnu.	2020-01-03 10:02:05 -03:00
Wilco Dijkstra	220622dde5	Add libm_alias_finite for _finite symbols This patch adds a new macro, libm_alias_finite, to define all _finite symbol. It sets all _finite symbol as compat symbol based on its first version (obtained from the definition at built generated first-versions.h). The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need special treatment in code that is shared between long double and float128. It is done by adding a list, similar to internal symbol redifinition, on sysdeps/ieee754/float128/float128_private.h. Alpha also needs some tricky changes to ensure we still emit 2 compat symbols for sqrt(f). Passes buildmanyglibc. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-03 10:02:04 -03:00
Siddhesh Poyarekar	cf4dfd4617	Update libc.pot for 2.31 release	2020-01-02 20:11:36 +05:30
Rafał Lużyński	75ba929987	Multiple locales: Add date_fmt (bug 24054) It is not specified what should be the content of d_t_fmt and date_fmt but in the built-in C locale those fields have only one difference: date_fmt contains "%Z" (the current time zone) while d_t_fmt does not. For most of the locales this commit does the following operation: copy d_t_fmt to date_fmt, and then remove "%Z" from d_t_fmt. If "%Z" was originally missing from d_t_fmt add it to date_fmt. It also corrects comments where necessary. Exceptions: * In bo_CN, dz_BT, and km_KH "%Z" has not been added to date_fmt because it was too difficult. In these locales date_fmt has been set to the copy of d_t_fmt. * In en_DK "%Z" has not been removed from d_t_fmt in order to preserve the conformance with the standard mentioned in the comment. The command to identify and initially edit the locales that need the update was: for i in `grep -lw d_t_fmt *` do if ! grep -qw date_fmt $i ; then awk '/d_t_fmt/ { print $0; gsub("d_t_fmt", "date_fmt"); } //{ print $0 }' < $i > $i.next mv $i.next $i fi done and then each file was further edited manually.	2020-01-02 11:45:45 +01:00
Florian Weimer	cc47d5c5f5	build-many-glibcs.py: Fix “glibcs i686-gnu --strip” Hurd uses an empty prefix, so the linker scripts end up in /lib, the find command picked them up, and stripping them failed because they are not ELF files.	2020-01-02 10:18:42 +01:00
Florian Weimer	0933a4678c	Linux: Remove pread/pread64, pwrite/pwrite64 kludges from <sysdep.h> Since the switch away from auto-generated wrappers for these system calls, the kludge is already included in the C source file of the system call wrapper.	2020-01-02 10:18:37 +01:00
Florian Weimer	07a44d2392	build-many-glibcs.py: Implement update-syscalls command This command uses pre-built compilers to re-install the Linux headers from the current sources into a temporary location and runs glibc's “make update-syscalls-lists” against that. This updates the glibc source tree with the current system call numbers. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-02 10:18:34 +01:00
Florian Weimer	857c7d7397	build-many-glibcs.py: Introduce glibc build policy classes The new classes GlibcPolicyForCompiler and GlibcPolicyForBuild allow customization of the Glibc.build_glibc method, replacing the existing for_compiler flag. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-02 10:18:31 +01:00
Florian Weimer	65b6c9b02b	build-many-glibcs.py: Introduce LinuxHeadersPolicyForBuild And move install_linux_headers to the top level. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-02 10:18:26 +01:00
Florian Weimer	a1bd5f8673	Linux: Use system call tables during build Use <arch-syscall.h> instead of <asm/unistd.h> to obtain the system call numbers. A few direct includes of <asm/unistd.h> need to be removed (if the system call numbers are already provided indirectly by <sysdep.h>) or replaced with <sys/syscall.h>. Current Linux headers for alpha define the required system call names, so most of the _NR_* hacks are no longer needed. For the 32-bit arm architecture, eliminate the INTERNAL_SYSCALL_ARM macro, now that we have regular system call names for cacheflush and set_tls. There are more such cleanup opportunities for other architectures, but these cleanups are required to avoid macro redefinition errors during the build. For ia64, it is desirable to use <asm/break.h> directly to obtain the break number for system calls (which is not a system call number itself). This requires replacing __BREAK_SYSCALL with __IA64_BREAK_SYSCALL because the former is defined as an alias in <asm/unistd.h>, but not in <asm/break.h>. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-02 10:18:23 +01:00
Florian Weimer	4cf0d22305	Linux: Add tables with system call numbers The new tables are currently only used for consistency checks with the installed kernel headers and the architecture-independent system call names table. They are based on Linux 5.4. The goal is to use these architecture-specific tables to ensure that system call wrappers are available irrespective of the version of the installed kernel headers. The tables are formatted in the form of C header files so that they can be used directly in an #include directive, without external preprocessing. (External preprocessing of a plain table file would introduce cross-subdirectory dependency issues.) However, the intent is that they can still be treated as tables and can be processed by simple tools. The irregular system call names on 32-bit arm add a complication. The <fixup-asm-unistd.h> header is introduced to work around that, and the system calls are listed under regular names in the <arch-syscall.h> file. A make target, update-syscalls-list, is added to patch the glibc sources with data from the current kernel headers. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2020-01-02 10:18:10 +01:00
Joseph Myers	5f72f9800b	Update copyright dates not handled by scripts/update-copyrights. I've updated copyright dates in glibc for 2020. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files. As well as the usual annual updates, mainly dates in --version output (minus libc.texinfo which previously had to be handled manually but is now successfully updated by update-copyrights), there is a fix to sysdeps/unix/sysv/linux/powerpc/bits/termios-c_lflag.h where a typo in the copyright notice meant it failed to be updated automatically. Please remember to include 2020 in the dates for any new files added in future (which means updating any existing uncommitted patches you have that add new files to use the new copyright dates in them).	2020-01-01 00:21:22 +00:00
Joseph Myers	d614a75396	Update copyright dates with scripts/update-copyrights.	2020-01-01 00:14:33 +00:00
Adhemerval Zanella	09153638cf	alpha: Set wait4 as cancellation entrypoint Since both wait and waitpid are implemented on top of wait4. It fixes nptl/tst-cancel{x}{4,5,7}. Checked on alpha-linux-gnu.	2019-12-30 11:05:28 -03:00
Rafał Lużyński	d99b500e3d	lv_LV locale: Correct the time part of d_t_fmt (bug 25324) Currently d_t_fmt formats time as "plkst. %H un %M". A quick Google search says that "plkst." means "o’clock" and "un" means "and". Also this format does not display seconds. CLDR does not mention anything like that. We have no reason to use anything different than "%H:%M:%S".	2019-12-30 11:48:20 +01:00
Rafał Lużyński	20a740b2b2	km_KH locale: Use "%M" instead of "m" in d_t_fmt (bug 25323) A quick analysis suggests that the original author meant "%M" (minutes format specifier) instead of "m" which is just a literal "m" letter.	2019-12-30 11:48:19 +01:00
Jeremie Koenig	653d74f12a	hurd: Global signal disposition This adds _hurd_sigstate_set_global_rcv used by libpthread to enable POSIX-confirming behavior of signals on a per-thread basis. This also provides a sigstate destructor _hurd_sigstate_delete, and a global process signal state, which needs to be locked and check when global disposition is enabled, thus the addition of _hurd_sigstate_lock _hurd_sigstate_actions _hurd_sigstate_pending _hurd_sigstate_unlock helpers. This also updates all the glibc code accordingly. This also drops support for get_int(INIT_SIGMASK), which did not make sense any more since we do not have a single signal thread any more. During fork/spawn, this also reinitializes the child global sigstate's lock. That cures an issue that would very rarely cause a deadlock in the child in fork, tries to unlock ss' critical section lock at the end of fork. This will typically (always?) be observed in /bin/sh, which is not surprising as that is the foremost caller of fork. To reproduce an intermediate state, add an endless loop if _hurd_global_sigstate is locked after __proc_dostop (cast through volatile); that is, while still being in the fork's parent process. When that triggers (use the libtool testsuite), the signal thread has already locked ss (which is _hurd_global_sigstate), and is stuck at hurdsig.c:685 in post_signal, trying to lock _hurd_siglock (which the main thread already has locked and keeps locked until after __task_create). This is the case that ss->thread == MACH_PORT_NULL, that is, a global signal. In the main thread, between __proc_dostop and __task_create is the __thread_abort call on the signal thread which would abort any current kernel operation (but leave ss locked). Later in fork, in the parent, when _hurd_siglock is unlocked in fork, the parent's signal thread can proceed and will unlock eventually the global sigstate. In the client, _hurd_siglock will likewise be unlocked, but the global sigstate never will be, as the client's signal thread has been configured to restart execution from _hurd_msgport_receive. Thus, when the child tries to unlock ss' critical section lock at the end of fork, it will first lock the global sigstate, will spin trying to lock it, which can never be successful, and we get our deadlock. Options seem to be: * Move the locking of _hurd_siglock earlier in post_signal -- but that may generally impact performance, if this locking isn't generally needed anyway? On the other hand, would it actually make sense to wait here until we are not any longer in a critical section (which is meant to disable signal delivery anyway (but not for preempted signals?))? * Clear the global sigstate in the fork's child with the rationale that we're anyway restarting the signal thread from a clean state. This has now been implemented. Why has this problem not been observed before Jérémie's patches? (Or has it? Perhaps even more rarely?) In _S_msg_sig_post, the signal is now posted to a global receiver thread, whereas previously it was posted to the designated signal-receiving thread. The latter one was in a critical section in fork, so didn't try to handle the signal until after leaving the critical section? (Not completely analyzed and verified.) Another question is what the signal is that is being received during/around the time __proc_dostop executes.	2019-12-29 18:32:49 +01:00
Samuel Thibault	eb87a46c56	hurd sendmsg: Fix warning on calling CMSG_*HDR	2019-12-29 17:49:41 +01:00
Jeremie Koenig	4288c548da	hurd: Signal code refactoring This should not change the current behavior, although this fixes a few minor bugs which were made apparent in the process of global signal disposition work: - Split into more functions - Scope variables more restrictively - Split out inner functions - refactor check_pending_signals - make sigsuspend POSIX-conformant. - fix uninitialized act value.	2019-12-29 17:18:04 +01:00
Thomas Schwinge	a678c13b8f	hurd: Add getcontext, makecontext, setcontext, swapcontext Adapted from the Linux x86 functions. Not thoroughly tested, but manual testing as well as glibc tests look fine, and manual -lpthread testing also looks fine (within the given bounds for a new stack to be used with makecontext). This has also been in use in Debian since 2013.	2019-12-29 16:54:08 +01:00
Emilio Pozuelo Monfort	344e755248	hurd: Support sending file descriptors over Unix sockets	2019-12-29 16:34:20 +01:00
Gabriel F. T. Gomes	9ae967bf45	ldbl-128ibm-compat: Do not mix -mabi=longdouble and -mlong-double-128 Some compiler versions, e.g. GCC 7, complain when -mlong-double-128 is used together with -mabi=ibmlongdouble or -mabi=ieeelongdouble, producing the following error message: cc1: error: ‘-mabi=ibmlongdouble’ requires ‘-mlong-double-128’ This patch removes -mlong-double-128 from the compilation lines that explicitly request -mabi=longdouble. Tested for powerpc64le. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>	2019-12-27 15:02:10 -03:00
Tulio Magno Quites Machado Filho	5d73c96f64	ldbl-128ibm-compat: Compiler flags for stdio functions Some of the files that provide stdio.h and wchar.h functions have a filename prefixed with 'io', such as 'iovsprintf.c'. On platforms that imply ldbl-128ibm-compat, these files must be compiled with the flag -mabi=ibmlongdouble. This patch adds this flag to their compilation. Notice that this is not required for the other files that provide similar functions, because filenames that are not prefixed with 'io' have ldbl-128ibm-compat counterparts in the Makefile, which already adds -mabi=ibmlongdouble to them. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>	2019-12-27 15:02:10 -03:00
Tulio Magno Quites Machado Filho	1ef9b6e0bf	Do not redirect calls to __GI_* symbols, when redirecting to ieee128 On platforms where long double has IEEE binary128 format as a third option (initially, only powerpc64le), many exported functions are redirected to their __ieee128 equivalents. This redirection is provided by installed headers such as stdio-ldbl.h, and is supposed to work correctly with user code. However, during the build of glibc, similar redirections are employed, in internal headers, such as include/stdio.h, in order to avoid extra PLT entries. These redirections conflict with the redirections to __*ieee128, and must be avoided during the build. This patch protects the second redirections with a test for __LONG_DOUBLE_USES_FLOAT128, a new macro that is defined to 1 when functions that deal with long double typed values reuses the _Float128 implementation (this is currently only true for powerpc64le). Tested for powerpc64le, x86_64, and with build-many-glibcs.py. Co-authored-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2019-12-27 15:02:10 -03:00
Xuelei Zhang	863d775c48	aarch64: add default memcpy version for kunpeng920 Checked on aarch64-linux-gnu.	2019-12-27 11:59:37 -03:00
Xuelei Zhang	10df95cdaf	aarch64: ifunc rename for kunpeng Rename ifunc for kunpeng to kunpeng920, and modify the corresponding function files including IS_KUNPENG920 judgement. Checked on aarch64-linux-gnu.	2019-12-27 11:59:51 -03:00
Xuelei Zhang	64297d49b3	aarch64: Modify error-shown comments for strcpy Checked on aarch64-linux-gnu.	2019-12-27 11:59:37 -03:00
Adhemerval Zanella	dc86199477	linux: Consolidate sigprocmask All architectures now uses the Linux generic implementation which uses __NR_rt_sigprocmask. Checked on x86_64-linux-gnu, sparc64-linux-gnu, ia64-linux-gnu, s390x-linux-gnu, and alpha-linux-gnu.	2019-12-27 11:18:23 -03:00
Adhemerval Zanella	58bd592536	Fix return code for __libc_signal_* functions The functions do not fail regardless of the argument value. Also, for Linux the return value is not correct on some platforms due the missing usage of INTERNAL_SYSCALL_ERROR_P / INTERNAL_SYSCALL_ERRNO macros. Checked on x86_64-linux-gnu, i686-linux-gnu, and sparc64-linux-gnu.	2019-12-27 11:18:23 -03:00
Adhemerval Zanella	11519fd0c9	nptl: Remove duplicate internal __SIZEOF_PTHREAD_MUTEX_T (BZ#25241) Checked on x86_64-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu-x32.	2019-12-26 17:04:50 -03:00
Rafał Lużyński	b8c210bcc7	mnw_MM, my_MM, and shn_MM locales: Do not use %Op The "O" modifier does nothing when used with "%p" so let's better not use it at all and replace "%Op" with "%p".	2019-12-23 23:49:22 +01:00
Gabriel F. T. Gomes	f8cd102081	Avoid compat symbols for totalorder in powerpc64le IEEE long double On powerpc64le, the libm_alias_float128_other_r_ldbl macro is used to create an alias between totalorderf128 and __totalorderlieee128, as well as between the totalordermagf128 and __totalordermaglieee128. However, the totalorder* and totalordermag* functions changed their parameter type since commit ID `42760d7646` and got compat symbols for their old versions. With this change, the aforementioned macro would create two conflicting aliases for __totalorderlieee128 and __totalordermaglieee128. This patch avoids the creation of the alias between the IEEE long double symbols (__totalorderl*ieee128) and the compat symbols, because the IEEE long double functions have never been exported thus don't need such compat symbol. Tested for powerpc64le. Reviewed-by: Joseph Myers <joseph@codesourcery.com>	2019-12-23 16:32:20 -03:00
Gabriel F. T. Gomes	3021e78178	ldbl-128ibm-compat: Add cvt functions This patch adds IEEE long double versions of qcvt* functions for powerpc64le. Unlike all other long double to/from string conversion functions, these do not rely on internal functions that can take floating-point numbers with different formats and act on them accordingly, instead, the related files are rebuilt with the -mabi=ieeelongdouble compiler flag set. Having -mabi=ieeelongdouble passed to the compiler causes the object files to be marked with a .gnu_attribute that is incompatible with the .gnu_attribute in files built with -mabi=ibmlongdouble (the default). The difference causes error messages similar to the following: ld: libc_pic.a(s_isinfl.os) uses IBM long double, libc_pic.a(ieee128-qefgcvt_r.os) uses IEEE long double. collect2: error: ld returned 1 exit status make[2]: *** [../Makerules:649: libc_pic.os] Error 1 Although this warning is useful in other situations, the library actually needs to have functions with different long double formats, so .gnu_attribute generation is explicitly disabled for these files with the use of -mno-gnu-attribute. Tested for powerpc64le on the branch that actually enables the sysdeps/ieee754/ldbl-128ibm-compat for powerpc64le. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-12-23 16:32:20 -03:00
Gabriel F. T. Gomes	dce4253411	Refactor cvt functions implementation (2/2) This patch refactors the cvt functions implementation in a way that makes it easier to re-use them for implementing the IEEE long double on powerpc64le. By removing the macros that generate the function names (APPEND combined with FUNC_PREFIX), the new code makes it easier to define new function names, such as __qecvtieee128. Tested that installed stripped binaries for all build-many-glibcs targets remain identical before and after this patch. Also tested for powerpc64le and x86_64. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-12-23 16:32:07 -03:00
Gabriel F. T. Gomes	e18a305777	Refactor cvt functions implementation (1/2) This patch refactors the cvt functions implementation in a way that makes it easier to re-use them for implementing the IEEE long double on powerpc64le. By splitting the implementation per se in one file (efgcvt-template.c) and the alias definitions in others (e.g. efgcvt.c), the new code makes it easier to define new function names, such as __qecvtieee128. Tested that installed stripped binaries for all build-many-glibcs targets remain identical before and after this patch. Also tested for powerpc64le and x86_64. Reviewed-by: Paul E. Murphy <murphyp@linux.ibm.com>	2019-12-23 16:30:24 -03:00
Adhemerval Zanella	57e687c6d4	Add exception-based flags for wait4 It fixes the tst-cancelx4 and tst-cancelx5 on sparc{64,v9}. Checked on sparc64-linux-gnu and sparcv9-linux-gnu.	2019-12-20 09:59:11 -03:00
Xuelei Zhang	525de033a9	aarch64: Optimized memset for Kunpeng processor. Due to the branch prediction issue of Kunpeng processor, we found memset_generic has poor performance on middle sizes setting, and so we reconstructed the logic, expanded the loop by 4 times in set_long to solve the problem, even when setting below 1K sizes have benefit. Another change is that DZ_ZVA seems no work when setting zero, so we discarded it and used set_long to set zero instead. Fewer branches and predictions also make the zero case have slightly improvement. Checked on aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2019-12-19 16:31:04 -03:00
Xuelei Zhang	c2150769d0	aarch64: Optimized strlen for strlen_asimd Optimize the strlen implementation by using vector operations and loop unrolling in main loop.Compared to __strlen_generic,it reduces latency of cases in bench-strlen by 7%~18% when the length of src is greater than 128 bytes, with gains throughout the benchmark. Checked on aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2019-12-19 16:31:04 -03:00
Xuelei Zhang	0db8e7b366	aarch64: Add Huawei Kunpeng to tunable cpu list Kunpeng processer is a 64-bit Arm-compatible CPU released by Huawei, and we have already signed a copyright assignement with the FSF. This patch adds its to cpu list, and related macro for IFUNC. Checked on aarch64-linux-gnu. Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>	2019-12-19 16:31:04 -03:00
Xuelei Zhang	a7611806d5	aarch64: Optimized implementation of memrchr Considering the excellent performance of memchr.S on glibc 2.30, the same algorithm is used to find chrin. Compared to memrchr.c, this method with memrchr.S achieves an average performance improvement of 58% based on benchtest and its extension cases. Checked on aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2019-12-19 16:31:04 -03:00
Xuelei Zhang	2911cb68ed	aarch64: Optimized implementation of strnlen Optimize the strlen implementation by using vector operations and loop unrooling in main loop. Compared to aarch64/strnlen.S, it reduces latency of cases in bench-strnlen by 11%~24% when the length of src is greater than 64 bytes, with gains throughout the benchmark. Checked on aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2019-12-19 16:31:04 -03:00
Xuelei Zhang	0237b61526	aarch64: Optimized implementation of strcpy Optimize the strcpy implementation by using vector loads and operations in main loop.Compared to aarch64/strcpy.S, it reduces latency of cases in bench-strlen by 5%~18% when the length of src is greater than 64 bytes, with gains throughout the benchmark. Checked on aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2019-12-19 16:31:04 -03:00
Xuelei Zhang	233efd433d	aarch64: Optimized implementation of memcmp The loop body is expanded from a 16-byte comparison to a 64-byte comparison, and the usage of ldp is replaced by the Post-index mode to the Base plus offset mode. Hence, compare can faster 18% around > 128 bytes in all. Checked on aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2019-12-19 16:31:04 -03:00

1 2 3 4 5 ...

35310 Commits