This is an alternative implementation for GH-10406 that resets the
has_buffered_data flag after finishing stream read so it does not impact
other ops->read use like for example php_stream_get_line.
Closes GH-11421
This adds support for the completed event. Since the read handler could
be entered twice towards the end of the stream we remember what the eof
flag was before reading so we can emit the completed event when the flag
changes to true.
Closes GH-10505.
| 732 | ts->mode = mode && mode[0] == 'r' && mode[1] != '+' ? TEMP_STREAM_READONLY : 0;
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
Although mode is already dereference on line 723 in the call to strlen()
I grepped for php_printf cases in main/ and sapi/ and converted the
cases which clearly indicate errors to fprintf(stderr, ...), like
suggested in the linked issue.
Closes GH-11163.
The TSRM keeps a hashtable mapping the thread IDs to the thread resource pointers.
It's possible that the thread disappears without us knowing, and then another thread
gets spawned some time later with the same ID as the disappeared thread.
Note that since it's a new thread the TSRM key pointer and cached pointer will be NULL.
The Apache request handler `php_handler()` will try to fetch some fields from the SAPI globals.
It uses a lazy thread resource allocation by calling `ts_resource(0);`.
This allocates a thread resource and sets up the TSRM pointers if they haven't been set up yet.
At least, that's what's supposed to happen. But since we are in a situation where the thread ID
still has the resources of the *old* thread associated in the hashtable,
the loop in `ts_resource_ex` will find that thread resource and assume the thread has been setup
already. But this is not the case since this thread is actually a new thread, just reusing the ID
of the old one, without any relation whatsoever to the old thread.
Because of this assumption, the TSRM pointers will not be setup, leading to a
NULL pointer dereference when trying to access the SAPI globals.
We can easily detect this scenario: if we're in the fallback path, and the pointer is NULL,
and we're looking for our own thread resource, we know we're actually reusing a thread ID.
In that case, we'll free up the old thread resources gracefully (gracefully because
there might still be resources open like database connection which need to be
shut down cleanly). After freeing the resources, we'll create the new resources for
this thread as if the stale resources never existed in the first place.
From that point forward, it is as if that situation never occurred.
The fact that this situation happens isn't that bad because a child process containing
threads will eventually be respawned anyway by the SAPI, so the stale thread resources
won't remain forever.
Note that we can't simply assign our own TSRM pointers to the existing
thread resource for our ID, since it was actually from a different thread
(just with the same ID!). Furthermore, the dynamically loaded extensions
have their own pointer, which is only set when their constructor is
called, so we'd have to call their constructor anyway...
I also tried to call the dtor and then the ctor again for those resources
on the pre-existing thread resource to reuse storage, but that didn't work properly
because other code doesn't expect something like that to happen, which breaks assumptions,
and this in turn caused Valgrind to (rightfully) complain about memory bugs.
Note 2: I also had to fix a bug in the core globals destruction because it
always assumed that the thread destroying them was the owning thread,
which on TSRM shutdown isn't always the case. A similar bug was fixed
recently with the JIT globals.
Closes GH-10863.
This change restores the old behaviour for the server socket streams
that don't support IO. This is now stored in the stream flags so it can
be later used to do some other decisions and possibly introduce some
better error reporting.
Closes GH-10877
Struct members require some alignment based on their type. This means
that if a struct member is not aligned, there will be a hole created by
the compiler in the struct, which is wasted space. This patch reorders
some of the most commonly used structs, but in such a way that the
fields which were in the same cache line still belong together.
The only exception to this is exception_ignore_args, which was
temporally not close to nearby members, and as such I placed
it further up to close a hole.
On 64-bit Linux this gives us the following shrinks:
* zend_op_array: 248 -> 240
* zend_ssa_var: 56 -> 48
* zend_ssa_var_info: 48 -> 40
* php_core_globals: 672 -> 608
* zend_executor_globals: 1824 -> 1792
On 32-bit, the sizes will either remain the same or will result in
smaller shrinks.
The tsrm_startup() function is currently always called with expected_threads = 1.
This means that the hashtable used in the TSRM will only contain a single bucket,
and all thread resources will therefore be in the same linked list.
So it's not really a hashtable right now, even though it's supposed to be.
This patch adds a function tsrm_startup_ex() which takes the expected
thread count as an argument. It also keeps the tsrm_startup() function
so there are no BC breaks.
In the Apache SAPI we query how many threads we have, and pass that to
the tsrm_startup_ex() function.
`zend_rc_debug` is not a type and does not really belong in
`zend_types.h`; this allows using `ZEND_RC_MOD_CHECK()` without
including the huge `zend_types.h` header and allows decoupling
circular header dependencies.
Fixes GH-10692
php_fopen_primary_script() does not initialize all fields of
zend_file_handle. So when it fails and when fastcgi is true, the
zend_destroy_file_handle() function will try to free uninitialized
pointers, causing a segmentation fault. Fix it by zero-initializing file
handles just like the zend_stream_init_fp() counterpart does.
Closes GH-10697.
`zend_uchar` suggests that the value is an ASCII character, but here,
it's about very small integers. This is misleading, so let's use a
C99 integer instead.
On all architectures currently supported by PHP, `zend_uchar` and
`uint8_t` are identical. This change is only about code readability.
* Zend/zend_enum: make `forbidden_methods` static+const
* main/php_syslog: make `xdigits` static
* sapi/fpm: make several globals `const`
* sapi/phpdbg: make `OPTIONS` static
* sapi/phpdbg/help: make help texts const
* sapi/cli: make `template_map` const
* ext/ffi: make `zend_ffi_types` static
* ext/bcmath: make `ref_str` const
* ext/phar: make several globals static+const
Fix it by extending the array sizes by one character. As the input is
limited to the maximum path length, there will always be place to append
the slash. As the php_check_specific_open_basedir() simply uses the
strings to compare against each other, no new failures related to too
long paths are introduced.
We'll let the DOM and XML case handle a potentially too long path in the
library code.
On some filesystems, the copy operation fails if we specify a size
larger than the file size in certain circumstances and configurations.
In those cases EIO will be returned as errno and we will therefore fall
back to other methods.
copy_file_range can return early without copying all the data. This is
legal behaviour and worked properly, unless the mmap fallback was used.
The mmap fallback would read too much data into the destination,
corrupting the destination file. Furthermore, if the mmap fallback would
fail and have to fallback to the regular file copying mechanism, a
similar issue would occur because both maxlen and haveread are modified.
Furthermore, there was a mmap-resource in one of the failure paths of
the mmap fallback code.
This patch fixes these issues. This also adds regression tests using the
new copy_file_range early-return simulation added in the previous
commit.
The precision "minimum" for spprintf was changed in
3f23e6bca9 with the cryptic comment "Enable 0
mode for echo/print". Since then the behaviour of spprintf and snprintf has not
been the same. This results in some APIs handling precision differently than
others, which then resulted in the following Xdebug issue:
https://bugs.xdebug.org/view.php?id=2151
The "manpage" for snprinf says about precision:
An optional precision, in the form of a period ('.') followed by an
optional decimal digit string. Instead of a decimal digit string one
may write "*" or "*m$" (for some decimal integer m) to specify that the
precision is given in the next argument, or in the m-th argument, re‐
spectively, which must be of type int. If the precision is given as
just '.', the precision is taken to be zero. A negative precision is
taken as if the precision were omitted.
However, the snprintf implementation never supported this "negative precision",
which is what PHP's default setting is in PG(precision). However, in
3f23e6bca9 spprintf was made to support this.
Although this techinically can break BC, there is clearly a bug here, and I
could not see any failing tests locally.
When this INI option is enabled, it reverts the line separator for
headers and message to LF which was a non conformant behavior in PHP 7.
It is done because some non conformant MTAs fail to parse CRLF line
separator for headers and body.
This is used for mail and mb_send_mail functions.
* main: Fix comment for php_safe_bcmp
* main: Include note about php_safe_bcmp being security sensitive
This is taken from the implementation of `hash_equals()`.
The last parameter which is named zval *args, does NOT set the FCI params field. It is expected to be a PHP array which gets looped over to set the arguments which is also a zval pointer...
Since PHP 8.0, the named_params field is a more appropriate way of doing this.
This is a result of checking GH-8800 which assumed potential
memory leaks here. Even though it was not the case in reality,
the function deserves a bit of clarification to prevent similar
attempts in the future.
This change primarily splits SAPI deactivation to module and destroy
parts. The reason is that currently some SAPIs might bail out
on deactivation. One of those SAPI is PHP-FPM that can bail out on
request end if for example the connection is closed by the client
(web sever). The problem is that in such case the resources are not
freed and some values reset. The most visible impact can have not
resetting the PG(headers_sent) which can cause erorrs in the next
request. One such issue is described in #77780 bug which this fixes
and is also cover by a test in this commit. It seems reasonable
to separate deactivation and destroying of the resource which means
that the bail out will not impact it.
This fix is another solution to replace d0527427be, use zend_try and zend_catch to make sure persistent stream will be released when error occurred.
Closes GH-9332.
This reverts commit d0527427be.
This patch makes Swoole/Swow can not work anymore, because Coroutine will yield to another one during socket operation, EG(record_errors) assertion will always fail, and zend_begin_record_errors() was only used during compile time before.
Note: zend_emit_recorded_errors() and the typo fix are reserved.
This is not actually related to SSL handshake but stream socket creation
which does not clean errors if the error handler is set. This fix
prevents emitting errors until the stream is freed.
This PR changes the glob stream wrapper so it impacts "glob://"
streamsas well. The idea is to do a check for each found path instead
of the pattern which was not working correctly.
Implements https://wiki.php.net/rfc/partially-supported-callables-expand-deprecation-notices
so that uses of "self" and "parent" in is_callable() and callable
type constraints now raise a deprecation notice, independent of the
one raised when and if the callable is actually invoked.
A new flag is added to the existing check_flags parameter of
zend_is_callable / zend_is_callable_ex, for use in internal calls
that would otherwise repeat the notice multiple times. In particular,
arguments to internal function calls are checked first based on
arginfo, and then again during ZPP, so the former suppresses the
deprecation notice.
Some existing tests which raised this deprecation have been updated
to avoid the syntax, but the existing version retained for maximum
regression coverage until it is made an error.
With thanks to Juliette Reinders Folmer for the RFC and initial
investigation.
Closes GH-8823.
On Windows, closing a file which is locked may not immediately remove
the lock. The `LockFileEx()` documentation states:
| Therefore, it is recommended that your process explicitly unlock all
| files it has locked when it terminates.
We comply, and also use the macro `LOCK_EX` instead of the magic number
`2`.
Closes GH-8925.
Fix targeted for oses defining those flags as enums (like Linux/glibc).
`error: converting the enum constant to a boolean [-Werror,-Wint-in-bool-context]
} else if ((!sslsock->ssl_active && value == 0 && (MSG_DONTWAIT || !sslsock->s.is_blocked)) ||`
Closes#8895.
A file that has just been opened is known to be at offset zero, and
the lseek(SEEK_CUR) system call to determine the current offset can be
skipped.
Closes#8540.
In a similar model as _safe_*alloc api but for the `userland` it guards
against overflow before (re)allocation, usage concealed in fpm for now.
Modern Linux and most of BSD already have it.
Closes#8871.
Nothing new but to refactor usage b/w hash and password
extensions but using volatile pointers to be a bit safer,
allowing to expand its usage eventually.
If there is a zero timeout and MSG_DONTWAIT is available (or the
socket is non-blocking), the poll() call is not necessary, and we can
just call recv() right away.
Before this change:
poll([{fd=4, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, "HTTP/1.1 301 Moved Permanently\r\n"..., 8192, MSG_DONTWAIT, NULL, NULL) = 348
poll([{fd=4, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, "", 1, MSG_PEEK, NULL, NULL) = 0
After this change:
recvfrom(4, 0x7ffe0cc719a0, 1, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=4, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, "HTTP/1.1 301 Moved Permanently\r\n"..., 8192, MSG_DONTWAIT, NULL, NULL) = 348
recvfrom(4, "", 1, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 0
The first poll() is replaced by recvfrom(), and the third poll() is
omitted completely.
ext/openssl/xp_ssl: eliminate poll() when MSG_DONTWAIT is available
If there is a zero timeout and MSG_DONTWAIT is available (or the
socket is non-blocking), the poll() call is not necessary, and we can
just call recv() right away.
Closes GH-8092.
Add zend_ini_parse_quantity() and deprecate zend_atol(), zend_atoi()
zend_atol() and zend_atoi() don't just do number parsing.
They also check for a 'K', 'M', or 'G' at the end of the string,
and multiply the parsed value out accordingly.
Unfortunately, they ignore any other non-numerics between the
numeric component and the last character in the string.
This means that numbers such as the following are both valid
and non-intuitive in their final output.
* "123KMG" is interpreted as "123G" -> 132070244352
* "123G " is interpreted as "123 " -> 123
* "123GB" is interpreted as "123B" -> 123
* "123 I like tacos." is also interpreted as "123." -> 123
Currently, in php-src these functions are used only for parsing ini values.
In this change we deprecate zend_atol(), zend_atoi(), and introduce a new
function with the same behavior, but with the ability to report invalid inputs
to the caller. The function's name also makes the behavior less unexpected:
zend_ini_parse_quantity().
Co-authored-by: Sara Golemon <pollita@php.net>
* Fix regression from GH-8587
Streams hold a reference to the stream wrapper. User stream wrappers
must not be released until the streams themselves are closed.
* Add test for directories
Because the UID= and PWD= values are appended to the SQLDriverConnect
case when credentials are passed, we have to append them to the string
in case users are relying on this behaviour. However, they must be
quoted, or the arguments will be invalid (or possibly more injected).
This means users had to quote arguments or append credentials to the raw
connection string themselves.
It seems that ODBC quoting rules are consistent enough (and that
Microsoft trusts them enough to encode into the .NET BCL) that we can
actually check if the string is already quoted (in case a user is
already quoting because of this not being fixed), and if not, apply the
appropriate ODBC quoting rules.
This is because the code exists in main/, and are shared between
both ODBC extensions, so it doesn't make sense for it to only exist
in one or the other. There may be a better spot for it.
Closes GH-8307.
* Use arena in DCE instead of multiple alloca()
This requires passing the optimizer context
* Use our do_alloca() instead of alloca()
* Use emalloc in DEBUG builds instead of stack allocations for do_alloca()
This helps detecting that we correctly free do_alloca()
This issue might happen if there is change of the fcgi stream when
the buffer is full. Then the empty record is created which signals
end of stream which is incorrect.
The actual fix without a test was contributed by GitHub user @loveharmful
in GH-3198.
When you choose to bind() to a local address and connect to a remote host
the kernel does not know if socket is listener or whatnot and has to reserve
a port within a ~32k range on which you are also competing with other
applications bound to the same address, this is easy to exhaust and get a
EADDRINUSE.
The linux kernel implemented IP_BIND_ADDRESS_NO_PORT on
90c337da15
which delays the port allocation until the 4-tuple is known in case source port
is 0.
CVS date not changed as in fact the actual version is related to an earier date
in reality.
They ditched the `uintptr_t`cast finally. While at it
updating the C style definitions.
Closes GH-8389.
copy_file_range() is a Linux-specific system call which allows
efficient copying between two file descriptors, eliminating the need
to transfer data from the kernel to userspace and back. For
networking file systems like NFS and Ceph, it even eliminates copying
data to the client, and local filesystems like Btrfs and XFS can
create shared extents.
Apparently, this has been forgotten when PHP 8.0.17RC1 and 8.0.18RC1
had been tagged.
We also fix the version of the fix for GH-8253, which didn't make it
into PHP 8.0.18RC1.
* ext/oci8: use zend_string_equals()
Eliminate duplicate code.
* main/php_variables: use zend_string_equals_literal()
Eliminate duplicate code.
* Zend/zend_string: add zend_string_equals_cstr()
Allows eliminating duplicate code.
* Zend, ext/{opcache,standard}, main/output: use zend_string_equals_cstr()
Eliminate duplicate code.
* Zend/zend_string: add zend_string_starts_with()
* ext/{opcache,phar,spl,standard}: use zend_string_starts_with()
This adds missing length checks to several callers, e.g. in
cache_script_in_shared_memory(). This is important when the
zend_string is shorter than the string parameter, when memcmp()
happens to check backwards; this can result in an out-of-bounds memory
access.
This is achieved by tracking the observers on the run_time_cache (with a fixed amount of slots, 2 for each observer).
That way round, if the run_time_cache is freed all associated observer data is as well.
This approach has been chosen, as to avoid any ABI or API breakage.
Future versions may for example choose to provide a hookable API for run_time_cache freeing or similar.
Closes GH-7847
Closes GH-7852
Previously stripos/stristr would lowercase both the haystack and the
needle to reuse strpos. The approach in this PR is similar to strpos.
memchr is highly optimized so we're using it to search for the first
character of the needle in the haystack. If we find it we compare the
remaining characters of the needle manually.
The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary to find the next candidate).
When bug 77574[1] has been fixed, the fix only catered to variables
retrieved via `getenv()` with a `$varname` passed, but neither to
`getenv()` without arguments nor to the general import of environment
variables into `$_ENV` and `$_SERVER`. We catch up on this by using
`GetEnvironmentStringsW()` in `_php_import_environment_variables()` and
converting the encoding to whatever had been chosen by the user.
[1] <https://bugs.php.net/bug.php?id=75574>
Closes GH-7928.
Unfortunately, libedit is locale based and does not accept UTF-8
input when the C locale is used. This patch switches the default
locale to C.UTF-8 instead (if it is available). This makes libedit
work and I believe it shouldn't affect behavior of single-byte
locale-dependent functions that PHP otherwise uses.
Closes GH-7635.
Since we're going to read from the current stream position anyway, the
`max_len` should be the size of the file minus the current position
(still catering to potentially filtered streams). We must, however,
make sure to cater to the file position being beyond the actual file
size.
While we're at, we also fix the step size in the comment, which is 8K.
A further optimization could be done for unfiltered streams, thus
saving that step size, but 8K might not be worth it.
Closes GH-7693.
On x86_64 glibc memrchr() uses SSE/AVX CPU extensions and works much
faster then naive loop. On x86 32-bit we still use inlined version.
memrchr() is a GNU extension. Its prototype becomes available when
<string.h> is included with defined _GNU_SOURCE macro. Previously, we
defined it in "php_config.h", but some sources may include <string.h>
befire it. To avod mess we also pass -D_GNU_SOURCE to C compiler.
While testing the cPanel usage of PHP-FPM, we stumbled on this bug.
Without the fix, the zend_string is corrupted and getting odd filenames
When using FPM we kept getting "No input file specified".
I work for cPanel and we use PHP extensively.
- for packed arrays we store just an array of zvals without keys.
- the elements of packed array are accessible throuf as ht->arPacked[i]
instead of ht->arData[i]
- in addition to general ZEND_HASH_FOREACH_* macros, we introduced similar
familied for packed (ZEND_HASH_PACKED_FORECH_*) and real hashes
(ZEND_HASH_MAP_FOREACH_*)
- introduced an additional family of macros to access elements of array
(packed or real hashes) ZEND_ARRAY_ELEMET_SIZE, ZEND_ARRAY_ELEMET_EX,
ZEND_ARRAY_ELEMET, ZEND_ARRAY_NEXT_ELEMENT, ZEND_ARRAY_PREV_ELEMENT
- zend_hash_minmax() prototype was changed to compare only values
Because of smaller data set, this patch may show performance improvement
on some apps and benchmarks that use packed arrays. (~1% on PHP-Parser)
TODO:
- sapi/phpdbg needs special support for packed arrays (WATCH_ON_BUCKET).
- zend_hash_sort_ex() may require converting packed arrays to hash.
Use ASCII case conversion instead of locale-dependent case conversion in
the following places:
* grapheme_stripos() and grapheme_strripos() in the "fast" path
* ldap_get_entries()
* oci_pconnect() for case folding of parameters when constructing a key
into the connection or session pool
* SoapClient: case folding of function names
* get_meta_tags(): case conversion of property names
* http stream wrapper: header names
* phpinfo(): anchor names
* php_verror(): docref URLs
* rfc1867.c: Content-Type boundary parameter name
* streams.c: stream protocol names
Using locale-dependent case folding for these cases is either
unnecessary or actively incorrect. These functions could have
misbehaved when used with certain locales (e.g. Turkish).
Closes GH-7511.
In particular, this allows using the hook without server_context.
The apache2handler implementation now checks that server_context
is available itself, as that's the implementation that cares
about it.
We need to avoid storing it in the first place, as we don't
really have a good place to release it later. If headers haven't
been sent yet, send_headers will do this. sapi_deactive happens
too late in the shutdown sequence and will result in leak reports.