In some places, we need to make sure that no warnings are thrown
due to unknown encoding. The error reporting code tried to avoid
this by determining a "safe charset", but this introduces subtle
discrepancies in which charset is picked (normally
internal_encoding takes precedence). Avoid this by suppressing
the warning in the first place.
While here, use the fallback logic to print error messages with
substitution characters more consistently, to avoid skipping
parts of the error message entirely.
Treatment of locales in PHP is currently inconsistent: The LC_ALL
locale is set to "C", as is standard behavior on program startup.
The LC_CTYPE locale is set to "", which will inherit it from the
environment. However, the inherited LC_CTYPE locale will only be
used in some cases, while in other cases it is necessary to perform
an explicit setlocale() call in PHP first. This is the case for
the locale-sensitive handling in the PCRE extension.
Make things consistent by *never* inheriting any locales from the
environment. LC_ALL, including LC_CTYPE will be "C" on startup.
A locale can be set or inherited through an explicit setlocale()
call, at which point the behavior will be fully consistent and
predictable.
Closes GH-5488.
Even if default_charset is set to "", we should still return
"UTF-8" as the default value here. Setting default_charset to ""
suppresses the header emission, but shouldn't change anything
about our encoding defaults.
This is a backport of fcdc0a6db0
to the PHP-7.3 branch. We need to make sure that OnUpdateString
is also called for a NULL value, otherwise the reset of the encoding
at the end of the request will not work.
I believe I already tried to land this before once, but it didn't
actually end up on the PHP-7.3 branch due to a push conflict that
I only noticed just now.
We need to update the value even if new_value is NULL. In particular,
it should be reset back to NULL after each request if the setting was
not specified on startup. Otherwise we leave dangling pointers.
There are a few parts here:
* opcache should not be blocking signals while invoking compile_file,
otherwise signals may remain blocked on a compile error. While at
it, also protect SHM memory during compile_file.
* We should deactivate Zend signals at the end of the request, to make
sure that we gracefully recover from a missing unblock and signals
don't remain blocked forever.
* We don't use a critical section in deactivation, because it should
not be necessary. Additionally we want to clean up the signal queue,
if it is non-empty.
* Enable SIGG(check) in debug builds so we notice issues in the future.
If we're including a file via PHP streams, we're not going to trust
the reported file size anyway and populate in a loop -- so don't
bother determining the file size in the first place. Only do this
for non-tty HANDLE_FP now, which is the only case where this
information was used.
Disable buffering in PHP streams, to avoid storing and copying the
file contents twice.
This will call stream_set_option() on custom stream wrapper as
well, so the method needs to be implemented to avoid a warning.
Using mmap() is unsafe under concurrent modification. If the file
is truncated, access past the end of the file may occur, which will
generate a SIGBUS error. Even if the length does not change, the
contents may, which is a situation that the lexer certainly is not
prepared to deal with either.
Reproduce with test.php:
<?php
file_put_contents(__DIR__ . '/test.tpl',
'AAA<?php $string = "' .
str_repeat('A', mt_rand(1, 256 * 1024)) .
'"; ?>BBB' . "\r\n");
require_once __DIR__ . '/test.tpl';
And:
for ((n=0;n<100;n++)); do sapi/cli/php test.php & done
Instead of handling shebang lines by adjusting the file pointer in
individual SAPIs, move the handling into the lexer, where this is
both a lot simpler and more robust. Whether the shebang should be
skipped is controlled by CG(skip_shebang) -- we might want to do
that in more cases.
This fixed bugs #60677 and #78066.