Internal functions error when too many arguments are passed. Make
this part of the verification we do in debug builds. This will
help avoid cases where an argument is missing in the stubs,
as recently encountered in 6d96f0f.
Bug that regularly sneaks in: ZEND_ACC_FINAL is set before calling
zend_register_internal_class() and promptly gets ignored. Remove
this footgun by preserving flags from the original CE.
In line with usual rules, give untyped properties a null default
value. Otherwise constructor promotion would give you a property
declaration that cannot be achieved through any other means.
Currently, unexpected tokens in the parser are shown as the text
found, plus the internal token name, including the notorious
"unexpected '::' (T_PAAMAYIM_NEKUDOTAYIM)".
This commit replaces that with a more user-friendly format, with
two main types of token:
* Tokens which always represent the same text are shown like
'unexpected token "::"' and 'expected "::"'
* Tokens which have variable text are given a user-friendly
name, and show like 'unexpected identifier "foo"', and
'expected identifer'.
A few tokens have special cases:
* unexpected token """ -> unexpected double-quote mark
* unexpected quoted string "'foo'" -> unexpected single-quoted
string "foo"
* unexpected quoted string ""foo"" -> unexpected double-quoted
string "foo"
* unexpected illegal character "_" -> unexpected character 0xNN
(where _ is almost certainly a control character, and NN is the
hexadecimal value of the byte)
The \ token has a special case in the implementation just to stop
bison making a mess of escaping it and it coming out as \\
When performing an RW modification of an array offset, the undefined
offset warning may call an error handler / OB callback, which may
destroy the array we're supposed to change. Detect this by temporarily
incrementing the reference count. If we find that the array has been
modified/destroyed in the meantime, we do nothing -- the execution
model here would be that the modification has happened on the destroyed
version of the array.
Some code paths were checking this manually, but we can turn this
into a general assertion to avoid surprises (functions returning
failure without throwing).
I haven't tracked down in detail where the interaction with
increment_function comes from, but the root problem here is failure
to handle the illegal offset type exception.
In the interest of avoiding side-effects during dumping, I'm
replacing the value with a <constant ast> string instead of
performing an update constant operation.
As part of my work on typed class constants, I wanted to make a
separate pull request for unrelated changes.
These include:
* Moving from ast->child[0]->attr to ast->attr
* Making zend_ast_export_ex() export class constants' visibility
* Extracting an additional zend_compile_class_const_group() function
Closes GH-5812.
Fixes a use-after-free encountered in Symfony's SecurityBundle.
I don't have a reproducer for this, and believe the issue can only
occur if we leak an iterator (the leak is a separate issue).
We should not free the generator iterator here, because we do not
own it. The code that fetched the iterator is responsible for
releasing it. In the rare case where we do hit this code-path,
we cause a use-after-free.
exit() is now internally implemented by throwing an exception,
performing a normal stack unwind and a clean shutdown. This ensures
that no persistent resource leaks occur.
The exception is internal, cannot be caught and does not result in
the execution of finally blocks. This may be relaxed in the future.
Closes GH-5768.
Don't include a trailing newline in T_COMMENT tokens, instead leave
it for a following T_WHITESPACE token. The newline does not belong
to the comment logically, and this makes for an ugly special case,
as other tokens do not include trailing newlines.
Whitespace-sensitive tooling will want to either forward or backward
emulate this change.
Closes GH-5182.
Make user-exposed sorts stable, by storing the position of elements
in the original array, and using those positions as a fallback
comparison criterion. The base sort is still hybrid q/insert.
The use of true/false comparison functions is deprecated (but still
supported) and should be replaced by -1/0/1 comparison functions,
driven by the <=> operator.
RFC: https://wiki.php.net/rfc/stable_sorting
Closes GH-5236.
This avoids ubsan warnings. Alternatively we could always initialize
iterator_funcs_ptr for aggregates, instead of doing so only for
non-internal ones.
Userland classes that implement Traversable must do so either
through Iterator or IteratorAggregate. The same requirement does
not exist for internal classes: They can implement the internal
get_iterator mechanism, without exposing either the Iterator or
IteratorAggregate APIs. This makes them usable in get_iterator(),
but incompatible with any Iterator based APIs.
A lot of internal classes do this, because exposing the userland
APIs is simply a lot of work. This patch alleviates this issue by
providing a generic InternalIterator class, which acts as an
adapater between get_iterator and Iterator, and can be easily
used by many internal classes. At the same time, we extend the
requirement that Traversable implies Iterator or IteratorAggregate
to internal classes as well.
Closes GH-5216.
While performing resource -> object migrations, we're adding
defensive classes that are final, non-serializable and non-clonable
(unless they are, of course). This path adds a ZEND_ACC_NO_DYNAMIC_PROPERTIES
flag, that also forbids the creation of dynamic properties on these objects.
This is a subset of #3931 and targeted at internal usage only
(though may be extended to userland at some point in the future).
It's already possible to achieve this (what the removed
WeakRef/WeakMap code does), but there's some caveats: First, this
simple approach is only possible if the class has no declared
properties, otherwise it's necessary to special-case those
properties. Second, it's easy to make it overly strict, e.g. by
forbidding isset($obj->prop) as well. And finally, it requires a
lot of boilerplate code for each class.
Closes GH-5572.
The hash is used to check whether the arginfo file needs to be
regenerated. PHP-Parser will only be downloaded if this is actually
necessary.
This ensures that release artifacts will never try to regenerate
stubs and thus fetch PHP-Parser, as long as you do not modify any
files.
Closes GH-5739.
- Fix typo in build/php.m4
- Nothing uses HAVE_INTTYPES_H; so remove check for header file
- Nothing defines ZEND_ACCONFIG_H_NO_C_PROTOS; so remove #ifndef
- `format_money` was removed in 2019, so <monetary.h> no longer needed
- Nothing uses HAVE_NETDB_H; so remove check for header file
- Nothing checks HAVE_TERMIOS_H; so remove check for header file
(This was actually added when Wez Furlong was adding the original implementation of
PTY support in `proc_open`, since replaced.)
- Nothing checks HAVE_SYS_AUXV_H; so remove check for header file
- PHP_BUILD_DATE variable is not used for anything, so remove it
This variable was added to the Makefile, but from there, was not used for anything.
The comments suggest it was intended to allow 'reproducible builds'. Presumably,
this means that if a bug is found in a PHP binary somewhere, one could look at the
Makefile which it was built from, see the date, and then could check the same
code version out from source control. But... there can easily be multiple commits
to the repo in the same day. Also, what makes us think that the Makefile which a
binary was built from will be easily available?
Besides, ext/standard/info.c already embeds the build date and time in each binary...
but it does it using `__DATE__` and `__TIME__` (see `php_print_info`).
- Nothing checks HAVE_FINITE; so don't check for function
- Grammar fix to comment in build/php.m4
- Nothing sets $php_ldflags_add_usr_lib variable in configure, so remove conditional
This was added in 2002, when Rasmus was having difficulty building PHP on some
host and needed to have /usr/lib in the rpath. It was never documented and
probably has never been used by anyone else.
One of the weirdest pieces of PHP code I've ever seen. In terms
of tokens, this gets internally translated to
use x as y; echo as my_echo;
On master it crashes because this "echo" does not have attached
identifier metadata. Make sure it is added and then reject the
use of "<?=" as an identifier inside zend_lex_tstring.
Fixes oss-fuzz #23547.
In 1999, inline optimization was turned off by default. The commit log indicates this was
done because GCC was running out of memory on some hosts when building the Zend executor.
In 2003, inline optimization was re-enabled by default, but a build option was added to
turn it off if one runs out of memory when building.
Computing hardware has come a long way since 2003 and I doubt that anyone is running out
of memory when building PHP now.
Interestingly, this code set an unused variable called `INLINE_CFLAGS`. It actually
disabled inline optimization by adding -O0 to the build command, not using `INLINE_CFLAGS`.
Just to see how much memory GCC/Make are using when building PHP, I tried building with
successively higher values of `ulimit -v` until it succeeded. Interestingly, while most
of the codebase can be built with about 400MB of memory, ext/fileinfo/libmagic/apprentice.c
requires 1.2GB, doubtless because it includes ext/fileinfo/data_file.c, which is more
than 350,000 lines long. That is with GCC 7.5.0.
Most users get PHP as a binary package anyways, so the question is, are *packagers*
of PHP trying to build on machines with just 1GB RAM? And would they want to package
a PHP interpreter built with *no optimizations*? I can't imagine either being true.
We should be doing this anyway to prevent stack overflow, but on
master this is important for an additional reason: The temporary
GC buffer provided for get_gc handlers may get reused if the scan
is performed recursively instead of indirected via the GC stack.
This fixes oss-fuzz #23350.
The (void)_dummy is apparently considered a read of an uninitialized
variable. As it is a _Bool now, which has trap representations, this
is no longer considered legal and results in somewhat odd ubsan
warnings of the form:
runtime error: load of value 0, which is not a valid value for type 'zend_bool' (aka 'bool')
Similar to 097043db2a, but for the
zend_call_method() API. I don't think we ever use this for
static methods, but this logic shouldn't be there. If you want
to inherit the active LSB scope for some reason, do so explicitly.
Replace EG(autoload_func) with a C level zend_autoload hook.
This avoids having to do one indirection through PHP function
calls. The need for EG(autoload_func) was a leftover from the
__autoload() implementation.
Additionally, drop special-casing of spl_autoload(), and instead
register it just like any other autoloading function. This fixes
bug #71236 as a side-effect.
Finally, change spl_autoload_functions() to always return an array.
The distinction between false and an empty array no longer makes
sense here.
Closes GH-5696.
Formerly, this had to be enabled by passing the configuration flag
`--enable-crt-debug`; now it can be enabled by setting the environment
variable `PHP_WIN32_DEBUG_HEAP`. The advantage is that it is no longer
necessary to do separate builds, at the cost of a very minor
performance penalty during process startup.
In module startup stage, we should not initiliaze
EG(modified_ini_directives) as it use zend MM, the zend MM will be
restart at the end of modules startup stage,
by say "partial", because this issue still exists if altering ZEND_USER
inis, we should add a zend_ini_deactive at the end of modules startup
stage, but it brings some new cost, and I think no one would do things
like that
We regularly find new places where we forgot to reset fake_scope.
Instead of having to handle this for each caller of zend_call_function()
and similar APIs, handle it directly in zend_call_function().
This adds the following APIs:
void zend_call_known_function(
zend_function *fn, zend_object *object, zend_class_entry *called_scope,
zval *retval_ptr, int param_count, zval *params);
void zend_call_known_instance_method(
zend_function *fn, zend_object *object, zval *retval_ptr, int param_count, zval *params);
void zend_call_known_instance_method_with_0_params(
zend_function *fn, zend_object *object, zval *retval_ptr);
void zend_call_known_instance_method_with_1_params(
zend_function *fn, zend_object *object, zval *retval_ptr, zval *param);
void zend_call_known_instance_method_with_2_params(
zend_function *fn, zend_object *object, zval *retval_ptr, zval *param1, zval *param2);
These are used to perform a call if you already have the
zend_function you want to call. zend_call_known_function()
is the base API, the rest are just really thin wrappers around
it for the common case of instance method calls.
Closes GH-5692.
Don't treat the !fn_proxy && !obj_ce case differently. There doesn't
seem to be any need for it, and it will result in subtly different
behavior (e.g. it will accept "Foo::bar" syntax, but break as soon
as you pass in an fn_proxy cache).
Add ZVAL_CHAR/RETVAL_CHAR/RETURN_CHAR as a shortcut for using
ZVAL_INTERNED_STRING and ZSTR_CHAR.
Add zend_string_init_fast() as a helper for the empty string /
one char interned string / zend_string_init() pattern.
Also add corresponding ZVAL_STRINGL_FAST etc macros.
Closes GH-5684.
This is a bit tricky: In this cases we have "namespace as", which
means that we will only recognize "namespace" as an identifier when
the lookahead token is already at the "as". This means that
zend_lex_tstring picks up the wrong identifier.
We solve this by actually assigning the identifier as the semantic
value on the parser stack -- as in almost all cases we will not
actually need the identifier, this is just an (offset, size)
reference, not a copy of the string.
Additionally, we need to teach the lexer feedback mechanism used
by tokenizer TOKEN_PARSE mode to apply feedback to something
other than the very last token. To that purpose we pass through
the token text and check the tokens in reverse order to find the
right one.
Closes GH-5668.