This version of libxml introduced quite a few changes. Most of
them are differences in error reporting, while some also change
behavior, e.g. null bytes are no longer supported and xinclude
recursion is limited.
Closes GH-7030. Closes GH-7046.
Co-authored-by: Nikita Popov <nikic@php.net>
According to the DOM standard, elements may only contain element, text,
processing instruction and comment nodes[1]. It is also specified that
a HierarchyRequestError should be thrown if a document is to be
inserted[2]. We follow that standard, and prevent the use-after-free
this way.
[1] <https://dom.spec.whatwg.org/#node-trees>
[2] <https://dom.spec.whatwg.org/#mutation-algorithms>
Closes GH-6765.
According to the DOM specification, this argument should be
nullable. It's also supposed to be a required argument, but
not changing that at this point.
This is an unavoidable breaking change to both the type and
parameter name.
The assertion that was supposed to prevent this was overly lax
and accepted any object type for string parameters.
libxml2 has no particular issues parsing HTML strings with NUL bytes;
these just cause truncation of the current text content, but parsing
continues generally. Since `::loadHTMLFile()` already supports NUL
bytes, `::loadHTML()` should as well.
Note that this is different from XML, which does not allow any NUL
bytes.
Closes GH-6368.
Not all extensions consistently throw exceptions when the user passes
a path name containing null bytes. Also, some extensions would throw
a ValueError while others would throw a TypeError. Error messages
also varied.
Now a ValueError is thrown after all failed path checks, at least for
as far as these occur in functions that are exposed to userland.
Closes GH-6216.
In preparation for generating method signatures for the manual.
This change gets rid of bogus false|null return types, a few unnecessary trailing backslashes, and settles on using ? when possible for nullable types.
From an engine perspective, named parameters mainly add three
concepts:
* The SEND_* opcodes now accept a CONST op2, which is the
argument name. For now, it is looked up by linear scan and
runtime cached.
* This may leave UNDEF arguments on the stack. To avoid having
to deal with them in other places, a CHECK_UNDEF_ARGS opcode
is used to either replace them with defaults, or error.
* For variadic functions, EX(extra_named_params) are collected
and need to be freed based on ZEND_CALL_HAS_EXTRA_NAMED_PARAMS.
RFC: https://wiki.php.net/rfc/named_params
Closes GH-5357.
Userland classes that implement Traversable must do so either
through Iterator or IteratorAggregate. The same requirement does
not exist for internal classes: They can implement the internal
get_iterator mechanism, without exposing either the Iterator or
IteratorAggregate APIs. This makes them usable in get_iterator(),
but incompatible with any Iterator based APIs.
A lot of internal classes do this, because exposing the userland
APIs is simply a lot of work. This patch alleviates this issue by
providing a generic InternalIterator class, which acts as an
adapater between get_iterator and Iterator, and can be easily
used by many internal classes. At the same time, we extend the
requirement that Traversable implies Iterator or IteratorAggregate
to internal classes as well.
Closes GH-5216.
The hash is used to check whether the arginfo file needs to be
regenerated. PHP-Parser will only be downloaded if this is actually
necessary.
This ensures that release artifacts will never try to regenerate
stubs and thus fetch PHP-Parser, as long as you do not modify any
files.
Closes GH-5739.