It's possible to categorise the failures into 2 categories:
- Changed error message. In this case we either duplicate the test and
modify the error message. Or if the change in error message is
small, we use the EXPECTF matchers to make the test compatible with both
old and new versions of libxml2.
- Missing warnings. This is caused by a change in libxml2 where the
parser started using SAX APIs internally [1]. In this case the
error_type passed to php_libxml_internal_error_handler() changed from
PHP_LIBXML_ERROR to PHP_LIBXML_CTX_WARNING because it internally
started to use the SAX handlers instead of the generic handlers.
However, for the SAX handlers the current input stack is empty, so
nothing is actually printed. I fixed this by falling back to a
regular warning without a filename & line number reference, which
mimicks the old behaviour. Furthermore, this change now also shows
an additional warning in a test which was previously hidden.
[1] 9a82b94a94
Closes GH-11162.
The Docbook parser module has been removed completely. Support for
XPointer locations (ranges and points) is disabled by default, and will
eventually be removed completely. Given that the maintainer comments
on the latter: "Be warned that this part of the code base is buggy and
had many security issues in the past", it seems to be prudent to no
longer build with XPointer locations support right away.
To be able to build against libxml2 2.10.0, we remove the export
definitions for Windows.
Closes GH-9358.
Add libxml_get_external_entity_loader(), which returns the currently
installed external entity loader, i.e. the value which was passed to
libxml_set_external_entity_loader() or null if no loader was installed
and the default entity loader will be used.
This allows libraries to save and restore the loader, controlling entity
expansion without interfering with the rest of the application.
Add macro Z_PARAM_FUNC_OR_NULL_WITH_ZVAL(). This allows us to get the
zval for a callable parameter without duplicating callable argument
parsing.
The saved zval keeps the object needed for fcc/fci alive, simplifying
memory management.
Fixes#76763.
@cname currently refers to the constant name in C. However, it is not always a (constant) name, but sometimes a function invocation, so naming it as @cvalue would be more appropriate.
Closes GH-7847
Closes GH-7852
Previously stripos/stristr would lowercase both the haystack and the
needle to reuse strpos. The approach in this PR is similar to strpos.
memchr is highly optimized so we're using it to search for the first
character of the needle in the haystack. If we find it we compare the
remaining characters of the needle manually.
The new implementation seems to perform about half as well as strpos (as
two memchr calls are necessary to find the next candidate).
The libxml based XML functions accepting a filename actually accept
URIs with possibly percent-encoded characters. Percent-encoded NUL
bytes lead to truncation, like non-encoded NUL bytes would. We catch
those, and let the functions fail with a respective warning.
This version of libxml introduced quite a few changes. Most of
them are differences in error reporting, while some also change
behavior, e.g. null bytes are no longer supported and xinclude
recursion is limited.
Closes GH-7030. Closes GH-7046.
Co-authored-by: Nikita Popov <nikic@php.net>
1. Update: http://www.php.net/license/3_01.txt to https, as there is anyway server header "Location:" to https.
2. Update few license 3.0 to 3.01 as 3.0 states "php 5.1.1, 4.1.1, and earlier".
3. In some license comments is "at through the world-wide-web" while most is without "at", so deleted.
4. fixed indentation in some files before |
The `encoding` attribute of the XML declaration is optional; it is good
practice to use external encoding information where available if it is
missing. Thus, we check for `charset` info of `Content-Type` headers,
and see whether the encoding is supported.
We cater to trailing parameters and quoted-strings, but not to escaped
backslashes and quotes in quoted-strings, since no known character
encoding contains these anyway.
Co-authored-by: Michael Wallner <mike@php.net>
Closes GH-6747.
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.
Of course, zend_bool is retained as an alias.
This method was used to protect code against XXE processing attacks.
Since PHP now requires libxml >= 2.9.0 external entity loading no longer
needs to be disabled to prevent these attacks. It is disabled by default.
Also, the method has an unwanted side effect that causes a lot of
confusion: Parsing XML data from resources like files is no longer possible.
Closes GH-5867.
Since libxml version 2.9.0 external entity loading is disabled by default.
Bumping the version requirement means that XML processing in PHP is no
longer vulnerable to XXE processing attacks by default.