If buf_len is zero, this would leave behind a dangling pointer
to an already released header.str. Make sure this can't happen
by always overwriting the pointer.
Closes GH-7376.
The CURLOPT_DEBUGDATA will point to the old curl handle after
copying. Update it to point to the new handle.
We don't separately store whether CURLINFO_HEADER_OUT is enabled,
so I'm doing this unconditionally. It should be harmless if
CURLOPT_DEBUGFUNCTION is not used.
Currently, resource IDs are limited to 32-bits. As resource IDs
are not reused, this means that resource ID overflow for
long-running processes is very possible.
This patch switches resource IDs to use zend_long instead, which
means that on 64-bit systems, 64-bit resource IDs will be used.
This makes resource ID overflow practically impossible.
The tradeoff is an 8 byte increase in zend_resource size.
Closes GH-7436.
mbstring has always had the conversion tables to support CP932 codes
in ku 115-119, and the conversion code for CP5022x has an 'if' clause
specifically to handle such characters... but that 'if' clause was dead
code, since a guard clause earlier in the same function prevented it
from accepting 2-byte characters with a starting byte of 0x93-0x97.
Adjust the guard clause so that these characters can be converted as
the original author apparently intended.
The code which handles ku 115-119 is the part which reads:
} else if (s >= cp932ext3_ucs_table_min && s < cp932ext3_ucs_table_max) {
w = cp932ext3_ucs_table[s - cp932ext3_ucs_table_min];
Previously, mbstring had a special mode whereby it would convert
erroneous input byte sequences to output like "BAD+XXXX", where "XXXX"
would be the erroneous bytes expressed in hexadecimal. This mode could
be enabled by calling `mb_substitute_character("long")`.
However, accurately reproducing input byte sequences from the cached
state of a conversion filter is often tricky, and this significantly
complicates the implementation. Further, the means used for passing
the erroneous bytes through to where the "BAD+XXXX" text is generated
only allows for up to 3 bytes to be passed, meaning that some erroneous
byte sequences are truncated anyways.
More to the point, a search of publically available PHP code indicates
that nobody is really using this feature anyways.
Incidentally, this feature also provided error output like "JIS+XXXX"
if the input 'should have' represented a JISX 0208 codepoint, but it
decodes to a codepoint which does not exist in the JISX 0208 charset.
Similarly, specific error output was provided for non-existent
JISX 0212 codepoints, and likewise for JISX 0213, CP932, and a few
other charsets. All of that is now consigned to the flames.
However, "long" error markers also include a somewhat more useful
"U+XXXX" marker for Unicode codepoints which were successfully
decoded from the input text, but cannot be represented in the output
encoding. Those are still supported.
With this change, there is no need to use a variety of special values
in the high bits of a wchar to represent different types of error
values. We can (and will) just use a single error value. This will be
equal to -1.
One complicating factor: Text conversion functions return an integer to
indicate whether the conversion operation should be immediately
aborted, and the magic 'abort' marker is -1. Also, almost all of these
functions would return the received byte/codepoint to indicate success.
That doesn't work with the new error value; if an input filter detects
an error and passes -1 to the output filter, and the output filter
returns it back, that would be taken to mean 'abort'.
Therefore, amend all these functions to return 0 for success.
When more than INT_MAX resource are created, throw a fatal error,
rather than reusing already allocated IDs, which will result in
assertion failures or crashes down the line.
This doesn't fix the fundamental problem, but makes the failure
more graceful with an obvious cause.
Inspired by https://bugs.php.net/bug.php?id=81399.
Closes GH-7428.
If we assemble a zend_string manually, we need to end it with a NUL
byte ourselves.
We also fix the size calculation for that zend_string; there is no need
for the extra byte for each part, and we don't have to multiply by two,
since we're using DnsQuery_A(), not DnsQuery_W () (in which case we
would have to do the character set conversion, anyway). This avoids
over-allocation, and the need to explicitly set the string length.
Finally, we use the proper access macro for zend_strings.
Closes GH-7427.