Commit Graph

1396 Commits

Author SHA1 Message Date
Gabriel Caruso
2238403892 Trailing whitespaces on ext/*
Signed-off-by: Gabriel Caruso <carusogabriel34@gmail.com>
2018-01-04 02:38:32 -02:00
Gabriel Caruso
6400264856 Trailing whitespaces
Signed-off-by: Gabriel Caruso <carusogabriel34@gmail.com>
2018-01-03 14:38:00 +01:00
Xinchen Hui
a76eeea736 Merge branch 'PHP-7.2'
* PHP-7.2:
  Happy new year (Update copyright to 2018)

Conflicts:
	ext/phar/LICENSE
2018-01-03 16:02:15 +08:00
Xinchen Hui
0e62639d28 Merge branch 'PHP-7.1' into PHP-7.2
* PHP-7.1:
  Happy new year (Update copyright to 2018)
2018-01-03 16:00:34 +08:00
Lior Kaplan
fbfdd1e1c4 Happy new year (Update copyright to 2018) 2018-01-02 23:42:29 +02:00
Xinchen Hui
a6519d0514 year++ 2018-01-02 12:57:58 +08:00
Xinchen Hui
7a7ec01a49 year++ 2018-01-02 12:55:14 +08:00
Xinchen Hui
ccd4716ec7 year++ 2018-01-02 12:53:31 +08:00
Dmitry Stogov
b864e6b58c Move constants into read-only data segment 2017-12-15 01:55:00 +03:00
Dmitry Stogov
83e495e0fd Move constants into read-only data segment 2017-12-14 22:14:36 +03:00
Dmitry Stogov
9e709e2fa0 Move constants into read-only data segment 2017-12-14 18:43:44 +03:00
Dmitry Stogov
185478d07e Use cheaper SEPARATE macros 2017-12-07 22:35:17 +03:00
Dmitry Stogov
6a9d2b2190 Cleanup type conversion 2017-12-07 19:24:55 +03:00
Nikita Popov
d21c902841 Fix cp950 pua check
One set of parenthesis was missing, causing a legitimate compiler
warnings. In the end it doesn't actually matter, because it just
ends up doing an unnecessary check in the w > 0 case.

This fixes the logic and moves it out into a separate functions,
to be a bit more readable.
2017-11-22 23:47:18 +01:00
Colin O'Dell
201930106d Add test for negative lengths in mb_strcut() 2017-11-22 22:47:55 +01:00
Colin O'Dell
830d87b86e Add tests for mb_language() 2017-11-22 22:47:55 +01:00
Joe Watkins
21e4ab1977
Merge branch 'PHP-7.2'
* PHP-7.2:
  Fix proto documents for new global functions
2017-11-06 07:24:51 +00:00
Tyson Andre
5cdf37e603
Fix proto documents for new global functions
See NEWS and UPGRADING (or arginfo/implementation) for details.
2017-11-06 07:24:42 +00:00
Dmitry Stogov
3b2e858304 Overlad functions once in MINIT (instead of on each requestr in RINIT) 2017-11-02 14:09:06 +03:00
Dmitry Stogov
ed5b4d5c99 Use Zend MM heap 2017-11-01 02:38:26 +03:00
Nikita Popov
251c1b1a44 Fix invalid read in mb_ord() 2017-10-28 16:44:32 +02:00
Peter Kokot
5c5bd30339 Remove --with-libmbfl configure option
The bundled libmbfl library is no longer API or ABI compatible with
the (currently unmaintained) upstream library. As such, building
against an external libmbfl is no longer possible.
2017-10-28 16:11:30 +02:00
Dmitry Stogov
9cf87aa196 Avoid HashTable allocations for empty arrays (using zend_empty_array). 2017-10-24 17:27:31 +03:00
Peter Kokot
3ed3bc3a0c Update README information for the libmbfl library
The libmbfl library is bundled with PHP and has its own repository for
development and bug fixes. To avoid confusion and faster development the
README has been updated to include the information of the original library and
to use the bundled library as a fork of the upstream repository instead.
2017-10-08 17:51:02 +02:00
Peter Kokot
a57de26c3d Refactor mbstring READMEs 2017-10-08 17:51:02 +02:00
Dmitry Stogov
45ee78e040 mb_convert_variables() refactored to use simple recursion.
Fixed incorrect recursion protection (previous implementation kept protection flag or apply counter in non-zero state).
2017-10-06 12:08:55 +03:00
Dmitry Stogov
cb9d81ef4f Refactored recursion pretection 2017-10-06 01:34:50 +03:00
Peter Kokot
39ea632f74 Join untracked files to root .gitignore 2017-10-05 12:36:47 +02:00
Dmitry Stogov
44e0b79ac6 Refactored array creation API. array_init() and array_init_size() are converted into macros calling zend_new_array(). They are not functions anymore and don't return any values. 2017-09-20 02:25:56 +03:00
Joe Watkins
c898349e16
fixes PR #2722, no clue how it broke ... 2017-09-06 11:13:27 +01:00
shinemotec@gmail.com
9b77615608
fixed mbstring extension compiled broken with archlinux 2017-09-06 09:50:08 +01:00
Nikita Popov
fea7957d08 Optimize mb_chr()
By avoiding an unnecessary copy between a string an zend_string.
2017-08-04 22:38:54 +02:00
Nikita Popov
f24db7686e Optimize mb_ord()
Don't perform a full encoding conversion into UCS4-BE, instead only
perform an input conversion into a wchar device.
2017-08-04 22:22:58 +02:00
Nikita Popov
633a471ba0 Store input and output filters in mbfl encodings
For functions like mb_chr() and mb_ord() just looking up the
input/output filter for the encoding dominates the runtime. This
commit stores the input/output filter for an encoding in the
mbfl encoding structure, so it can be looked up directly, rather
than scanning through filter function lists.
2017-08-04 22:22:58 +02:00
Nikita Popov
e20fbd43ba Separate mbfl filters into three categories
Input filters, output filters and special filters.
2017-08-04 22:22:58 +02:00
Nikita Popov
840b77c02e Merge branch 'PHP-7.2' 2017-08-04 22:20:11 +02:00
Nikita Popov
6b73b2d6eb Check for empty string in mb_ord() 2017-08-04 22:20:05 +02:00
Nikita Popov
4e4ec31e2e Merge branch 'PHP-7.2' 2017-08-04 13:02:44 +02:00
Nikita Popov
353f7bf461 Also check for invalid codepoints in mb_ord()
And return false in that case, instead of returning 0x3f...
2017-08-04 13:01:03 +02:00
Nikita Popov
5caf05f6c5 Merge branch 'PHP-7.2' 2017-08-03 22:41:15 +02:00
Nikita Popov
e53162a32b Return false on invalid codepoint in mb_chr()
Instead of returning the encoding of the current substitution
character. This allows a robust check for the failure case. The
substitution character (especially the default of "?") is also
a valid output of mb_chr() for a valid input (for "?" that would be
0x3f), so it's a bad choice for an error value.
2017-08-03 22:36:42 +02:00
Nikita Popov
41e9ba6333 Always use Unicode codepoints in mb_ord() and mb_chr()
Previously mb_chr() had two different encoding-dependent behaviors:
 * For "Unicode-encodings" it took a Unicode codepoint and returned
   its encoded representation.
 * Otherwise it returned a big-endian binary encoding of the passed
   integer.

Now the input is always interpreted as a Unicode codepoint. If
a big-endian binary encoding is what you want, you don't need
mbstring to implement that.
2017-08-03 22:14:00 +02:00
Nikita Popov
c98714f19e Merge branch 'PHP-7.2' 2017-08-03 21:57:35 +02:00
Nikita Popov
fb9bf5b64b Revert/fix substitution character fallback
The introduced checks were not correct in two respects:
 * It was checked whether the source encoding of the string matches
   the internal encoding, while the actually relevant encoding is
   the *target* encoding.
 * Even if the correct encoding is used, the checks are still too
   conservative. Just because something is not a "Unicode-encoding"
   does not mean that it does not map any non-ASCII characters.

I've reverted the added checks and instead adjusted mbfl_convert
to first try to use the provided substitution character and if
that fails, perform the fallback to '?' at that point. This means
that any codepoint mapped in the target encoding should now be
correctly supported and anything else should fall back to '?'.
2017-08-03 21:53:59 +02:00
Nikita Popov
3d948d77d1 Merge branch 'PHP-7.2' 2017-08-03 21:17:26 +02:00
Nikita Popov
a8a9e93e9a Revert/fix mb_substitute_character() codepoint checks
The introduced checks did not treat "non-Unicode" encodings correctly,
because they treated the passed integer as encoded in the internal
encoding in that case, while in actuality the substitute character
is always a Unicode codepoint.

Additionally checking the codepoint against the internal encoding
is not correct in any case, because the substitution character must
be mapped in the *target* encoding of the conversion, which does
not necessarily coincide with the internal encoding (the internal
encoding is the default *source* encoding, not *target* encoding).

This reverts the checks back to simple range checks, but in a way
that still resolves #69079: Characters outside the Basic
Multilingual Plane are now accepted and Surrogate Codepoints are
rejected. A distinction between UTF-8 and non-UTF-8 encodings is
not made for surrogate checks (as in the original patch), as
surrogates are always illegal on their own. Specifying a surrogate
as substitution character would only make sense if you could
specify a substitution string with more than one character --
however we do not support that.
2017-08-03 21:12:41 +02:00
Nikita Popov
94fe629992 Merge branch 'PHP-7.2' 2017-08-02 18:11:17 +02:00
Nikita Popov
91240073ea Merge branch 'PHP-7.1' into PHP-7.2 2017-08-02 18:11:12 +02:00
Nikita Popov
63607375f5 Merge branch 'PHP-7.0' into PHP-7.1 2017-08-02 18:09:09 +02:00
Fabien Villepinte
2cc1cbf2f4 Fix Bug #75001: Wrong reflection on mb_eregi_replace 2017-08-02 18:08:42 +02:00