Merge branch 'PHP-5.5' of git.php.net:php-src into PHP-5.5

This commit is contained in:
Pierre Joye 2013-03-04 14:06:09 +01:00
commit e9a2642c89
53 changed files with 13054 additions and 7633 deletions

15
NEWS
View File

@ -10,6 +10,9 @@ PHP NEWS
. Fixed bug #64287 (sendmsg/recvmsg shutdown handler causes segfault).
(Gustavo)
- PCRE:
. Merged PCRE 8.32. (Anatol)
21 Feb 2013, PHP 5.5.0 Alpha 5
- Core:
@ -60,6 +63,18 @@ PHP NEWS
- Filter:
. Implemented FR #49180 - added MAC address validation. (Martin)
- Phar:
. Fixed timestamp update on Phar contents modification. (Dmitry)
- SPL:
. Fixed bug #64264 (SPLFixedArray toArray problem). (Laruence)
. Fixed bug #64228 (RecursiveDirectoryIterator always assumes SKIP_DOTS).
(patch by kriss@krizalys.com, Laruence)
. Fixed bug #64106 (Segfault on SplFixedArray[][x] = y when extended).
(Nikita Popov)
. Fixed bug #52861 (unset fails with ArrayObject and deep arrays).
(Mike Willbanks)
- SNMP:
. Fixed bug #64124 (IPv6 malformed). (Boris Lytochkin)

344
NEWS-5.5 Normal file
View File

@ -0,0 +1,344 @@
PHP NEWS
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
?? ??? 201?, PHP 5.5.0 Beta 1
- Core:
. Fixed bug #49348 (Uninitialized ++$foo->bar; does not cause a notice).
(Stas)
- Sockets:
. Fixed bug #64287 (sendmsg/recvmsg shutdown handler causes segfault).
(Gustavo)
- PCRE:
. Merged PCRE 8.32. (Anatol)
21 Feb 2013, PHP 5.5.0 Alpha 5
- Core:
. Implemented FR #64175 (Added HTTP codes as of RFC 6585). (Jonh Wendell)
. Fixed bug #64135 (Exceptions from set_error_handler are not always
propagated). (Laruence)
. Fixed bug #63830 (Segfault on undefined function call in nested generator).
(Nikita Popov)
. Fixed bug #60833 (self, parent, static behave inconsistently
case-sensitive). (Stas, mario at include-once dot org)
. Implemented FR #60524 (specify temp dir by php.ini). (ALeX Kazik).
. Fixed bug #64142 (dval to lval different behavior on ppc64). (Remi)
. Added ARMv7/v8 versions of various Zend arithmetic functions that are
implemented using inline assembler (Ard Biesheuvel)
. Fix undefined behavior when converting double variables to integers.
The double is now always rounded towards zero, the remainder of its division
by 2^32 or 2^64 (depending on sizeof(long)) is calculated and it's made
signed assuming a two's complement representation. (Gustavo)
- CLI server:
. Fixed bug #64128 (buit-in web server is broken on ppc64). (Remi)
- cURL:
. Implemented FR #46439 - added CURLFile for safer file uploads.
(Stas)
- Intl:
. Cherry-picked UConverter wrapper, which had accidentaly been committed only
to master.
- mysqli
. Added mysqli_begin_transaction()/mysqli::begin_transaction(). Implemented
all options, per MySQL 5.6, which can be used with START TRANSACTION, COMMIT
and ROLLBACK through options to mysqli_commit()/mysqli_rollback() and their
respective OO counterparts. They work in libmysql and mysqlnd mode. (Andrey)
. Added mysqli_savepoint(), mysqli_release_savepoint(). (Andrey)
- mysqlnd
. Add new begin_transaction() call to the connection object. Implemented all
options, per MySQL 5.6, which can be used with START TRANSACTION, COMMIT
and ROLLBACK. (Andrey)
. Added mysqlnd_savepoint(), mysqlnd_release_savepoint(). (Andrey)
- Sockets:
. Added recvmsg() and sendmsg() wrappers. (Gustavo)
See https://wiki.php.net/rfc/sendrecvmsg
- Filter:
. Implemented FR #49180 - added MAC address validation. (Martin)
- Phar:
. Fixed timestamp update on Phar contents modification. (Dmitry)
- SPL:
. Fixed bug #64264 (SPLFixedArray toArray problem). (Laruence)
. Fixed bug #64228 (RecursiveDirectoryIterator always assumes SKIP_DOTS).
(patch by kriss@krizalys.com, Laruence)
. Fixed bug #64106 (Segfault on SplFixedArray[][x] = y when extended).
(Nikita Popov)
. Fixed bug #52861 (unset fails with ArrayObject and deep arrays).
(Mike Willbanks)
- SNMP:
. Fixed bug #64124 (IPv6 malformed). (Boris Lytochkin)
24 Jan 2013, PHP 5.5.0 Alpha 4
- Core:
. Fixed bug #63980 (object members get trimmed by zero bytes). (Laruence)
. Implemented RFC for Class Name Resolution As Scalar Via "class" Keyword.
(Ralph Schindler, Nikita Popov, Lars)
- DateTime
. Added DateTimeImmutable - a variant of DateTime that only returns the
modified state instead of changing itself. (Derick)
- FPM:
. Fixed bug #63999 (php with fpm fails to build on Solaris 10 or 11). (Adam)
- pgsql:
. Bug #46408: Locale number format settings can cause pg_query_params to
break with numerics. (asmecher, Lars)
- dba:
. Bug #62489: dba_insert not working as expected.
(marc-bennewitz at arcor dot de, Lars)
- Reflection:
. Fixed bug #64007 (There is an ability to create instance of Generator by
hand). (Laruence)
10 Jan 2013, PHP 5.5.0 Alpha 3
- General improvements:
. Fixed bug #63874 (Segfault if php_strip_whitespace has heredoc). (Pierrick)
. Fixed bug #63822 (Crash when using closures with ArrayAccess).
(Nikita Popov)
. Add Generator::throw() method. (Nikita Popov)
. Bug #23955: allow specifying Max-Age attribute in setcookie() (narfbg, Lars)
. Bug #52126: timestamp for mail.log (Martin Jansen, Lars)
- mysqlnd
. Fixed return value of mysqli_stmt_affected_rows() in the time after
prepare() and before execute(). (Andrey)
- cURL:
. Added new functions curl_escape, curl_multi_setopt, curl_multi_strerror
curl_pause, curl_reset, curl_share_close, curl_share_init,
curl_share_setopt curl_strerror and curl_unescape. (Pierrick)
. Addes new curl options CURLOPT_TELNETOPTIONS, CURLOPT_GSSAPI_DELEGATION,
CURLOPT_ACCEPTTIMEOUT_MS, CURLOPT_SSL_OPTIONS, CURLOPT_TCP_KEEPALIVE,
CURLOPT_TCP_KEEPIDLE and CURLOPT_TCP_KEEPINTVL. (Pierrick)
18 Dec 2012, PHP 5.5.0 Alpha 2
- General improvements:
. Added systemtap support by enabling systemtap compatible dtrace probes on
linux. (David Soria Parra)
. Added support for using empty() on the result of function calls and
other expressions (https://wiki.php.net/rfc/empty_isset_exprs).
(Nikita Popov)
. Optimized access to temporary and compiled VM variables. 8% less memory
reads. (Dmitry)
. The VM stacks for passing function arguments and syntaticaly nested calls
were merged into a single stack. The stack size needed for op_array
execution is calculated at compile time and preallocated at once. As result
all the stack push operatins don't require checks for stack overflow
any more. (Dmitry)
- MySQL
. This extension is now deprecated, and deprecation warnings will be generated
when connections are established to databases via mysql_connect(),
mysql_pconnect(), or through implicit connection: use MySQLi or PDO_MySQL
instead (https://wiki.php.net/rfc/mysql_deprecation). (Adam)
- Fileinfo:
. Fixed bug #63590 (Different results in TS and NTS under Windows).
(Anatoliy)
- Apache2 Handler SAPI:
. Enabled Apache 2.4 configure option for Windows (Pierre, Anatoliy)
13 Nov 2012, PHP 5.5.0 Alpha 1
- General improvements:
. Added generators and coroutines (https://wiki.php.net/rfc/generators).
(Nikita Popov)
. Added "finally" keyword (https://wiki.php.net/rfc/finally). (Laruence)
. Add simplified password hashing API
(https://wiki.php.net/rfc/password_hash). (Anthony Ferrara)
. Added support for list in foreach (https://wiki.php.net/rfc/foreachlist).
(Laruence)
. Added support for using empty() on the result of function calls and
other expressions (https://wiki.php.net/rfc/empty_isset_exprs).
(Nikita Popov)
. Added support for constant array/string dereferencing. (Laruence)
. Improve set_exception_handler while doing reset.(Laruence)
. Remove php_logo_guid(), php_egg_logo_guid(), php_real_logo_guid(),
zend_logo_guid(). (Adnrew Faulds)
. Drop Windows XP and 2003 support. (Pierre)
- Calendar:
. Fixed bug #54254 (cal_from_jd returns month = 6 when there is only one Adar)
(Stas, Eitan Mosenkis)
- Core:
. Added boolval(). (Jille Timmermans)
. Added "Z" option to pack/unpack. (Gustavo)
. Implemented FR #60738 (Allow 'set_error_handler' to handle NULL).
(Laruence, Nikita Popov)
. Added optional second argument for assert() to specify custom message. Patch
by Lonny Kapelushnik (lonny@lonnylot.com). (Lars)
. Fixed bug #18556 (Engine uses locale rules to handle class names). (Stas)
. Fixed bug #61681 (Malformed grammar). (Nikita Popov, Etienne, Laruence)
. Fixed bug #61038 (unpack("a5", "str\0\0") does not work as expected).
(srgoogleguy, Gustavo)
. Return previous handler when passing NULL to set_error_handler and
set_exception_handler. (Nikita Popov)
- cURL:
. Added support for CURLOPT_FTP_RESPONSE_TIMEOUT, CURLOPT_APPEND,
CURLOPT_DIRLISTONLY, CURLOPT_NEW_DIRECTORY_PERMS, CURLOPT_NEW_FILE_PERMS,
CURLOPT_NETRC_FILE, CURLOPT_PREQUOTE, CURLOPT_KRBLEVEL, CURLOPT_MAXFILESIZE,
CURLOPT_FTP_ACCOUNT, CURLOPT_COOKIELIST, CURLOPT_IGNORE_CONTENT_LENGTH,
CURLOPT_CONNECT_ONLY, CURLOPT_LOCALPORT, CURLOPT_LOCALPORTRANGE,
CURLOPT_FTP_ALTERNATIVE_TO_USER, CURLOPT_SSL_SESSIONID_CACHE,
CURLOPT_FTP_SSL_CCC, CURLOPT_HTTP_CONTENT_DECODING,
CURLOPT_HTTP_TRANSFER_DECODING, CURLOPT_PROXY_TRANSFER_MODE,
CURLOPT_ADDRESS_SCOPE, CURLOPT_CRLFILE, CURLOPT_ISSUERCERT,
CURLOPT_USERNAME, CURLOPT_PASSWORD, CURLOPT_PROXYUSERNAME,
CURLOPT_PROXYPASSWORD, CURLOPT_NOPROXY, CURLOPT_SOCKS5_GSSAPI_NEC,
CURLOPT_SOCKS5_GSSAPI_SERVICE, CURLOPT_TFTP_BLKSIZE,
CURLOPT_SSH_KNOWNHOSTS, CURLOPT_FTP_USE_PRET, CURLOPT_MAIL_FROM,
CURLOPT_MAIL_RCPT, CURLOPT_RTSP_CLIENT_CSEQ, CURLOPT_RTSP_SERVER_CSEQ,
CURLOPT_RTSP_SESSION_ID, CURLOPT_RTSP_STREAM_URI, CURLOPT_RTSP_TRANSPORT,
CURLOPT_RTSP_REQUEST, CURLOPT_RESOLVE, CURLOPT_ACCEPT_ENCODING,
CURLOPT_TRANSFER_ENCODING, CURLOPT_DNS_SERVERS and CURLOPT_USE_SSL.
(Pierrick)
. Fixed bug #55635 (CURLOPT_BINARYTRANSFER no longer used. The constant
still exists for backward compatibility but is doing nothing). (Pierrick)
. Fixed bug #54995 (Missing CURLINFO_RESPONSE_CODE support). (Pierrick)
- Datetime
. Fixed bug #61642 (modify("+5 weekdays") returns Sunday).
(Dmitri Iouchtchenko)
- Hash
. Added support for PBKDF2 via hash_pbkdf2(). (Anthony Ferrara)
- Intl
. The intl extension now requires ICU 4.0+.
. Added intl.use_exceptions INI directive, which controls what happens when
global errors are set together with intl.error_level. (Gustavo)
. MessageFormatter::format() and related functions now accepted named
arguments and mixed numeric/named arguments in ICU 4.8+. (Gustavo)
. MessageFormatter::format() and related functions now don't error out when
an insufficient argument count is provided. Instead, the placeholders will
remain unsubstituted. (Gustavo)
. MessageFormatter::parse() and MessageFormat::format() (and their static
equivalents) don't throw away better than second precision in the arguments.
(Gustavo)
. IntlDateFormatter::__construct and datefmt_create() now accept for the
$timezone argument time zone identifiers, IntlTimeZone objects, DateTimeZone
objects and NULL. (Gustavo)
. IntlDateFormatter::__construct and datefmt_create() no longer accept invalid
timezone identifiers or empty strings. (Gustavo)
. The default time zone used in IntlDateFormatter::__construct and
datefmt_create() (when the corresponding argument is not passed or NULL is
passed) is now the one given by date_default_timezone_get(), not the
default ICU time zone. (Gustavo)
. The time zone passed to the IntlDateFormatter is ignored if it is NULL and
if the calendar passed is an IntlCalendar object -- in this case, the
IntlCalendar's time zone will be used instead. Otherwise, the time zone
specified in the $timezone argument is used instead. This does not affect
old code, as IntlCalendar was introduced in this version. (Gustavo)
. IntlDateFormatter::__construct and datefmt_create() now accept for the
$calendar argument also IntlCalendar objects. (Gustavo)
. IntlDateFormatter::getCalendar() and datefmt_get_calendar() return false
if the IntlDateFormatter was set up with an IntlCalendar instead of the
constants IntlDateFormatter::GREGORIAN/TRADITIONAL. IntlCalendar did not
exist before this version. (Gustavo)
. IntlDateFormatter::setCalendar() and datefmt_set_calendar() now also accept
an IntlCalendar object, in which case its time zone is taken. Passing a
constant is still allowed, and still keeps the time zone. (Gustavo)
. IntlDateFormatter::setTimeZoneID() and datefmt_set_timezone_id() are
deprecated. Use IntlDateFormatter::setTimeZone() or datefmt_set_timezone()
instead. (Gustavo)
. IntlDateFormatter::format() and datefmt_format() now also accept an
IntlCalendar object for formatting. (Gustavo)
. Added the classes: IntlCalendar, IntlGregorianCalendar, IntlTimeZone,
IntlBreakIterator, IntlRuleBasedBreakIterator and
IntlCodePointBreakIterator. (Gustavo)
. Added the functions: intlcal_get_keyword_values_for_locale(),
intlcal_get_now(), intlcal_get_available_locales(), intlcal_get(),
intlcal_get_time(), intlcal_set_time(), intlcal_add(),
intlcal_set_time_zone(), intlcal_after(), intlcal_before(), intlcal_set(),
intlcal_roll(), intlcal_clear(), intlcal_field_difference(),
intlcal_get_actual_maximum(), intlcal_get_actual_minimum(),
intlcal_get_day_of_week_type(), intlcal_get_first_day_of_week(),
intlcal_get_greatest_minimum(), intlcal_get_least_maximum(),
intlcal_get_locale(), intlcal_get_maximum(),
intlcal_get_minimal_days_in_first_week(), intlcal_get_minimum(),
intlcal_get_time_zone(), intlcal_get_type(),
intlcal_get_weekend_transition(), intlcal_in_daylight_time(),
intlcal_is_equivalent_to(), intlcal_is_lenient(), intlcal_is_set(),
intlcal_is_weekend(), intlcal_set_first_day_of_week(),
intlcal_set_lenient(), intlcal_equals(),
intlcal_get_repeated_wall_time_option(),
intlcal_get_skipped_wall_time_option(),
intlcal_set_repeated_wall_time_option(),
intlcal_set_skipped_wall_time_option(), intlcal_from_date_time(),
intlcal_to_date_time(), intlcal_get_error_code(),
intlcal_get_error_message(), intlgregcal_create_instance(),
intlgregcal_set_gregorian_change(), intlgregcal_get_gregorian_change() and
intlgregcal_is_leap_year(). (Gustavo)
. Added the functions: intltz_create_time_zone(), intltz_create_default(),
intltz_get_id(), intltz_get_gmt(), intltz_get_unknown(),
intltz_create_enumeration(), intltz_count_equivalent_ids(),
intltz_create_time_zone_id_enumeration(), intltz_get_canonical_id(),
intltz_get_region(), intltz_get_tz_data_version(),
intltz_get_equivalent_id(), intltz_use_daylight_time(), intltz_get_offset(),
intltz_get_raw_offset(), intltz_has_same_rules(), intltz_get_display_name(),
intltz_get_dst_savings(), intltz_from_date_time_zone(),
intltz_to_date_time_zone(), intltz_get_error_code(),
intltz_get_error_message(). (Gustavo)
. Added the methods: IntlDateFormatter::formatObject(),
IntlDateFormatter::getCalendarObject(), IntlDateFormatter::getTimeZone(),
IntlDateFormatter::setTimeZone(). (Gustavo)
. Added the functions: datefmt_format_object(), datefmt_get_calendar_object(),
datefmt_get_timezone(), datefmt_set_timezone(),
datefmt_get_calendar_object(), intlcal_create_instance(). (Gustavo)
- MCrypt
. mcrypt_ecb(), mcrypt_cbc(), mcrypt_cfb() and mcrypt_ofb() now throw
E_DEPRECATED. (GoogleGuy)
- MySQLi
. Dropped support for LOAD DATA LOCAL INFILE handlers when using libmysql.
Known for stability problems. (Andrey)
. Added support for SHA256 authentication available with MySQL 5.6.6+.
(Andrey)
- PCRE:
. Deprecated the /e modifier
(https://wiki.php.net/rfc/remove_preg_replace_eval_modifier). (Nikita Popov)
. Fixed bug #63284 (Upgrade PCRE to 8.31). (Anatoliy)
- pgsql
. Added pg_escape_literal() and pg_escape_identifier() (Yasuo)
- SPL
. Fix bug #60560 (SplFixedArray un-/serialize, getSize(), count() return 0,
keys are strings). (Adam)
- Tokenizer:
. Fixed bug #60097 (token_get_all fails to lex nested heredoc). (Nikita Popov)
- Zip:
. Upgraded libzip to 0.10.1 (Anatoliy)
- Fileinfo:
. Fixed bug #63248 (Load multiple magic files from a directory under Windows).
(Anatoliy)
- General improvements:
. Implemented FR #46487 (Dereferencing process-handles no longer waits on
those processes). (Jille Timmermans)
<<< NOTE: Insert NEWS from last stable release here prior to actual release! >>>

File diff suppressed because it is too large Load Diff

View File

@ -18,7 +18,6 @@ $pattern = '[[:space:]]';
$string = '1 2 3 4 5';
var_dump(split($pattern, $string, 0));
var_dump(split($pattern, $string, -10));
var_dump(split($pattern, $string, 10E20));
echo "Done";
@ -35,9 +34,4 @@ array(1) {
[0]=>
string(9) "1 2 3 4 5"
}
Error: 8192 - Function split() is deprecated, %s(18)
array(1) {
[0]=>
string(9) "1 2 3 4 5"
}
Done

View File

@ -18,7 +18,6 @@ $pattern = '[[:space:]]';
$string = '1 2 3 4 5';
var_dump(spliti($pattern, $string, 0));
var_dump(spliti($pattern, $string, -10));
var_dump(spliti($pattern, $string, 10E20));
echo "Done";
@ -35,9 +34,4 @@ array(1) {
[0]=>
string(9) "1 2 3 4 5"
}
Error: 8192 - Function spliti() is deprecated, %s(18)
array(1) {
[0]=>
string(9) "1 2 3 4 5"
}
Done

View File

@ -10,3 +10,4 @@ AC_DEFINE('HAVE_BUNDLED_PCRE', 1, 'Using bundled PCRE library');
AC_DEFINE('HAVE_PCRE', 1, 'Have PCRE library');
PHP_PCRE="yes";
PHP_INSTALL_HEADERS("ext/pcre", "php_pcre.h pcrelib/");
ADD_FLAG("CFLAGS_PCRE", " /D HAVE_CONFIG_H");

View File

@ -59,7 +59,8 @@ PHP_ARG_WITH(pcre-regex,,
pcrelib/pcre_ord2utf8.c pcrelib/pcre_refcount.c pcrelib/pcre_study.c \
pcrelib/pcre_tables.c pcrelib/pcre_valid_utf8.c \
pcrelib/pcre_version.c pcrelib/pcre_xclass.c"
PHP_NEW_EXTENSION(pcre, $pcrelib_sources php_pcre.c, no,,-I@ext_srcdir@/pcrelib)
PHP_PCRE_CFLAGS="-DHAVE_CONFIG_H -I@ext_srcdir@/pcrelib"
PHP_NEW_EXTENSION(pcre, $pcrelib_sources php_pcre.c, no,,$PHP_PCRE_CFLAGS)
PHP_ADD_BUILD_DIR($ext_builddir/pcrelib)
PHP_INSTALL_HEADERS([ext/pcre], [php_pcre.h pcrelib/])
AC_DEFINE(HAVE_BUNDLED_PCRE, 1, [ ])

View File

@ -1,6 +1,170 @@
ChangeLog for PCRE
------------------
Version 8.32 30-November-2012
-----------------------------
1. Improved JIT compiler optimizations for first character search and single
character iterators.
2. Supporting IBM XL C compilers for PPC architectures in the JIT compiler.
Patch by Daniel Richard G.
3. Single character iterator optimizations in the JIT compiler.
4. Improved JIT compiler optimizations for character ranges.
5. Rename the "leave" variable names to "quit" to improve WinCE compatibility.
Reported by Giuseppe D'Angelo.
6. The PCRE_STARTLINE bit, indicating that a match can occur only at the start
of a line, was being set incorrectly in cases where .* appeared inside
atomic brackets at the start of a pattern, or where there was a subsequent
*PRUNE or *SKIP.
7. Improved instruction cache flush for POWER/PowerPC.
Patch by Daniel Richard G.
8. Fixed a number of issues in pcregrep, making it more compatible with GNU
grep:
(a) There is now no limit to the number of patterns to be matched.
(b) An error is given if a pattern is too long.
(c) Multiple uses of --exclude, --exclude-dir, --include, and --include-dir
are now supported.
(d) --exclude-from and --include-from (multiple use) have been added.
(e) Exclusions and inclusions now apply to all files and directories, not
just to those obtained from scanning a directory recursively.
(f) Multiple uses of -f and --file-list are now supported.
(g) In a Windows environment, the default for -d has been changed from
"read" (the GNU grep default) to "skip", because otherwise the presence
of a directory in the file list provokes an error.
(h) The documentation has been revised and clarified in places.
9. Improve the matching speed of capturing brackets.
10. Changed the meaning of \X so that it now matches a Unicode extended
grapheme cluster.
11. Patch by Daniel Richard G to the autoconf files to add a macro for sorting
out POSIX threads when JIT support is configured.
12. Added support for PCRE_STUDY_EXTRA_NEEDED.
13. In the POSIX wrapper regcomp() function, setting re_nsub field in the preg
structure could go wrong in environments where size_t is not the same size
as int.
14. Applied user-supplied patch to pcrecpp.cc to allow PCRE_NO_UTF8_CHECK to be
set.
15. The EBCDIC support had decayed; later updates to the code had included
explicit references to (e.g.) \x0a instead of CHAR_LF. There has been a
general tidy up of EBCDIC-related issues, and the documentation was also
not quite right. There is now a test that can be run on ASCII systems to
check some of the EBCDIC-related things (but is it not a full test).
16. The new PCRE_STUDY_EXTRA_NEEDED option is now used by pcregrep, resulting
in a small tidy to the code.
17. Fix JIT tests when UTF is disabled and both 8 and 16 bit mode are enabled.
18. If the --only-matching (-o) option in pcregrep is specified multiple
times, each one causes appropriate output. For example, -o1 -o2 outputs the
substrings matched by the 1st and 2nd capturing parentheses. A separating
string can be specified by --om-separator (default empty).
19. Improving the first n character searches.
20. Turn case lists for horizontal and vertical white space into macros so that
they are defined only once.
21. This set of changes together give more compatible Unicode case-folding
behaviour for characters that have more than one other case when UCP
support is available.
(a) The Unicode property table now has offsets into a new table of sets of
three or more characters that are case-equivalent. The MultiStage2.py
script that generates these tables (the pcre_ucd.c file) now scans
CaseFolding.txt instead of UnicodeData.txt for character case
information.
(b) The code for adding characters or ranges of characters to a character
class has been abstracted into a generalized function that also handles
case-independence. In UTF-mode with UCP support, this uses the new data
to handle characters with more than one other case.
(c) A bug that is fixed as a result of (b) is that codepoints less than 256
whose other case is greater than 256 are now correctly matched
caselessly. Previously, the high codepoint matched the low one, but not
vice versa.
(d) The processing of \h, \H, \v, and \ in character classes now makes use
of the new class addition function, using character lists defined as
macros alongside the case definitions of 20 above.
(e) Caseless back references now work with characters that have more than
one other case.
(f) General caseless matching of characters with more than one other case
is supported.
22. Unicode character properties were updated from Unicode 6.2.0
23. Improved CMake support under Windows. Patch by Daniel Richard G.
24. Add support for 32-bit character strings, and UTF-32
25. Major JIT compiler update (code refactoring and bugfixing).
Experimental Sparc 32 support is added.
26. Applied a modified version of Daniel Richard G's patch to create
pcre.h.generic and config.h.generic by "make" instead of in the
PrepareRelease script.
27. Added a definition for CHAR_NULL (helpful for the z/OS port), and use it in
pcre_compile.c when checking for a zero character.
28. Introducing a native interface for JIT. Through this interface, the compiled
machine code can be directly executed. The purpose of this interface is to
provide fast pattern matching, so several sanity checks are not performed.
However, feature tests are still performed. The new interface provides
1.4x speedup compared to the old one.
29. If pcre_exec() or pcre_dfa_exec() was called with a negative value for
the subject string length, the error given was PCRE_ERROR_BADOFFSET, which
was confusing. There is now a new error PCRE_ERROR_BADLENGTH for this case.
30. In 8-bit UTF-8 mode, pcretest failed to give an error for data codepoints
greater than 0x7fffffff (which cannot be represented in UTF-8, even under
the "old" RFC 2279). Instead, it ended up passing a negative length to
pcre_exec().
31. Add support for GCC's visibility feature to hide internal functions.
32. Running "pcretest -C pcre8" or "pcretest -C pcre16" gave a spurious error
"unknown -C option" after outputting 0 or 1.
33. There is now support for generating a code coverage report for the test
suite in environments where gcc is the compiler and lcov is installed. This
is mainly for the benefit of the developers.
34. If PCRE is built with --enable-valgrind, certain memory regions are marked
unaddressable using valgrind annotations, allowing valgrind to detect
invalid memory accesses. This is mainly for the benefit of the developers.
25. (*UTF) can now be used to start a pattern in any of the three libraries.
26. Give configure error if --enable-cpp but no C++ compiler found.
Version 8.31 06-July-2012
-------------------------

View File

@ -49,16 +49,17 @@ complexity in Perl regular expressions, I couldn't do this. In any case, a
first pass through the pattern is helpful for other reasons.
Support for 16-bit data strings
-------------------------------
Support for 16-bit and 32-bit data strings
-------------------------------------------
From release 8.30, PCRE supports 16-bit as well as 8-bit data strings, by being
compilable in either 8-bit or 16-bit modes, or both. Thus, two different
libraries can be created. In the description that follows, the word "short" is
From release 8.30, PCRE supports 16-bit as well as 8-bit data strings; and from
release 8.32, PCRE supports 32-bit data strings. The library can be compiled
in any combination of 8-bit, 16-bit or 32-bit modes, creating different
libraries. In the description that follows, the word "short" is
used for a 16-bit data quantity, and the word "unit" is used for a quantity
that is a byte in 8-bit mode and a short in 16-bit mode. However, so as not to
over-complicate the text, the names of PCRE functions are given in 8-bit form
only.
that is a byte in 8-bit mode, a short in 16-bit mode and a 32-bit unsigned
integer in 32-bit mode. However, so as not to over-complicate the text, the
names of PCRE functions are given in 8-bit form only.
Computing the memory requirement: how it was
@ -138,9 +139,10 @@ Format of compiled patterns
---------------------------
The compiled form of a pattern is a vector of units (bytes in 8-bit mode, or
shorts in 16-bit mode), containing items of variable length. The first unit in
an item contains an opcode, and the length of the item is either implicit in
the opcode or contained in the data that follows it.
shorts in 16-bit mode, 32-bit unsigned integers in 32-bit mode), containing
items of variable length. The first unit in an item contains an opcode, and
the length of the item is either implicit in the opcode or contained in the
data that follows it.
In many cases listed below, LINK_SIZE data values are specified for offsets
within the compiled pattern. LINK_SIZE always specifies a number of bytes. The
@ -207,7 +209,8 @@ Matching literal characters
The OP_CHAR opcode is followed by a single character that is to be matched
casefully. For caseless matching, OP_CHARI is used. In UTF-8 or UTF-16 modes,
the character may be more than one unit long.
the character may be more than one unit long. In UTF-32 mode, characters
are always exactly one unit long.
Repeating single characters
@ -228,7 +231,8 @@ following opcodes, which come in caseful and caseless versions:
OP_POSQUERY OP_POSQUERYI
Each opcode is followed by the character that is to be repeated. In ASCII mode,
these are two-unit items; in UTF-8 or UTF-16 modes, the length is variable.
these are two-unit items; in UTF-8 or UTF-16 modes, the length is variable; in
UTF-32 mode these are one-unit items.
Those with "MIN" in their names are the minimizing versions. Those with "POS"
in their names are possessive versions. Other repeats make use of these
opcodes:
@ -299,7 +303,7 @@ bit map containing a 1 bit for every character that is acceptable. The bits are
counted from the least significant end of each unit. In caseless mode, bits for
both cases are set.
The reason for having both OP_CLASS and OP_NCLASS is so that, in UTF-8/16 mode,
The reason for having both OP_CLASS and OP_NCLASS is so that, in UTF-8/16/32 mode,
subject characters with values greater than 255 can be handled correctly. For
OP_CLASS they do not match, whereas for OP_NCLASS they do.
@ -412,7 +416,8 @@ OP_ASSERTBACK and OP_ASSERTBACK_NOT, and the first opcode inside the assertion
is OP_REVERSE, followed by a two byte (one short) count of the number of
characters to move back the pointer in the subject string. In ASCII mode, the
count is a number of units, but in UTF-8/16 mode each character may occupy more
than one unit. A separate count is present in each alternative of a lookbehind
than one unit; in UTF-32 mode each character occupies exactly one unit.
A separate count is present in each alternative of a lookbehind
assertion, allowing them to have different fixed lengths.

View File

@ -1,6 +1,46 @@
News about PCRE releases
------------------------
Release 8.32 30-November-2012
-----------------------------
This release fixes a number of bugs, but also has some new features. These are
the highlights:
. There is now support for 32-bit character strings and UTF-32. Like the
16-bit support, this is done by compiling a separate 32-bit library.
. \X now matches a Unicode extended grapheme cluster.
. Case-independent matching of Unicode characters that have more than one
"other case" now makes all three (or more) characters equivalent. This
applies, for example, to Greek Sigma, which has two lowercase versions.
. Unicode character properties are updated to Unicode 6.2.0.
. The EBCDIC support, which had decayed, has had a spring clean.
. A number of JIT optimizations have been added, which give faster JIT
execution speed. In addition, a new direct interface to JIT execution is
available. This bypasses some of the sanity checks of pcre_exec() to give a
noticeable speed-up.
. A number of issues in pcregrep have been fixed, making it more compatible
with GNU grep. In particular, --exclude and --include (and variants) apply
to all files now, not just those obtained from scanning a directory
recursively. In Windows environments, the default action for directories is
now "skip" instead of "read" (which provokes an error).
. If the --only-matching (-o) option in pcregrep is specified multiple
times, each one causes appropriate output. For example, -o1 -o2 outputs the
substrings matched by the 1st and 2nd capturing parentheses. A separating
string can be specified by --om-separator (default empty).
. When PCRE is built via Autotools using a version of gcc that has the
"visibility" feature, it is used to hide internal library functions that are
not part of the public API.
Release 8.31 06-July-2012
-------------------------
@ -9,7 +49,7 @@ This is mainly a bug-fixing release, with a small number of developments:
. The JIT compiler now supports partial matching and the (*MARK) and
(*COMMIT) verbs.
. PCRE_INFO_MAXLOOKBEHIND can be used to find the longest lookbehing in a
. PCRE_INFO_MAXLOOKBEHIND can be used to find the longest lookbehind in a
pattern.
. There should be a performance improvement when using the heap instead of the

View File

@ -35,9 +35,10 @@ The contents of this README file are:
The PCRE APIs
-------------
PCRE is written in C, and it has its own API. There are two sets of functions,
one for the 8-bit library, which processes strings of bytes, and one for the
16-bit library, which processes strings of 16-bit values. The distribution also
PCRE is written in C, and it has its own API. There are three sets of functions,
one for the 8-bit library, which processes strings of bytes, one for the
16-bit library, which processes strings of 16-bit values, and one for the 32-bit
library, which processes strings of 32-bit values. The distribution also
includes a set of C++ wrapper functions (see the pcrecpp man page for details),
courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
C++.
@ -183,8 +184,10 @@ library. They are also documented in the pcrebuild man page.
(See also "Shared libraries on Unix-like systems" below.)
. By default, only the 8-bit library is built. If you add --enable-pcre16 to
the "configure" command, the 16-bit library is also built. If you want only
the 16-bit library, use "./configure --enable-pcre16 --disable-pcre8".
the "configure" command, the 16-bit library is also built. If you add
--enable-pcre32 to the "configure" command, the 32-bit library is also built.
If you want only the 16-bit or 32-bit library, use --disable-pcre8 to disable
building the 8-bit library.
. If you are building the 8-bit library and want to suppress the building of
the C++ wrapper library, you can add --disable-cpp to the "configure"
@ -203,23 +206,24 @@ library. They are also documented in the pcrebuild man page.
. If you want to make use of the support for UTF-8 Unicode character strings in
the 8-bit library, or UTF-16 Unicode character strings in the 16-bit library,
you must add --enable-utf to the "configure" command. Without it, the code
for handling UTF-8 and UTF-16 is not included in the relevant library. Even
or UTF-32 Unicode character strings in the 32-bit library, you must add
--enable-utf to the "configure" command. Without it, the code for handling
UTF-8, UTF-16 and UTF-8 is not included in the relevant library. Even
when --enable-utf is included, the use of a UTF encoding still has to be
enabled by an option at run time. When PCRE is compiled with this option, its
input can only either be ASCII or UTF-8/16, even when running on EBCDIC
input can only either be ASCII or UTF-8/16/32, even when running on EBCDIC
platforms. It is not possible to use both --enable-utf and --enable-ebcdic at
the same time.
. There are no separate options for enabling UTF-8 and UTF-16 independently
because that would allow ridiculous settings such as requesting UTF-16
support while building only the 8-bit library. However, the option
. There are no separate options for enabling UTF-8, UTF-16 and UTF-32
independently because that would allow ridiculous settings such as requesting
UTF-16 support while building only the 8-bit library. However, the option
--enable-utf8 is retained for backwards compatibility with earlier releases
that did not support 16-bit character strings. It is synonymous with
that did not support 16-bit or 32-bit character strings. It is synonymous with
--enable-utf. It is not possible to configure one library with UTF support
and the other without in the same configuration.
. If, in addition to support for UTF-8/16 character strings, you want to
. If, in addition to support for UTF-8/16/32 character strings, you want to
include support for the \P, \p, and \X sequences that recognize Unicode
character properties, you must add --enable-unicode-properties to the
"configure" command. This adds about 30K to the size of the library (in the
@ -281,7 +285,8 @@ library. They are also documented in the pcrebuild man page.
library, PCRE then uses three bytes instead of two for offsets to different
parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
the same as --with-link-size=4, which (in both libraries) uses four-byte
offsets. Increasing the internal link size reduces performance.
offsets. Increasing the internal link size reduces performance. In the 32-bit
library, the only supported link size is 4.
. You can build PCRE so that its internal match() function that is called from
pcre_exec() does not call itself recursively. Instead, it uses memory blocks
@ -310,13 +315,34 @@ library. They are also documented in the pcrebuild man page.
pcre_chartables.c.dist. See "Character tables" below for further information.
. It is possible to compile PCRE for use on systems that use EBCDIC as their
character code (as opposed to ASCII) by specifying
character code (as opposed to ASCII/Unicode) by specifying
--enable-ebcdic
This automatically implies --enable-rebuild-chartables (see above). However,
when PCRE is built this way, it always operates in EBCDIC. It cannot support
both EBCDIC and UTF-8/16.
both EBCDIC and UTF-8/16/32. There is a second option, --enable-ebcdic-nl25,
which specifies that the code value for the EBCDIC NL character is 0x25
instead of the default 0x15.
. In environments where valgrind is installed, if you specify
--enable-valgrind
PCRE will use valgrind annotations to mark certain memory regions as
unaddressable. This allows it to detect invalid memory accesses, and is
mostly useful for debugging PCRE itself.
. In environments where the gcc compiler is used and lcov version 1.6 or above
is installed, if you specify
--enable-coverage
the build process implements a code coverage report for the test suite. The
report is generated by running "make coverage". If ccache is installed on
your system, it must be disabled when building PCRE for coverage reporting.
You can do this by setting the environment variable CCACHE_DISABLE=1 before
running "make" to build PCRE.
. The pcregrep program currently supports only 8-bit data files, and so
requires the 8-bit PCRE library. It is possible to compile pcregrep to use
@ -366,6 +392,7 @@ The "configure" script builds the following files for the basic C library:
that were set for "configure"
. libpcre.pc ) data for the pkg-config command
. libpcre16.pc )
. libpcre32.pc )
. libpcreposix.pc )
. libtool script that builds shared and/or static libraries
@ -385,8 +412,8 @@ The "configure" script also creates config.status, which is an executable
script that can be run to recreate the configuration, and config.log, which
contains compiler output from tests that "configure" runs.
Once "configure" has run, you can run "make". This builds either or both of the
libraries libpcre and libpcre16, and a test program called pcretest. If you
Once "configure" has run, you can run "make". This builds the the libraries
libpcre, libpcre16 and/or libpcre32, and a test program called pcretest. If you
enabled JIT support with --enable-jit, a test program called pcre_jit_test is
built as well.
@ -410,12 +437,14 @@ system. The following are installed (file names are all relative to the
Libraries (lib):
libpcre16 (if 16-bit support is enabled)
libpcre32 (if 32-bit support is enabled)
libpcre (if 8-bit support is enabled)
libpcreposix (if 8-bit support is enabled)
libpcrecpp (if 8-bit and C++ support is enabled)
Configuration information (lib/pkgconfig):
libpcre16.pc
libpcre32.pc
libpcre.pc
libpcreposix.pc
libpcrecpp.pc (if C++ support is enabled)
@ -596,7 +625,7 @@ The RunTest script runs the pcretest test program (which is documented in its
own man page) on each of the relevant testinput files in the testdata
directory, and compares the output with the contents of the corresponding
testoutput files. Some tests are relevant only when certain build-time options
were selected. For example, the tests for UTF-8/16 support are run only if
were selected. For example, the tests for UTF-8/16/32 support are run only if
--enable-utf was used. RunTest outputs a comment when it skips a test.
Many of the tests that are not skipped are run up to three times. The second
@ -605,9 +634,9 @@ tests that are marked "never study" (see the pcretest program for how this is
done). If JIT support is available, the non-DFA tests are run a third time,
this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.
When both 8-bit and 16-bit support is enabled, the entire set of tests is run
twice, once for each library. If you want to run just one set of tests, call
RunTest with either the -8 or -16 option.
The entire set of tests is run once for each of the 8-bit, 16-bit and 32-bit
libraries that are enabled. If you want to run just one set of tests, call
RunTest with either the -8, -16 or -32 option.
RunTest uses a file called testtry to hold the main output from pcretest.
Other files whose names begin with "test" are used as working files in some
@ -658,13 +687,13 @@ RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses
Windows versions of test 2. More info on using RunTest.bat is included in the
document entitled NON-UNIX-USE.]
The fourth and fifth tests check the UTF-8/16 support and error handling and
The fourth and fifth tests check the UTF-8/16/32 support and error handling and
internal UTF features of PCRE that are not relevant to Perl, respectively. The
sixth and seventh tests do the same for Unicode character properties support.
The eighth, ninth, and tenth tests check the pcre_dfa_exec() alternative
matching function, in non-UTF-8/16 mode, UTF-8/16 mode, and UTF-8/16 mode with
Unicode property support, respectively.
matching function, in non-UTF-8/16/32 mode, UTF-8/16/32 mode, and UTF-8/16/32
mode with Unicode property support, respectively.
The eleventh test checks some internal offsets and code size features; it is
run only when the default "link size" of 2 is set (in other cases the sizes
@ -675,16 +704,21 @@ test is run only when JIT support is not available. They test some JIT-specific
features such as information output from pcretest about JIT compilation.
The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
the seventeenth, eighteenth, and nineteenth tests are run only in 16-bit mode.
the seventeenth, eighteenth, and nineteenth tests are run only in 16/32-bit mode.
These are tests that generate different output in the two modes. They are for
general cases, UTF-8/16 support, and Unicode property support, respectively.
general cases, UTF-8/16/32 support, and Unicode property support, respectively.
The twentieth test is run only in 16-bit mode. It tests some specific 16-bit
features of the DFA matching engine.
The twentieth test is run only in 16/32-bit mode. It tests some specific
16/32-bit features of the DFA matching engine.
The twenty-first and twenty-second tests are run only in 16-bit mode, when the
link size is set to 2. They test reloading pre-compiled patterns.
The twenty-first and twenty-second tests are run only in 16/32-bit mode, when the
link size is set to 2 for the 16-bit library. They test reloading pre-compiled patterns.
The twenty-third and twenty-fourth tests are run only in 16-bit mode. They are for
general cases, and UTF-16 support, respectively.
The twenty-fifth and twenty-sixth tests are run only in 32-bit mode. They are for
general cases, and UTF-32 support, respectively.
Character tables
----------------
@ -744,8 +778,8 @@ File manifest
-------------
The distribution should contain the files listed below. Where a file name is
given as pcre[16]_xxx it means that there are two files, one with the name
pcre_xxx and the other with the name pcre16_xxx.
given as pcre[16|32]_xxx it means that there are three files, one with the name
pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
(A) Source files of the PCRE library functions and their headers:
@ -757,31 +791,33 @@ pcre_xxx and the other with the name pcre16_xxx.
specified, by copying to pcre[16]_chartables.c
pcreposix.c )
pcre[16]_byte_order.c )
pcre[16]_compile.c )
pcre[16]_config.c )
pcre[16]_dfa_exec.c )
pcre[16]_exec.c )
pcre[16]_fullinfo.c )
pcre[16]_get.c ) sources for the functions in the library,
pcre[16]_globals.c ) and some internal functions that they use
pcre[16]_jit_compile.c )
pcre[16]_maketables.c )
pcre[16]_newline.c )
pcre[16]_refcount.c )
pcre[16]_string_utils.c )
pcre[16]_study.c )
pcre[16]_tables.c )
pcre[16]_ucd.c )
pcre[16]_version.c )
pcre[16]_xclass.c )
pcre[16|32]_byte_order.c )
pcre[16|32]_compile.c )
pcre[16|32]_config.c )
pcre[16|32]_dfa_exec.c )
pcre[16|32]_exec.c )
pcre[16|32]_fullinfo.c )
pcre[16|32]_get.c ) sources for the functions in the library,
pcre[16|32]_globals.c ) and some internal functions that they use
pcre[16|32]_jit_compile.c )
pcre[16|32]_maketables.c )
pcre[16|32]_newline.c )
pcre[16|32]_refcount.c )
pcre[16|32]_string_utils.c )
pcre[16|32]_study.c )
pcre[16|32]_tables.c )
pcre[16|32]_ucd.c )
pcre[16|32]_version.c )
pcre[16|32]_xclass.c )
pcre_ord2utf8.c )
pcre_valid_utf8.c )
pcre16_ord2utf16.c )
pcre16_utf16_utils.c )
pcre16_valid_utf16.c )
pcre32_utf32_utils.c )
pcre32_valid_utf32.c )
pcre[16]_printint.c ) debugging function that is used by pcretest,
pcre[16|32]_printint.c ) debugging function that is used by pcretest,
) and can also be #included in pcre_compile()
pcre.h.in template for pcre.h when built by "configure"
@ -847,6 +883,7 @@ pcre_xxx and the other with the name pcre16_xxx.
doc/perltest.txt plain text documentation of Perl test program
install-sh a shell script for installing files
libpcre16.pc.in template for libpcre16.pc for pkg-config
libpcre32.pc.in template for libpcre32.pc for pkg-config
libpcre.pc.in template for libpcre.pc for pkg-config
libpcreposix.pc.in template for libpcreposix.pc for pkg-config
libpcrecpp.pc.in template for libpcrecpp.pc for pkg-config
@ -895,4 +932,4 @@ pcre_xxx and the other with the name pcre16_xxx.
Philip Hazel
Email local part: ph10
Email domain: cam.ac.uk
Last updated: 18 June 2012
Last updated: 27 October 2012

View File

@ -31,16 +31,17 @@
/* config.h.in. Generated from configure.ac by autoheader. */
/* On Unix-like systems config.h.in is converted by "configure" into config.h.
Some other environments also support the use of "configure". PCRE is written in
Standard C, but there are a few non-standard things it can cope with, allowing
it to run on SunOS4 and other "close to standard" systems.
/* PCRE is written in Standard C, but there are a few non-standard things it
can cope with, allowing it to run on SunOS4 and other "close to standard"
systems.
If you are going to build PCRE "by hand" on a system without "configure" you
should copy the distributed config.h.generic to config.h, and then set up the
macro definitions the way you need them. You must then add -DHAVE_CONFIG_H to
all of your compile commands, so that config.h is included at the start of
every source.
In environments that support the facilities, config.h.in is converted by
"configure", or config-cmake.h.in is converted by CMake, into config.h. If you
are going to build PCRE "by hand" without using "configure" or CMake, you
should copy the distributed config.h.generic to config.h, and then edit the
macro definitions to be the way you need them. You must then add
-DHAVE_CONFIG_H to all of your compile commands, so that config.h is included
at the start of every source.
Alternatively, you can avoid editing by using -D on the compiler command line
to set the macro values. In this case, you do not have to set -DHAVE_CONFIG_H.
@ -50,19 +51,27 @@ HAVE_BCOPY is set to 1. If your system has neither bcopy() nor memmove(), set
them both to 0; an emulation function will be used. */
/* By default, the \R escape sequence matches any Unicode line ending
character or sequence of characters. If BSR_ANYCRLF is defined, this is
changed so that backslash-R matches only CR, LF, or CRLF. The build- time
default can be overridden by the user of PCRE at runtime. On systems that
support it, "configure" can be used to override the default. */
/* #undef BSR_ANYCRLF */
character or sequence of characters. If BSR_ANYCRLF is defined (to any
value), this is changed so that backslash-R matches only CR, LF, or CRLF.
The build-time default can be overridden by the user of PCRE at runtime. */
#undef BSR_ANYCRLF
/* If you are compiling for a system that uses EBCDIC instead of ASCII
character codes, define this macro as 1. On systems that can use
"configure", this can be done via --enable-ebcdic. PCRE will then assume
that all input strings are in EBCDIC. If you do not define this macro, PCRE
will assume input strings are ASCII or UTF-8 Unicode. It is not possible to
build a version of PCRE that supports both EBCDIC and UTF-8. */
/* #undef EBCDIC */
character codes, define this macro to any value. You must also edit the
NEWLINE macro below to set a suitable EBCDIC newline, commonly 21 (0x15).
On systems that can use "configure" or CMake to set EBCDIC, NEWLINE is
automatically adjusted. When EBCDIC is set, PCRE assumes that all input
strings are in EBCDIC. If you do not define this macro, PCRE will assume
input strings are ASCII or UTF-8/16/32 Unicode. It is not possible to build
a version of PCRE that supports both EBCDIC and UTF-8/16/32. */
#undef EBCDIC
/* In an EBCDIC environment, define this macro to any value to arrange for the
NL character to be 0x25 instead of the default 0x15. NL plays the role that
LF does in an ASCII/Unicode environment. The value must also be set in the
NEWLINE macro below. On systems that can use "configure" or CMake to set
EBCDIC_NL25, the adjustment of NEWLINE is automatic. */
#undef EBCDIC_NL25
/* Define to 1 if you have the `bcopy' function. */
#ifndef HAVE_BCOPY
@ -87,6 +96,12 @@ them both to 0; an emulation function will be used. */
#define HAVE_DLFCN_H 1
#endif
/* Define to 1 if you have the <editline/readline.h> header file. */
/*#undef HAVE_EDITLINE_READLINE_H*/
/* Define to 1 if you have the <edit/readline/readline.h> header file. */
/* #undef HAVE_EDIT_READLINE_READLINE_H */
/* Define to 1 if you have the <inttypes.h> header file. */
#ifndef HAVE_INTTYPES_H
#define HAVE_INTTYPES_H 1
@ -112,6 +127,11 @@ them both to 0; an emulation function will be used. */
#define HAVE_MEMORY_H 1
#endif
/* Define if you have POSIX threads libraries and header files. */
#undef HAVE_PTHREAD
/* Have PTHREAD_PRIO_INHERIT. */
#undef HAVE_PTHREAD_PRIO_INHERIT
/* Define to 1 if you have the <readline/history.h> header file. */
#ifndef HAVE_READLINE_HISTORY_H
#define HAVE_READLINE_HISTORY_H 1
@ -186,6 +206,10 @@ them both to 0; an emulation function will be used. */
#define HAVE_UNSIGNED_LONG_LONG 1
#endif
/* Define to 1 or 0, depending whether the compiler supports simple visibility
declarations. */
/* #undef HAVE_VISIBILITY */
/* Define to 1 if you have the <windows.h> header file. */
/* #undef HAVE_WINDOWS_H */
@ -254,22 +278,28 @@ them both to 0; an emulation function will be used. */
#define MAX_NAME_SIZE 32
#endif
/* The value of NEWLINE determines the newline character sequence. On systems
that support it, "configure" can be used to override the default, which is
10. The possible values are 10 (LF), 13 (CR), 3338 (CRLF), -1 (ANY), or -2
(ANYCRLF). */
/* The value of NEWLINE determines the default newline character sequence.
PCRE client programs can override this by selecting other values at run
time. In ASCII environments, the value can be 10 (LF), 13 (CR), or 3338
(CRLF); in EBCDIC environments the value can be 21 or 37 (LF), 13 (CR), or
3349 or 3365 (CRLF) because there are two alternative codepoints (0x15 and
0x25) that are used as the NL line terminator that is equivalent to ASCII
LF. In both ASCII and EBCDIC environments the value can also be -1 (ANY),
or -2 (ANYCRLF). */
#ifndef NEWLINE
#define NEWLINE 10
#endif
/* Define to 1 if your C compiler doesn't accept -c and -o together. */
/* #undef NO_MINUS_C_MINUS_O */
/* PCRE uses recursive function calls to handle backtracking while matching.
This can sometimes be a problem on systems that have stacks of limited
size. Define NO_RECURSE to get a version that doesn't use recursion in the
match() function; instead it creates its own stack by steam using
pcre_recurse_malloc() to obtain memory from the heap. For more detail, see
the comments and other stuff just above the match() function. On systems
that support it, "configure" can be used to set this in the Makefile (use
--disable-stack-for-recursion). */
size. Define NO_RECURSE to any value to get a version that doesn't use
recursion in the match() function; instead it creates its own stack by
steam using pcre_recurse_malloc() to obtain memory from the heap. For more
detail, see the comments and other stuff just above the match() function.
*/
/* #undef NO_RECURSE */
/* Name of package */
@ -282,7 +312,7 @@ them both to 0; an emulation function will be used. */
#define PACKAGE_NAME "PCRE"
/* Define to the full name and version of this package. */
#define PACKAGE_STRING "PCRE 8.31"
#define PACKAGE_STRING "PCRE 8.32"
/* Define to the one symbol short name of this package. */
#define PACKAGE_TARNAME "pcre"
@ -291,21 +321,46 @@ them both to 0; an emulation function will be used. */
#define PACKAGE_URL ""
/* Define to the version of this package. */
#define PACKAGE_VERSION "8.31"
#define PACKAGE_VERSION "8.32"
/* to make a symbol visible */
/* #undef PCRECPP_EXP_DECL */
/* to make a symbol visible */
/* #undef PCRECPP_EXP_DEFN */
/* The value of PCREGREP_BUFSIZE determines the size of buffer used by
pcregrep to hold parts of the file it is searching. This is also the
minimum value. The actual amount of memory used by pcregrep is three times
this number, because it allows for the buffering of "before" and "after"
lines. */
/* #undef PCREGREP_BUFSIZE */
/* to make a symbol visible */
/* #undef PCREPOSIX_EXP_DECL */
/* to make a symbol visible */
/* #undef PCREPOSIX_EXP_DEFN */
/* to make a symbol visible */
/* #undef PCRE_EXP_DATA_DEFN */
/* to make a symbol visible */
/* #undef PCRE_EXP_DECL */
/* If you are compiling for a system other than a Unix-like system or
Win32, and it needs some magic to be inserted before the definition
of a function that is exported by the library, define this macro to
contain the relevant magic. If you do not define this macro, it
defaults to "extern" for a C compiler and "extern C" for a C++
compiler on non-Win32 systems. This macro apears at the start of
every exported function that is part of the external API. It does
not appear on functions that are "external" in the C sense, but
which are internal to the library. */
contain the relevant magic. If you do not define this macro, a suitable
__declspec value is used for Windows systems; in other environments
"extern" is used for a C compiler and "extern C" for a C++ compiler.
This macro apears at the start of every exported function that is part
of the external API. It does not appear on functions that are "external"
in the C sense, but which are internal to the library. */
/* #undef PCRE_EXP_DEFN */
/* Define if linking statically (TODO: make nice with Libtool) */
/* Define to any value if linking statically (TODO: make nice with Libtool) */
/* #undef PCRE_STATIC */
/* When calling PCRE via the POSIX interface, additional working storage is
@ -314,40 +369,68 @@ them both to 0; an emulation function will be used. */
only two. If the number of expected substrings is small, the wrapper
function uses space on the stack, because this is faster than using
malloc() for each call. The threshold above which the stack is no longer
used is defined by POSIX_MALLOC_THRESHOLD. On systems that support it,
"configure" can be used to override this default. */
used is defined by POSIX_MALLOC_THRESHOLD. */
#ifndef POSIX_MALLOC_THRESHOLD
#define POSIX_MALLOC_THRESHOLD 10
#endif
/* Define to necessary symbol if this constant uses a non-standard name on
your system. */
/* #undef PTHREAD_CREATE_JOINABLE */
/* Define to 1 if you have the ANSI C header files. */
#ifndef STDC_HEADERS
#define STDC_HEADERS 1
#endif
/* Define to allow pcregrep to be linked with libbz2, so that it is able to
handle .bz2 files. */
/* Define to allow pcretest and pcregrep to be linked with gcov, so that they
are able to generate code coverage reports. */
#undef SUPPORT_GCOV
/* Define to any value to enable support for Just-In-Time compiling. */
#undef SUPPORT_JIT
/* Define to any value to allow pcregrep to be linked with libbz2, so that it
is able to handle .bz2 files. */
/* #undef SUPPORT_LIBBZ2 */
/* Define to allow pcretest to be linked with libreadline. */
/* Define to any value to allow pcretest to be linked with libedit. */
#undef SUPPORT_LIBEDIT
/* Define to any value to allow pcretest to be linked with libreadline. */
/* #undef SUPPORT_LIBREADLINE */
/* Define to allow pcregrep to be linked with libz, so that it is able to
handle .gz files. */
/* Define to any value to allow pcregrep to be linked with libz, so that it is
able to handle .gz files. */
/* #undef SUPPORT_LIBZ */
/* Define to any value to enable the 16 bit PCRE library. */
/* #undef SUPPORT_PCRE16 */
/* Define to any value to enable the 32 bit PCRE library. */
/* #undef SUPPORT_PCRE32 */
/* Define to any value to enable the 8 bit PCRE library. */
/* #undef SUPPORT_PCRE8 */
/* Define to any value to enable JIT support in pcregrep. */
/* #undef SUPPORT_PCREGREP_JIT */
/* Define to enable support for Unicode properties */
/* #undef SUPPORT_UCP */
/* Define to enable support for the UTF-8 Unicode encoding. This will work
even in an EBCDIC environment, but it is incompatible with the EBCDIC
macro. That is, PCRE can support *either* EBCDIC code *or* ASCII/UTF-8, but
not both at once. */
/* Define to any value to enable support for the UTF-8/16/32 Unicode encoding.
This will work even in an EBCDIC environment, but it is incompatible with
the EBCDIC macro. That is, PCRE can support *either* EBCDIC code *or*
ASCII/UTF-8/16/32, but not both at once. */
/* #undef SUPPORT_UTF8 */
/* Valgrind support to find invalid memory reads. */
/* #undef SUPPORT_VALGRIND */
/* Version number of package */
#ifndef VERSION
#define VERSION "8.31"
#define VERSION "8.32"
#endif
/* Define to empty if `const' does not conform to ANSI C. */

View File

@ -43,7 +43,9 @@ character tables for PCRE. The tables are built according to the current
locale. Now that pcre_maketables is a function visible to the outside world, we
make use of its code from here in order to be consistent. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include <ctype.h>
#include <stdio.h>
@ -106,11 +108,24 @@ fprintf(f,
"library and dead code stripping is activated. This leads to link errors.\n"
"Pulling in the header ensures that the array gets flagged as \"someone\n"
"outside this compilation unit might reference this\" and so it will always\n"
"be supplied to the linker. */\n\n"
"be supplied to the linker. */\n\n");
/* Force config.h in z/OS */
#if defined NATIVE_ZOS
fprintf(f,
"/* For z/OS, config.h is forced */\n"
"#ifndef HAVE_CONFIG_H\n"
"#define HAVE_CONFIG_H 1\n"
"#endif\n\n");
#endif
fprintf(f,
"#ifdef HAVE_CONFIG_H\n"
"#include \"config.h\"\n"
"#endif\n\n"
"#include \"pcre_internal.h\"\n\n");
fprintf(f,
"const pcre_uint8 PRIV(default_tables)[] = {\n\n"
"/* This table is a lower casing table. */\n\n");

File diff suppressed because it is too large Load Diff

View File

@ -42,9 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
/* The current PCRE version information. */
#define PCRE_MAJOR 8
#define PCRE_MINOR 31
#define PCRE_MINOR 32
#define PCRE_PRERELEASE
#define PCRE_DATE 2012-07-06
#define PCRE_DATE 2012-11-30
/* When an application links to a PCRE DLL in Windows, the symbols that are
imported have to be identified as such. When building PCRE, the appropriate
@ -95,54 +95,70 @@ it is needed here for malloc. */
extern "C" {
#endif
/* Options. Some are compile-time only, some are run-time only, and some are
both, so we keep them all distinct. However, almost all the bits in the options
word are now used. In the long run, we may have to re-use some of the
compile-time only bits for runtime options, or vice versa. In the comments
below, "compile", "exec", and "DFA exec" mean that the option is permitted to
be set for those functions; "used in" means that an option may be set only for
compile, but is subsequently referenced in exec and/or DFA exec. Any of the
/* Public options. Some are compile-time only, some are run-time only, and some
are both, so we keep them all distinct. However, almost all the bits in the
options word are now used. In the long run, we may have to re-use some of the
compile-time only bits for runtime options, or vice versa. Any of the
compile-time options may be inspected during studying (and therefore JIT
compiling). */
compiling).
#define PCRE_CASELESS 0x00000001 /* Compile */
#define PCRE_MULTILINE 0x00000002 /* Compile */
#define PCRE_DOTALL 0x00000004 /* Compile */
#define PCRE_EXTENDED 0x00000008 /* Compile */
#define PCRE_ANCHORED 0x00000010 /* Compile, exec, DFA exec */
#define PCRE_DOLLAR_ENDONLY 0x00000020 /* Compile, used in exec, DFA exec */
#define PCRE_EXTRA 0x00000040 /* Compile */
#define PCRE_NOTBOL 0x00000080 /* Exec, DFA exec */
#define PCRE_NOTEOL 0x00000100 /* Exec, DFA exec */
#define PCRE_UNGREEDY 0x00000200 /* Compile */
#define PCRE_NOTEMPTY 0x00000400 /* Exec, DFA exec */
/* The next two are also used in exec and DFA exec */
#define PCRE_UTF8 0x00000800 /* Compile (same as PCRE_UTF16) */
#define PCRE_UTF16 0x00000800 /* Compile (same as PCRE_UTF8) */
#define PCRE_NO_AUTO_CAPTURE 0x00001000 /* Compile */
/* The next two are also used in exec and DFA exec */
#define PCRE_NO_UTF8_CHECK 0x00002000 /* Compile (same as PCRE_NO_UTF16_CHECK) */
#define PCRE_NO_UTF16_CHECK 0x00002000 /* Compile (same as PCRE_NO_UTF8_CHECK) */
#define PCRE_AUTO_CALLOUT 0x00004000 /* Compile */
#define PCRE_PARTIAL_SOFT 0x00008000 /* Exec, DFA exec */
#define PCRE_PARTIAL 0x00008000 /* Backwards compatible synonym */
#define PCRE_DFA_SHORTEST 0x00010000 /* DFA exec */
#define PCRE_DFA_RESTART 0x00020000 /* DFA exec */
#define PCRE_FIRSTLINE 0x00040000 /* Compile, used in exec, DFA exec */
#define PCRE_DUPNAMES 0x00080000 /* Compile */
#define PCRE_NEWLINE_CR 0x00100000 /* Compile, exec, DFA exec */
#define PCRE_NEWLINE_LF 0x00200000 /* Compile, exec, DFA exec */
#define PCRE_NEWLINE_CRLF 0x00300000 /* Compile, exec, DFA exec */
#define PCRE_NEWLINE_ANY 0x00400000 /* Compile, exec, DFA exec */
#define PCRE_NEWLINE_ANYCRLF 0x00500000 /* Compile, exec, DFA exec */
#define PCRE_BSR_ANYCRLF 0x00800000 /* Compile, exec, DFA exec */
#define PCRE_BSR_UNICODE 0x01000000 /* Compile, exec, DFA exec */
#define PCRE_JAVASCRIPT_COMPAT 0x02000000 /* Compile, used in exec */
#define PCRE_NO_START_OPTIMIZE 0x04000000 /* Compile, exec, DFA exec */
#define PCRE_NO_START_OPTIMISE 0x04000000 /* Synonym */
#define PCRE_PARTIAL_HARD 0x08000000 /* Exec, DFA exec */
#define PCRE_NOTEMPTY_ATSTART 0x10000000 /* Exec, DFA exec */
#define PCRE_UCP 0x20000000 /* Compile, used in exec, DFA exec */
Some options for pcre_compile() change its behaviour but do not affect the
behaviour of the execution functions. Other options are passed through to the
execution functions and affect their behaviour, with or without affecting the
behaviour of pcre_compile().
Options that can be passed to pcre_compile() are tagged Cx below, with these
variants:
C1 Affects compile only
C2 Does not affect compile; affects exec, dfa_exec
C3 Affects compile, exec, dfa_exec
C4 Affects compile, exec, dfa_exec, study
C5 Affects compile, exec, study
Options that can be set for pcre_exec() and/or pcre_dfa_exec() are flagged with
E and D, respectively. They take precedence over C3, C4, and C5 settings passed
from pcre_compile(). Those that are compatible with JIT execution are flagged
with J. */
#define PCRE_CASELESS 0x00000001 /* C1 */
#define PCRE_MULTILINE 0x00000002 /* C1 */
#define PCRE_DOTALL 0x00000004 /* C1 */
#define PCRE_EXTENDED 0x00000008 /* C1 */
#define PCRE_ANCHORED 0x00000010 /* C4 E D */
#define PCRE_DOLLAR_ENDONLY 0x00000020 /* C2 */
#define PCRE_EXTRA 0x00000040 /* C1 */
#define PCRE_NOTBOL 0x00000080 /* E D J */
#define PCRE_NOTEOL 0x00000100 /* E D J */
#define PCRE_UNGREEDY 0x00000200 /* C1 */
#define PCRE_NOTEMPTY 0x00000400 /* E D J */
#define PCRE_UTF8 0x00000800 /* C4 ) */
#define PCRE_UTF16 0x00000800 /* C4 ) Synonyms */
#define PCRE_UTF32 0x00000800 /* C4 ) */
#define PCRE_NO_AUTO_CAPTURE 0x00001000 /* C1 */
#define PCRE_NO_UTF8_CHECK 0x00002000 /* C1 E D J ) */
#define PCRE_NO_UTF16_CHECK 0x00002000 /* C1 E D J ) Synonyms */
#define PCRE_NO_UTF32_CHECK 0x00002000 /* C1 E D J ) */
#define PCRE_AUTO_CALLOUT 0x00004000 /* C1 */
#define PCRE_PARTIAL_SOFT 0x00008000 /* E D J ) Synonyms */
#define PCRE_PARTIAL 0x00008000 /* E D J ) */
#define PCRE_DFA_SHORTEST 0x00010000 /* D */
#define PCRE_DFA_RESTART 0x00020000 /* D */
#define PCRE_FIRSTLINE 0x00040000 /* C3 */
#define PCRE_DUPNAMES 0x00080000 /* C1 */
#define PCRE_NEWLINE_CR 0x00100000 /* C3 E D */
#define PCRE_NEWLINE_LF 0x00200000 /* C3 E D */
#define PCRE_NEWLINE_CRLF 0x00300000 /* C3 E D */
#define PCRE_NEWLINE_ANY 0x00400000 /* C3 E D */
#define PCRE_NEWLINE_ANYCRLF 0x00500000 /* C3 E D */
#define PCRE_BSR_ANYCRLF 0x00800000 /* C3 E D */
#define PCRE_BSR_UNICODE 0x01000000 /* C3 E D */
#define PCRE_JAVASCRIPT_COMPAT 0x02000000 /* C5 */
#define PCRE_NO_START_OPTIMIZE 0x04000000 /* C2 E D ) Synonyms */
#define PCRE_NO_START_OPTIMISE 0x04000000 /* C2 E D ) */
#define PCRE_PARTIAL_HARD 0x08000000 /* E D J */
#define PCRE_NOTEMPTY_ATSTART 0x10000000 /* E D J */
#define PCRE_UCP 0x20000000 /* C3 */
/* Exec-time and get/set-time error codes */
@ -156,8 +172,9 @@ compiling). */
#define PCRE_ERROR_NOSUBSTRING (-7)
#define PCRE_ERROR_MATCHLIMIT (-8)
#define PCRE_ERROR_CALLOUT (-9) /* Never used by PCRE itself */
#define PCRE_ERROR_BADUTF8 (-10) /* Same for 8/16 */
#define PCRE_ERROR_BADUTF16 (-10) /* Same for 8/16 */
#define PCRE_ERROR_BADUTF8 (-10) /* Same for 8/16/32 */
#define PCRE_ERROR_BADUTF16 (-10) /* Same for 8/16/32 */
#define PCRE_ERROR_BADUTF32 (-10) /* Same for 8/16/32 */
#define PCRE_ERROR_BADUTF8_OFFSET (-11) /* Same for 8/16 */
#define PCRE_ERROR_BADUTF16_OFFSET (-11) /* Same for 8/16 */
#define PCRE_ERROR_PARTIAL (-12)
@ -180,6 +197,8 @@ compiling). */
#define PCRE_ERROR_BADMODE (-28)
#define PCRE_ERROR_BADENDIANNESS (-29)
#define PCRE_ERROR_DFA_BADRESTART (-30)
#define PCRE_ERROR_JIT_BADOPTION (-31)
#define PCRE_ERROR_BADLENGTH (-32)
/* Specific error codes for UTF-8 validity checks */
@ -205,6 +224,7 @@ compiling). */
#define PCRE_UTF8_ERR19 19
#define PCRE_UTF8_ERR20 20
#define PCRE_UTF8_ERR21 21
#define PCRE_UTF8_ERR22 22
/* Specific error codes for UTF-16 validity checks */
@ -214,6 +234,13 @@ compiling). */
#define PCRE_UTF16_ERR3 3
#define PCRE_UTF16_ERR4 4
/* Specific error codes for UTF-32 validity checks */
#define PCRE_UTF32_ERR0 0
#define PCRE_UTF32_ERR1 1
#define PCRE_UTF32_ERR2 2
#define PCRE_UTF32_ERR3 3
/* Request types for pcre_fullinfo() */
#define PCRE_INFO_OPTIONS 0
@ -236,6 +263,10 @@ compiling). */
#define PCRE_INFO_JIT 16
#define PCRE_INFO_JITSIZE 17
#define PCRE_INFO_MAXLOOKBEHIND 18
#define PCRE_INFO_FIRSTCHARACTER 19
#define PCRE_INFO_FIRSTCHARACTERFLAGS 20
#define PCRE_INFO_REQUIREDCHAR 21
#define PCRE_INFO_REQUIREDCHARFLAGS 22
/* Request types for pcre_config(). Do not re-arrange, in order to remain
compatible. */
@ -252,6 +283,7 @@ compatible. */
#define PCRE_CONFIG_JIT 9
#define PCRE_CONFIG_UTF16 10
#define PCRE_CONFIG_JITTARGET 11
#define PCRE_CONFIG_UTF32 12
/* Request types for pcre_study(). Do not re-arrange, in order to remain
compatible. */
@ -259,8 +291,9 @@ compatible. */
#define PCRE_STUDY_JIT_COMPILE 0x0001
#define PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE 0x0002
#define PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE 0x0004
#define PCRE_STUDY_EXTRA_NEEDED 0x0008
/* Bit flags for the pcre[16]_extra structure. Do not re-arrange or redefine
/* Bit flags for the pcre[16|32]_extra structure. Do not re-arrange or redefine
these bits, just add new ones on the end, in order to remain compatible. */
#define PCRE_EXTRA_STUDY_DATA 0x0001
@ -279,12 +312,18 @@ typedef struct real_pcre pcre;
struct real_pcre16; /* declaration; the definition is private */
typedef struct real_pcre16 pcre16;
struct real_pcre32; /* declaration; the definition is private */
typedef struct real_pcre32 pcre32;
struct real_pcre_jit_stack; /* declaration; the definition is private */
typedef struct real_pcre_jit_stack pcre_jit_stack;
struct real_pcre16_jit_stack; /* declaration; the definition is private */
typedef struct real_pcre16_jit_stack pcre16_jit_stack;
struct real_pcre32_jit_stack; /* declaration; the definition is private */
typedef struct real_pcre32_jit_stack pcre32_jit_stack;
/* If PCRE is compiled with 16 bit character support, PCRE_UCHAR16 must contain
a 16 bit wide signed data type. Otherwise it can be a dummy data type since
pcre16 functions are not implemented. There is a check for this in pcre_internal.h. */
@ -296,6 +335,17 @@ pcre16 functions are not implemented. There is a check for this in pcre_internal
#define PCRE_SPTR16 const PCRE_UCHAR16 *
#endif
/* If PCRE is compiled with 32 bit character support, PCRE_UCHAR32 must contain
a 32 bit wide signed data type. Otherwise it can be a dummy data type since
pcre32 functions are not implemented. There is a check for this in pcre_internal.h. */
#ifndef PCRE_UCHAR32
#define PCRE_UCHAR32 unsigned int
#endif
#ifndef PCRE_SPTR32
#define PCRE_SPTR32 const PCRE_UCHAR32 *
#endif
/* When PCRE is compiled as a C++ library, the subject pointer type can be
replaced with a custom type. For conventional use, the public interface is a
const char *. */
@ -332,6 +382,19 @@ typedef struct pcre16_extra {
void *executable_jit; /* Contains a pointer to a compiled jit code */
} pcre16_extra;
/* Same structure as above, but with 32 bit char pointers. */
typedef struct pcre32_extra {
unsigned long int flags; /* Bits for which fields are set */
void *study_data; /* Opaque data from pcre_study() */
unsigned long int match_limit; /* Maximum number of calls to match() */
void *callout_data; /* Data passed back in callouts */
const unsigned char *tables; /* Pointer to character tables */
unsigned long int match_limit_recursion; /* Max recursive calls to match() */
PCRE_UCHAR32 **mark; /* For passing back a mark pointer */
void *executable_jit; /* Contains a pointer to a compiled jit code */
} pcre32_extra;
/* The structure for passing out data via the pcre_callout_function. We use a
structure so that new fields can be added on the end in future versions,
without changing the API of the function, thereby allowing old clients to work
@ -379,6 +442,28 @@ typedef struct pcre16_callout_block {
/* ------------------------------------------------------------------ */
} pcre16_callout_block;
/* Same structure as above, but with 32 bit char pointers. */
typedef struct pcre32_callout_block {
int version; /* Identifies version of block */
/* ------------------------ Version 0 ------------------------------- */
int callout_number; /* Number compiled into pattern */
int *offset_vector; /* The offset vector */
PCRE_SPTR32 subject; /* The subject being matched */
int subject_length; /* The length of the subject */
int start_match; /* Offset to start of this match attempt */
int current_position; /* Where we currently are in the subject */
int capture_top; /* Max current capture */
int capture_last; /* Most recently closed capture */
void *callout_data; /* Data passed in with the call */
/* ------------------- Added for Version 1 -------------------------- */
int pattern_position; /* Offset to next item in the pattern */
int next_item_length; /* Length of next item in the pattern */
/* ------------------- Added for Version 2 -------------------------- */
const PCRE_UCHAR32 *mark; /* Pointer to current mark or NULL */
/* ------------------------------------------------------------------ */
} pcre32_callout_block;
/* Indirection for store get and free functions. These can be set to
alternative malloc/free functions if required. Special ones are used in the
non-recursive case for "frames". There is also an optional callout function
@ -397,6 +482,12 @@ PCRE_EXP_DECL void (*pcre16_free)(void *);
PCRE_EXP_DECL void *(*pcre16_stack_malloc)(size_t);
PCRE_EXP_DECL void (*pcre16_stack_free)(void *);
PCRE_EXP_DECL int (*pcre16_callout)(pcre16_callout_block *);
PCRE_EXP_DECL void *(*pcre32_malloc)(size_t);
PCRE_EXP_DECL void (*pcre32_free)(void *);
PCRE_EXP_DECL void *(*pcre32_stack_malloc)(size_t);
PCRE_EXP_DECL void (*pcre32_stack_free)(void *);
PCRE_EXP_DECL int (*pcre32_callout)(pcre32_callout_block *);
#else /* VPCOMPAT */
PCRE_EXP_DECL void *pcre_malloc(size_t);
PCRE_EXP_DECL void pcre_free(void *);
@ -409,12 +500,19 @@ PCRE_EXP_DECL void pcre16_free(void *);
PCRE_EXP_DECL void *pcre16_stack_malloc(size_t);
PCRE_EXP_DECL void pcre16_stack_free(void *);
PCRE_EXP_DECL int pcre16_callout(pcre16_callout_block *);
PCRE_EXP_DECL void *pcre32_malloc(size_t);
PCRE_EXP_DECL void pcre32_free(void *);
PCRE_EXP_DECL void *pcre32_stack_malloc(size_t);
PCRE_EXP_DECL void pcre32_stack_free(void *);
PCRE_EXP_DECL int pcre32_callout(pcre32_callout_block *);
#endif /* VPCOMPAT */
/* User defined callback which provides a stack just before the match starts. */
typedef pcre_jit_stack *(*pcre_jit_callback)(void *);
typedef pcre16_jit_stack *(*pcre16_jit_callback)(void *);
typedef pcre32_jit_stack *(*pcre32_jit_callback)(void *);
/* Exported PCRE functions */
@ -422,83 +520,131 @@ PCRE_EXP_DECL pcre *pcre_compile(const char *, int, const char **, int *,
const unsigned char *);
PCRE_EXP_DECL pcre16 *pcre16_compile(PCRE_SPTR16, int, const char **, int *,
const unsigned char *);
PCRE_EXP_DECL pcre32 *pcre32_compile(PCRE_SPTR32, int, const char **, int *,
const unsigned char *);
PCRE_EXP_DECL pcre *pcre_compile2(const char *, int, int *, const char **,
int *, const unsigned char *);
PCRE_EXP_DECL pcre16 *pcre16_compile2(PCRE_SPTR16, int, int *, const char **,
int *, const unsigned char *);
PCRE_EXP_DECL pcre32 *pcre32_compile2(PCRE_SPTR32, int, int *, const char **,
int *, const unsigned char *);
PCRE_EXP_DECL int pcre_config(int, void *);
PCRE_EXP_DECL int pcre16_config(int, void *);
PCRE_EXP_DECL int pcre32_config(int, void *);
PCRE_EXP_DECL int pcre_copy_named_substring(const pcre *, const char *,
int *, int, const char *, char *, int);
PCRE_EXP_DECL int pcre16_copy_named_substring(const pcre16 *, PCRE_SPTR16,
int *, int, PCRE_SPTR16, PCRE_UCHAR16 *, int);
PCRE_EXP_DECL int pcre32_copy_named_substring(const pcre32 *, PCRE_SPTR32,
int *, int, PCRE_SPTR32, PCRE_UCHAR32 *, int);
PCRE_EXP_DECL int pcre_copy_substring(const char *, int *, int, int,
char *, int);
PCRE_EXP_DECL int pcre16_copy_substring(PCRE_SPTR16, int *, int, int,
PCRE_UCHAR16 *, int);
PCRE_EXP_DECL int pcre32_copy_substring(PCRE_SPTR32, int *, int, int,
PCRE_UCHAR32 *, int);
PCRE_EXP_DECL int pcre_dfa_exec(const pcre *, const pcre_extra *,
const char *, int, int, int, int *, int , int *, int);
PCRE_EXP_DECL int pcre16_dfa_exec(const pcre16 *, const pcre16_extra *,
PCRE_SPTR16, int, int, int, int *, int , int *, int);
PCRE_EXP_DECL int pcre32_dfa_exec(const pcre32 *, const pcre32_extra *,
PCRE_SPTR32, int, int, int, int *, int , int *, int);
PCRE_EXP_DECL int pcre_exec(const pcre *, const pcre_extra *, PCRE_SPTR,
int, int, int, int *, int);
PCRE_EXP_DECL int pcre16_exec(const pcre16 *, const pcre16_extra *,
PCRE_SPTR16, int, int, int, int *, int);
PCRE_EXP_DECL int pcre32_exec(const pcre32 *, const pcre32_extra *,
PCRE_SPTR32, int, int, int, int *, int);
PCRE_EXP_DECL int pcre_jit_exec(const pcre *, const pcre_extra *,
PCRE_SPTR, int, int, int, int *, int,
pcre_jit_stack *);
PCRE_EXP_DECL int pcre16_jit_exec(const pcre16 *, const pcre16_extra *,
PCRE_SPTR16, int, int, int, int *, int,
pcre16_jit_stack *);
PCRE_EXP_DECL int pcre32_jit_exec(const pcre32 *, const pcre32_extra *,
PCRE_SPTR32, int, int, int, int *, int,
pcre32_jit_stack *);
PCRE_EXP_DECL void pcre_free_substring(const char *);
PCRE_EXP_DECL void pcre16_free_substring(PCRE_SPTR16);
PCRE_EXP_DECL void pcre32_free_substring(PCRE_SPTR32);
PCRE_EXP_DECL void pcre_free_substring_list(const char **);
PCRE_EXP_DECL void pcre16_free_substring_list(PCRE_SPTR16 *);
PCRE_EXP_DECL void pcre32_free_substring_list(PCRE_SPTR32 *);
PCRE_EXP_DECL int pcre_fullinfo(const pcre *, const pcre_extra *, int,
void *);
PCRE_EXP_DECL int pcre16_fullinfo(const pcre16 *, const pcre16_extra *, int,
void *);
PCRE_EXP_DECL int pcre32_fullinfo(const pcre32 *, const pcre32_extra *, int,
void *);
PCRE_EXP_DECL int pcre_get_named_substring(const pcre *, const char *,
int *, int, const char *, const char **);
PCRE_EXP_DECL int pcre16_get_named_substring(const pcre16 *, PCRE_SPTR16,
int *, int, PCRE_SPTR16, PCRE_SPTR16 *);
PCRE_EXP_DECL int pcre32_get_named_substring(const pcre32 *, PCRE_SPTR32,
int *, int, PCRE_SPTR32, PCRE_SPTR32 *);
PCRE_EXP_DECL int pcre_get_stringnumber(const pcre *, const char *);
PCRE_EXP_DECL int pcre16_get_stringnumber(const pcre16 *, PCRE_SPTR16);
PCRE_EXP_DECL int pcre32_get_stringnumber(const pcre32 *, PCRE_SPTR32);
PCRE_EXP_DECL int pcre_get_stringtable_entries(const pcre *, const char *,
char **, char **);
PCRE_EXP_DECL int pcre16_get_stringtable_entries(const pcre16 *, PCRE_SPTR16,
PCRE_UCHAR16 **, PCRE_UCHAR16 **);
PCRE_EXP_DECL int pcre32_get_stringtable_entries(const pcre32 *, PCRE_SPTR32,
PCRE_UCHAR32 **, PCRE_UCHAR32 **);
PCRE_EXP_DECL int pcre_get_substring(const char *, int *, int, int,
const char **);
PCRE_EXP_DECL int pcre16_get_substring(PCRE_SPTR16, int *, int, int,
PCRE_SPTR16 *);
PCRE_EXP_DECL int pcre32_get_substring(PCRE_SPTR32, int *, int, int,
PCRE_SPTR32 *);
PCRE_EXP_DECL int pcre_get_substring_list(const char *, int *, int,
const char ***);
PCRE_EXP_DECL int pcre16_get_substring_list(PCRE_SPTR16, int *, int,
PCRE_SPTR16 **);
PCRE_EXP_DECL int pcre32_get_substring_list(PCRE_SPTR32, int *, int,
PCRE_SPTR32 **);
PCRE_EXP_DECL const unsigned char *pcre_maketables(void);
PCRE_EXP_DECL const unsigned char *pcre16_maketables(void);
PCRE_EXP_DECL const unsigned char *pcre32_maketables(void);
PCRE_EXP_DECL int pcre_refcount(pcre *, int);
PCRE_EXP_DECL int pcre16_refcount(pcre16 *, int);
PCRE_EXP_DECL int pcre32_refcount(pcre32 *, int);
PCRE_EXP_DECL pcre_extra *pcre_study(const pcre *, int, const char **);
PCRE_EXP_DECL pcre16_extra *pcre16_study(const pcre16 *, int, const char **);
PCRE_EXP_DECL pcre32_extra *pcre32_study(const pcre32 *, int, const char **);
PCRE_EXP_DECL void pcre_free_study(pcre_extra *);
PCRE_EXP_DECL void pcre16_free_study(pcre16_extra *);
PCRE_EXP_DECL void pcre32_free_study(pcre32_extra *);
PCRE_EXP_DECL const char *pcre_version(void);
PCRE_EXP_DECL const char *pcre16_version(void);
PCRE_EXP_DECL const char *pcre32_version(void);
/* Utility functions for byte order swaps. */
PCRE_EXP_DECL int pcre_pattern_to_host_byte_order(pcre *, pcre_extra *,
const unsigned char *);
PCRE_EXP_DECL int pcre16_pattern_to_host_byte_order(pcre16 *, pcre16_extra *,
const unsigned char *);
PCRE_EXP_DECL int pcre32_pattern_to_host_byte_order(pcre32 *, pcre32_extra *,
const unsigned char *);
PCRE_EXP_DECL int pcre16_utf16_to_host_byte_order(PCRE_UCHAR16 *,
PCRE_SPTR16, int, int *, int);
PCRE_EXP_DECL int pcre32_utf32_to_host_byte_order(PCRE_UCHAR32 *,
PCRE_SPTR32, int, int *, int);
/* JIT compiler related functions. */
PCRE_EXP_DECL pcre_jit_stack *pcre_jit_stack_alloc(int, int);
PCRE_EXP_DECL pcre16_jit_stack *pcre16_jit_stack_alloc(int, int);
PCRE_EXP_DECL pcre32_jit_stack *pcre32_jit_stack_alloc(int, int);
PCRE_EXP_DECL void pcre_jit_stack_free(pcre_jit_stack *);
PCRE_EXP_DECL void pcre16_jit_stack_free(pcre16_jit_stack *);
PCRE_EXP_DECL void pcre32_jit_stack_free(pcre32_jit_stack *);
PCRE_EXP_DECL void pcre_assign_jit_stack(pcre_extra *,
pcre_jit_callback, void *);
PCRE_EXP_DECL void pcre16_assign_jit_stack(pcre16_extra *,
pcre16_jit_callback, void *);
PCRE_EXP_DECL void pcre32_assign_jit_stack(pcre32_extra *,
pcre32_jit_callback, void *);
#ifdef __cplusplus
} /* extern "C" */

View File

@ -20,11 +20,13 @@ and dead code stripping is activated. This leads to link errors. Pulling in the
header ensures that the array gets flagged as "someone outside this compilation
unit might reference this" and so it will always be supplied to the linker. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
const unsigned char _pcre_default_tables[] = {
const pcre_uint8 PRIV(default_tables)[] = {
/* This table is a lower casing table. */

File diff suppressed because it is too large Load Diff

View File

@ -41,7 +41,9 @@ POSSIBILITY OF SUCH DAMAGE.
/* This module contains the external function pcre_config(). */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
/* Keep the original link size. */
static int real_link_size = LINK_SIZE;
@ -63,18 +65,21 @@ Arguments:
Returns: 0 if data returned, negative on error
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_config(int what, void *where)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_config(int what, void *where)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_config(int what, void *where)
#endif
{
switch (what)
{
case PCRE_CONFIG_UTF8:
#if defined COMPILE_PCRE16
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
*((int *)where) = 0;
return PCRE_ERROR_BADOPTION;
#else
@ -87,7 +92,20 @@ switch (what)
#endif
case PCRE_CONFIG_UTF16:
#if defined COMPILE_PCRE8
#if defined COMPILE_PCRE8 || defined COMPILE_PCRE32
*((int *)where) = 0;
return PCRE_ERROR_BADOPTION;
#else
#if defined SUPPORT_UTF
*((int *)where) = 1;
#else
*((int *)where) = 0;
#endif
break;
#endif
case PCRE_CONFIG_UTF32:
#if defined COMPILE_PCRE8 || defined COMPILE_PCRE16
*((int *)where) = 0;
return PCRE_ERROR_BADOPTION;
#else

File diff suppressed because it is too large Load Diff

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
information about a compiled pattern. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
@ -63,14 +65,18 @@ Arguments:
Returns: 0 if data returned, negative on error
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_fullinfo(const pcre *argument_re, const pcre_extra *extra_data,
int what, void *where)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_fullinfo(const pcre16 *argument_re, const pcre16_extra *extra_data,
int what, void *where)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_fullinfo(const pcre32 *argument_re, const pcre32_extra *extra_data,
int what, void *where)
#endif
{
const REAL_PCRE *re = (const REAL_PCRE *)argument_re;
@ -130,10 +136,21 @@ switch (what)
case PCRE_INFO_FIRSTBYTE:
*((int *)where) =
((re->flags & PCRE_FIRSTSET) != 0)? re->first_char :
((re->flags & PCRE_FIRSTSET) != 0)? (int)re->first_char :
((re->flags & PCRE_STARTLINE) != 0)? -1 : -2;
break;
case PCRE_INFO_FIRSTCHARACTER:
*((pcre_uint32 *)where) =
(re->flags & PCRE_FIRSTSET) != 0 ? re->first_char : 0;
break;
case PCRE_INFO_FIRSTCHARACTERFLAGS:
*((int *)where) =
((re->flags & PCRE_FIRSTSET) != 0) ? 1 :
((re->flags & PCRE_STARTLINE) != 0) ? 2 : 0;
break;
/* Make sure we pass back the pointer to the bit vector in the external
block, not the internal copy (with flipped integer fields). */
@ -157,7 +174,17 @@ switch (what)
case PCRE_INFO_LASTLITERAL:
*((int *)where) =
((re->flags & PCRE_REQCHSET) != 0)? re->req_char : -1;
((re->flags & PCRE_REQCHSET) != 0)? (int)re->req_char : -1;
break;
case PCRE_INFO_REQUIREDCHAR:
*((pcre_uint32 *)where) =
((re->flags & PCRE_REQCHSET) != 0) ? re->req_char : 0;
break;
case PCRE_INFO_REQUIREDCHARFLAGS:
*((int *)where) =
((re->flags & PCRE_REQCHSET) != 0);
break;
case PCRE_INFO_NAMEENTRYSIZE:

View File

@ -43,7 +43,9 @@ from the subject string after a regex match has succeeded. The original idea
for these functions came from Scott Wimer. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
@ -63,12 +65,15 @@ Returns: the number of the named parentheses, or a negative number
(PCRE_ERROR_NOSUBSTRING) if not found
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_stringnumber(const pcre *code, const char *stringname)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_stringnumber(const pcre16 *code, PCRE_SPTR16 stringname)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_stringnumber(const pcre32 *code, PCRE_SPTR32 stringname)
#endif
{
int rc;
@ -96,6 +101,16 @@ if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0
if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc;
#endif
#ifdef COMPILE_PCRE32
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
return rc;
if (top <= 0) return PCRE_ERROR_NOSUBSTRING;
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0)
return rc;
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc;
#endif
bot = 0;
while (top > bot)
@ -130,14 +145,18 @@ Returns: the length of each entry, or a negative number
(PCRE_ERROR_NOSUBSTRING) if not found
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_stringtable_entries(const pcre *code, const char *stringname,
char **firstptr, char **lastptr)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_stringtable_entries(const pcre16 *code, PCRE_SPTR16 stringname,
PCRE_UCHAR16 **firstptr, PCRE_UCHAR16 **lastptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_stringtable_entries(const pcre32 *code, PCRE_SPTR32 stringname,
PCRE_UCHAR32 **firstptr, PCRE_UCHAR32 **lastptr)
#endif
{
int rc;
@ -165,6 +184,16 @@ if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0
if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc;
#endif
#ifdef COMPILE_PCRE32
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
return rc;
if (top <= 0) return PCRE_ERROR_NOSUBSTRING;
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0)
return rc;
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc;
#endif
lastentry = nametable + entrysize * (top - 1);
bot = 0;
@ -190,12 +219,15 @@ while (top > bot)
(pcre_uchar *)(last + entrysize + IMM2_SIZE)) != 0) break;
last += entrysize;
}
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
*firstptr = (char *)first;
*lastptr = (char *)last;
#else
#elif defined COMPILE_PCRE16
*firstptr = (PCRE_UCHAR16 *)first;
*lastptr = (PCRE_UCHAR16 *)last;
#elif defined COMPILE_PCRE32
*firstptr = (PCRE_UCHAR32 *)first;
*lastptr = (PCRE_UCHAR32 *)last;
#endif
return entrysize;
}
@ -224,31 +256,40 @@ Returns: the number of the first that is set,
or a negative number on error
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
static int
get_first_set(const pcre *code, const char *stringname, int *ovector)
#else
#elif defined COMPILE_PCRE16
static int
get_first_set(const pcre16 *code, PCRE_SPTR16 stringname, int *ovector)
#elif defined COMPILE_PCRE32
static int
get_first_set(const pcre32 *code, PCRE_SPTR32 stringname, int *ovector)
#endif
{
const REAL_PCRE *re = (const REAL_PCRE *)code;
int entrysize;
pcre_uchar *entry;
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
char *first, *last;
#else
#elif defined COMPILE_PCRE16
PCRE_UCHAR16 *first, *last;
#elif defined COMPILE_PCRE32
PCRE_UCHAR32 *first, *last;
#endif
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
return pcre_get_stringnumber(code, stringname);
entrysize = pcre_get_stringtable_entries(code, stringname, &first, &last);
#else
#elif defined COMPILE_PCRE16
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
return pcre16_get_stringnumber(code, stringname);
entrysize = pcre16_get_stringtable_entries(code, stringname, &first, &last);
#elif defined COMPILE_PCRE32
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
return pcre32_get_stringnumber(code, stringname);
entrysize = pcre32_get_stringtable_entries(code, stringname, &first, &last);
#endif
if (entrysize <= 0) return entrysize;
for (entry = (pcre_uchar *)first; entry <= (pcre_uchar *)last; entry += entrysize)
@ -289,14 +330,18 @@ Returns: if successful:
PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_copy_substring(const char *subject, int *ovector, int stringcount,
int stringnumber, char *buffer, int size)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_copy_substring(PCRE_SPTR16 subject, int *ovector, int stringcount,
int stringnumber, PCRE_UCHAR16 *buffer, int size)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_copy_substring(PCRE_SPTR32 subject, int *ovector, int stringcount,
int stringnumber, PCRE_UCHAR32 *buffer, int size)
#endif
{
int yield;
@ -340,24 +385,31 @@ Returns: if successful:
PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_copy_named_substring(const pcre *code, const char *subject,
int *ovector, int stringcount, const char *stringname,
char *buffer, int size)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_copy_named_substring(const pcre16 *code, PCRE_SPTR16 subject,
int *ovector, int stringcount, PCRE_SPTR16 stringname,
PCRE_UCHAR16 *buffer, int size)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_copy_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
int *ovector, int stringcount, PCRE_SPTR32 stringname,
PCRE_UCHAR32 *buffer, int size)
#endif
{
int n = get_first_set(code, stringname, ovector);
if (n <= 0) return n;
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
return pcre_copy_substring(subject, ovector, stringcount, n, buffer, size);
#else
#elif defined COMPILE_PCRE16
return pcre16_copy_substring(subject, ovector, stringcount, n, buffer, size);
#elif defined COMPILE_PCRE32
return pcre32_copy_substring(subject, ovector, stringcount, n, buffer, size);
#endif
}
@ -384,14 +436,18 @@ Returns: if successful: 0
PCRE_ERROR_NOMEMORY (-6) failed to get store
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_substring_list(const char *subject, int *ovector, int stringcount,
const char ***listptr)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_substring_list(PCRE_SPTR16 subject, int *ovector, int stringcount,
PCRE_SPTR16 **listptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_substring_list(PCRE_SPTR32 subject, int *ovector, int stringcount,
PCRE_SPTR32 **listptr)
#endif
{
int i;
@ -406,10 +462,12 @@ for (i = 0; i < double_count; i += 2)
stringlist = (pcre_uchar **)(PUBL(malloc))(size);
if (stringlist == NULL) return PCRE_ERROR_NOMEMORY;
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
*listptr = (const char **)stringlist;
#else
#elif defined COMPILE_PCRE16
*listptr = (PCRE_SPTR16 *)stringlist;
#elif defined COMPILE_PCRE32
*listptr = (PCRE_SPTR32 *)stringlist;
#endif
p = (pcre_uchar *)(stringlist + stringcount + 1);
@ -440,12 +498,15 @@ Argument: the result of a previous pcre_get_substring_list()
Returns: nothing
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre_free_substring_list(const char **pointer)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre16_free_substring_list(PCRE_SPTR16 *pointer)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre32_free_substring_list(PCRE_SPTR32 *pointer)
#endif
{
(PUBL(free))((void *)pointer);
@ -478,14 +539,18 @@ Returns: if successful:
PCRE_ERROR_NOSUBSTRING (-7) substring not present
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_substring(const char *subject, int *ovector, int stringcount,
int stringnumber, const char **stringptr)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_substring(PCRE_SPTR16 subject, int *ovector, int stringcount,
int stringnumber, PCRE_SPTR16 *stringptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_substring(PCRE_SPTR32 subject, int *ovector, int stringcount,
int stringnumber, PCRE_SPTR32 *stringptr)
#endif
{
int yield;
@ -498,10 +563,12 @@ substring = (pcre_uchar *)(PUBL(malloc))(IN_UCHARS(yield + 1));
if (substring == NULL) return PCRE_ERROR_NOMEMORY;
memcpy(substring, subject + ovector[stringnumber], IN_UCHARS(yield));
substring[yield] = 0;
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
*stringptr = (const char *)substring;
#else
#elif defined COMPILE_PCRE16
*stringptr = (PCRE_SPTR16)substring;
#elif defined COMPILE_PCRE32
*stringptr = (PCRE_SPTR32)substring;
#endif
return yield;
}
@ -535,24 +602,31 @@ Returns: if successful:
PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_named_substring(const pcre *code, const char *subject,
int *ovector, int stringcount, const char *stringname,
const char **stringptr)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_named_substring(const pcre16 *code, PCRE_SPTR16 subject,
int *ovector, int stringcount, PCRE_SPTR16 stringname,
PCRE_SPTR16 *stringptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
int *ovector, int stringcount, PCRE_SPTR32 stringname,
PCRE_SPTR32 *stringptr)
#endif
{
int n = get_first_set(code, stringname, ovector);
if (n <= 0) return n;
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
return pcre_get_substring(subject, ovector, stringcount, n, stringptr);
#else
#elif defined COMPILE_PCRE16
return pcre16_get_substring(subject, ovector, stringcount, n, stringptr);
#elif defined COMPILE_PCRE32
return pcre32_get_substring(subject, ovector, stringcount, n, stringptr);
#endif
}
@ -571,12 +645,15 @@ Argument: the result of a previous pcre_get_substring()
Returns: nothing
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre_free_substring(const char *pointer)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre16_free_substring(PCRE_SPTR16 pointer)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre32_free_substring(PCRE_SPTR32 pointer)
#endif
{
(PUBL(free))((void *)pointer);

View File

@ -52,7 +52,9 @@ a local function is used.
Also, when compiling for Virtual Pascal, things are done differently, and
global variables are not used. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"

File diff suppressed because it is too large Load Diff

View File

@ -45,7 +45,9 @@ compilation of dftables.c, in which case the macro DFTABLES is defined. */
#ifndef DFTABLES
# ifdef HAVE_CONFIG_H
# include "config.h"
# endif
# include "pcre_internal.h"
#endif
@ -64,12 +66,15 @@ Arguments: none
Returns: pointer to the contiguous block of data
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
const unsigned char *
pcre_maketables(void)
#else
#elif defined COMPILE_PCRE16
const unsigned char *
pcre16_maketables(void)
#elif defined COMPILE_PCRE32
const unsigned char *
pcre32_maketables(void)
#endif
{
unsigned char *yield, *p;
@ -125,7 +130,7 @@ within regexes. */
for (i = 0; i < 256; i++)
{
int x = 0;
if (i != 0x0b && isspace(i)) x += ctype_space;
if (i != CHAR_VT && isspace(i)) x += ctype_space;
if (isalpha(i)) x += ctype_letter;
if (isdigit(i)) x += ctype_digit;
if (isxdigit(i)) x += ctype_xdigit;

View File

@ -47,7 +47,9 @@ and NLTYPE_ANY. The full list of Unicode newline characters is taken from
http://unicode.org/unicode/reports/tr18/. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
@ -74,7 +76,7 @@ BOOL
PRIV(is_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR endptr, int *lenptr,
BOOL utf)
{
int c;
pcre_uint32 c;
(void)utf;
#ifdef SUPPORT_UTF
if (utf)
@ -85,11 +87,13 @@ else
#endif /* SUPPORT_UTF */
c = *ptr;
/* Note that this function is called only for ANY or ANYCRLF. */
if (type == NLTYPE_ANYCRLF) switch(c)
{
case 0x000a: *lenptr = 1; return TRUE; /* LF */
case 0x000d: *lenptr = (ptr < endptr - 1 && ptr[1] == 0x0a)? 2 : 1;
return TRUE; /* CR */
case CHAR_LF: *lenptr = 1; return TRUE;
case CHAR_CR: *lenptr = (ptr < endptr - 1 && ptr[1] == CHAR_LF)? 2 : 1;
return TRUE;
default: return FALSE;
}
@ -97,20 +101,29 @@ if (type == NLTYPE_ANYCRLF) switch(c)
else switch(c)
{
case 0x000a: /* LF */
case 0x000b: /* VT */
case 0x000c: *lenptr = 1; return TRUE; /* FF */
case 0x000d: *lenptr = (ptr < endptr - 1 && ptr[1] == 0x0a)? 2 : 1;
return TRUE; /* CR */
#ifdef EBCDIC
case CHAR_NEL:
#endif
case CHAR_LF:
case CHAR_VT:
case CHAR_FF: *lenptr = 1; return TRUE;
case CHAR_CR:
*lenptr = (ptr < endptr - 1 && ptr[1] == CHAR_LF)? 2 : 1;
return TRUE;
#ifndef EBCDIC
#ifdef COMPILE_PCRE8
case 0x0085: *lenptr = utf? 2 : 1; return TRUE; /* NEL */
case CHAR_NEL: *lenptr = utf? 2 : 1; return TRUE;
case 0x2028: /* LS */
case 0x2029: *lenptr = 3; return TRUE; /* PS */
#else
case 0x0085: /* NEL */
#else /* COMPILE_PCRE16 || COMPILE_PCRE32 */
case CHAR_NEL:
case 0x2028: /* LS */
case 0x2029: *lenptr = 1; return TRUE; /* PS */
#endif /* COMPILE_PCRE8 */
#endif /* Not EBCDIC */
default: return FALSE;
}
}
@ -138,7 +151,7 @@ BOOL
PRIV(was_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR startptr, int *lenptr,
BOOL utf)
{
int c;
pcre_uint32 c;
(void)utf;
ptr--;
#ifdef SUPPORT_UTF
@ -151,30 +164,45 @@ else
#endif /* SUPPORT_UTF */
c = *ptr;
/* Note that this function is called only for ANY or ANYCRLF. */
if (type == NLTYPE_ANYCRLF) switch(c)
{
case 0x000a: *lenptr = (ptr > startptr && ptr[-1] == 0x0d)? 2 : 1;
return TRUE; /* LF */
case 0x000d: *lenptr = 1; return TRUE; /* CR */
case CHAR_LF:
*lenptr = (ptr > startptr && ptr[-1] == CHAR_CR)? 2 : 1;
return TRUE;
case CHAR_CR: *lenptr = 1; return TRUE;
default: return FALSE;
}
/* NLTYPE_ANY */
else switch(c)
{
case 0x000a: *lenptr = (ptr > startptr && ptr[-1] == 0x0d)? 2 : 1;
return TRUE; /* LF */
case 0x000b: /* VT */
case 0x000c: /* FF */
case 0x000d: *lenptr = 1; return TRUE; /* CR */
case CHAR_LF:
*lenptr = (ptr > startptr && ptr[-1] == CHAR_CR)? 2 : 1;
return TRUE;
#ifdef EBCDIC
case CHAR_NEL:
#endif
case CHAR_VT:
case CHAR_FF:
case CHAR_CR: *lenptr = 1; return TRUE;
#ifndef EBCDIC
#ifdef COMPILE_PCRE8
case 0x0085: *lenptr = utf? 2 : 1; return TRUE; /* NEL */
case CHAR_NEL: *lenptr = utf? 2 : 1; return TRUE;
case 0x2028: /* LS */
case 0x2029: *lenptr = 3; return TRUE; /* PS */
#else
case 0x0085: /* NEL */
#else /* COMPILE_PCRE16 || COMPILE_PCRE32 */
case CHAR_NEL:
case 0x2028: /* LS */
case 0x2029: *lenptr = 1; return TRUE; /* PS */
#endif /* COMPILE_PCRE8 */
#endif /* NotEBCDIC */
default: return FALSE;
}
}

View File

@ -41,17 +41,20 @@ POSSIBILITY OF SUCH DAMAGE.
/* This file contains a private PCRE function that converts an ordinal
character value into a UTF8 string. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#define COMPILE_PCRE8
#include "pcre_internal.h"
/*************************************************
* Convert character value to UTF-8 *
*************************************************/
/* This function takes an integer value in the range 0 - 0x10ffff
and encodes it as a UTF-8 character in 1 to 6 pcre_uchars.
and encodes it as a UTF-8 character in 1 to 4 pcre_uchars.
Arguments:
cvalue the character value
@ -60,6 +63,7 @@ Arguments:
Returns: number of characters placed in the buffer
*/
unsigned
int
PRIV(ord2utf)(pcre_uint32 cvalue, pcre_uchar *buffer)
{
@ -67,11 +71,6 @@ PRIV(ord2utf)(pcre_uint32 cvalue, pcre_uchar *buffer)
register int i, j;
/* Checking invalid cvalue character, encoded as invalid UTF-16 character.
Should never happen in practice. */
if ((cvalue & 0xf800) == 0xd800 || cvalue >= 0x110000)
cvalue = 0xfffe;
for (i = 0; i < PRIV(utf8_table1_size); i++)
if ((int)cvalue <= PRIV(utf8_table1)[i]) break;
buffer += i;

View File

@ -44,7 +44,9 @@ pattern data block. This might be helpful in applications where the block is
shared by different users. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
@ -66,12 +68,15 @@ Returns: the (possibly updated) count value (a non-negative number), or
a negative error number
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_refcount(pcre *argument_re, int adjust)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_refcount(pcre16 *argument_re, int adjust)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_refcount(pcre32 *argument_re, int adjust)
#endif
{
REAL_PCRE *re = (REAL_PCRE *)argument_re;

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
supporting functions. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
@ -96,7 +98,7 @@ for (;;)
{
int d, min;
pcre_uchar *cs, *ce;
register int op = *cc;
register pcre_uchar op = *cc;
switch (op)
{
@ -321,15 +323,19 @@ for (;;)
/* Check a class for variable quantification */
#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
case OP_XCLASS:
cc += GET(cc, 1) - PRIV(OP_lengths)[OP_CLASS];
/* Fall through */
#endif
case OP_CLASS:
case OP_NCLASS:
#if defined SUPPORT_UTF || defined COMPILE_PCRE16 || defined COMPILE_PCRE32
case OP_XCLASS:
/* The original code caused an unsigned overflow in 64 bit systems,
so now we use a conditional statement. */
if (op == OP_XCLASS)
cc += GET(cc, 1);
else
cc += PRIV(OP_lengths)[OP_CLASS];
#else
cc += PRIV(OP_lengths)[OP_CLASS];
#endif
switch (*cc)
{
@ -536,7 +542,7 @@ Arguments:
p points to the character
caseless the caseless flag
cd the block with char table pointers
utf TRUE for UTF-8 / UTF-16 mode
utf TRUE for UTF-8 / UTF-16 / UTF-32 mode
Returns: pointer after the character
*/
@ -545,7 +551,7 @@ static const pcre_uchar *
set_table_bit(pcre_uint8 *start_bits, const pcre_uchar *p, BOOL caseless,
compile_data *cd, BOOL utf)
{
unsigned int c = *p;
pcre_uint32 c = *p;
#ifdef COMPILE_PCRE8
SET_BIT(c);
@ -562,18 +568,20 @@ if (utf && c > 127)
(void)PRIV(ord2utf)(c, buff);
SET_BIT(buff[0]);
}
#endif
#endif /* Not SUPPORT_UCP */
return p;
}
#endif
#else /* Not SUPPORT_UTF */
(void)(utf); /* Stops warning for unused parameter */
#endif /* SUPPORT_UTF */
/* Not UTF-8 mode, or character is less than 127. */
if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]);
return p + 1;
#endif
#endif /* COMPILE_PCRE8 */
#ifdef COMPILE_PCRE16
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
if (c > 0xff)
{
c = 0xff;
@ -593,10 +601,12 @@ if (utf && c > 127)
c = 0xff;
SET_BIT(c);
}
#endif
#endif /* SUPPORT_UCP */
return p;
}
#endif
#else /* Not SUPPORT_UTF */
(void)(utf); /* Stops warning for unused parameter */
#endif /* SUPPORT_UTF */
if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]);
return p + 1;
@ -626,10 +636,10 @@ Returns: nothing
*/
static void
set_type_bits(pcre_uint8 *start_bits, int cbit_type, int table_limit,
set_type_bits(pcre_uint8 *start_bits, int cbit_type, unsigned int table_limit,
compile_data *cd)
{
register int c;
register pcre_uint32 c;
for (c = 0; c < table_limit; c++) start_bits[c] |= cd->cbits[c+cbit_type];
#if defined SUPPORT_UTF && defined COMPILE_PCRE8
if (table_limit == 32) return;
@ -668,10 +678,10 @@ Returns: nothing
*/
static void
set_nottype_bits(pcre_uint8 *start_bits, int cbit_type, int table_limit,
set_nottype_bits(pcre_uint8 *start_bits, int cbit_type, unsigned int table_limit,
compile_data *cd)
{
register int c;
register pcre_uint32 c;
for (c = 0; c < table_limit; c++) start_bits[c] |= ~cd->cbits[c+cbit_type];
#if defined SUPPORT_UTF && defined COMPILE_PCRE8
if (table_limit != 32) for (c = 24; c < 32; c++) start_bits[c] = 0xff;
@ -695,7 +705,7 @@ function fails unless the result is SSB_DONE.
Arguments:
code points to an expression
start_bits points to a 32-byte table, initialized to 0
utf TRUE if in UTF-8 / UTF-16 mode
utf TRUE if in UTF-8 / UTF-16 / UTF-32 mode
cd the block with char table pointers
Returns: SSB_FAIL => Failed to find any starting bytes
@ -708,7 +718,7 @@ static int
set_start_bits(const pcre_uchar *code, pcre_uint8 *start_bits, BOOL utf,
compile_data *cd)
{
register int c;
register pcre_uint32 c;
int yield = SSB_DONE;
#if defined SUPPORT_UTF && defined COMPILE_PCRE8
int table_limit = utf? 16:32;
@ -984,8 +994,8 @@ do
identical. */
case OP_HSPACE:
SET_BIT(0x09);
SET_BIT(0x20);
SET_BIT(CHAR_HT);
SET_BIT(CHAR_SPACE);
#ifdef SUPPORT_UTF
if (utf)
{
@ -994,46 +1004,46 @@ do
SET_BIT(0xE1); /* For U+1680, U+180E */
SET_BIT(0xE2); /* For U+2000 - U+200A, U+202F, U+205F */
SET_BIT(0xE3); /* For U+3000 */
#endif
#ifdef COMPILE_PCRE16
#elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(0xA0);
SET_BIT(0xFF); /* For characters > 255 */
#endif
#endif /* COMPILE_PCRE[8|16|32] */
}
else
#endif /* SUPPORT_UTF */
{
#ifndef EBCDIC
SET_BIT(0xA0);
#ifdef COMPILE_PCRE16
#endif /* Not EBCDIC */
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(0xFF); /* For characters > 255 */
#endif
#endif /* COMPILE_PCRE[16|32] */
}
try_next = FALSE;
break;
case OP_ANYNL:
case OP_VSPACE:
SET_BIT(0x0A);
SET_BIT(0x0B);
SET_BIT(0x0C);
SET_BIT(0x0D);
SET_BIT(CHAR_LF);
SET_BIT(CHAR_VT);
SET_BIT(CHAR_FF);
SET_BIT(CHAR_CR);
#ifdef SUPPORT_UTF
if (utf)
{
#ifdef COMPILE_PCRE8
SET_BIT(0xC2); /* For U+0085 */
SET_BIT(0xE2); /* For U+2028, U+2029 */
#endif
#ifdef COMPILE_PCRE16
SET_BIT(0x85);
#elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(CHAR_NEL);
SET_BIT(0xFF); /* For characters > 255 */
#endif
#endif /* COMPILE_PCRE[8|16|32] */
}
else
#endif /* SUPPORT_UTF */
{
SET_BIT(0x85);
#ifdef COMPILE_PCRE16
SET_BIT(CHAR_NEL);
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(0xFF); /* For characters > 255 */
#endif
}
@ -1056,7 +1066,8 @@ do
break;
/* The cbit_space table has vertical tab as whitespace; we have to
ensure it is set as not whitespace. */
ensure it is set as not whitespace. Luckily, the code value is the same
(0x0b) in ASCII and EBCDIC, so we can just adjust the appropriate bit. */
case OP_NOT_WHITESPACE:
set_nottype_bits(start_bits, cbit_space, table_limit, cd);
@ -1064,8 +1075,9 @@ do
try_next = FALSE;
break;
/* The cbit_space table has vertical tab as whitespace; we have to
not set it from the table. */
/* The cbit_space table has vertical tab as whitespace; we have to not
set it from the table. Luckily, the code value is the same (0x0b) in
ASCII and EBCDIC, so we can just adjust the appropriate bit. */
case OP_WHITESPACE:
c = start_bits[1]; /* Save in case it was already set */
@ -1119,8 +1131,8 @@ do
return SSB_FAIL;
case OP_HSPACE:
SET_BIT(0x09);
SET_BIT(0x20);
SET_BIT(CHAR_HT);
SET_BIT(CHAR_SPACE);
#ifdef SUPPORT_UTF
if (utf)
{
@ -1129,38 +1141,38 @@ do
SET_BIT(0xE1); /* For U+1680, U+180E */
SET_BIT(0xE2); /* For U+2000 - U+200A, U+202F, U+205F */
SET_BIT(0xE3); /* For U+3000 */
#endif
#ifdef COMPILE_PCRE16
#elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(0xA0);
SET_BIT(0xFF); /* For characters > 255 */
#endif
#endif /* COMPILE_PCRE[8|16|32] */
}
else
#endif /* SUPPORT_UTF */
#ifndef EBCDIC
SET_BIT(0xA0);
#endif /* Not EBCDIC */
break;
case OP_ANYNL:
case OP_VSPACE:
SET_BIT(0x0A);
SET_BIT(0x0B);
SET_BIT(0x0C);
SET_BIT(0x0D);
SET_BIT(CHAR_LF);
SET_BIT(CHAR_VT);
SET_BIT(CHAR_FF);
SET_BIT(CHAR_CR);
#ifdef SUPPORT_UTF
if (utf)
{
#ifdef COMPILE_PCRE8
SET_BIT(0xC2); /* For U+0085 */
SET_BIT(0xE2); /* For U+2028, U+2029 */
#endif
#ifdef COMPILE_PCRE16
SET_BIT(0x85);
#elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(CHAR_NEL);
SET_BIT(0xFF); /* For characters > 255 */
#endif
#endif /* COMPILE_PCRE16 */
}
else
#endif /* SUPPORT_UTF */
SET_BIT(0x85);
SET_BIT(CHAR_NEL);
break;
case OP_NOT_DIGIT:
@ -1172,7 +1184,9 @@ do
break;
/* The cbit_space table has vertical tab as whitespace; we have to
ensure it gets set as not whitespace. */
ensure it gets set as not whitespace. Luckily, the code value is the
same (0x0b) in ASCII and EBCDIC, so we can just adjust the appropriate
bit. */
case OP_NOT_WHITESPACE:
set_nottype_bits(start_bits, cbit_space, table_limit, cd);
@ -1180,7 +1194,8 @@ do
break;
/* The cbit_space table has vertical tab as whitespace; we have to
avoid setting it. */
avoid setting it. Luckily, the code value is the same (0x0b) in ASCII
and EBCDIC, so we can just adjust the appropriate bit. */
case OP_WHITESPACE:
c = start_bits[1]; /* Save in case it was already set */
@ -1214,7 +1229,7 @@ do
memset(start_bits+25, 0xff, 7); /* Bits for 0xc9 - 0xff */
}
#endif
#ifdef COMPILE_PCRE16
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(0xFF); /* For characters > 255 */
#endif
/* Fall through */
@ -1310,12 +1325,15 @@ Returns: pointer to a pcre[16]_extra block, with study_data filled in and
NULL on error or if no optimization possible
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN pcre_extra * PCRE_CALL_CONVENTION
pcre_study(const pcre *external_re, int options, const char **errorptr)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN pcre16_extra * PCRE_CALL_CONVENTION
pcre16_study(const pcre16 *external_re, int options, const char **errorptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN pcre32_extra * PCRE_CALL_CONVENTION
pcre32_study(const pcre32 *external_re, int options, const char **errorptr)
#endif
{
int min;
@ -1338,10 +1356,12 @@ if (re == NULL || re->magic_number != MAGIC_NUMBER)
if ((re->flags & PCRE_MODE) == 0)
{
#ifdef COMPILE_PCRE8
*errorptr = "argument is compiled in 16 bit mode";
#else
*errorptr = "argument is compiled in 8 bit mode";
#if defined COMPILE_PCRE8
*errorptr = "argument not compiled in 8 bit mode";
#elif defined COMPILE_PCRE16
*errorptr = "argument not compiled in 16 bit mode";
#elif defined COMPILE_PCRE32
*errorptr = "argument not compiled in 32 bit mode";
#endif
return NULL;
}
@ -1368,14 +1388,18 @@ if ((re->options & PCRE_ANCHORED) == 0 &&
tables = re->tables;
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
if (tables == NULL)
(void)pcre_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
(void *)(&tables));
#else
#elif defined COMPILE_PCRE16
if (tables == NULL)
(void)pcre16_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
(void *)(&tables));
#elif defined COMPILE_PCRE32
if (tables == NULL)
(void)pcre32_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
(void *)(&tables));
#endif
compile_block.lcc = tables + lcc_offset;
@ -1406,20 +1430,20 @@ switch(min = find_minlength(code, code, re->options, 0))
}
/* If a set of starting bytes has been identified, or if the minimum length is
greater than zero, or if JIT optimization has been requested, get a
pcre[16]_extra block and a pcre_study_data block. The study data is put in the
latter, which is pointed to by the former, which may also get additional data
set later by the calling program. At the moment, the size of pcre_study_data
is fixed. We nevertheless save it in a field for returning via the
pcre_fullinfo() function so that if it becomes variable in the future,
we don't have to change that code. */
greater than zero, or if JIT optimization has been requested, or if
PCRE_STUDY_EXTRA_NEEDED is set, get a pcre[16]_extra block and a
pcre_study_data block. The study data is put in the latter, which is pointed to
by the former, which may also get additional data set later by the calling
program. At the moment, the size of pcre_study_data is fixed. We nevertheless
save it in a field for returning via the pcre_fullinfo() function so that if it
becomes variable in the future, we don't have to change that code. */
if (bits_set || min > 0
if (bits_set || min > 0 || (options & (
#ifdef SUPPORT_JIT
|| (options & (PCRE_STUDY_JIT_COMPILE | PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE
| PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE)) != 0
PCRE_STUDY_JIT_COMPILE | PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE |
PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE |
#endif
)
PCRE_STUDY_EXTRA_NEEDED)) != 0)
{
extra = (PUBL(extra) *)(PUBL(malloc))
(sizeof(PUBL(extra)) + sizeof(pcre_study_data));
@ -1473,7 +1497,8 @@ if (bits_set || min > 0
/* If JIT support was compiled and requested, attempt the JIT compilation.
If no starting bytes were found, and the minimum length is zero, and JIT
compilation fails, abandon the extra block and return NULL. */
compilation fails, abandon the extra block and return NULL, unless
PCRE_STUDY_EXTRA_NEEDED is set. */
#ifdef SUPPORT_JIT
extra->executable_jit = NULL;
@ -1484,13 +1509,15 @@ if (bits_set || min > 0
if ((options & PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE) != 0)
PRIV(jit_compile)(re, extra, JIT_PARTIAL_HARD_COMPILE);
if (study->flags == 0 && (extra->flags & PCRE_EXTRA_EXECUTABLE_JIT) == 0)
if (study->flags == 0 && (extra->flags & PCRE_EXTRA_EXECUTABLE_JIT) == 0 &&
(options & PCRE_STUDY_EXTRA_NEEDED) == 0)
{
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
pcre_free_study(extra);
#endif
#ifdef COMPILE_PCRE16
#elif defined COMPILE_PCRE16
pcre16_free_study(extra);
#elif defined COMPILE_PCRE32
pcre32_free_study(extra);
#endif
extra = NULL;
}
@ -1511,12 +1538,15 @@ Argument: a pointer to the pcre[16]_extra block
Returns: nothing
*/
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN void
pcre_free_study(pcre_extra *extra)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN void
pcre16_free_study(pcre16_extra *extra)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN void
pcre32_free_study(pcre32_extra *extra)
#endif
{
if (extra == NULL)

View File

@ -45,7 +45,9 @@ uses macros to change their names from _pcre_xxx to xxxx, thereby avoiding name
clashes with the library. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
@ -56,6 +58,12 @@ the definition is next to the definition of the opcodes in pcre_internal.h. */
const pcre_uint8 PRIV(OP_lengths)[] = { OP_LENGTHS };
/* Tables of horizontal and vertical whitespace characters, suitable for
adding to classes. */
const pcre_uint32 PRIV(hspace_list)[] = { HSPACE_LIST };
const pcre_uint32 PRIV(vspace_list)[] = { VSPACE_LIST };
/*************************************************
@ -66,9 +74,9 @@ const pcre_uint8 PRIV(OP_lengths)[] = { OP_LENGTHS };
character. */
#if (defined SUPPORT_UTF && defined COMPILE_PCRE8) \
|| (defined PCRE_INCLUDED && defined SUPPORT_PCRE16)
|| (defined PCRE_INCLUDED && (defined SUPPORT_PCRE16 || defined SUPPORT_PCRE32))
/* These tables are also required by pcretest in 16 bit mode. */
/* These tables are also required by pcretest in 16- or 32-bit mode. */
const int PRIV(utf8_table1)[] =
{ 0x7f, 0x7ff, 0xffff, 0x1fffff, 0x3ffffff, 0x7fffffff};
@ -90,13 +98,13 @@ const pcre_uint8 PRIV(utf8_table4)[] = {
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5 };
#endif /* (SUPPORT_UTF && COMPILE_PCRE8) || (PCRE_INCLUDED && SUPPORT_PCRE16)*/
#endif /* (SUPPORT_UTF && COMPILE_PCRE8) || (PCRE_INCLUDED && SUPPORT_PCRE[16|32])*/
#ifdef SUPPORT_UTF
/* Table to translate from particular type value to the general value. */
const int PRIV(ucp_gentype)[] = {
const pcre_uint32 PRIV(ucp_gentype)[] = {
ucp_C, ucp_C, ucp_C, ucp_C, ucp_C, /* Cc, Cf, Cn, Co, Cs */
ucp_L, ucp_L, ucp_L, ucp_L, ucp_L, /* Ll, Lu, Lm, Lo, Lt */
ucp_M, ucp_M, ucp_M, /* Mc, Me, Mn */
@ -107,6 +115,66 @@ const int PRIV(ucp_gentype)[] = {
ucp_Z, ucp_Z, ucp_Z /* Zl, Zp, Zs */
};
/* This table encodes the rules for finding the end of an extended grapheme
cluster. Every code point has a grapheme break property which is one of the
ucp_gbXX values defined in ucp.h. The 2-dimensional table is indexed by the
properties of two adjacent code points. The left property selects a word from
the table, and the right property selects a bit from that word like this:
ucp_gbtable[left-property] & (1 << right-property)
The value is non-zero if a grapheme break is NOT permitted between the relevant
two code points. The breaking rules are as follows:
1. Break at the start and end of text (pretty obviously).
2. Do not break between a CR and LF; otherwise, break before and after
controls.
3. Do not break Hangul syllable sequences, the rules for which are:
L may be followed by L, V, LV or LVT
LV or V may be followed by V or T
LVT or T may be followed by T
4. Do not break before extending characters.
The next two rules are only for extended grapheme clusters (but that's what we
are implementing).
5. Do not break before SpacingMarks.
6. Do not break after Prepend characters.
7. Otherwise, break everywhere.
*/
const pcre_uint32 PRIV(ucp_gbtable[]) = {
(1<<ucp_gbLF), /* 0 CR */
0, /* 1 LF */
0, /* 2 Control */
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark), /* 3 Extend */
(1<<ucp_gbExtend)|(1<<ucp_gbPrepend)| /* 4 Prepend */
(1<<ucp_gbSpacingMark)|(1<<ucp_gbL)|
(1<<ucp_gbV)|(1<<ucp_gbT)|(1<<ucp_gbLV)|
(1<<ucp_gbLVT)|(1<<ucp_gbOther),
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark), /* 5 SpacingMark */
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbL)| /* 6 L */
(1<<ucp_gbL)|(1<<ucp_gbV)|(1<<ucp_gbLV)|(1<<ucp_gbLVT),
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbV)| /* 7 V */
(1<<ucp_gbT),
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbT), /* 8 T */
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbV)| /* 9 LV */
(1<<ucp_gbT),
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbT), /* 10 LVT */
(1<<ucp_gbRegionalIndicator), /* 11 RegionalIndicator */
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark) /* 12 Other */
};
#ifdef SUPPORT_JIT
/* This table reverses PRIV(ucp_gentype). We can save the cost
of a memory load. */

File diff suppressed because it is too large Load Diff

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
strings. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
@ -90,6 +92,7 @@ PCRE_UTF8_ERR18 Overlong 5-byte sequence (won't ever occur)
PCRE_UTF8_ERR19 Overlong 6-byte sequence (won't ever occur)
PCRE_UTF8_ERR20 Isolated 0x80 byte (not within UTF-8 character)
PCRE_UTF8_ERR21 Byte with the illegal value 0xfe or 0xff
PCRE_UTF8_ERR22 Non-character
Arguments:
string points to the string
@ -114,7 +117,8 @@ if (length < 0)
for (p = string; length-- > 0; p++)
{
register int ab, c, d;
register pcre_uchar ab, c, d;
pcre_uint32 v = 0;
c = *p;
if (c < 128) continue; /* ASCII character */
@ -183,6 +187,7 @@ for (p = string; length-- > 0; p++)
*erroroffset = (int)(p - string) - 2;
return PCRE_UTF8_ERR14;
}
v = ((c & 0x0f) << 12) | ((d & 0x3f) << 6) | (*p & 0x3f);
break;
/* 4-byte character. Check 3rd and 4th bytes for 0x80. Then check first 2
@ -210,6 +215,7 @@ for (p = string; length-- > 0; p++)
*erroroffset = (int)(p - string) - 3;
return PCRE_UTF8_ERR13;
}
v = ((c & 0x07) << 18) | ((d & 0x3f) << 12) | ((p[-1] & 0x3f) << 6) | (*p & 0x3f);
break;
/* 5-byte and 6-byte characters are not allowed by RFC 3629, and will be
@ -284,11 +290,20 @@ for (p = string; length-- > 0; p++)
*erroroffset = (int)(p - string) - ab;
return (ab == 4)? PCRE_UTF8_ERR11 : PCRE_UTF8_ERR12;
}
/* Reject non-characters. The pointer p is currently at the last byte of the
character. */
if ((v & 0xfffeu) == 0xfffeu || (v >= 0xfdd0 && v <= 0xfdef))
{
*erroroffset = (int)(p - string) - ab;
return PCRE_UTF8_ERR22;
}
}
#else /* SUPPORT_UTF */
#else /* Not SUPPORT_UTF */
(void)(string); /* Keep picky compilers happy */
(void)(length);
(void)(erroroffset);
#endif
return PCRE_UTF8_ERR0; /* This indicates success */

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
string that identifies the PCRE version that is in use. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
@ -77,12 +79,15 @@ I could find no way of detecting that a macro is defined as an empty string at
pre-processor time. This hack uses a standard trick for avoiding calling
the STRING macro with an empty argument when doing the test. */
#ifdef COMPILE_PCRE8
#if defined COMPILE_PCRE8
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
pcre_version(void)
#else
#elif defined COMPILE_PCRE16
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
pcre16_version(void)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
pcre32_version(void)
#endif
{
return (XSTRING(Z PCRE_PRERELEASE)[1] == 0)?

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
class. It is used by both pcre_exec() and pcre_def_exec(). */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#include "pcre_internal.h"
@ -62,9 +64,9 @@ Returns: TRUE if character matches, else FALSE
*/
BOOL
PRIV(xclass)(int c, const pcre_uchar *data, BOOL utf)
PRIV(xclass)(pcre_uint32 c, const pcre_uchar *data, BOOL utf)
{
int t;
pcre_uchar t;
BOOL negated = (*data & XCL_NOT) != 0;
(void)utf;
@ -92,7 +94,7 @@ if ((*data++ & XCL_MAP) != 0) data += 32 / sizeof(pcre_uchar);
while ((t = *data++) != XCL_END)
{
int x, y;
pcre_uint32 x, y;
if (t == XCL_SINGLE)
{
#ifdef SUPPORT_UTF

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
functions. */
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
/* Ensure that the PCREPOSIX_EXP_xxx macros are set appropriately for
@ -155,11 +157,12 @@ static const int eint[] = {
REG_BADPAT, /* internal error: unknown opcode in find_fixedlength() */
REG_BADPAT, /* \N is not supported in a class */
REG_BADPAT, /* too many forward references */
REG_BADPAT, /* disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff) */
REG_BADPAT, /* disallowed UTF-8/16/32 code point (>= 0xd800 && <= 0xdfff) */
REG_BADPAT, /* invalid UTF-16 string (should not occur) */
/* 75 */
REG_BADPAT, /* overlong MARK name */
REG_BADPAT /* character value in \u.... sequence is too large */
REG_BADPAT, /* character value in \u.... sequence is too large */
REG_BADPAT /* invalid UTF-32 string (should not occur) */
};
/* Table of texts corresponding to POSIX error codes */
@ -257,6 +260,7 @@ const char *errorptr;
int erroffset;
int errorcode;
int options = 0;
int re_nsub = 0;
if ((cflags & REG_ICASE) != 0) options |= PCRE_CASELESS;
if ((cflags & REG_NEWLINE) != 0) options |= PCRE_MULTILINE;
@ -280,7 +284,8 @@ if (preg->re_pcre == NULL)
}
(void)pcre_fullinfo((const pcre *)preg->re_pcre, NULL, PCRE_INFO_CAPTURECOUNT,
&(preg->re_nsub));
&re_nsub);
preg->re_nsub = (size_t)re_nsub;
return 0;
}
@ -312,7 +317,7 @@ int *ovector = NULL;
int small_ovector[POSIX_MALLOC_THRESHOLD * 3];
BOOL allocated_ovector = FALSE;
BOOL nosub =
(((const pcre *)preg->re_pcre)->options & PCRE_NO_AUTO_CAPTURE) != 0;
(REAL_PCRE_OPTIONS((const pcre *)preg->re_pcre) & PCRE_NO_AUTO_CAPTURE) != 0;
if ((eflags & REG_NOTBOL) != 0) options |= PCRE_NOTBOL;
if ((eflags & REG_NOTEOL) != 0) options |= PCRE_NOTEOL;

View File

@ -93,6 +93,7 @@ RC=0
---------------------------- Test 13 -----------------------------
Here is the pattern again.
That time it was on a line by itself.
seventeen
This line contains pattern not on a line by itself.
RC=0
---------------------------- Test 14 -----------------------------
@ -370,11 +371,11 @@ RC=2
---------------------------- Test 34 -----------------------------
RC=2
---------------------------- Test 35 -----------------------------
./testdata/grepinput8
./testdata/grepinputx
RC=0
---------------------------- Test 36 -----------------------------
./testdata/grepinput3
./testdata/grepinput8
./testdata/grepinputx
RC=0
---------------------------- Test 37 -----------------------------
@ -643,6 +644,7 @@ testdata/grepinputv:fox jumps
testdata/grepinputx:complete pair
testdata/grepinputx:That was a complete pair
testdata/grepinputx:complete pair
testdata/grepinput3:triple: t7_txt s1_tag s_txt p_tag p_txt o_tag o_txt
RC=0
---------------------------- Test 85 -----------------------------
./testdata/grepinput3:Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
@ -668,3 +670,38 @@ RC=0
---------------------------- Test 93 -----------------------------
The quick brown fx jumps over the lazy dog.
RC=0
---------------------------- Test 94 -----------------------------
./testdata/grepinput8
./testdata/grepinputx
RC=0
---------------------------- Test 95 -----------------------------
testdata/grepinputx:complete pair
testdata/grepinputx:That was a complete pair
testdata/grepinputx:complete pair
RC=0
---------------------------- Test 96 -----------------------------
./testdata/grepinput3
./testdata/grepinput8
./testdata/grepinputx
RC=0
---------------------------- Test 97 -----------------------------
./testdata/grepinput3
./testdata/grepinputx
RC=0
---------------------------- Test 98 -----------------------------
./testdata/grepinputx
RC=0
---------------------------- Test 99 -----------------------------
./testdata/grepinput3
./testdata/grepinputx
RC=0
---------------------------- Test 100 ------------------------------
./testdata/grepinput:zerothe.
./testdata/grepinput:zeroa
./testdata/grepinput:zerothe.
RC=0
---------------------------- Test 101 ------------------------------
./testdata/grepinput:.|zero|the|.
./testdata/grepinput:zero|a
./testdata/grepinput:.|zero|the|.
RC=0

View File

@ -5262,4 +5262,45 @@ name were given. ---/
/((?>a?)*)*c/
aac
/(?>.*?a)(?<=ba)/
aba
/(?:.*?a)(?<=ba)/
aba
/.*?a(*PRUNE)b/
aab
/.*?a(*PRUNE)b/s
aab
/^a(*PRUNE)b/s
aab
/.*?a(*SKIP)b/
aab
/(?>.*?a)b/s
aab
/(?>.*?a)b/
aab
/(?>^a)b/s
aab
/(?>.*?)(?<=(abcd)|(wxyz))/
alphabetabcd
endingwxyz
/(?>.*)(?<=(abcd)|(wxyz))/
alphabetabcd
endingwxyz
"(?>.*)foo"
abcdfooxyz
"(?>.*?)foo"
abcdfooxyz
/-- End of testinput1 --/

View File

@ -1026,4 +1026,312 @@
AA\P
AA\P\P
/-- These are tests for extended grapheme clusters --/
/^\X/8+
G\x{34e}\x{34e}X
\x{34e}\x{34e}X
\x04X
\x{1100}X
\x{1100}\x{34e}X
\x{1b04}\x{1b04}X
*These match up to the roman letters
\x{1111}\x{1111}L,L
\x{1111}\x{1111}\x{1169}L,L,V
\x{1111}\x{ae4c}L, LV
\x{1111}\x{ad89}L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
*These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
\x{ae4c}\x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
\x{1169}\x{1111}V, L
\x{1169}\x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
\x{11fe}\x{1169}T, V
\x{11fe}\x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
*Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
\x0d\x{0711}CR, extend
\x0d\x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
\x0a\x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
\x09\x{1b04}Control, spacingmark
*There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}?X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/-- --/
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
/[z\x{1e9e}]+/8i
\x{1e9e}\x{00df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
/[z\x{00df}]+/8i
\x{1e9e}\x{00df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
/[z\x{1f88}]+/8i
\x{1f88}\x{1f80}
/-- Perl matches these --/
/\x{00b5}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{039c}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{03bc}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{00c5}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{00e5}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{212b}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{01c4}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c5}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c6}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c7}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01c8}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01c9}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01ca}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01cb}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01cc}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01f1}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{01f2}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{01f3}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{0345}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0399}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{03b9}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{1fbe}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0392}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{03b2}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{03d0}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{0395}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{03b5}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{03f5}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{0398}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03b8}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03d1}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03f4}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{039a}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03ba}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03f0}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03a0}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03c0}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03d6}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03a1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03c1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03f1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03a3}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03c2}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03c3}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03a6}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03c6}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03d5}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03c9}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{03a9}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{2126}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{1e60}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
/\x{1f80}+/8i
\x{1f88}\x{1f80}
/\x{004b}+/8i
\x{004b}\x{006b}\x{212a}
/\x{006b}+/8i
\x{004b}\x{006b}\x{212a}
/\x{212a}+/8i
\x{004b}\x{006b}\x{212a}
/\x{0053}+/8i
\x{0053}\x{0073}\x{017f}
/\x{0073}+/8i
\x{0053}\x{0073}\x{017f}
/\x{017f}+/8i
\x{0053}\x{0073}\x{017f}
/ist/8i
ikt
/is+t/8i
iSs\x{17f}t
ikt
/is+?t/8i
ikt
/is?t/8i
ikt
/is{2}t/8i
iskt
/-- End of testinput10 --/

View File

@ -3769,4 +3769,45 @@ assertion, and therefore fails the entire subroutine call. --/
/((?=a(*COMMIT)b)ab|ac){0}(?:(?1)|a(c))/
ac
/-- These are all run as real matches in test 1; here we are just checking the
settings of the anchored and startline bits. --/
/(?>.*?a)(?<=ba)/I
/(?:.*?a)(?<=ba)/I
/.*?a(*PRUNE)b/I
/.*?a(*PRUNE)b/sI
/^a(*PRUNE)b/sI
/.*?a(*SKIP)b/I
/(?>.*?a)b/sI
/(?>.*?a)b/I
/(?>^a)b/sI
/(?>.*?)(?<=(abcd)|(wxyz))/I
/(?>.*)(?<=(abcd)|(wxyz))/I
"(?>.*)foo"I
"(?>.*?)foo"I
/(?>^abc)/mI
/(?>.*abc)/mI
/(?:.*abc)/mI
/-- Check PCRE_STUDY_EXTRA_NEEDED --/
/.?/S-I
/.?/S!I
/-- End of testinput2 --/

View File

@ -1,6 +1,5 @@
/-- This set of tests is for Unicode property support. It is compatible with
Perl >= 5.10, but not 5.8 because it tests some extra properties that are
not in the earlier release. --/
Perl >= 5.15. --/
/^\pC\pL\pM\pN\pP\pS\pZ</8
\x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
@ -407,6 +406,12 @@
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
/^\X/8
A
A\x{300}BC
A\x{300}\x{301}\x{302}BC
\x{300}
/^\p{Han}+/8
\x{2e81}\x{3007}\x{2f804}\x{31a0}
** Failers
@ -666,6 +671,7 @@
\x{65c}
\x{65d}
\x{65e}
\x{65f}
\x{66a}
\x{6e9}
\x{6ef}
@ -677,7 +683,6 @@
\x{653}
\x{654}
\x{655}
\x{65f}
/^\p{Cyrillic}/8
\x{1d2b}
@ -815,4 +820,500 @@
Ⱥ
/-- These are tests for extended grapheme clusters --/
/^\X/8+
G\x{34e}\x{34e}X
\x{34e}\x{34e}X
\x04X
\x{1100}X
\x{1100}\x{34e}X
\x{1b04}\x{1b04}X
*These match up to the roman letters
\x{1111}\x{1111}L,L
\x{1111}\x{1111}\x{1169}L,L,V
\x{1111}\x{ae4c}L, LV
\x{1111}\x{ad89}L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
*These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
\x{ae4c}\x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
\x{1169}\x{1111}V, L
\x{1169}\x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
\x{11fe}\x{1169}T, V
\x{11fe}\x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
*Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
\x0d\x{0711}CR, extend
\x0d\x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
\x0a\x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
\x09\x{1b04}Control, spacingmark
*There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}?X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/-- --/
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
/[z\x{1e9e}]+/8i
\x{1e9e}\x{00df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
/[z\x{00df}]+/8i
\x{1e9e}\x{00df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
/[z\x{1f88}]+/8i
\x{1f88}\x{1f80}
/-- Characters with more than one other case; test in classes --/
/[z\x{00b5}]+/8i
\x{00b5}\x{039c}\x{03bc}
/[z\x{039c}]+/8i
\x{00b5}\x{039c}\x{03bc}
/[z\x{03bc}]+/8i
\x{00b5}\x{039c}\x{03bc}
/[z\x{00c5}]+/8i
\x{00c5}\x{00e5}\x{212b}
/[z\x{00e5}]+/8i
\x{00c5}\x{00e5}\x{212b}
/[z\x{212b}]+/8i
\x{00c5}\x{00e5}\x{212b}
/[z\x{01c4}]+/8i
\x{01c4}\x{01c5}\x{01c6}
/[z\x{01c5}]+/8i
\x{01c4}\x{01c5}\x{01c6}
/[z\x{01c6}]+/8i
\x{01c4}\x{01c5}\x{01c6}
/[z\x{01c7}]+/8i
\x{01c7}\x{01c8}\x{01c9}
/[z\x{01c8}]+/8i
\x{01c7}\x{01c8}\x{01c9}
/[z\x{01c9}]+/8i
\x{01c7}\x{01c8}\x{01c9}
/[z\x{01ca}]+/8i
\x{01ca}\x{01cb}\x{01cc}
/[z\x{01cb}]+/8i
\x{01ca}\x{01cb}\x{01cc}
/[z\x{01cc}]+/8i
\x{01ca}\x{01cb}\x{01cc}
/[z\x{01f1}]+/8i
\x{01f1}\x{01f2}\x{01f3}
/[z\x{01f2}]+/8i
\x{01f1}\x{01f2}\x{01f3}
/[z\x{01f3}]+/8i
\x{01f1}\x{01f2}\x{01f3}
/[z\x{0345}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{0399}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{03b9}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{1fbe}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{0392}]+/8i
\x{0392}\x{03b2}\x{03d0}
/[z\x{03b2}]+/8i
\x{0392}\x{03b2}\x{03d0}
/[z\x{03d0}]+/8i
\x{0392}\x{03b2}\x{03d0}
/[z\x{0395}]+/8i
\x{0395}\x{03b5}\x{03f5}
/[z\x{03b5}]+/8i
\x{0395}\x{03b5}\x{03f5}
/[z\x{03f5}]+/8i
\x{0395}\x{03b5}\x{03f5}
/[z\x{0398}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{03b8}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{03d1}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{03f4}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{039a}]+/8i
\x{039a}\x{03ba}\x{03f0}
/[z\x{03ba}]+/8i
\x{039a}\x{03ba}\x{03f0}
/[z\x{03f0}]+/8i
\x{039a}\x{03ba}\x{03f0}
/[z\x{03a0}]+/8i
\x{03a0}\x{03c0}\x{03d6}
/[z\x{03c0}]+/8i
\x{03a0}\x{03c0}\x{03d6}
/[z\x{03d6}]+/8i
\x{03a0}\x{03c0}\x{03d6}
/[z\x{03a1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
/[z\x{03c1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
/[z\x{03f1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
/[z\x{03a3}]+/8i
\x{03A3}\x{03C2}\x{03C3}
/[z\x{03c2}]+/8i
\x{03A3}\x{03C2}\x{03C3}
/[z\x{03c3}]+/8i
\x{03A3}\x{03C2}\x{03C3}
/[z\x{03a6}]+/8i
\x{03a6}\x{03c6}\x{03d5}
/[z\x{03c6}]+/8i
\x{03a6}\x{03c6}\x{03d5}
/[z\x{03d5}]+/8i
\x{03a6}\x{03c6}\x{03d5}
/[z\x{03c9}]+/8i
\x{03c9}\x{03a9}\x{2126}
/[z\x{03a9}]+/8i
\x{03c9}\x{03a9}\x{2126}
/[z\x{2126}]+/8i
\x{03c9}\x{03a9}\x{2126}
/[z\x{1e60}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e61}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e9b}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
/-- Perl 5.12.4 gets these wrong, but 5.15.3 is OK --/
/[z\x{004b}]+/8i
\x{004b}\x{006b}\x{212a}
/[z\x{006b}]+/8i
\x{004b}\x{006b}\x{212a}
/[z\x{212a}]+/8i
\x{004b}\x{006b}\x{212a}
/[z\x{0053}]+/8i
\x{0053}\x{0073}\x{017f}
/[z\x{0073}]+/8i
\x{0053}\x{0073}\x{017f}
/[z\x{017f}]+/8i
\x{0053}\x{0073}\x{017f}
/-- --/
/(ΣΆΜΟΣ) \1/8i
ΣΆΜΟΣ ΣΆΜΟΣ
ΣΆΜΟΣ σάμος
σάμος σάμος
σάμος σάμοσ
σάμος ΣΆΜΟΣ
/(σάμος) \1/8i
ΣΆΜΟΣ ΣΆΜΟΣ
ΣΆΜΟΣ σάμος
σάμος σάμος
σάμος σάμοσ
σάμος ΣΆΜΟΣ
/(ΣΆΜΟΣ) \1*/8i
ΣΆΜΟΣ\x20
ΣΆΜΟΣ ΣΆΜΟΣσάμοςσάμος
/-- Perl matches these --/
/\x{00b5}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{039c}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{03bc}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{00c5}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{00e5}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{212b}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{01c4}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c5}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c6}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c7}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01c8}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01c9}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01ca}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01cb}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01cc}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01f1}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{01f2}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{01f3}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{0345}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0399}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{03b9}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{1fbe}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0392}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{03b2}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{03d0}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{0395}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{03b5}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{03f5}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{0398}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03b8}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03d1}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03f4}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{039a}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03ba}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03f0}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03a0}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03c0}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03d6}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03a1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03c1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03f1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03a3}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03c2}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03c3}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03a6}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03c6}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03d5}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03c9}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{03a9}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{2126}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{1e60}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
/\x{1f80}+/8i
\x{1f88}\x{1f80}
/-- Perl 5.12.4 gets these wrong, but 5.15.3 is OK --/
/\x{004b}+/8i
\x{004b}\x{006b}\x{212a}
/\x{006b}+/8i
\x{004b}\x{006b}\x{212a}
/\x{212a}+/8i
\x{004b}\x{006b}\x{212a}
/\x{0053}+/8i
\x{0053}\x{0073}\x{017f}
/\x{0073}+/8i
\x{0053}\x{0073}\x{017f}
/\x{017f}+/8i
\x{0053}\x{0073}\x{017f}
/-- End of testinput6 --/

View File

@ -89,7 +89,7 @@
/(\p{Yi}{0,3}+\277)*/
/\p{Zl}{2,3}+/8BZ
\xe2\x80\xa8\xe2\x80\xa8
\x{2028}\x{2028}\x{2028}
/\p{Zl}/8BZ
@ -195,15 +195,6 @@ of case for anything other than the ASCII letters. --/
\x{c0}
\x{e0}
/-- This should be Perl-compatible but Perl 5.11 gets \x{300} wrong. --/8
/^\X/8
A
A\x{300}BC
A\x{300}\x{301}\x{302}BC
*** Failers
\x{300}
/-- These are PCRE's extra properties to help with Unicodizing \d etc. --/
/^\p{Xan}/8
@ -622,4 +613,60 @@ of case for anything other than the ASCII letters. --/
AA\P
AA\P\P
/A\x{3a3}B/8iDZ
/\x{3a3}B/8iDZ
/[\x{3a3}]/8iBZ
/[^\x{3a3}]/8iBZ
/[\x{3a3}]+/8iBZ
/[^\x{3a3}]+/8iBZ
/a*\x{3a3}/8iBZ
/\x{3a3}+a/8iBZ
/\x{3a3}*\x{3c2}/8iBZ
/\x{3a3}{3}/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}{2,4}/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}{2,4}?/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}+./8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}++./8i+
** Failers
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}*\x{3c2}/8iBZ
/[^\x{3a3}]*\x{3c2}/8iBZ
/[^a]*\x{3c2}/8iBZ
/ist/8iBZ
ikt
/is+t/8i
iSs\x{17f}t
ikt
/is+?t/8i
ikt
/is?t/8i
ikt
/is{2}t/8i
iskt
/-- End of testinput7 --/

View File

@ -8733,4 +8733,66 @@ No match
0: aac
1:
/(?>.*?a)(?<=ba)/
aba
0: ba
/(?:.*?a)(?<=ba)/
aba
0: aba
/.*?a(*PRUNE)b/
aab
0: ab
/.*?a(*PRUNE)b/s
aab
0: ab
/^a(*PRUNE)b/s
aab
No match
/.*?a(*SKIP)b/
aab
0: ab
/(?>.*?a)b/s
aab
0: ab
/(?>.*?a)b/
aab
0: ab
/(?>^a)b/s
aab
No match
/(?>.*?)(?<=(abcd)|(wxyz))/
alphabetabcd
0:
1: abcd
endingwxyz
0:
1: <unset>
2: wxyz
/(?>.*)(?<=(abcd)|(wxyz))/
alphabetabcd
0: alphabetabcd
1: abcd
endingwxyz
0: endingwxyz
1: <unset>
2: wxyz
"(?>.*)foo"
abcdfooxyz
No match
"(?>.*?)foo"
abcdfooxyz
0: foo
/-- End of testinput1 --/

View File

@ -90,7 +90,7 @@ No match
9: **
10: *
\x{300}\x{301}\x{302}
No match
0: \x{300}\x{301}\x{302}
/\X?abc/8
abc
@ -100,7 +100,7 @@ No match
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
0: A\x{300}abc
\x{300}abc
0: abc
0: \x{300}abc
*** Failers
No match
@ -114,7 +114,7 @@ No match
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
No match
\x{300}abc
No match
0: \x{300}abc
/\X*abc/8
abc
@ -124,7 +124,7 @@ No match
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
0: A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abc
\x{300}abc
0: abc
0: \x{300}abc
*** Failers
No match
@ -138,7 +138,7 @@ No match
*** Failers
No match
\x{300}abc
No match
0: \x{300}abc
/^\pL?=./8
A=b
@ -1133,7 +1133,7 @@ No match
*** Failers
0: *
\x{300}
No match
0: \x{300}
/^[\X]/8
X123
@ -2100,4 +2100,627 @@ Partial match: AA
AA\P\P
Partial match: AA
/-- These are tests for extended grapheme clusters --/
/^\X/8+
G\x{34e}\x{34e}X
0: G\x{34e}\x{34e}
0+ X
\x{34e}\x{34e}X
0: \x{34e}\x{34e}
0+ X
\x04X
0: \x{04}
0+ X
\x{1100}X
0: \x{1100}
0+ X
\x{1100}\x{34e}X
0: \x{1100}\x{34e}
0+ X
\x{1b04}\x{1b04}X
0: \x{1b04}\x{1b04}
0+ X
*These match up to the roman letters
0: *
0+ These match up to the roman letters
\x{1111}\x{1111}L,L
0: \x{1111}\x{1111}
0+ L,L
\x{1111}\x{1111}\x{1169}L,L,V
0: \x{1111}\x{1111}\x{1169}
0+ L,L,V
\x{1111}\x{ae4c}L, LV
0: \x{1111}\x{ae4c}
0+ L, LV
\x{1111}\x{ad89}L, LVT
0: \x{1111}\x{ad89}
0+ L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
0: \x{1111}\x{ae4c}\x{1169}
0+ L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
0: \x{1111}\x{ae4c}\x{1169}\x{1169}
0+ L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
0: \x{1111}\x{ae4c}\x{1169}\x{11fe}
0+ L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
0: \x{1111}\x{ad89}\x{11fe}
0+ L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
0: \x{1111}\x{ad89}\x{11fe}\x{11fe}
0+ L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
0: \x{ad89}\x{11fe}\x{11fe}
0+ LVT, T, T
*These match just the first codepoint (invalid sequence)
0: *
0+ These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
0: \x{1111}
0+ \x{11fe}L, T
\x{ae4c}\x{1111}LV, L
0: \x{ae4c}
0+ \x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
0: \x{ae4c}
0+ \x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
0: \x{ae4c}
0+ \x{ad89}LV, LVT
\x{1169}\x{1111}V, L
0: \x{1169}
0+ \x{1111}V, L
\x{1169}\x{ae4c}V, LV
0: \x{1169}
0+ \x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
0: \x{1169}
0+ \x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
0: \x{ad89}
0+ \x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
0: \x{ad89}
0+ \x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
0: \x{ad89}
0+ \x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
0: \x{ad89}
0+ \x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
0: \x{11fe}
0+ \x{1111}T, L
\x{11fe}\x{1169}T, V
0: \x{11fe}
0+ \x{1169}T, V
\x{11fe}\x{ae4c}T, LV
0: \x{11fe}
0+ \x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
0: \x{11fe}
0+ \x{ad89}T, LVT
*Test extend and spacing mark
0: *
0+ Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
0: \x{1111}\x{ae4c}\x{711}
0+ L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}
0+ L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}\x{711}\x{1b04}
0+ L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
0: *
0+ Test CR, LF, and control
\x0d\x{0711}CR, extend
0: \x{0d}
0+ \x{711}CR, extend
\x0d\x{1b04}CR, spacingmark
0: \x{0d}
0+ \x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
0: \x{0a}
0+ \x{711}LF, extend
\x0a\x{1b04}LF, spacingmark
0: \x{0a}
0+ \x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
0: \x{0b}
0+ \x{711}Control, extend
\x09\x{1b04}Control, spacingmark
0: \x{09}
0+ \x{1b04}Control, spacingmark
*There are no Prepend characters, so we can't test Prepend, CR
0: *
0+ There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}?X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/-- --/
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/[z\x{1e9e}]+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/[z\x{00df}]+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
1: \x{1f88}
/[z\x{1f88}]+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
1: \x{1f88}
/-- Perl matches these --/
/\x{00b5}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
1: \x{b5}\x{39c}
2: \x{b5}
/\x{039c}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
1: \x{b5}\x{39c}
2: \x{b5}
/\x{03bc}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
1: \x{b5}\x{39c}
2: \x{b5}
/\x{00c5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
1: \x{c5}\x{e5}
2: \x{c5}
/\x{00e5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
1: \x{c5}\x{e5}
2: \x{c5}
/\x{212b}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
1: \x{c5}\x{e5}
2: \x{c5}
/\x{01c4}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
1: \x{1c4}\x{1c5}
2: \x{1c4}
/\x{01c5}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
1: \x{1c4}\x{1c5}
2: \x{1c4}
/\x{01c6}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
1: \x{1c4}\x{1c5}
2: \x{1c4}
/\x{01c7}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
1: \x{1c7}\x{1c8}
2: \x{1c7}
/\x{01c8}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
1: \x{1c7}\x{1c8}
2: \x{1c7}
/\x{01c9}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
1: \x{1c7}\x{1c8}
2: \x{1c7}
/\x{01ca}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
1: \x{1ca}\x{1cb}
2: \x{1ca}
/\x{01cb}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
1: \x{1ca}\x{1cb}
2: \x{1ca}
/\x{01cc}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
1: \x{1ca}\x{1cb}
2: \x{1ca}
/\x{01f1}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
1: \x{1f1}\x{1f2}
2: \x{1f1}
/\x{01f2}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
1: \x{1f1}\x{1f2}
2: \x{1f1}
/\x{01f3}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
1: \x{1f1}\x{1f2}
2: \x{1f1}
/\x{0345}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
1: \x{345}\x{399}\x{3b9}
2: \x{345}\x{399}
3: \x{345}
/\x{0399}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
1: \x{345}\x{399}\x{3b9}
2: \x{345}\x{399}
3: \x{345}
/\x{03b9}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
1: \x{345}\x{399}\x{3b9}
2: \x{345}\x{399}
3: \x{345}
/\x{1fbe}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
1: \x{345}\x{399}\x{3b9}
2: \x{345}\x{399}
3: \x{345}
/\x{0392}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
1: \x{392}\x{3b2}
2: \x{392}
/\x{03b2}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
1: \x{392}\x{3b2}
2: \x{392}
/\x{03d0}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
1: \x{392}\x{3b2}
2: \x{392}
/\x{0395}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
1: \x{395}\x{3b5}
2: \x{395}
/\x{03b5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
1: \x{395}\x{3b5}
2: \x{395}
/\x{03f5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
1: \x{395}\x{3b5}
2: \x{395}
/\x{0398}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
1: \x{398}\x{3b8}\x{3d1}
2: \x{398}\x{3b8}
3: \x{398}
/\x{03b8}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
1: \x{398}\x{3b8}\x{3d1}
2: \x{398}\x{3b8}
3: \x{398}
/\x{03d1}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
1: \x{398}\x{3b8}\x{3d1}
2: \x{398}\x{3b8}
3: \x{398}
/\x{03f4}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
1: \x{398}\x{3b8}\x{3d1}
2: \x{398}\x{3b8}
3: \x{398}
/\x{039a}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
1: \x{39a}\x{3ba}
2: \x{39a}
/\x{03ba}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
1: \x{39a}\x{3ba}
2: \x{39a}
/\x{03f0}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
1: \x{39a}\x{3ba}
2: \x{39a}
/\x{03a0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
1: \x{3a0}\x{3c0}
2: \x{3a0}
/\x{03c0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
1: \x{3a0}\x{3c0}
2: \x{3a0}
/\x{03d6}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
1: \x{3a0}\x{3c0}
2: \x{3a0}
/\x{03a1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
1: \x{3a1}\x{3c1}
2: \x{3a1}
/\x{03c1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
1: \x{3a1}\x{3c1}
2: \x{3a1}
/\x{03f1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
1: \x{3a1}\x{3c1}
2: \x{3a1}
/\x{03a3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
1: \x{3a3}\x{3c2}
2: \x{3a3}
/\x{03c2}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
1: \x{3a3}\x{3c2}
2: \x{3a3}
/\x{03c3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
1: \x{3a3}\x{3c2}
2: \x{3a3}
/\x{03a6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
1: \x{3a6}\x{3c6}
2: \x{3a6}
/\x{03c6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
1: \x{3a6}\x{3c6}
2: \x{3a6}
/\x{03d5}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
1: \x{3a6}\x{3c6}
2: \x{3a6}
/\x{03c9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
1: \x{3c9}\x{3a9}
2: \x{3c9}
/\x{03a9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
1: \x{3c9}\x{3a9}
2: \x{3c9}
/\x{2126}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
1: \x{3c9}\x{3a9}
2: \x{3c9}
/\x{1e60}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
1: \x{1e60}\x{1e61}
2: \x{1e60}
/\x{1e61}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
1: \x{1e60}\x{1e61}
2: \x{1e60}
/\x{1e9b}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
1: \x{1e60}\x{1e61}
2: \x{1e60}
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
1: \x{1f88}
/\x{1f80}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
1: \x{1f88}
/\x{004b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
1: Kk
2: K
/\x{006b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
1: Kk
2: K
/\x{212a}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
1: Kk
2: K
/\x{0053}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
1: Ss
2: S
/\x{0073}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
1: Ss
2: S
/\x{017f}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
1: Ss
2: S
/ist/8i
ikt
No match
/is+t/8i
iSs\x{17f}t
0: iSs\x{17f}t
ikt
No match
/is+?t/8i
ikt
No match
/is?t/8i
ikt
No match
/is{2}t/8i
iskt
No match
/-- End of testinput10 --/

View File

@ -768,7 +768,7 @@ Max lookbehind = 3
/(?>.*)(?<=(abcd)|(xyz))/I
Capturing subpattern count = 2
No options
First char at start or follows newline
No first char
No need char
Max lookbehind = 4
alphabetabcd
@ -10110,7 +10110,7 @@ No set of starting bytes
"(?>.*/)foo"SI
Capturing subpattern count = 0
No options
First char at start or follows newline
No first char
Need char = 'o'
Subject length lower bound = 4
No set of starting bytes
@ -12361,4 +12361,124 @@ assertion, and therefore fails the entire subroutine call. --/
ac
0: ac
/-- These are all run as real matches in test 1; here we are just checking the
settings of the anchored and startline bits. --/
/(?>.*?a)(?<=ba)/I
Capturing subpattern count = 0
No options
No first char
Need char = 'a'
Max lookbehind = 2
/(?:.*?a)(?<=ba)/I
Capturing subpattern count = 0
No options
First char at start or follows newline
Need char = 'a'
Max lookbehind = 2
/.*?a(*PRUNE)b/I
Capturing subpattern count = 0
No options
No first char
Need char = 'b'
/.*?a(*PRUNE)b/sI
Capturing subpattern count = 0
Options: dotall
No first char
Need char = 'b'
/^a(*PRUNE)b/sI
Capturing subpattern count = 0
Options: anchored dotall
No first char
No need char
/.*?a(*SKIP)b/I
Capturing subpattern count = 0
No options
No first char
Need char = 'b'
/(?>.*?a)b/sI
Capturing subpattern count = 0
Options: dotall
No first char
Need char = 'b'
/(?>.*?a)b/I
Capturing subpattern count = 0
No options
No first char
Need char = 'b'
/(?>^a)b/sI
Capturing subpattern count = 0
Options: anchored dotall
No first char
No need char
/(?>.*?)(?<=(abcd)|(wxyz))/I
Capturing subpattern count = 2
No options
No first char
No need char
Max lookbehind = 4
/(?>.*)(?<=(abcd)|(wxyz))/I
Capturing subpattern count = 2
No options
No first char
No need char
Max lookbehind = 4
"(?>.*)foo"I
Capturing subpattern count = 0
No options
No first char
Need char = 'o'
"(?>.*?)foo"I
Capturing subpattern count = 0
No options
No first char
Need char = 'o'
/(?>^abc)/mI
Capturing subpattern count = 0
Options: multiline
First char at start or follows newline
Need char = 'c'
/(?>.*abc)/mI
Capturing subpattern count = 0
Options: multiline
No first char
Need char = 'c'
/(?:.*abc)/mI
Capturing subpattern count = 0
Options: multiline
First char at start or follows newline
Need char = 'c'
/-- Check PCRE_STUDY_EXTRA_NEEDED --/
/.?/S-I
Capturing subpattern count = 0
No options
No first char
No need char
Study returned NULL
/.?/S!I
Capturing subpattern count = 0
No options
No first char
No need char
Subject length lower bound = -1
No set of starting bytes
/-- End of testinput2 --/

View File

@ -276,7 +276,7 @@ No need char
/[\xFF]/DZ
------------------------------------------------------------------
Bra
\xff
\x{ff}
Ket
End
------------------------------------------------------------------
@ -290,7 +290,7 @@ No need char
/[^\xFF]/DZ
------------------------------------------------------------------
Bra
[^\xff]
[^\x{ff}]
Ket
End
------------------------------------------------------------------
@ -786,7 +786,7 @@ No match
/[\H]/8BZ
------------------------------------------------------------------
Bra
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}]
[\x00-\x08\x0a-\x1f!-\x9f\x{a1}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}]
Ket
End
------------------------------------------------------------------
@ -794,7 +794,7 @@ No match
/[\V]/8BZ
------------------------------------------------------------------
Bra
[\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{10ffff}]
[\x00-\x09\x0e-\x84\x{86}-\x{2027}\x{202a}-\x{10ffff}]
Ket
End
------------------------------------------------------------------
@ -1594,7 +1594,7 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 7
/[\H\x{d7ff}]+/8BZ
------------------------------------------------------------------
Bra
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}\x{d7ff}]+
[\x00-\x08\x0a-\x1f!-\x9f\x{a1}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}\x{d7ff}]+
Ket
End
------------------------------------------------------------------
@ -1634,7 +1634,7 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 7
/[\V\x{d7ff}]+/8BZ
------------------------------------------------------------------
Bra
[\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{10ffff}\x{d7ff}]+
[\x00-\x09\x0e-\x84\x{86}-\x{2027}\x{202a}-\x{10ffff}\x{d7ff}]+
Ket
End
------------------------------------------------------------------

View File

@ -1,6 +1,5 @@
/-- This set of tests is for Unicode property support. It is compatible with
Perl >= 5.10, but not 5.8 because it tests some extra properties that are
not in the earlier release. --/
Perl >= 5.15. --/
/^\pC\pL\pM\pN\pP\pS\pZ</8
\x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
@ -697,6 +696,16 @@ No match
0: A\x{300}\x{301}B\x{300}C
1: C
/^\X/8
A
0: A
A\x{300}BC
0: A\x{300}
A\x{300}\x{301}\x{302}BC
0: A\x{300}\x{301}\x{302}
\x{300}
0: \x{300}
/^\p{Han}+/8
\x{2e81}\x{3007}\x{2f804}\x{31a0}
0: \x{2e81}\x{3007}\x{2f804}
@ -1136,6 +1145,8 @@ No match
0: \x{65d}
\x{65e}
0: \x{65e}
\x{65f}
0: \x{65f}
\x{66a}
0: \x{66a}
\x{6e9}
@ -1158,8 +1169,6 @@ No match
No match
\x{655}
No match
\x{65f}
No match
/^\p{Cyrillic}/8
\x{1d2b}
@ -1373,4 +1382,756 @@ No match
0: \x{2c65}
/-- These are tests for extended grapheme clusters --/
/^\X/8+
G\x{34e}\x{34e}X
0: G\x{34e}\x{34e}
0+ X
\x{34e}\x{34e}X
0: \x{34e}\x{34e}
0+ X
\x04X
0: \x{04}
0+ X
\x{1100}X
0: \x{1100}
0+ X
\x{1100}\x{34e}X
0: \x{1100}\x{34e}
0+ X
\x{1b04}\x{1b04}X
0: \x{1b04}\x{1b04}
0+ X
*These match up to the roman letters
0: *
0+ These match up to the roman letters
\x{1111}\x{1111}L,L
0: \x{1111}\x{1111}
0+ L,L
\x{1111}\x{1111}\x{1169}L,L,V
0: \x{1111}\x{1111}\x{1169}
0+ L,L,V
\x{1111}\x{ae4c}L, LV
0: \x{1111}\x{ae4c}
0+ L, LV
\x{1111}\x{ad89}L, LVT
0: \x{1111}\x{ad89}
0+ L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
0: \x{1111}\x{ae4c}\x{1169}
0+ L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
0: \x{1111}\x{ae4c}\x{1169}\x{1169}
0+ L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
0: \x{1111}\x{ae4c}\x{1169}\x{11fe}
0+ L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
0: \x{1111}\x{ad89}\x{11fe}
0+ L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
0: \x{1111}\x{ad89}\x{11fe}\x{11fe}
0+ L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
0: \x{ad89}\x{11fe}\x{11fe}
0+ LVT, T, T
*These match just the first codepoint (invalid sequence)
0: *
0+ These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
0: \x{1111}
0+ \x{11fe}L, T
\x{ae4c}\x{1111}LV, L
0: \x{ae4c}
0+ \x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
0: \x{ae4c}
0+ \x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
0: \x{ae4c}
0+ \x{ad89}LV, LVT
\x{1169}\x{1111}V, L
0: \x{1169}
0+ \x{1111}V, L
\x{1169}\x{ae4c}V, LV
0: \x{1169}
0+ \x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
0: \x{1169}
0+ \x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
0: \x{ad89}
0+ \x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
0: \x{ad89}
0+ \x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
0: \x{ad89}
0+ \x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
0: \x{ad89}
0+ \x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
0: \x{11fe}
0+ \x{1111}T, L
\x{11fe}\x{1169}T, V
0: \x{11fe}
0+ \x{1169}T, V
\x{11fe}\x{ae4c}T, LV
0: \x{11fe}
0+ \x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
0: \x{11fe}
0+ \x{ad89}T, LVT
*Test extend and spacing mark
0: *
0+ Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
0: \x{1111}\x{ae4c}\x{711}
0+ L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}
0+ L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}\x{711}\x{1b04}
0+ L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
0: *
0+ Test CR, LF, and control
\x0d\x{0711}CR, extend
0: \x{0d}
0+ \x{711}CR, extend
\x0d\x{1b04}CR, spacingmark
0: \x{0d}
0+ \x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
0: \x{0a}
0+ \x{711}LF, extend
\x0a\x{1b04}LF, spacingmark
0: \x{0a}
0+ \x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
0: \x{0b}
0+ \x{711}Control, extend
\x09\x{1b04}Control, spacingmark
0: \x{09}
0+ \x{1b04}Control, spacingmark
*There are no Prepend characters, so we can't test Prepend, CR
0: *
0+ There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}?X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/-- --/
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/[z\x{1e9e}]+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/[z\x{00df}]+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/[z\x{1f88}]+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/-- Characters with more than one other case; test in classes --/
/[z\x{00b5}]+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/[z\x{039c}]+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/[z\x{03bc}]+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/[z\x{00c5}]+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/[z\x{00e5}]+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/[z\x{212b}]+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/[z\x{01c4}]+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/[z\x{01c5}]+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/[z\x{01c6}]+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/[z\x{01c7}]+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/[z\x{01c8}]+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/[z\x{01c9}]+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/[z\x{01ca}]+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/[z\x{01cb}]+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/[z\x{01cc}]+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/[z\x{01f1}]+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/[z\x{01f2}]+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/[z\x{01f3}]+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/[z\x{0345}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{0399}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{03b9}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{1fbe}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{0392}]+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/[z\x{03b2}]+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/[z\x{03d0}]+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/[z\x{0395}]+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/[z\x{03b5}]+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/[z\x{03f5}]+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/[z\x{0398}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{03b8}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{03d1}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{03f4}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{039a}]+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/[z\x{03ba}]+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/[z\x{03f0}]+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/[z\x{03a0}]+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/[z\x{03c0}]+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/[z\x{03d6}]+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/[z\x{03a1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/[z\x{03c1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/[z\x{03f1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/[z\x{03a3}]+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/[z\x{03c2}]+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/[z\x{03c3}]+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/[z\x{03a6}]+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/[z\x{03c6}]+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/[z\x{03d5}]+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/[z\x{03c9}]+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/[z\x{03a9}]+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/[z\x{2126}]+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/[z\x{1e60}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e61}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e9b}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/-- Perl 5.12.4 gets these wrong, but 5.15.3 is OK --/
/[z\x{004b}]+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/[z\x{006b}]+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/[z\x{212a}]+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/[z\x{0053}]+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/[z\x{0073}]+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/[z\x{017f}]+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/-- --/
/(ΣΆΜΟΣ) \1/8i
ΣΆΜΟΣ ΣΆΜΟΣ
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
ΣΆΜΟΣ σάμος
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
σάμος σάμος
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος σάμοσ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος ΣΆΜΟΣ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
/(σάμος) \1/8i
ΣΆΜΟΣ ΣΆΜΟΣ
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
ΣΆΜΟΣ σάμος
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
σάμος σάμος
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος σάμοσ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος ΣΆΜΟΣ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
/(ΣΆΜΟΣ) \1*/8i
ΣΆΜΟΣ\x20
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
ΣΆΜΟΣ ΣΆΜΟΣσάμοςσάμος
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}\x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}\x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
/-- Perl matches these --/
/\x{00b5}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{039c}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{03bc}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{00c5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{00e5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{212b}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{01c4}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c5}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c6}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c7}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01c8}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01c9}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01ca}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01cb}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01cc}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01f1}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{01f2}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{01f3}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{0345}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{0399}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{03b9}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{1fbe}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{0392}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{03b2}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{03d0}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{0395}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{03b5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{03f5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{0398}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03b8}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03d1}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03f4}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{039a}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03ba}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03f0}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03a0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03c0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03d6}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03a1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03c1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03f1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03a3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03c2}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03c3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03a6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03c6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03d5}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03c9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{03a9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{2126}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{1e60}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/\x{1f80}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/-- Perl 5.12.4 gets these wrong, but 5.15.3 is OK --/
/\x{004b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{006b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{212a}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{0053}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/\x{0073}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/\x{017f}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/-- End of testinput6 --/

View File

@ -124,7 +124,7 @@ No match
/[z-\x{100}]/8iDZ
------------------------------------------------------------------
Bra
[Z\x{39c}\x{178}z-\x{101}]
[Z\x{39c}\x{3bc}\x{1e9e}\x{178}z-\x{101}]
Ket
End
------------------------------------------------------------------
@ -162,7 +162,7 @@ No match
/[z-\x{100}]/8DZi
------------------------------------------------------------------
Bra
[Z\x{39c}\x{178}z-\x{101}]
[Z\x{39c}\x{3bc}\x{1e9e}\x{178}z-\x{101}]
Ket
End
------------------------------------------------------------------
@ -233,7 +233,7 @@ No need char
Ket
End
------------------------------------------------------------------
\xe2\x80\xa8\xe2\x80\xa8
0: \x{2028}\x{2028}
\x{2028}\x{2028}\x{2028}
0: \x{2028}\x{2028}\x{2028}
@ -423,20 +423,6 @@ of case for anything other than the ASCII letters. --/
\x{e0}
0: \x{e0}
/-- This should be Perl-compatible but Perl 5.11 gets \x{300} wrong. --/8
/^\X/8
A
0: A
A\x{300}BC
0: A\x{300}
A\x{300}\x{301}\x{302}BC
0: A\x{300}\x{301}\x{302}
*** Failers
0: *
\x{300}
No match
/-- These are PCRE's extra properties to help with Unicodizing \d etc. --/
/^\p{Xan}/8
@ -1194,11 +1180,13 @@ No match
/^S(\X*)e(\X*)$/8
Stéréo
No match
0: Ste\x{301}re\x{301}o
1: te\x{301}r
2: \x{301}o
/^\X/8
́réo
No match
0: \x{301}
/^a\X41z/<JS>
aX41z
@ -1313,4 +1301,173 @@ Partial match: AA
AA\P\P
Partial match: AA
/A\x{3a3}B/8iDZ
------------------------------------------------------------------
Bra
/i A
clist 03a3 03c2 03c3
/i B
Ket
End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: caseless utf
First char = 'A' (caseless)
Need char = 'B' (caseless)
/\x{3a3}B/8iDZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3
/i B
Ket
End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: caseless utf
No first char
Need char = 'B' (caseless)
/[\x{3a3}]/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/[^\x{3a3}]/8iBZ
------------------------------------------------------------------
Bra
not clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/[\x{3a3}]+/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3 +
Ket
End
------------------------------------------------------------------
/[^\x{3a3}]+/8iBZ
------------------------------------------------------------------
Bra
not clist 03a3 03c2 03c3 +
Ket
End
------------------------------------------------------------------
/a*\x{3a3}/8iBZ
------------------------------------------------------------------
Bra
/i a*+
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/\x{3a3}+a/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3 ++
/i a
Ket
End
------------------------------------------------------------------
/\x{3a3}*\x{3c2}/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3 *
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/\x{3a3}{3}/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0: \x{3a3}\x{3c3}\x{3c2}
0+ \x{3a3}\x{3c3}\x{3c2}
/\x{3a3}{2,4}/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0: \x{3a3}\x{3c3}\x{3c2}\x{3a3}
0+ \x{3c3}\x{3c2}
/\x{3a3}{2,4}?/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0: \x{3a3}\x{3c3}
0+ \x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}+./8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0: \x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0+
/\x{3a3}++./8i+
** Failers
No match
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
No match
/\x{3a3}*\x{3c2}/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3 *
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/[^\x{3a3}]*\x{3c2}/8iBZ
------------------------------------------------------------------
Bra
not clist 03a3 03c2 03c3 *+
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/[^a]*\x{3c2}/8iBZ
------------------------------------------------------------------
Bra
/i [^a]*
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/ist/8iBZ
------------------------------------------------------------------
Bra
/i i
clist 0053 0073 017f
/i t
Ket
End
------------------------------------------------------------------
ikt
No match
/is+t/8i
iSs\x{17f}t
0: iSs\x{17f}t
ikt
No match
/is+?t/8i
ikt
No match
/is?t/8i
ikt
No match
/is{2}t/8i
iskt
No match
/-- End of testinput7 --/

View File

@ -7,7 +7,11 @@
/* This file contains definitions of the property values that are returned by
the UCD access macros. New values that are added for new releases of Unicode
should always be at the end of each enum, for backwards compatibility. */
should always be at the end of each enum, for backwards compatibility.
IMPORTANT: Note also that the specific numeric values of the enums have to be
the same as the values that are generated by the maint/MultiStage2.py script,
where the equivalent property descriptive names are listed in vectors. */
/* These are the general character categories. */
@ -21,7 +25,7 @@ enum {
ucp_Z /* Separator */
};
/* These are the particular character types. */
/* These are the particular character categories. */
enum {
ucp_Cc, /* Control */
@ -56,6 +60,26 @@ enum {
ucp_Zs /* Space separator */
};
/* These are grapheme break properties. Note that the code for processing them
assumes that the values are less than 16. If more values are added that take
the number to 16 or more, the code will have to be rewritten. */
enum {
ucp_gbCR, /* 0 */
ucp_gbLF, /* 1 */
ucp_gbControl, /* 2 */
ucp_gbExtend, /* 3 */
ucp_gbPrepend, /* 4 */
ucp_gbSpacingMark, /* 5 */
ucp_gbL, /* 6 Hangul syllable type L */
ucp_gbV, /* 7 Hangul syllable type V */
ucp_gbT, /* 8 Hangul syllable type T */
ucp_gbLV, /* 9 Hangul syllable type LV */
ucp_gbLVT, /* 10 Hangul syllable type LVT */
ucp_gbRegionalIndicator, /* 11 */
ucp_gbOther /* 12 */
};
/* These are the script identifications. */
enum {

View File

@ -244,12 +244,19 @@ PHPAPI pcre_cache_entry* pcre_get_compiled_regex_cache(char *regex, int regex_le
int count = 0;
unsigned const char *tables = NULL;
#if HAVE_SETLOCALE
char *locale = setlocale(LC_CTYPE, NULL);
char *locale;
#endif
pcre_cache_entry *pce;
pcre_cache_entry new_entry;
char *tmp = NULL;
#if HAVE_SETLOCALE
# ifdef PHP_WIN32 && ZTS
_configthreadlocale(_ENABLE_PER_THREAD_LOCALE);
# endif
locale = setlocale(LC_CTYPE, NULL);
#endif
/* Try to lookup the cached regex entry, and if successful, just pass
back the compiled pattern, otherwise go on and compile it. */
if (zend_hash_find(&PCRE_G(pcre_cache), regex, regex_len+1, (void **)&pce) == SUCCESS) {

View File

@ -28,6 +28,10 @@
# include "config.h"
#endif
#ifdef __APPLE__
#define __APPLE_USE_RFC_3542
#endif
#if HAVE_SOCKETS
#include <php.h>

View File

@ -9,6 +9,9 @@ if (!defined('IPPROTO_IPV6')) {
die('skip IPv6 not available.');
}
$s = socket_create(AF_INET6, SOCK_DGRAM, SOL_UDP);
if ($s === false) {
die("skip unable to create socket");
}
$br = socket_bind($s, '::', 3000);
/* On Linux, there is no route ff00::/8 by default on lo, which makes it
* troublesome to send multicast traffic from lo, which we must since

View File

@ -5,6 +5,11 @@ Test if socket_set_option() returns 'unable to set socket option' failure for in
if (!extension_loaded('sockets')) {
die('SKIP sockets extension not available.');
}
if (PHP_OS == 'Darwin') {
die('skip Not for OSX');
}
$filename = dirname(__FILE__) . '/006_root_check.tmp';
$fp = fopen($filename, 'w');
fclose($fp);

View File

@ -62,8 +62,8 @@ int(16)
int(24)
-- Iteration 3 --
1234000 0 120
int(25)
1234000 3875820019684212736 120
int(34)
-- Iteration 4 --
#1 0 $0 10

View File

@ -58,7 +58,7 @@ string(16) "1234567 342391 0"
string(24) "12345678900 u 1234 12345"
-- Iteration 3 --
string(25) " 1234000 0 120"
string(34) " 1234000 3875820019684212736 120"
-- Iteration 4 --
string(10) "#1 0 $0 10"