Merge branch 'PHP-5.5' of git.php.net:php-src into PHP-5.5

This commit is contained in:
Pierre Joye 2013-03-04 14:06:09 +01:00
commit e9a2642c89
53 changed files with 13054 additions and 7633 deletions

15
NEWS
View File

@ -10,6 +10,9 @@ PHP NEWS
. Fixed bug #64287 (sendmsg/recvmsg shutdown handler causes segfault). . Fixed bug #64287 (sendmsg/recvmsg shutdown handler causes segfault).
(Gustavo) (Gustavo)
- PCRE:
. Merged PCRE 8.32. (Anatol)
21 Feb 2013, PHP 5.5.0 Alpha 5 21 Feb 2013, PHP 5.5.0 Alpha 5
- Core: - Core:
@ -60,6 +63,18 @@ PHP NEWS
- Filter: - Filter:
. Implemented FR #49180 - added MAC address validation. (Martin) . Implemented FR #49180 - added MAC address validation. (Martin)
- Phar:
. Fixed timestamp update on Phar contents modification. (Dmitry)
- SPL:
. Fixed bug #64264 (SPLFixedArray toArray problem). (Laruence)
. Fixed bug #64228 (RecursiveDirectoryIterator always assumes SKIP_DOTS).
(patch by kriss@krizalys.com, Laruence)
. Fixed bug #64106 (Segfault on SplFixedArray[][x] = y when extended).
(Nikita Popov)
. Fixed bug #52861 (unset fails with ArrayObject and deep arrays).
(Mike Willbanks)
- SNMP: - SNMP:
. Fixed bug #64124 (IPv6 malformed). (Boris Lytochkin) . Fixed bug #64124 (IPv6 malformed). (Boris Lytochkin)

344
NEWS-5.5 Normal file
View File

@ -0,0 +1,344 @@
PHP NEWS
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
?? ??? 201?, PHP 5.5.0 Beta 1
- Core:
. Fixed bug #49348 (Uninitialized ++$foo->bar; does not cause a notice).
(Stas)
- Sockets:
. Fixed bug #64287 (sendmsg/recvmsg shutdown handler causes segfault).
(Gustavo)
- PCRE:
. Merged PCRE 8.32. (Anatol)
21 Feb 2013, PHP 5.5.0 Alpha 5
- Core:
. Implemented FR #64175 (Added HTTP codes as of RFC 6585). (Jonh Wendell)
. Fixed bug #64135 (Exceptions from set_error_handler are not always
propagated). (Laruence)
. Fixed bug #63830 (Segfault on undefined function call in nested generator).
(Nikita Popov)
. Fixed bug #60833 (self, parent, static behave inconsistently
case-sensitive). (Stas, mario at include-once dot org)
. Implemented FR #60524 (specify temp dir by php.ini). (ALeX Kazik).
. Fixed bug #64142 (dval to lval different behavior on ppc64). (Remi)
. Added ARMv7/v8 versions of various Zend arithmetic functions that are
implemented using inline assembler (Ard Biesheuvel)
. Fix undefined behavior when converting double variables to integers.
The double is now always rounded towards zero, the remainder of its division
by 2^32 or 2^64 (depending on sizeof(long)) is calculated and it's made
signed assuming a two's complement representation. (Gustavo)
- CLI server:
. Fixed bug #64128 (buit-in web server is broken on ppc64). (Remi)
- cURL:
. Implemented FR #46439 - added CURLFile for safer file uploads.
(Stas)
- Intl:
. Cherry-picked UConverter wrapper, which had accidentaly been committed only
to master.
- mysqli
. Added mysqli_begin_transaction()/mysqli::begin_transaction(). Implemented
all options, per MySQL 5.6, which can be used with START TRANSACTION, COMMIT
and ROLLBACK through options to mysqli_commit()/mysqli_rollback() and their
respective OO counterparts. They work in libmysql and mysqlnd mode. (Andrey)
. Added mysqli_savepoint(), mysqli_release_savepoint(). (Andrey)
- mysqlnd
. Add new begin_transaction() call to the connection object. Implemented all
options, per MySQL 5.6, which can be used with START TRANSACTION, COMMIT
and ROLLBACK. (Andrey)
. Added mysqlnd_savepoint(), mysqlnd_release_savepoint(). (Andrey)
- Sockets:
. Added recvmsg() and sendmsg() wrappers. (Gustavo)
See https://wiki.php.net/rfc/sendrecvmsg
- Filter:
. Implemented FR #49180 - added MAC address validation. (Martin)
- Phar:
. Fixed timestamp update on Phar contents modification. (Dmitry)
- SPL:
. Fixed bug #64264 (SPLFixedArray toArray problem). (Laruence)
. Fixed bug #64228 (RecursiveDirectoryIterator always assumes SKIP_DOTS).
(patch by kriss@krizalys.com, Laruence)
. Fixed bug #64106 (Segfault on SplFixedArray[][x] = y when extended).
(Nikita Popov)
. Fixed bug #52861 (unset fails with ArrayObject and deep arrays).
(Mike Willbanks)
- SNMP:
. Fixed bug #64124 (IPv6 malformed). (Boris Lytochkin)
24 Jan 2013, PHP 5.5.0 Alpha 4
- Core:
. Fixed bug #63980 (object members get trimmed by zero bytes). (Laruence)
. Implemented RFC for Class Name Resolution As Scalar Via "class" Keyword.
(Ralph Schindler, Nikita Popov, Lars)
- DateTime
. Added DateTimeImmutable - a variant of DateTime that only returns the
modified state instead of changing itself. (Derick)
- FPM:
. Fixed bug #63999 (php with fpm fails to build on Solaris 10 or 11). (Adam)
- pgsql:
. Bug #46408: Locale number format settings can cause pg_query_params to
break with numerics. (asmecher, Lars)
- dba:
. Bug #62489: dba_insert not working as expected.
(marc-bennewitz at arcor dot de, Lars)
- Reflection:
. Fixed bug #64007 (There is an ability to create instance of Generator by
hand). (Laruence)
10 Jan 2013, PHP 5.5.0 Alpha 3
- General improvements:
. Fixed bug #63874 (Segfault if php_strip_whitespace has heredoc). (Pierrick)
. Fixed bug #63822 (Crash when using closures with ArrayAccess).
(Nikita Popov)
. Add Generator::throw() method. (Nikita Popov)
. Bug #23955: allow specifying Max-Age attribute in setcookie() (narfbg, Lars)
. Bug #52126: timestamp for mail.log (Martin Jansen, Lars)
- mysqlnd
. Fixed return value of mysqli_stmt_affected_rows() in the time after
prepare() and before execute(). (Andrey)
- cURL:
. Added new functions curl_escape, curl_multi_setopt, curl_multi_strerror
curl_pause, curl_reset, curl_share_close, curl_share_init,
curl_share_setopt curl_strerror and curl_unescape. (Pierrick)
. Addes new curl options CURLOPT_TELNETOPTIONS, CURLOPT_GSSAPI_DELEGATION,
CURLOPT_ACCEPTTIMEOUT_MS, CURLOPT_SSL_OPTIONS, CURLOPT_TCP_KEEPALIVE,
CURLOPT_TCP_KEEPIDLE and CURLOPT_TCP_KEEPINTVL. (Pierrick)
18 Dec 2012, PHP 5.5.0 Alpha 2
- General improvements:
. Added systemtap support by enabling systemtap compatible dtrace probes on
linux. (David Soria Parra)
. Added support for using empty() on the result of function calls and
other expressions (https://wiki.php.net/rfc/empty_isset_exprs).
(Nikita Popov)
. Optimized access to temporary and compiled VM variables. 8% less memory
reads. (Dmitry)
. The VM stacks for passing function arguments and syntaticaly nested calls
were merged into a single stack. The stack size needed for op_array
execution is calculated at compile time and preallocated at once. As result
all the stack push operatins don't require checks for stack overflow
any more. (Dmitry)
- MySQL
. This extension is now deprecated, and deprecation warnings will be generated
when connections are established to databases via mysql_connect(),
mysql_pconnect(), or through implicit connection: use MySQLi or PDO_MySQL
instead (https://wiki.php.net/rfc/mysql_deprecation). (Adam)
- Fileinfo:
. Fixed bug #63590 (Different results in TS and NTS under Windows).
(Anatoliy)
- Apache2 Handler SAPI:
. Enabled Apache 2.4 configure option for Windows (Pierre, Anatoliy)
13 Nov 2012, PHP 5.5.0 Alpha 1
- General improvements:
. Added generators and coroutines (https://wiki.php.net/rfc/generators).
(Nikita Popov)
. Added "finally" keyword (https://wiki.php.net/rfc/finally). (Laruence)
. Add simplified password hashing API
(https://wiki.php.net/rfc/password_hash). (Anthony Ferrara)
. Added support for list in foreach (https://wiki.php.net/rfc/foreachlist).
(Laruence)
. Added support for using empty() on the result of function calls and
other expressions (https://wiki.php.net/rfc/empty_isset_exprs).
(Nikita Popov)
. Added support for constant array/string dereferencing. (Laruence)
. Improve set_exception_handler while doing reset.(Laruence)
. Remove php_logo_guid(), php_egg_logo_guid(), php_real_logo_guid(),
zend_logo_guid(). (Adnrew Faulds)
. Drop Windows XP and 2003 support. (Pierre)
- Calendar:
. Fixed bug #54254 (cal_from_jd returns month = 6 when there is only one Adar)
(Stas, Eitan Mosenkis)
- Core:
. Added boolval(). (Jille Timmermans)
. Added "Z" option to pack/unpack. (Gustavo)
. Implemented FR #60738 (Allow 'set_error_handler' to handle NULL).
(Laruence, Nikita Popov)
. Added optional second argument for assert() to specify custom message. Patch
by Lonny Kapelushnik (lonny@lonnylot.com). (Lars)
. Fixed bug #18556 (Engine uses locale rules to handle class names). (Stas)
. Fixed bug #61681 (Malformed grammar). (Nikita Popov, Etienne, Laruence)
. Fixed bug #61038 (unpack("a5", "str\0\0") does not work as expected).
(srgoogleguy, Gustavo)
. Return previous handler when passing NULL to set_error_handler and
set_exception_handler. (Nikita Popov)
- cURL:
. Added support for CURLOPT_FTP_RESPONSE_TIMEOUT, CURLOPT_APPEND,
CURLOPT_DIRLISTONLY, CURLOPT_NEW_DIRECTORY_PERMS, CURLOPT_NEW_FILE_PERMS,
CURLOPT_NETRC_FILE, CURLOPT_PREQUOTE, CURLOPT_KRBLEVEL, CURLOPT_MAXFILESIZE,
CURLOPT_FTP_ACCOUNT, CURLOPT_COOKIELIST, CURLOPT_IGNORE_CONTENT_LENGTH,
CURLOPT_CONNECT_ONLY, CURLOPT_LOCALPORT, CURLOPT_LOCALPORTRANGE,
CURLOPT_FTP_ALTERNATIVE_TO_USER, CURLOPT_SSL_SESSIONID_CACHE,
CURLOPT_FTP_SSL_CCC, CURLOPT_HTTP_CONTENT_DECODING,
CURLOPT_HTTP_TRANSFER_DECODING, CURLOPT_PROXY_TRANSFER_MODE,
CURLOPT_ADDRESS_SCOPE, CURLOPT_CRLFILE, CURLOPT_ISSUERCERT,
CURLOPT_USERNAME, CURLOPT_PASSWORD, CURLOPT_PROXYUSERNAME,
CURLOPT_PROXYPASSWORD, CURLOPT_NOPROXY, CURLOPT_SOCKS5_GSSAPI_NEC,
CURLOPT_SOCKS5_GSSAPI_SERVICE, CURLOPT_TFTP_BLKSIZE,
CURLOPT_SSH_KNOWNHOSTS, CURLOPT_FTP_USE_PRET, CURLOPT_MAIL_FROM,
CURLOPT_MAIL_RCPT, CURLOPT_RTSP_CLIENT_CSEQ, CURLOPT_RTSP_SERVER_CSEQ,
CURLOPT_RTSP_SESSION_ID, CURLOPT_RTSP_STREAM_URI, CURLOPT_RTSP_TRANSPORT,
CURLOPT_RTSP_REQUEST, CURLOPT_RESOLVE, CURLOPT_ACCEPT_ENCODING,
CURLOPT_TRANSFER_ENCODING, CURLOPT_DNS_SERVERS and CURLOPT_USE_SSL.
(Pierrick)
. Fixed bug #55635 (CURLOPT_BINARYTRANSFER no longer used. The constant
still exists for backward compatibility but is doing nothing). (Pierrick)
. Fixed bug #54995 (Missing CURLINFO_RESPONSE_CODE support). (Pierrick)
- Datetime
. Fixed bug #61642 (modify("+5 weekdays") returns Sunday).
(Dmitri Iouchtchenko)
- Hash
. Added support for PBKDF2 via hash_pbkdf2(). (Anthony Ferrara)
- Intl
. The intl extension now requires ICU 4.0+.
. Added intl.use_exceptions INI directive, which controls what happens when
global errors are set together with intl.error_level. (Gustavo)
. MessageFormatter::format() and related functions now accepted named
arguments and mixed numeric/named arguments in ICU 4.8+. (Gustavo)
. MessageFormatter::format() and related functions now don't error out when
an insufficient argument count is provided. Instead, the placeholders will
remain unsubstituted. (Gustavo)
. MessageFormatter::parse() and MessageFormat::format() (and their static
equivalents) don't throw away better than second precision in the arguments.
(Gustavo)
. IntlDateFormatter::__construct and datefmt_create() now accept for the
$timezone argument time zone identifiers, IntlTimeZone objects, DateTimeZone
objects and NULL. (Gustavo)
. IntlDateFormatter::__construct and datefmt_create() no longer accept invalid
timezone identifiers or empty strings. (Gustavo)
. The default time zone used in IntlDateFormatter::__construct and
datefmt_create() (when the corresponding argument is not passed or NULL is
passed) is now the one given by date_default_timezone_get(), not the
default ICU time zone. (Gustavo)
. The time zone passed to the IntlDateFormatter is ignored if it is NULL and
if the calendar passed is an IntlCalendar object -- in this case, the
IntlCalendar's time zone will be used instead. Otherwise, the time zone
specified in the $timezone argument is used instead. This does not affect
old code, as IntlCalendar was introduced in this version. (Gustavo)
. IntlDateFormatter::__construct and datefmt_create() now accept for the
$calendar argument also IntlCalendar objects. (Gustavo)
. IntlDateFormatter::getCalendar() and datefmt_get_calendar() return false
if the IntlDateFormatter was set up with an IntlCalendar instead of the
constants IntlDateFormatter::GREGORIAN/TRADITIONAL. IntlCalendar did not
exist before this version. (Gustavo)
. IntlDateFormatter::setCalendar() and datefmt_set_calendar() now also accept
an IntlCalendar object, in which case its time zone is taken. Passing a
constant is still allowed, and still keeps the time zone. (Gustavo)
. IntlDateFormatter::setTimeZoneID() and datefmt_set_timezone_id() are
deprecated. Use IntlDateFormatter::setTimeZone() or datefmt_set_timezone()
instead. (Gustavo)
. IntlDateFormatter::format() and datefmt_format() now also accept an
IntlCalendar object for formatting. (Gustavo)
. Added the classes: IntlCalendar, IntlGregorianCalendar, IntlTimeZone,
IntlBreakIterator, IntlRuleBasedBreakIterator and
IntlCodePointBreakIterator. (Gustavo)
. Added the functions: intlcal_get_keyword_values_for_locale(),
intlcal_get_now(), intlcal_get_available_locales(), intlcal_get(),
intlcal_get_time(), intlcal_set_time(), intlcal_add(),
intlcal_set_time_zone(), intlcal_after(), intlcal_before(), intlcal_set(),
intlcal_roll(), intlcal_clear(), intlcal_field_difference(),
intlcal_get_actual_maximum(), intlcal_get_actual_minimum(),
intlcal_get_day_of_week_type(), intlcal_get_first_day_of_week(),
intlcal_get_greatest_minimum(), intlcal_get_least_maximum(),
intlcal_get_locale(), intlcal_get_maximum(),
intlcal_get_minimal_days_in_first_week(), intlcal_get_minimum(),
intlcal_get_time_zone(), intlcal_get_type(),
intlcal_get_weekend_transition(), intlcal_in_daylight_time(),
intlcal_is_equivalent_to(), intlcal_is_lenient(), intlcal_is_set(),
intlcal_is_weekend(), intlcal_set_first_day_of_week(),
intlcal_set_lenient(), intlcal_equals(),
intlcal_get_repeated_wall_time_option(),
intlcal_get_skipped_wall_time_option(),
intlcal_set_repeated_wall_time_option(),
intlcal_set_skipped_wall_time_option(), intlcal_from_date_time(),
intlcal_to_date_time(), intlcal_get_error_code(),
intlcal_get_error_message(), intlgregcal_create_instance(),
intlgregcal_set_gregorian_change(), intlgregcal_get_gregorian_change() and
intlgregcal_is_leap_year(). (Gustavo)
. Added the functions: intltz_create_time_zone(), intltz_create_default(),
intltz_get_id(), intltz_get_gmt(), intltz_get_unknown(),
intltz_create_enumeration(), intltz_count_equivalent_ids(),
intltz_create_time_zone_id_enumeration(), intltz_get_canonical_id(),
intltz_get_region(), intltz_get_tz_data_version(),
intltz_get_equivalent_id(), intltz_use_daylight_time(), intltz_get_offset(),
intltz_get_raw_offset(), intltz_has_same_rules(), intltz_get_display_name(),
intltz_get_dst_savings(), intltz_from_date_time_zone(),
intltz_to_date_time_zone(), intltz_get_error_code(),
intltz_get_error_message(). (Gustavo)
. Added the methods: IntlDateFormatter::formatObject(),
IntlDateFormatter::getCalendarObject(), IntlDateFormatter::getTimeZone(),
IntlDateFormatter::setTimeZone(). (Gustavo)
. Added the functions: datefmt_format_object(), datefmt_get_calendar_object(),
datefmt_get_timezone(), datefmt_set_timezone(),
datefmt_get_calendar_object(), intlcal_create_instance(). (Gustavo)
- MCrypt
. mcrypt_ecb(), mcrypt_cbc(), mcrypt_cfb() and mcrypt_ofb() now throw
E_DEPRECATED. (GoogleGuy)
- MySQLi
. Dropped support for LOAD DATA LOCAL INFILE handlers when using libmysql.
Known for stability problems. (Andrey)
. Added support for SHA256 authentication available with MySQL 5.6.6+.
(Andrey)
- PCRE:
. Deprecated the /e modifier
(https://wiki.php.net/rfc/remove_preg_replace_eval_modifier). (Nikita Popov)
. Fixed bug #63284 (Upgrade PCRE to 8.31). (Anatoliy)
- pgsql
. Added pg_escape_literal() and pg_escape_identifier() (Yasuo)
- SPL
. Fix bug #60560 (SplFixedArray un-/serialize, getSize(), count() return 0,
keys are strings). (Adam)
- Tokenizer:
. Fixed bug #60097 (token_get_all fails to lex nested heredoc). (Nikita Popov)
- Zip:
. Upgraded libzip to 0.10.1 (Anatoliy)
- Fileinfo:
. Fixed bug #63248 (Load multiple magic files from a directory under Windows).
(Anatoliy)
- General improvements:
. Implemented FR #46487 (Dereferencing process-handles no longer waits on
those processes). (Jille Timmermans)
<<< NOTE: Insert NEWS from last stable release here prior to actual release! >>>

File diff suppressed because it is too large Load Diff

View File

@ -18,7 +18,6 @@ $pattern = '[[:space:]]';
$string = '1 2 3 4 5'; $string = '1 2 3 4 5';
var_dump(split($pattern, $string, 0)); var_dump(split($pattern, $string, 0));
var_dump(split($pattern, $string, -10)); var_dump(split($pattern, $string, -10));
var_dump(split($pattern, $string, 10E20));
echo "Done"; echo "Done";
@ -35,9 +34,4 @@ array(1) {
[0]=> [0]=>
string(9) "1 2 3 4 5" string(9) "1 2 3 4 5"
} }
Error: 8192 - Function split() is deprecated, %s(18)
array(1) {
[0]=>
string(9) "1 2 3 4 5"
}
Done Done

View File

@ -18,7 +18,6 @@ $pattern = '[[:space:]]';
$string = '1 2 3 4 5'; $string = '1 2 3 4 5';
var_dump(spliti($pattern, $string, 0)); var_dump(spliti($pattern, $string, 0));
var_dump(spliti($pattern, $string, -10)); var_dump(spliti($pattern, $string, -10));
var_dump(spliti($pattern, $string, 10E20));
echo "Done"; echo "Done";
@ -35,9 +34,4 @@ array(1) {
[0]=> [0]=>
string(9) "1 2 3 4 5" string(9) "1 2 3 4 5"
} }
Error: 8192 - Function spliti() is deprecated, %s(18)
array(1) {
[0]=>
string(9) "1 2 3 4 5"
}
Done Done

View File

@ -10,3 +10,4 @@ AC_DEFINE('HAVE_BUNDLED_PCRE', 1, 'Using bundled PCRE library');
AC_DEFINE('HAVE_PCRE', 1, 'Have PCRE library'); AC_DEFINE('HAVE_PCRE', 1, 'Have PCRE library');
PHP_PCRE="yes"; PHP_PCRE="yes";
PHP_INSTALL_HEADERS("ext/pcre", "php_pcre.h pcrelib/"); PHP_INSTALL_HEADERS("ext/pcre", "php_pcre.h pcrelib/");
ADD_FLAG("CFLAGS_PCRE", " /D HAVE_CONFIG_H");

View File

@ -59,7 +59,8 @@ PHP_ARG_WITH(pcre-regex,,
pcrelib/pcre_ord2utf8.c pcrelib/pcre_refcount.c pcrelib/pcre_study.c \ pcrelib/pcre_ord2utf8.c pcrelib/pcre_refcount.c pcrelib/pcre_study.c \
pcrelib/pcre_tables.c pcrelib/pcre_valid_utf8.c \ pcrelib/pcre_tables.c pcrelib/pcre_valid_utf8.c \
pcrelib/pcre_version.c pcrelib/pcre_xclass.c" pcrelib/pcre_version.c pcrelib/pcre_xclass.c"
PHP_NEW_EXTENSION(pcre, $pcrelib_sources php_pcre.c, no,,-I@ext_srcdir@/pcrelib) PHP_PCRE_CFLAGS="-DHAVE_CONFIG_H -I@ext_srcdir@/pcrelib"
PHP_NEW_EXTENSION(pcre, $pcrelib_sources php_pcre.c, no,,$PHP_PCRE_CFLAGS)
PHP_ADD_BUILD_DIR($ext_builddir/pcrelib) PHP_ADD_BUILD_DIR($ext_builddir/pcrelib)
PHP_INSTALL_HEADERS([ext/pcre], [php_pcre.h pcrelib/]) PHP_INSTALL_HEADERS([ext/pcre], [php_pcre.h pcrelib/])
AC_DEFINE(HAVE_BUNDLED_PCRE, 1, [ ]) AC_DEFINE(HAVE_BUNDLED_PCRE, 1, [ ])

View File

@ -1,6 +1,170 @@
ChangeLog for PCRE ChangeLog for PCRE
------------------ ------------------
Version 8.32 30-November-2012
-----------------------------
1. Improved JIT compiler optimizations for first character search and single
character iterators.
2. Supporting IBM XL C compilers for PPC architectures in the JIT compiler.
Patch by Daniel Richard G.
3. Single character iterator optimizations in the JIT compiler.
4. Improved JIT compiler optimizations for character ranges.
5. Rename the "leave" variable names to "quit" to improve WinCE compatibility.
Reported by Giuseppe D'Angelo.
6. The PCRE_STARTLINE bit, indicating that a match can occur only at the start
of a line, was being set incorrectly in cases where .* appeared inside
atomic brackets at the start of a pattern, or where there was a subsequent
*PRUNE or *SKIP.
7. Improved instruction cache flush for POWER/PowerPC.
Patch by Daniel Richard G.
8. Fixed a number of issues in pcregrep, making it more compatible with GNU
grep:
(a) There is now no limit to the number of patterns to be matched.
(b) An error is given if a pattern is too long.
(c) Multiple uses of --exclude, --exclude-dir, --include, and --include-dir
are now supported.
(d) --exclude-from and --include-from (multiple use) have been added.
(e) Exclusions and inclusions now apply to all files and directories, not
just to those obtained from scanning a directory recursively.
(f) Multiple uses of -f and --file-list are now supported.
(g) In a Windows environment, the default for -d has been changed from
"read" (the GNU grep default) to "skip", because otherwise the presence
of a directory in the file list provokes an error.
(h) The documentation has been revised and clarified in places.
9. Improve the matching speed of capturing brackets.
10. Changed the meaning of \X so that it now matches a Unicode extended
grapheme cluster.
11. Patch by Daniel Richard G to the autoconf files to add a macro for sorting
out POSIX threads when JIT support is configured.
12. Added support for PCRE_STUDY_EXTRA_NEEDED.
13. In the POSIX wrapper regcomp() function, setting re_nsub field in the preg
structure could go wrong in environments where size_t is not the same size
as int.
14. Applied user-supplied patch to pcrecpp.cc to allow PCRE_NO_UTF8_CHECK to be
set.
15. The EBCDIC support had decayed; later updates to the code had included
explicit references to (e.g.) \x0a instead of CHAR_LF. There has been a
general tidy up of EBCDIC-related issues, and the documentation was also
not quite right. There is now a test that can be run on ASCII systems to
check some of the EBCDIC-related things (but is it not a full test).
16. The new PCRE_STUDY_EXTRA_NEEDED option is now used by pcregrep, resulting
in a small tidy to the code.
17. Fix JIT tests when UTF is disabled and both 8 and 16 bit mode are enabled.
18. If the --only-matching (-o) option in pcregrep is specified multiple
times, each one causes appropriate output. For example, -o1 -o2 outputs the
substrings matched by the 1st and 2nd capturing parentheses. A separating
string can be specified by --om-separator (default empty).
19. Improving the first n character searches.
20. Turn case lists for horizontal and vertical white space into macros so that
they are defined only once.
21. This set of changes together give more compatible Unicode case-folding
behaviour for characters that have more than one other case when UCP
support is available.
(a) The Unicode property table now has offsets into a new table of sets of
three or more characters that are case-equivalent. The MultiStage2.py
script that generates these tables (the pcre_ucd.c file) now scans
CaseFolding.txt instead of UnicodeData.txt for character case
information.
(b) The code for adding characters or ranges of characters to a character
class has been abstracted into a generalized function that also handles
case-independence. In UTF-mode with UCP support, this uses the new data
to handle characters with more than one other case.
(c) A bug that is fixed as a result of (b) is that codepoints less than 256
whose other case is greater than 256 are now correctly matched
caselessly. Previously, the high codepoint matched the low one, but not
vice versa.
(d) The processing of \h, \H, \v, and \ in character classes now makes use
of the new class addition function, using character lists defined as
macros alongside the case definitions of 20 above.
(e) Caseless back references now work with characters that have more than
one other case.
(f) General caseless matching of characters with more than one other case
is supported.
22. Unicode character properties were updated from Unicode 6.2.0
23. Improved CMake support under Windows. Patch by Daniel Richard G.
24. Add support for 32-bit character strings, and UTF-32
25. Major JIT compiler update (code refactoring and bugfixing).
Experimental Sparc 32 support is added.
26. Applied a modified version of Daniel Richard G's patch to create
pcre.h.generic and config.h.generic by "make" instead of in the
PrepareRelease script.
27. Added a definition for CHAR_NULL (helpful for the z/OS port), and use it in
pcre_compile.c when checking for a zero character.
28. Introducing a native interface for JIT. Through this interface, the compiled
machine code can be directly executed. The purpose of this interface is to
provide fast pattern matching, so several sanity checks are not performed.
However, feature tests are still performed. The new interface provides
1.4x speedup compared to the old one.
29. If pcre_exec() or pcre_dfa_exec() was called with a negative value for
the subject string length, the error given was PCRE_ERROR_BADOFFSET, which
was confusing. There is now a new error PCRE_ERROR_BADLENGTH for this case.
30. In 8-bit UTF-8 mode, pcretest failed to give an error for data codepoints
greater than 0x7fffffff (which cannot be represented in UTF-8, even under
the "old" RFC 2279). Instead, it ended up passing a negative length to
pcre_exec().
31. Add support for GCC's visibility feature to hide internal functions.
32. Running "pcretest -C pcre8" or "pcretest -C pcre16" gave a spurious error
"unknown -C option" after outputting 0 or 1.
33. There is now support for generating a code coverage report for the test
suite in environments where gcc is the compiler and lcov is installed. This
is mainly for the benefit of the developers.
34. If PCRE is built with --enable-valgrind, certain memory regions are marked
unaddressable using valgrind annotations, allowing valgrind to detect
invalid memory accesses. This is mainly for the benefit of the developers.
25. (*UTF) can now be used to start a pattern in any of the three libraries.
26. Give configure error if --enable-cpp but no C++ compiler found.
Version 8.31 06-July-2012 Version 8.31 06-July-2012
------------------------- -------------------------

View File

@ -49,16 +49,17 @@ complexity in Perl regular expressions, I couldn't do this. In any case, a
first pass through the pattern is helpful for other reasons. first pass through the pattern is helpful for other reasons.
Support for 16-bit data strings Support for 16-bit and 32-bit data strings
------------------------------- -------------------------------------------
From release 8.30, PCRE supports 16-bit as well as 8-bit data strings, by being From release 8.30, PCRE supports 16-bit as well as 8-bit data strings; and from
compilable in either 8-bit or 16-bit modes, or both. Thus, two different release 8.32, PCRE supports 32-bit data strings. The library can be compiled
libraries can be created. In the description that follows, the word "short" is in any combination of 8-bit, 16-bit or 32-bit modes, creating different
libraries. In the description that follows, the word "short" is
used for a 16-bit data quantity, and the word "unit" is used for a quantity used for a 16-bit data quantity, and the word "unit" is used for a quantity
that is a byte in 8-bit mode and a short in 16-bit mode. However, so as not to that is a byte in 8-bit mode, a short in 16-bit mode and a 32-bit unsigned
over-complicate the text, the names of PCRE functions are given in 8-bit form integer in 32-bit mode. However, so as not to over-complicate the text, the
only. names of PCRE functions are given in 8-bit form only.
Computing the memory requirement: how it was Computing the memory requirement: how it was
@ -138,9 +139,10 @@ Format of compiled patterns
--------------------------- ---------------------------
The compiled form of a pattern is a vector of units (bytes in 8-bit mode, or The compiled form of a pattern is a vector of units (bytes in 8-bit mode, or
shorts in 16-bit mode), containing items of variable length. The first unit in shorts in 16-bit mode, 32-bit unsigned integers in 32-bit mode), containing
an item contains an opcode, and the length of the item is either implicit in items of variable length. The first unit in an item contains an opcode, and
the opcode or contained in the data that follows it. the length of the item is either implicit in the opcode or contained in the
data that follows it.
In many cases listed below, LINK_SIZE data values are specified for offsets In many cases listed below, LINK_SIZE data values are specified for offsets
within the compiled pattern. LINK_SIZE always specifies a number of bytes. The within the compiled pattern. LINK_SIZE always specifies a number of bytes. The
@ -207,7 +209,8 @@ Matching literal characters
The OP_CHAR opcode is followed by a single character that is to be matched The OP_CHAR opcode is followed by a single character that is to be matched
casefully. For caseless matching, OP_CHARI is used. In UTF-8 or UTF-16 modes, casefully. For caseless matching, OP_CHARI is used. In UTF-8 or UTF-16 modes,
the character may be more than one unit long. the character may be more than one unit long. In UTF-32 mode, characters
are always exactly one unit long.
Repeating single characters Repeating single characters
@ -228,7 +231,8 @@ following opcodes, which come in caseful and caseless versions:
OP_POSQUERY OP_POSQUERYI OP_POSQUERY OP_POSQUERYI
Each opcode is followed by the character that is to be repeated. In ASCII mode, Each opcode is followed by the character that is to be repeated. In ASCII mode,
these are two-unit items; in UTF-8 or UTF-16 modes, the length is variable. these are two-unit items; in UTF-8 or UTF-16 modes, the length is variable; in
UTF-32 mode these are one-unit items.
Those with "MIN" in their names are the minimizing versions. Those with "POS" Those with "MIN" in their names are the minimizing versions. Those with "POS"
in their names are possessive versions. Other repeats make use of these in their names are possessive versions. Other repeats make use of these
opcodes: opcodes:
@ -299,7 +303,7 @@ bit map containing a 1 bit for every character that is acceptable. The bits are
counted from the least significant end of each unit. In caseless mode, bits for counted from the least significant end of each unit. In caseless mode, bits for
both cases are set. both cases are set.
The reason for having both OP_CLASS and OP_NCLASS is so that, in UTF-8/16 mode, The reason for having both OP_CLASS and OP_NCLASS is so that, in UTF-8/16/32 mode,
subject characters with values greater than 255 can be handled correctly. For subject characters with values greater than 255 can be handled correctly. For
OP_CLASS they do not match, whereas for OP_NCLASS they do. OP_CLASS they do not match, whereas for OP_NCLASS they do.
@ -412,7 +416,8 @@ OP_ASSERTBACK and OP_ASSERTBACK_NOT, and the first opcode inside the assertion
is OP_REVERSE, followed by a two byte (one short) count of the number of is OP_REVERSE, followed by a two byte (one short) count of the number of
characters to move back the pointer in the subject string. In ASCII mode, the characters to move back the pointer in the subject string. In ASCII mode, the
count is a number of units, but in UTF-8/16 mode each character may occupy more count is a number of units, but in UTF-8/16 mode each character may occupy more
than one unit. A separate count is present in each alternative of a lookbehind than one unit; in UTF-32 mode each character occupies exactly one unit.
A separate count is present in each alternative of a lookbehind
assertion, allowing them to have different fixed lengths. assertion, allowing them to have different fixed lengths.

View File

@ -1,6 +1,46 @@
News about PCRE releases News about PCRE releases
------------------------ ------------------------
Release 8.32 30-November-2012
-----------------------------
This release fixes a number of bugs, but also has some new features. These are
the highlights:
. There is now support for 32-bit character strings and UTF-32. Like the
16-bit support, this is done by compiling a separate 32-bit library.
. \X now matches a Unicode extended grapheme cluster.
. Case-independent matching of Unicode characters that have more than one
"other case" now makes all three (or more) characters equivalent. This
applies, for example, to Greek Sigma, which has two lowercase versions.
. Unicode character properties are updated to Unicode 6.2.0.
. The EBCDIC support, which had decayed, has had a spring clean.
. A number of JIT optimizations have been added, which give faster JIT
execution speed. In addition, a new direct interface to JIT execution is
available. This bypasses some of the sanity checks of pcre_exec() to give a
noticeable speed-up.
. A number of issues in pcregrep have been fixed, making it more compatible
with GNU grep. In particular, --exclude and --include (and variants) apply
to all files now, not just those obtained from scanning a directory
recursively. In Windows environments, the default action for directories is
now "skip" instead of "read" (which provokes an error).
. If the --only-matching (-o) option in pcregrep is specified multiple
times, each one causes appropriate output. For example, -o1 -o2 outputs the
substrings matched by the 1st and 2nd capturing parentheses. A separating
string can be specified by --om-separator (default empty).
. When PCRE is built via Autotools using a version of gcc that has the
"visibility" feature, it is used to hide internal library functions that are
not part of the public API.
Release 8.31 06-July-2012 Release 8.31 06-July-2012
------------------------- -------------------------
@ -9,7 +49,7 @@ This is mainly a bug-fixing release, with a small number of developments:
. The JIT compiler now supports partial matching and the (*MARK) and . The JIT compiler now supports partial matching and the (*MARK) and
(*COMMIT) verbs. (*COMMIT) verbs.
. PCRE_INFO_MAXLOOKBEHIND can be used to find the longest lookbehing in a . PCRE_INFO_MAXLOOKBEHIND can be used to find the longest lookbehind in a
pattern. pattern.
. There should be a performance improvement when using the heap instead of the . There should be a performance improvement when using the heap instead of the

View File

@ -35,9 +35,10 @@ The contents of this README file are:
The PCRE APIs The PCRE APIs
------------- -------------
PCRE is written in C, and it has its own API. There are two sets of functions, PCRE is written in C, and it has its own API. There are three sets of functions,
one for the 8-bit library, which processes strings of bytes, and one for the one for the 8-bit library, which processes strings of bytes, one for the
16-bit library, which processes strings of 16-bit values. The distribution also 16-bit library, which processes strings of 16-bit values, and one for the 32-bit
library, which processes strings of 32-bit values. The distribution also
includes a set of C++ wrapper functions (see the pcrecpp man page for details), includes a set of C++ wrapper functions (see the pcrecpp man page for details),
courtesy of Google Inc., which can be used to call the 8-bit PCRE library from courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
C++. C++.
@ -183,8 +184,10 @@ library. They are also documented in the pcrebuild man page.
(See also "Shared libraries on Unix-like systems" below.) (See also "Shared libraries on Unix-like systems" below.)
. By default, only the 8-bit library is built. If you add --enable-pcre16 to . By default, only the 8-bit library is built. If you add --enable-pcre16 to
the "configure" command, the 16-bit library is also built. If you want only the "configure" command, the 16-bit library is also built. If you add
the 16-bit library, use "./configure --enable-pcre16 --disable-pcre8". --enable-pcre32 to the "configure" command, the 32-bit library is also built.
If you want only the 16-bit or 32-bit library, use --disable-pcre8 to disable
building the 8-bit library.
. If you are building the 8-bit library and want to suppress the building of . If you are building the 8-bit library and want to suppress the building of
the C++ wrapper library, you can add --disable-cpp to the "configure" the C++ wrapper library, you can add --disable-cpp to the "configure"
@ -203,23 +206,24 @@ library. They are also documented in the pcrebuild man page.
. If you want to make use of the support for UTF-8 Unicode character strings in . If you want to make use of the support for UTF-8 Unicode character strings in
the 8-bit library, or UTF-16 Unicode character strings in the 16-bit library, the 8-bit library, or UTF-16 Unicode character strings in the 16-bit library,
you must add --enable-utf to the "configure" command. Without it, the code or UTF-32 Unicode character strings in the 32-bit library, you must add
for handling UTF-8 and UTF-16 is not included in the relevant library. Even --enable-utf to the "configure" command. Without it, the code for handling
UTF-8, UTF-16 and UTF-8 is not included in the relevant library. Even
when --enable-utf is included, the use of a UTF encoding still has to be when --enable-utf is included, the use of a UTF encoding still has to be
enabled by an option at run time. When PCRE is compiled with this option, its enabled by an option at run time. When PCRE is compiled with this option, its
input can only either be ASCII or UTF-8/16, even when running on EBCDIC input can only either be ASCII or UTF-8/16/32, even when running on EBCDIC
platforms. It is not possible to use both --enable-utf and --enable-ebcdic at platforms. It is not possible to use both --enable-utf and --enable-ebcdic at
the same time. the same time.
. There are no separate options for enabling UTF-8 and UTF-16 independently . There are no separate options for enabling UTF-8, UTF-16 and UTF-32
because that would allow ridiculous settings such as requesting UTF-16 independently because that would allow ridiculous settings such as requesting
support while building only the 8-bit library. However, the option UTF-16 support while building only the 8-bit library. However, the option
--enable-utf8 is retained for backwards compatibility with earlier releases --enable-utf8 is retained for backwards compatibility with earlier releases
that did not support 16-bit character strings. It is synonymous with that did not support 16-bit or 32-bit character strings. It is synonymous with
--enable-utf. It is not possible to configure one library with UTF support --enable-utf. It is not possible to configure one library with UTF support
and the other without in the same configuration. and the other without in the same configuration.
. If, in addition to support for UTF-8/16 character strings, you want to . If, in addition to support for UTF-8/16/32 character strings, you want to
include support for the \P, \p, and \X sequences that recognize Unicode include support for the \P, \p, and \X sequences that recognize Unicode
character properties, you must add --enable-unicode-properties to the character properties, you must add --enable-unicode-properties to the
"configure" command. This adds about 30K to the size of the library (in the "configure" command. This adds about 30K to the size of the library (in the
@ -281,7 +285,8 @@ library. They are also documented in the pcrebuild man page.
library, PCRE then uses three bytes instead of two for offsets to different library, PCRE then uses three bytes instead of two for offsets to different
parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
the same as --with-link-size=4, which (in both libraries) uses four-byte the same as --with-link-size=4, which (in both libraries) uses four-byte
offsets. Increasing the internal link size reduces performance. offsets. Increasing the internal link size reduces performance. In the 32-bit
library, the only supported link size is 4.
. You can build PCRE so that its internal match() function that is called from . You can build PCRE so that its internal match() function that is called from
pcre_exec() does not call itself recursively. Instead, it uses memory blocks pcre_exec() does not call itself recursively. Instead, it uses memory blocks
@ -310,13 +315,34 @@ library. They are also documented in the pcrebuild man page.
pcre_chartables.c.dist. See "Character tables" below for further information. pcre_chartables.c.dist. See "Character tables" below for further information.
. It is possible to compile PCRE for use on systems that use EBCDIC as their . It is possible to compile PCRE for use on systems that use EBCDIC as their
character code (as opposed to ASCII) by specifying character code (as opposed to ASCII/Unicode) by specifying
--enable-ebcdic --enable-ebcdic
This automatically implies --enable-rebuild-chartables (see above). However, This automatically implies --enable-rebuild-chartables (see above). However,
when PCRE is built this way, it always operates in EBCDIC. It cannot support when PCRE is built this way, it always operates in EBCDIC. It cannot support
both EBCDIC and UTF-8/16. both EBCDIC and UTF-8/16/32. There is a second option, --enable-ebcdic-nl25,
which specifies that the code value for the EBCDIC NL character is 0x25
instead of the default 0x15.
. In environments where valgrind is installed, if you specify
--enable-valgrind
PCRE will use valgrind annotations to mark certain memory regions as
unaddressable. This allows it to detect invalid memory accesses, and is
mostly useful for debugging PCRE itself.
. In environments where the gcc compiler is used and lcov version 1.6 or above
is installed, if you specify
--enable-coverage
the build process implements a code coverage report for the test suite. The
report is generated by running "make coverage". If ccache is installed on
your system, it must be disabled when building PCRE for coverage reporting.
You can do this by setting the environment variable CCACHE_DISABLE=1 before
running "make" to build PCRE.
. The pcregrep program currently supports only 8-bit data files, and so . The pcregrep program currently supports only 8-bit data files, and so
requires the 8-bit PCRE library. It is possible to compile pcregrep to use requires the 8-bit PCRE library. It is possible to compile pcregrep to use
@ -366,6 +392,7 @@ The "configure" script builds the following files for the basic C library:
that were set for "configure" that were set for "configure"
. libpcre.pc ) data for the pkg-config command . libpcre.pc ) data for the pkg-config command
. libpcre16.pc ) . libpcre16.pc )
. libpcre32.pc )
. libpcreposix.pc ) . libpcreposix.pc )
. libtool script that builds shared and/or static libraries . libtool script that builds shared and/or static libraries
@ -385,8 +412,8 @@ The "configure" script also creates config.status, which is an executable
script that can be run to recreate the configuration, and config.log, which script that can be run to recreate the configuration, and config.log, which
contains compiler output from tests that "configure" runs. contains compiler output from tests that "configure" runs.
Once "configure" has run, you can run "make". This builds either or both of the Once "configure" has run, you can run "make". This builds the the libraries
libraries libpcre and libpcre16, and a test program called pcretest. If you libpcre, libpcre16 and/or libpcre32, and a test program called pcretest. If you
enabled JIT support with --enable-jit, a test program called pcre_jit_test is enabled JIT support with --enable-jit, a test program called pcre_jit_test is
built as well. built as well.
@ -410,12 +437,14 @@ system. The following are installed (file names are all relative to the
Libraries (lib): Libraries (lib):
libpcre16 (if 16-bit support is enabled) libpcre16 (if 16-bit support is enabled)
libpcre32 (if 32-bit support is enabled)
libpcre (if 8-bit support is enabled) libpcre (if 8-bit support is enabled)
libpcreposix (if 8-bit support is enabled) libpcreposix (if 8-bit support is enabled)
libpcrecpp (if 8-bit and C++ support is enabled) libpcrecpp (if 8-bit and C++ support is enabled)
Configuration information (lib/pkgconfig): Configuration information (lib/pkgconfig):
libpcre16.pc libpcre16.pc
libpcre32.pc
libpcre.pc libpcre.pc
libpcreposix.pc libpcreposix.pc
libpcrecpp.pc (if C++ support is enabled) libpcrecpp.pc (if C++ support is enabled)
@ -596,7 +625,7 @@ The RunTest script runs the pcretest test program (which is documented in its
own man page) on each of the relevant testinput files in the testdata own man page) on each of the relevant testinput files in the testdata
directory, and compares the output with the contents of the corresponding directory, and compares the output with the contents of the corresponding
testoutput files. Some tests are relevant only when certain build-time options testoutput files. Some tests are relevant only when certain build-time options
were selected. For example, the tests for UTF-8/16 support are run only if were selected. For example, the tests for UTF-8/16/32 support are run only if
--enable-utf was used. RunTest outputs a comment when it skips a test. --enable-utf was used. RunTest outputs a comment when it skips a test.
Many of the tests that are not skipped are run up to three times. The second Many of the tests that are not skipped are run up to three times. The second
@ -605,9 +634,9 @@ tests that are marked "never study" (see the pcretest program for how this is
done). If JIT support is available, the non-DFA tests are run a third time, done). If JIT support is available, the non-DFA tests are run a third time,
this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option. this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.
When both 8-bit and 16-bit support is enabled, the entire set of tests is run The entire set of tests is run once for each of the 8-bit, 16-bit and 32-bit
twice, once for each library. If you want to run just one set of tests, call libraries that are enabled. If you want to run just one set of tests, call
RunTest with either the -8 or -16 option. RunTest with either the -8, -16 or -32 option.
RunTest uses a file called testtry to hold the main output from pcretest. RunTest uses a file called testtry to hold the main output from pcretest.
Other files whose names begin with "test" are used as working files in some Other files whose names begin with "test" are used as working files in some
@ -658,13 +687,13 @@ RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses
Windows versions of test 2. More info on using RunTest.bat is included in the Windows versions of test 2. More info on using RunTest.bat is included in the
document entitled NON-UNIX-USE.] document entitled NON-UNIX-USE.]
The fourth and fifth tests check the UTF-8/16 support and error handling and The fourth and fifth tests check the UTF-8/16/32 support and error handling and
internal UTF features of PCRE that are not relevant to Perl, respectively. The internal UTF features of PCRE that are not relevant to Perl, respectively. The
sixth and seventh tests do the same for Unicode character properties support. sixth and seventh tests do the same for Unicode character properties support.
The eighth, ninth, and tenth tests check the pcre_dfa_exec() alternative The eighth, ninth, and tenth tests check the pcre_dfa_exec() alternative
matching function, in non-UTF-8/16 mode, UTF-8/16 mode, and UTF-8/16 mode with matching function, in non-UTF-8/16/32 mode, UTF-8/16/32 mode, and UTF-8/16/32
Unicode property support, respectively. mode with Unicode property support, respectively.
The eleventh test checks some internal offsets and code size features; it is The eleventh test checks some internal offsets and code size features; it is
run only when the default "link size" of 2 is set (in other cases the sizes run only when the default "link size" of 2 is set (in other cases the sizes
@ -675,16 +704,21 @@ test is run only when JIT support is not available. They test some JIT-specific
features such as information output from pcretest about JIT compilation. features such as information output from pcretest about JIT compilation.
The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
the seventeenth, eighteenth, and nineteenth tests are run only in 16-bit mode. the seventeenth, eighteenth, and nineteenth tests are run only in 16/32-bit mode.
These are tests that generate different output in the two modes. They are for These are tests that generate different output in the two modes. They are for
general cases, UTF-8/16 support, and Unicode property support, respectively. general cases, UTF-8/16/32 support, and Unicode property support, respectively.
The twentieth test is run only in 16-bit mode. It tests some specific 16-bit The twentieth test is run only in 16/32-bit mode. It tests some specific
features of the DFA matching engine. 16/32-bit features of the DFA matching engine.
The twenty-first and twenty-second tests are run only in 16-bit mode, when the The twenty-first and twenty-second tests are run only in 16/32-bit mode, when the
link size is set to 2. They test reloading pre-compiled patterns. link size is set to 2 for the 16-bit library. They test reloading pre-compiled patterns.
The twenty-third and twenty-fourth tests are run only in 16-bit mode. They are for
general cases, and UTF-16 support, respectively.
The twenty-fifth and twenty-sixth tests are run only in 32-bit mode. They are for
general cases, and UTF-32 support, respectively.
Character tables Character tables
---------------- ----------------
@ -744,8 +778,8 @@ File manifest
------------- -------------
The distribution should contain the files listed below. Where a file name is The distribution should contain the files listed below. Where a file name is
given as pcre[16]_xxx it means that there are two files, one with the name given as pcre[16|32]_xxx it means that there are three files, one with the name
pcre_xxx and the other with the name pcre16_xxx. pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
(A) Source files of the PCRE library functions and their headers: (A) Source files of the PCRE library functions and their headers:
@ -756,33 +790,35 @@ pcre_xxx and the other with the name pcre16_xxx.
coding; used, unless --enable-rebuild-chartables is coding; used, unless --enable-rebuild-chartables is
specified, by copying to pcre[16]_chartables.c specified, by copying to pcre[16]_chartables.c
pcreposix.c ) pcreposix.c )
pcre[16]_byte_order.c ) pcre[16|32]_byte_order.c )
pcre[16]_compile.c ) pcre[16|32]_compile.c )
pcre[16]_config.c ) pcre[16|32]_config.c )
pcre[16]_dfa_exec.c ) pcre[16|32]_dfa_exec.c )
pcre[16]_exec.c ) pcre[16|32]_exec.c )
pcre[16]_fullinfo.c ) pcre[16|32]_fullinfo.c )
pcre[16]_get.c ) sources for the functions in the library, pcre[16|32]_get.c ) sources for the functions in the library,
pcre[16]_globals.c ) and some internal functions that they use pcre[16|32]_globals.c ) and some internal functions that they use
pcre[16]_jit_compile.c ) pcre[16|32]_jit_compile.c )
pcre[16]_maketables.c ) pcre[16|32]_maketables.c )
pcre[16]_newline.c ) pcre[16|32]_newline.c )
pcre[16]_refcount.c ) pcre[16|32]_refcount.c )
pcre[16]_string_utils.c ) pcre[16|32]_string_utils.c )
pcre[16]_study.c ) pcre[16|32]_study.c )
pcre[16]_tables.c ) pcre[16|32]_tables.c )
pcre[16]_ucd.c ) pcre[16|32]_ucd.c )
pcre[16]_version.c ) pcre[16|32]_version.c )
pcre[16]_xclass.c ) pcre[16|32]_xclass.c )
pcre_ord2utf8.c ) pcre_ord2utf8.c )
pcre_valid_utf8.c ) pcre_valid_utf8.c )
pcre16_ord2utf16.c ) pcre16_ord2utf16.c )
pcre16_utf16_utils.c ) pcre16_utf16_utils.c )
pcre16_valid_utf16.c ) pcre16_valid_utf16.c )
pcre32_utf32_utils.c )
pcre32_valid_utf32.c )
pcre[16]_printint.c ) debugging function that is used by pcretest, pcre[16|32]_printint.c ) debugging function that is used by pcretest,
) and can also be #included in pcre_compile() ) and can also be #included in pcre_compile()
pcre.h.in template for pcre.h when built by "configure" pcre.h.in template for pcre.h when built by "configure"
pcreposix.h header for the external POSIX wrapper API pcreposix.h header for the external POSIX wrapper API
@ -847,6 +883,7 @@ pcre_xxx and the other with the name pcre16_xxx.
doc/perltest.txt plain text documentation of Perl test program doc/perltest.txt plain text documentation of Perl test program
install-sh a shell script for installing files install-sh a shell script for installing files
libpcre16.pc.in template for libpcre16.pc for pkg-config libpcre16.pc.in template for libpcre16.pc for pkg-config
libpcre32.pc.in template for libpcre32.pc for pkg-config
libpcre.pc.in template for libpcre.pc for pkg-config libpcre.pc.in template for libpcre.pc for pkg-config
libpcreposix.pc.in template for libpcreposix.pc for pkg-config libpcreposix.pc.in template for libpcreposix.pc for pkg-config
libpcrecpp.pc.in template for libpcrecpp.pc for pkg-config libpcrecpp.pc.in template for libpcrecpp.pc for pkg-config
@ -895,4 +932,4 @@ pcre_xxx and the other with the name pcre16_xxx.
Philip Hazel Philip Hazel
Email local part: ph10 Email local part: ph10
Email domain: cam.ac.uk Email domain: cam.ac.uk
Last updated: 18 June 2012 Last updated: 27 October 2012

View File

@ -31,16 +31,17 @@
/* config.h.in. Generated from configure.ac by autoheader. */ /* config.h.in. Generated from configure.ac by autoheader. */
/* On Unix-like systems config.h.in is converted by "configure" into config.h. /* PCRE is written in Standard C, but there are a few non-standard things it
Some other environments also support the use of "configure". PCRE is written in can cope with, allowing it to run on SunOS4 and other "close to standard"
Standard C, but there are a few non-standard things it can cope with, allowing systems.
it to run on SunOS4 and other "close to standard" systems.
If you are going to build PCRE "by hand" on a system without "configure" you In environments that support the facilities, config.h.in is converted by
should copy the distributed config.h.generic to config.h, and then set up the "configure", or config-cmake.h.in is converted by CMake, into config.h. If you
macro definitions the way you need them. You must then add -DHAVE_CONFIG_H to are going to build PCRE "by hand" without using "configure" or CMake, you
all of your compile commands, so that config.h is included at the start of should copy the distributed config.h.generic to config.h, and then edit the
every source. macro definitions to be the way you need them. You must then add
-DHAVE_CONFIG_H to all of your compile commands, so that config.h is included
at the start of every source.
Alternatively, you can avoid editing by using -D on the compiler command line Alternatively, you can avoid editing by using -D on the compiler command line
to set the macro values. In this case, you do not have to set -DHAVE_CONFIG_H. to set the macro values. In this case, you do not have to set -DHAVE_CONFIG_H.
@ -50,19 +51,27 @@ HAVE_BCOPY is set to 1. If your system has neither bcopy() nor memmove(), set
them both to 0; an emulation function will be used. */ them both to 0; an emulation function will be used. */
/* By default, the \R escape sequence matches any Unicode line ending /* By default, the \R escape sequence matches any Unicode line ending
character or sequence of characters. If BSR_ANYCRLF is defined, this is character or sequence of characters. If BSR_ANYCRLF is defined (to any
changed so that backslash-R matches only CR, LF, or CRLF. The build- time value), this is changed so that backslash-R matches only CR, LF, or CRLF.
default can be overridden by the user of PCRE at runtime. On systems that The build-time default can be overridden by the user of PCRE at runtime. */
support it, "configure" can be used to override the default. */ #undef BSR_ANYCRLF
/* #undef BSR_ANYCRLF */
/* If you are compiling for a system that uses EBCDIC instead of ASCII /* If you are compiling for a system that uses EBCDIC instead of ASCII
character codes, define this macro as 1. On systems that can use character codes, define this macro to any value. You must also edit the
"configure", this can be done via --enable-ebcdic. PCRE will then assume NEWLINE macro below to set a suitable EBCDIC newline, commonly 21 (0x15).
that all input strings are in EBCDIC. If you do not define this macro, PCRE On systems that can use "configure" or CMake to set EBCDIC, NEWLINE is
will assume input strings are ASCII or UTF-8 Unicode. It is not possible to automatically adjusted. When EBCDIC is set, PCRE assumes that all input
build a version of PCRE that supports both EBCDIC and UTF-8. */ strings are in EBCDIC. If you do not define this macro, PCRE will assume
/* #undef EBCDIC */ input strings are ASCII or UTF-8/16/32 Unicode. It is not possible to build
a version of PCRE that supports both EBCDIC and UTF-8/16/32. */
#undef EBCDIC
/* In an EBCDIC environment, define this macro to any value to arrange for the
NL character to be 0x25 instead of the default 0x15. NL plays the role that
LF does in an ASCII/Unicode environment. The value must also be set in the
NEWLINE macro below. On systems that can use "configure" or CMake to set
EBCDIC_NL25, the adjustment of NEWLINE is automatic. */
#undef EBCDIC_NL25
/* Define to 1 if you have the `bcopy' function. */ /* Define to 1 if you have the `bcopy' function. */
#ifndef HAVE_BCOPY #ifndef HAVE_BCOPY
@ -87,6 +96,12 @@ them both to 0; an emulation function will be used. */
#define HAVE_DLFCN_H 1 #define HAVE_DLFCN_H 1
#endif #endif
/* Define to 1 if you have the <editline/readline.h> header file. */
/*#undef HAVE_EDITLINE_READLINE_H*/
/* Define to 1 if you have the <edit/readline/readline.h> header file. */
/* #undef HAVE_EDIT_READLINE_READLINE_H */
/* Define to 1 if you have the <inttypes.h> header file. */ /* Define to 1 if you have the <inttypes.h> header file. */
#ifndef HAVE_INTTYPES_H #ifndef HAVE_INTTYPES_H
#define HAVE_INTTYPES_H 1 #define HAVE_INTTYPES_H 1
@ -112,6 +127,11 @@ them both to 0; an emulation function will be used. */
#define HAVE_MEMORY_H 1 #define HAVE_MEMORY_H 1
#endif #endif
/* Define if you have POSIX threads libraries and header files. */
#undef HAVE_PTHREAD
/* Have PTHREAD_PRIO_INHERIT. */
#undef HAVE_PTHREAD_PRIO_INHERIT
/* Define to 1 if you have the <readline/history.h> header file. */ /* Define to 1 if you have the <readline/history.h> header file. */
#ifndef HAVE_READLINE_HISTORY_H #ifndef HAVE_READLINE_HISTORY_H
#define HAVE_READLINE_HISTORY_H 1 #define HAVE_READLINE_HISTORY_H 1
@ -186,6 +206,10 @@ them both to 0; an emulation function will be used. */
#define HAVE_UNSIGNED_LONG_LONG 1 #define HAVE_UNSIGNED_LONG_LONG 1
#endif #endif
/* Define to 1 or 0, depending whether the compiler supports simple visibility
declarations. */
/* #undef HAVE_VISIBILITY */
/* Define to 1 if you have the <windows.h> header file. */ /* Define to 1 if you have the <windows.h> header file. */
/* #undef HAVE_WINDOWS_H */ /* #undef HAVE_WINDOWS_H */
@ -254,22 +278,28 @@ them both to 0; an emulation function will be used. */
#define MAX_NAME_SIZE 32 #define MAX_NAME_SIZE 32
#endif #endif
/* The value of NEWLINE determines the newline character sequence. On systems /* The value of NEWLINE determines the default newline character sequence.
that support it, "configure" can be used to override the default, which is PCRE client programs can override this by selecting other values at run
10. The possible values are 10 (LF), 13 (CR), 3338 (CRLF), -1 (ANY), or -2 time. In ASCII environments, the value can be 10 (LF), 13 (CR), or 3338
(ANYCRLF). */ (CRLF); in EBCDIC environments the value can be 21 or 37 (LF), 13 (CR), or
3349 or 3365 (CRLF) because there are two alternative codepoints (0x15 and
0x25) that are used as the NL line terminator that is equivalent to ASCII
LF. In both ASCII and EBCDIC environments the value can also be -1 (ANY),
or -2 (ANYCRLF). */
#ifndef NEWLINE #ifndef NEWLINE
#define NEWLINE 10 #define NEWLINE 10
#endif #endif
/* Define to 1 if your C compiler doesn't accept -c and -o together. */
/* #undef NO_MINUS_C_MINUS_O */
/* PCRE uses recursive function calls to handle backtracking while matching. /* PCRE uses recursive function calls to handle backtracking while matching.
This can sometimes be a problem on systems that have stacks of limited This can sometimes be a problem on systems that have stacks of limited
size. Define NO_RECURSE to get a version that doesn't use recursion in the size. Define NO_RECURSE to any value to get a version that doesn't use
match() function; instead it creates its own stack by steam using recursion in the match() function; instead it creates its own stack by
pcre_recurse_malloc() to obtain memory from the heap. For more detail, see steam using pcre_recurse_malloc() to obtain memory from the heap. For more
the comments and other stuff just above the match() function. On systems detail, see the comments and other stuff just above the match() function.
that support it, "configure" can be used to set this in the Makefile (use */
--disable-stack-for-recursion). */
/* #undef NO_RECURSE */ /* #undef NO_RECURSE */
/* Name of package */ /* Name of package */
@ -282,7 +312,7 @@ them both to 0; an emulation function will be used. */
#define PACKAGE_NAME "PCRE" #define PACKAGE_NAME "PCRE"
/* Define to the full name and version of this package. */ /* Define to the full name and version of this package. */
#define PACKAGE_STRING "PCRE 8.31" #define PACKAGE_STRING "PCRE 8.32"
/* Define to the one symbol short name of this package. */ /* Define to the one symbol short name of this package. */
#define PACKAGE_TARNAME "pcre" #define PACKAGE_TARNAME "pcre"
@ -291,21 +321,46 @@ them both to 0; an emulation function will be used. */
#define PACKAGE_URL "" #define PACKAGE_URL ""
/* Define to the version of this package. */ /* Define to the version of this package. */
#define PACKAGE_VERSION "8.31" #define PACKAGE_VERSION "8.32"
/* to make a symbol visible */
/* #undef PCRECPP_EXP_DECL */
/* to make a symbol visible */
/* #undef PCRECPP_EXP_DEFN */
/* The value of PCREGREP_BUFSIZE determines the size of buffer used by
pcregrep to hold parts of the file it is searching. This is also the
minimum value. The actual amount of memory used by pcregrep is three times
this number, because it allows for the buffering of "before" and "after"
lines. */
/* #undef PCREGREP_BUFSIZE */
/* to make a symbol visible */
/* #undef PCREPOSIX_EXP_DECL */
/* to make a symbol visible */
/* #undef PCREPOSIX_EXP_DEFN */
/* to make a symbol visible */
/* #undef PCRE_EXP_DATA_DEFN */
/* to make a symbol visible */
/* #undef PCRE_EXP_DECL */
/* If you are compiling for a system other than a Unix-like system or /* If you are compiling for a system other than a Unix-like system or
Win32, and it needs some magic to be inserted before the definition Win32, and it needs some magic to be inserted before the definition
of a function that is exported by the library, define this macro to of a function that is exported by the library, define this macro to
contain the relevant magic. If you do not define this macro, it contain the relevant magic. If you do not define this macro, a suitable
defaults to "extern" for a C compiler and "extern C" for a C++ __declspec value is used for Windows systems; in other environments
compiler on non-Win32 systems. This macro apears at the start of "extern" is used for a C compiler and "extern C" for a C++ compiler.
every exported function that is part of the external API. It does This macro apears at the start of every exported function that is part
not appear on functions that are "external" in the C sense, but of the external API. It does not appear on functions that are "external"
which are internal to the library. */ in the C sense, but which are internal to the library. */
/* #undef PCRE_EXP_DEFN */ /* #undef PCRE_EXP_DEFN */
/* Define if linking statically (TODO: make nice with Libtool) */ /* Define to any value if linking statically (TODO: make nice with Libtool) */
/* #undef PCRE_STATIC */ /* #undef PCRE_STATIC */
/* When calling PCRE via the POSIX interface, additional working storage is /* When calling PCRE via the POSIX interface, additional working storage is
@ -314,40 +369,68 @@ them both to 0; an emulation function will be used. */
only two. If the number of expected substrings is small, the wrapper only two. If the number of expected substrings is small, the wrapper
function uses space on the stack, because this is faster than using function uses space on the stack, because this is faster than using
malloc() for each call. The threshold above which the stack is no longer malloc() for each call. The threshold above which the stack is no longer
used is defined by POSIX_MALLOC_THRESHOLD. On systems that support it, used is defined by POSIX_MALLOC_THRESHOLD. */
"configure" can be used to override this default. */
#ifndef POSIX_MALLOC_THRESHOLD #ifndef POSIX_MALLOC_THRESHOLD
#define POSIX_MALLOC_THRESHOLD 10 #define POSIX_MALLOC_THRESHOLD 10
#endif #endif
/* Define to necessary symbol if this constant uses a non-standard name on
your system. */
/* #undef PTHREAD_CREATE_JOINABLE */
/* Define to 1 if you have the ANSI C header files. */ /* Define to 1 if you have the ANSI C header files. */
#ifndef STDC_HEADERS #ifndef STDC_HEADERS
#define STDC_HEADERS 1 #define STDC_HEADERS 1
#endif #endif
/* Define to allow pcregrep to be linked with libbz2, so that it is able to /* Define to allow pcretest and pcregrep to be linked with gcov, so that they
handle .bz2 files. */ are able to generate code coverage reports. */
#undef SUPPORT_GCOV
/* Define to any value to enable support for Just-In-Time compiling. */
#undef SUPPORT_JIT
/* Define to any value to allow pcregrep to be linked with libbz2, so that it
is able to handle .bz2 files. */
/* #undef SUPPORT_LIBBZ2 */ /* #undef SUPPORT_LIBBZ2 */
/* Define to allow pcretest to be linked with libreadline. */ /* Define to any value to allow pcretest to be linked with libedit. */
#undef SUPPORT_LIBEDIT
/* Define to any value to allow pcretest to be linked with libreadline. */
/* #undef SUPPORT_LIBREADLINE */ /* #undef SUPPORT_LIBREADLINE */
/* Define to allow pcregrep to be linked with libz, so that it is able to /* Define to any value to allow pcregrep to be linked with libz, so that it is
handle .gz files. */ able to handle .gz files. */
/* #undef SUPPORT_LIBZ */ /* #undef SUPPORT_LIBZ */
/* Define to any value to enable the 16 bit PCRE library. */
/* #undef SUPPORT_PCRE16 */
/* Define to any value to enable the 32 bit PCRE library. */
/* #undef SUPPORT_PCRE32 */
/* Define to any value to enable the 8 bit PCRE library. */
/* #undef SUPPORT_PCRE8 */
/* Define to any value to enable JIT support in pcregrep. */
/* #undef SUPPORT_PCREGREP_JIT */
/* Define to enable support for Unicode properties */ /* Define to enable support for Unicode properties */
/* #undef SUPPORT_UCP */ /* #undef SUPPORT_UCP */
/* Define to enable support for the UTF-8 Unicode encoding. This will work /* Define to any value to enable support for the UTF-8/16/32 Unicode encoding.
even in an EBCDIC environment, but it is incompatible with the EBCDIC This will work even in an EBCDIC environment, but it is incompatible with
macro. That is, PCRE can support *either* EBCDIC code *or* ASCII/UTF-8, but the EBCDIC macro. That is, PCRE can support *either* EBCDIC code *or*
not both at once. */ ASCII/UTF-8/16/32, but not both at once. */
/* #undef SUPPORT_UTF8 */ /* #undef SUPPORT_UTF8 */
/* Valgrind support to find invalid memory reads. */
/* #undef SUPPORT_VALGRIND */
/* Version number of package */ /* Version number of package */
#ifndef VERSION #ifndef VERSION
#define VERSION "8.31" #define VERSION "8.32"
#endif #endif
/* Define to empty if `const' does not conform to ANSI C. */ /* Define to empty if `const' does not conform to ANSI C. */

View File

@ -43,7 +43,9 @@ character tables for PCRE. The tables are built according to the current
locale. Now that pcre_maketables is a function visible to the outside world, we locale. Now that pcre_maketables is a function visible to the outside world, we
make use of its code from here in order to be consistent. */ make use of its code from here in order to be consistent. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include <ctype.h> #include <ctype.h>
#include <stdio.h> #include <stdio.h>
@ -106,11 +108,24 @@ fprintf(f,
"library and dead code stripping is activated. This leads to link errors.\n" "library and dead code stripping is activated. This leads to link errors.\n"
"Pulling in the header ensures that the array gets flagged as \"someone\n" "Pulling in the header ensures that the array gets flagged as \"someone\n"
"outside this compilation unit might reference this\" and so it will always\n" "outside this compilation unit might reference this\" and so it will always\n"
"be supplied to the linker. */\n\n" "be supplied to the linker. */\n\n");
/* Force config.h in z/OS */
#if defined NATIVE_ZOS
fprintf(f,
"/* For z/OS, config.h is forced */\n"
"#ifndef HAVE_CONFIG_H\n"
"#define HAVE_CONFIG_H 1\n"
"#endif\n\n");
#endif
fprintf(f,
"#ifdef HAVE_CONFIG_H\n" "#ifdef HAVE_CONFIG_H\n"
"#include \"config.h\"\n" "#include \"config.h\"\n"
"#endif\n\n" "#endif\n\n"
"#include \"pcre_internal.h\"\n\n"); "#include \"pcre_internal.h\"\n\n");
fprintf(f, fprintf(f,
"const pcre_uint8 PRIV(default_tables)[] = {\n\n" "const pcre_uint8 PRIV(default_tables)[] = {\n\n"
"/* This table is a lower casing table. */\n\n"); "/* This table is a lower casing table. */\n\n");

File diff suppressed because it is too large Load Diff

View File

@ -42,9 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
/* The current PCRE version information. */ /* The current PCRE version information. */
#define PCRE_MAJOR 8 #define PCRE_MAJOR 8
#define PCRE_MINOR 31 #define PCRE_MINOR 32
#define PCRE_PRERELEASE #define PCRE_PRERELEASE
#define PCRE_DATE 2012-07-06 #define PCRE_DATE 2012-11-30
/* When an application links to a PCRE DLL in Windows, the symbols that are /* When an application links to a PCRE DLL in Windows, the symbols that are
imported have to be identified as such. When building PCRE, the appropriate imported have to be identified as such. When building PCRE, the appropriate
@ -95,54 +95,70 @@ it is needed here for malloc. */
extern "C" { extern "C" {
#endif #endif
/* Options. Some are compile-time only, some are run-time only, and some are /* Public options. Some are compile-time only, some are run-time only, and some
both, so we keep them all distinct. However, almost all the bits in the options are both, so we keep them all distinct. However, almost all the bits in the
word are now used. In the long run, we may have to re-use some of the options word are now used. In the long run, we may have to re-use some of the
compile-time only bits for runtime options, or vice versa. In the comments compile-time only bits for runtime options, or vice versa. Any of the
below, "compile", "exec", and "DFA exec" mean that the option is permitted to
be set for those functions; "used in" means that an option may be set only for
compile, but is subsequently referenced in exec and/or DFA exec. Any of the
compile-time options may be inspected during studying (and therefore JIT compile-time options may be inspected during studying (and therefore JIT
compiling). */ compiling).
#define PCRE_CASELESS 0x00000001 /* Compile */ Some options for pcre_compile() change its behaviour but do not affect the
#define PCRE_MULTILINE 0x00000002 /* Compile */ behaviour of the execution functions. Other options are passed through to the
#define PCRE_DOTALL 0x00000004 /* Compile */ execution functions and affect their behaviour, with or without affecting the
#define PCRE_EXTENDED 0x00000008 /* Compile */ behaviour of pcre_compile().
#define PCRE_ANCHORED 0x00000010 /* Compile, exec, DFA exec */
#define PCRE_DOLLAR_ENDONLY 0x00000020 /* Compile, used in exec, DFA exec */ Options that can be passed to pcre_compile() are tagged Cx below, with these
#define PCRE_EXTRA 0x00000040 /* Compile */ variants:
#define PCRE_NOTBOL 0x00000080 /* Exec, DFA exec */
#define PCRE_NOTEOL 0x00000100 /* Exec, DFA exec */ C1 Affects compile only
#define PCRE_UNGREEDY 0x00000200 /* Compile */ C2 Does not affect compile; affects exec, dfa_exec
#define PCRE_NOTEMPTY 0x00000400 /* Exec, DFA exec */ C3 Affects compile, exec, dfa_exec
/* The next two are also used in exec and DFA exec */ C4 Affects compile, exec, dfa_exec, study
#define PCRE_UTF8 0x00000800 /* Compile (same as PCRE_UTF16) */ C5 Affects compile, exec, study
#define PCRE_UTF16 0x00000800 /* Compile (same as PCRE_UTF8) */
#define PCRE_NO_AUTO_CAPTURE 0x00001000 /* Compile */ Options that can be set for pcre_exec() and/or pcre_dfa_exec() are flagged with
/* The next two are also used in exec and DFA exec */ E and D, respectively. They take precedence over C3, C4, and C5 settings passed
#define PCRE_NO_UTF8_CHECK 0x00002000 /* Compile (same as PCRE_NO_UTF16_CHECK) */ from pcre_compile(). Those that are compatible with JIT execution are flagged
#define PCRE_NO_UTF16_CHECK 0x00002000 /* Compile (same as PCRE_NO_UTF8_CHECK) */ with J. */
#define PCRE_AUTO_CALLOUT 0x00004000 /* Compile */
#define PCRE_PARTIAL_SOFT 0x00008000 /* Exec, DFA exec */ #define PCRE_CASELESS 0x00000001 /* C1 */
#define PCRE_PARTIAL 0x00008000 /* Backwards compatible synonym */ #define PCRE_MULTILINE 0x00000002 /* C1 */
#define PCRE_DFA_SHORTEST 0x00010000 /* DFA exec */ #define PCRE_DOTALL 0x00000004 /* C1 */
#define PCRE_DFA_RESTART 0x00020000 /* DFA exec */ #define PCRE_EXTENDED 0x00000008 /* C1 */
#define PCRE_FIRSTLINE 0x00040000 /* Compile, used in exec, DFA exec */ #define PCRE_ANCHORED 0x00000010 /* C4 E D */
#define PCRE_DUPNAMES 0x00080000 /* Compile */ #define PCRE_DOLLAR_ENDONLY 0x00000020 /* C2 */
#define PCRE_NEWLINE_CR 0x00100000 /* Compile, exec, DFA exec */ #define PCRE_EXTRA 0x00000040 /* C1 */
#define PCRE_NEWLINE_LF 0x00200000 /* Compile, exec, DFA exec */ #define PCRE_NOTBOL 0x00000080 /* E D J */
#define PCRE_NEWLINE_CRLF 0x00300000 /* Compile, exec, DFA exec */ #define PCRE_NOTEOL 0x00000100 /* E D J */
#define PCRE_NEWLINE_ANY 0x00400000 /* Compile, exec, DFA exec */ #define PCRE_UNGREEDY 0x00000200 /* C1 */
#define PCRE_NEWLINE_ANYCRLF 0x00500000 /* Compile, exec, DFA exec */ #define PCRE_NOTEMPTY 0x00000400 /* E D J */
#define PCRE_BSR_ANYCRLF 0x00800000 /* Compile, exec, DFA exec */ #define PCRE_UTF8 0x00000800 /* C4 ) */
#define PCRE_BSR_UNICODE 0x01000000 /* Compile, exec, DFA exec */ #define PCRE_UTF16 0x00000800 /* C4 ) Synonyms */
#define PCRE_JAVASCRIPT_COMPAT 0x02000000 /* Compile, used in exec */ #define PCRE_UTF32 0x00000800 /* C4 ) */
#define PCRE_NO_START_OPTIMIZE 0x04000000 /* Compile, exec, DFA exec */ #define PCRE_NO_AUTO_CAPTURE 0x00001000 /* C1 */
#define PCRE_NO_START_OPTIMISE 0x04000000 /* Synonym */ #define PCRE_NO_UTF8_CHECK 0x00002000 /* C1 E D J ) */
#define PCRE_PARTIAL_HARD 0x08000000 /* Exec, DFA exec */ #define PCRE_NO_UTF16_CHECK 0x00002000 /* C1 E D J ) Synonyms */
#define PCRE_NOTEMPTY_ATSTART 0x10000000 /* Exec, DFA exec */ #define PCRE_NO_UTF32_CHECK 0x00002000 /* C1 E D J ) */
#define PCRE_UCP 0x20000000 /* Compile, used in exec, DFA exec */ #define PCRE_AUTO_CALLOUT 0x00004000 /* C1 */
#define PCRE_PARTIAL_SOFT 0x00008000 /* E D J ) Synonyms */
#define PCRE_PARTIAL 0x00008000 /* E D J ) */
#define PCRE_DFA_SHORTEST 0x00010000 /* D */
#define PCRE_DFA_RESTART 0x00020000 /* D */
#define PCRE_FIRSTLINE 0x00040000 /* C3 */
#define PCRE_DUPNAMES 0x00080000 /* C1 */
#define PCRE_NEWLINE_CR 0x00100000 /* C3 E D */
#define PCRE_NEWLINE_LF 0x00200000 /* C3 E D */
#define PCRE_NEWLINE_CRLF 0x00300000 /* C3 E D */
#define PCRE_NEWLINE_ANY 0x00400000 /* C3 E D */
#define PCRE_NEWLINE_ANYCRLF 0x00500000 /* C3 E D */
#define PCRE_BSR_ANYCRLF 0x00800000 /* C3 E D */
#define PCRE_BSR_UNICODE 0x01000000 /* C3 E D */
#define PCRE_JAVASCRIPT_COMPAT 0x02000000 /* C5 */
#define PCRE_NO_START_OPTIMIZE 0x04000000 /* C2 E D ) Synonyms */
#define PCRE_NO_START_OPTIMISE 0x04000000 /* C2 E D ) */
#define PCRE_PARTIAL_HARD 0x08000000 /* E D J */
#define PCRE_NOTEMPTY_ATSTART 0x10000000 /* E D J */
#define PCRE_UCP 0x20000000 /* C3 */
/* Exec-time and get/set-time error codes */ /* Exec-time and get/set-time error codes */
@ -156,8 +172,9 @@ compiling). */
#define PCRE_ERROR_NOSUBSTRING (-7) #define PCRE_ERROR_NOSUBSTRING (-7)
#define PCRE_ERROR_MATCHLIMIT (-8) #define PCRE_ERROR_MATCHLIMIT (-8)
#define PCRE_ERROR_CALLOUT (-9) /* Never used by PCRE itself */ #define PCRE_ERROR_CALLOUT (-9) /* Never used by PCRE itself */
#define PCRE_ERROR_BADUTF8 (-10) /* Same for 8/16 */ #define PCRE_ERROR_BADUTF8 (-10) /* Same for 8/16/32 */
#define PCRE_ERROR_BADUTF16 (-10) /* Same for 8/16 */ #define PCRE_ERROR_BADUTF16 (-10) /* Same for 8/16/32 */
#define PCRE_ERROR_BADUTF32 (-10) /* Same for 8/16/32 */
#define PCRE_ERROR_BADUTF8_OFFSET (-11) /* Same for 8/16 */ #define PCRE_ERROR_BADUTF8_OFFSET (-11) /* Same for 8/16 */
#define PCRE_ERROR_BADUTF16_OFFSET (-11) /* Same for 8/16 */ #define PCRE_ERROR_BADUTF16_OFFSET (-11) /* Same for 8/16 */
#define PCRE_ERROR_PARTIAL (-12) #define PCRE_ERROR_PARTIAL (-12)
@ -180,6 +197,8 @@ compiling). */
#define PCRE_ERROR_BADMODE (-28) #define PCRE_ERROR_BADMODE (-28)
#define PCRE_ERROR_BADENDIANNESS (-29) #define PCRE_ERROR_BADENDIANNESS (-29)
#define PCRE_ERROR_DFA_BADRESTART (-30) #define PCRE_ERROR_DFA_BADRESTART (-30)
#define PCRE_ERROR_JIT_BADOPTION (-31)
#define PCRE_ERROR_BADLENGTH (-32)
/* Specific error codes for UTF-8 validity checks */ /* Specific error codes for UTF-8 validity checks */
@ -205,6 +224,7 @@ compiling). */
#define PCRE_UTF8_ERR19 19 #define PCRE_UTF8_ERR19 19
#define PCRE_UTF8_ERR20 20 #define PCRE_UTF8_ERR20 20
#define PCRE_UTF8_ERR21 21 #define PCRE_UTF8_ERR21 21
#define PCRE_UTF8_ERR22 22
/* Specific error codes for UTF-16 validity checks */ /* Specific error codes for UTF-16 validity checks */
@ -214,6 +234,13 @@ compiling). */
#define PCRE_UTF16_ERR3 3 #define PCRE_UTF16_ERR3 3
#define PCRE_UTF16_ERR4 4 #define PCRE_UTF16_ERR4 4
/* Specific error codes for UTF-32 validity checks */
#define PCRE_UTF32_ERR0 0
#define PCRE_UTF32_ERR1 1
#define PCRE_UTF32_ERR2 2
#define PCRE_UTF32_ERR3 3
/* Request types for pcre_fullinfo() */ /* Request types for pcre_fullinfo() */
#define PCRE_INFO_OPTIONS 0 #define PCRE_INFO_OPTIONS 0
@ -236,6 +263,10 @@ compiling). */
#define PCRE_INFO_JIT 16 #define PCRE_INFO_JIT 16
#define PCRE_INFO_JITSIZE 17 #define PCRE_INFO_JITSIZE 17
#define PCRE_INFO_MAXLOOKBEHIND 18 #define PCRE_INFO_MAXLOOKBEHIND 18
#define PCRE_INFO_FIRSTCHARACTER 19
#define PCRE_INFO_FIRSTCHARACTERFLAGS 20
#define PCRE_INFO_REQUIREDCHAR 21
#define PCRE_INFO_REQUIREDCHARFLAGS 22
/* Request types for pcre_config(). Do not re-arrange, in order to remain /* Request types for pcre_config(). Do not re-arrange, in order to remain
compatible. */ compatible. */
@ -252,6 +283,7 @@ compatible. */
#define PCRE_CONFIG_JIT 9 #define PCRE_CONFIG_JIT 9
#define PCRE_CONFIG_UTF16 10 #define PCRE_CONFIG_UTF16 10
#define PCRE_CONFIG_JITTARGET 11 #define PCRE_CONFIG_JITTARGET 11
#define PCRE_CONFIG_UTF32 12
/* Request types for pcre_study(). Do not re-arrange, in order to remain /* Request types for pcre_study(). Do not re-arrange, in order to remain
compatible. */ compatible. */
@ -259,8 +291,9 @@ compatible. */
#define PCRE_STUDY_JIT_COMPILE 0x0001 #define PCRE_STUDY_JIT_COMPILE 0x0001
#define PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE 0x0002 #define PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE 0x0002
#define PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE 0x0004 #define PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE 0x0004
#define PCRE_STUDY_EXTRA_NEEDED 0x0008
/* Bit flags for the pcre[16]_extra structure. Do not re-arrange or redefine /* Bit flags for the pcre[16|32]_extra structure. Do not re-arrange or redefine
these bits, just add new ones on the end, in order to remain compatible. */ these bits, just add new ones on the end, in order to remain compatible. */
#define PCRE_EXTRA_STUDY_DATA 0x0001 #define PCRE_EXTRA_STUDY_DATA 0x0001
@ -279,12 +312,18 @@ typedef struct real_pcre pcre;
struct real_pcre16; /* declaration; the definition is private */ struct real_pcre16; /* declaration; the definition is private */
typedef struct real_pcre16 pcre16; typedef struct real_pcre16 pcre16;
struct real_pcre32; /* declaration; the definition is private */
typedef struct real_pcre32 pcre32;
struct real_pcre_jit_stack; /* declaration; the definition is private */ struct real_pcre_jit_stack; /* declaration; the definition is private */
typedef struct real_pcre_jit_stack pcre_jit_stack; typedef struct real_pcre_jit_stack pcre_jit_stack;
struct real_pcre16_jit_stack; /* declaration; the definition is private */ struct real_pcre16_jit_stack; /* declaration; the definition is private */
typedef struct real_pcre16_jit_stack pcre16_jit_stack; typedef struct real_pcre16_jit_stack pcre16_jit_stack;
struct real_pcre32_jit_stack; /* declaration; the definition is private */
typedef struct real_pcre32_jit_stack pcre32_jit_stack;
/* If PCRE is compiled with 16 bit character support, PCRE_UCHAR16 must contain /* If PCRE is compiled with 16 bit character support, PCRE_UCHAR16 must contain
a 16 bit wide signed data type. Otherwise it can be a dummy data type since a 16 bit wide signed data type. Otherwise it can be a dummy data type since
pcre16 functions are not implemented. There is a check for this in pcre_internal.h. */ pcre16 functions are not implemented. There is a check for this in pcre_internal.h. */
@ -296,6 +335,17 @@ pcre16 functions are not implemented. There is a check for this in pcre_internal
#define PCRE_SPTR16 const PCRE_UCHAR16 * #define PCRE_SPTR16 const PCRE_UCHAR16 *
#endif #endif
/* If PCRE is compiled with 32 bit character support, PCRE_UCHAR32 must contain
a 32 bit wide signed data type. Otherwise it can be a dummy data type since
pcre32 functions are not implemented. There is a check for this in pcre_internal.h. */
#ifndef PCRE_UCHAR32
#define PCRE_UCHAR32 unsigned int
#endif
#ifndef PCRE_SPTR32
#define PCRE_SPTR32 const PCRE_UCHAR32 *
#endif
/* When PCRE is compiled as a C++ library, the subject pointer type can be /* When PCRE is compiled as a C++ library, the subject pointer type can be
replaced with a custom type. For conventional use, the public interface is a replaced with a custom type. For conventional use, the public interface is a
const char *. */ const char *. */
@ -332,6 +382,19 @@ typedef struct pcre16_extra {
void *executable_jit; /* Contains a pointer to a compiled jit code */ void *executable_jit; /* Contains a pointer to a compiled jit code */
} pcre16_extra; } pcre16_extra;
/* Same structure as above, but with 32 bit char pointers. */
typedef struct pcre32_extra {
unsigned long int flags; /* Bits for which fields are set */
void *study_data; /* Opaque data from pcre_study() */
unsigned long int match_limit; /* Maximum number of calls to match() */
void *callout_data; /* Data passed back in callouts */
const unsigned char *tables; /* Pointer to character tables */
unsigned long int match_limit_recursion; /* Max recursive calls to match() */
PCRE_UCHAR32 **mark; /* For passing back a mark pointer */
void *executable_jit; /* Contains a pointer to a compiled jit code */
} pcre32_extra;
/* The structure for passing out data via the pcre_callout_function. We use a /* The structure for passing out data via the pcre_callout_function. We use a
structure so that new fields can be added on the end in future versions, structure so that new fields can be added on the end in future versions,
without changing the API of the function, thereby allowing old clients to work without changing the API of the function, thereby allowing old clients to work
@ -379,6 +442,28 @@ typedef struct pcre16_callout_block {
/* ------------------------------------------------------------------ */ /* ------------------------------------------------------------------ */
} pcre16_callout_block; } pcre16_callout_block;
/* Same structure as above, but with 32 bit char pointers. */
typedef struct pcre32_callout_block {
int version; /* Identifies version of block */
/* ------------------------ Version 0 ------------------------------- */
int callout_number; /* Number compiled into pattern */
int *offset_vector; /* The offset vector */
PCRE_SPTR32 subject; /* The subject being matched */
int subject_length; /* The length of the subject */
int start_match; /* Offset to start of this match attempt */
int current_position; /* Where we currently are in the subject */
int capture_top; /* Max current capture */
int capture_last; /* Most recently closed capture */
void *callout_data; /* Data passed in with the call */
/* ------------------- Added for Version 1 -------------------------- */
int pattern_position; /* Offset to next item in the pattern */
int next_item_length; /* Length of next item in the pattern */
/* ------------------- Added for Version 2 -------------------------- */
const PCRE_UCHAR32 *mark; /* Pointer to current mark or NULL */
/* ------------------------------------------------------------------ */
} pcre32_callout_block;
/* Indirection for store get and free functions. These can be set to /* Indirection for store get and free functions. These can be set to
alternative malloc/free functions if required. Special ones are used in the alternative malloc/free functions if required. Special ones are used in the
non-recursive case for "frames". There is also an optional callout function non-recursive case for "frames". There is also an optional callout function
@ -397,6 +482,12 @@ PCRE_EXP_DECL void (*pcre16_free)(void *);
PCRE_EXP_DECL void *(*pcre16_stack_malloc)(size_t); PCRE_EXP_DECL void *(*pcre16_stack_malloc)(size_t);
PCRE_EXP_DECL void (*pcre16_stack_free)(void *); PCRE_EXP_DECL void (*pcre16_stack_free)(void *);
PCRE_EXP_DECL int (*pcre16_callout)(pcre16_callout_block *); PCRE_EXP_DECL int (*pcre16_callout)(pcre16_callout_block *);
PCRE_EXP_DECL void *(*pcre32_malloc)(size_t);
PCRE_EXP_DECL void (*pcre32_free)(void *);
PCRE_EXP_DECL void *(*pcre32_stack_malloc)(size_t);
PCRE_EXP_DECL void (*pcre32_stack_free)(void *);
PCRE_EXP_DECL int (*pcre32_callout)(pcre32_callout_block *);
#else /* VPCOMPAT */ #else /* VPCOMPAT */
PCRE_EXP_DECL void *pcre_malloc(size_t); PCRE_EXP_DECL void *pcre_malloc(size_t);
PCRE_EXP_DECL void pcre_free(void *); PCRE_EXP_DECL void pcre_free(void *);
@ -409,12 +500,19 @@ PCRE_EXP_DECL void pcre16_free(void *);
PCRE_EXP_DECL void *pcre16_stack_malloc(size_t); PCRE_EXP_DECL void *pcre16_stack_malloc(size_t);
PCRE_EXP_DECL void pcre16_stack_free(void *); PCRE_EXP_DECL void pcre16_stack_free(void *);
PCRE_EXP_DECL int pcre16_callout(pcre16_callout_block *); PCRE_EXP_DECL int pcre16_callout(pcre16_callout_block *);
PCRE_EXP_DECL void *pcre32_malloc(size_t);
PCRE_EXP_DECL void pcre32_free(void *);
PCRE_EXP_DECL void *pcre32_stack_malloc(size_t);
PCRE_EXP_DECL void pcre32_stack_free(void *);
PCRE_EXP_DECL int pcre32_callout(pcre32_callout_block *);
#endif /* VPCOMPAT */ #endif /* VPCOMPAT */
/* User defined callback which provides a stack just before the match starts. */ /* User defined callback which provides a stack just before the match starts. */
typedef pcre_jit_stack *(*pcre_jit_callback)(void *); typedef pcre_jit_stack *(*pcre_jit_callback)(void *);
typedef pcre16_jit_stack *(*pcre16_jit_callback)(void *); typedef pcre16_jit_stack *(*pcre16_jit_callback)(void *);
typedef pcre32_jit_stack *(*pcre32_jit_callback)(void *);
/* Exported PCRE functions */ /* Exported PCRE functions */
@ -422,83 +520,131 @@ PCRE_EXP_DECL pcre *pcre_compile(const char *, int, const char **, int *,
const unsigned char *); const unsigned char *);
PCRE_EXP_DECL pcre16 *pcre16_compile(PCRE_SPTR16, int, const char **, int *, PCRE_EXP_DECL pcre16 *pcre16_compile(PCRE_SPTR16, int, const char **, int *,
const unsigned char *); const unsigned char *);
PCRE_EXP_DECL pcre32 *pcre32_compile(PCRE_SPTR32, int, const char **, int *,
const unsigned char *);
PCRE_EXP_DECL pcre *pcre_compile2(const char *, int, int *, const char **, PCRE_EXP_DECL pcre *pcre_compile2(const char *, int, int *, const char **,
int *, const unsigned char *); int *, const unsigned char *);
PCRE_EXP_DECL pcre16 *pcre16_compile2(PCRE_SPTR16, int, int *, const char **, PCRE_EXP_DECL pcre16 *pcre16_compile2(PCRE_SPTR16, int, int *, const char **,
int *, const unsigned char *); int *, const unsigned char *);
PCRE_EXP_DECL pcre32 *pcre32_compile2(PCRE_SPTR32, int, int *, const char **,
int *, const unsigned char *);
PCRE_EXP_DECL int pcre_config(int, void *); PCRE_EXP_DECL int pcre_config(int, void *);
PCRE_EXP_DECL int pcre16_config(int, void *); PCRE_EXP_DECL int pcre16_config(int, void *);
PCRE_EXP_DECL int pcre32_config(int, void *);
PCRE_EXP_DECL int pcre_copy_named_substring(const pcre *, const char *, PCRE_EXP_DECL int pcre_copy_named_substring(const pcre *, const char *,
int *, int, const char *, char *, int); int *, int, const char *, char *, int);
PCRE_EXP_DECL int pcre16_copy_named_substring(const pcre16 *, PCRE_SPTR16, PCRE_EXP_DECL int pcre16_copy_named_substring(const pcre16 *, PCRE_SPTR16,
int *, int, PCRE_SPTR16, PCRE_UCHAR16 *, int); int *, int, PCRE_SPTR16, PCRE_UCHAR16 *, int);
PCRE_EXP_DECL int pcre32_copy_named_substring(const pcre32 *, PCRE_SPTR32,
int *, int, PCRE_SPTR32, PCRE_UCHAR32 *, int);
PCRE_EXP_DECL int pcre_copy_substring(const char *, int *, int, int, PCRE_EXP_DECL int pcre_copy_substring(const char *, int *, int, int,
char *, int); char *, int);
PCRE_EXP_DECL int pcre16_copy_substring(PCRE_SPTR16, int *, int, int, PCRE_EXP_DECL int pcre16_copy_substring(PCRE_SPTR16, int *, int, int,
PCRE_UCHAR16 *, int); PCRE_UCHAR16 *, int);
PCRE_EXP_DECL int pcre32_copy_substring(PCRE_SPTR32, int *, int, int,
PCRE_UCHAR32 *, int);
PCRE_EXP_DECL int pcre_dfa_exec(const pcre *, const pcre_extra *, PCRE_EXP_DECL int pcre_dfa_exec(const pcre *, const pcre_extra *,
const char *, int, int, int, int *, int , int *, int); const char *, int, int, int, int *, int , int *, int);
PCRE_EXP_DECL int pcre16_dfa_exec(const pcre16 *, const pcre16_extra *, PCRE_EXP_DECL int pcre16_dfa_exec(const pcre16 *, const pcre16_extra *,
PCRE_SPTR16, int, int, int, int *, int , int *, int); PCRE_SPTR16, int, int, int, int *, int , int *, int);
PCRE_EXP_DECL int pcre32_dfa_exec(const pcre32 *, const pcre32_extra *,
PCRE_SPTR32, int, int, int, int *, int , int *, int);
PCRE_EXP_DECL int pcre_exec(const pcre *, const pcre_extra *, PCRE_SPTR, PCRE_EXP_DECL int pcre_exec(const pcre *, const pcre_extra *, PCRE_SPTR,
int, int, int, int *, int); int, int, int, int *, int);
PCRE_EXP_DECL int pcre16_exec(const pcre16 *, const pcre16_extra *, PCRE_EXP_DECL int pcre16_exec(const pcre16 *, const pcre16_extra *,
PCRE_SPTR16, int, int, int, int *, int); PCRE_SPTR16, int, int, int, int *, int);
PCRE_EXP_DECL int pcre32_exec(const pcre32 *, const pcre32_extra *,
PCRE_SPTR32, int, int, int, int *, int);
PCRE_EXP_DECL int pcre_jit_exec(const pcre *, const pcre_extra *,
PCRE_SPTR, int, int, int, int *, int,
pcre_jit_stack *);
PCRE_EXP_DECL int pcre16_jit_exec(const pcre16 *, const pcre16_extra *,
PCRE_SPTR16, int, int, int, int *, int,
pcre16_jit_stack *);
PCRE_EXP_DECL int pcre32_jit_exec(const pcre32 *, const pcre32_extra *,
PCRE_SPTR32, int, int, int, int *, int,
pcre32_jit_stack *);
PCRE_EXP_DECL void pcre_free_substring(const char *); PCRE_EXP_DECL void pcre_free_substring(const char *);
PCRE_EXP_DECL void pcre16_free_substring(PCRE_SPTR16); PCRE_EXP_DECL void pcre16_free_substring(PCRE_SPTR16);
PCRE_EXP_DECL void pcre32_free_substring(PCRE_SPTR32);
PCRE_EXP_DECL void pcre_free_substring_list(const char **); PCRE_EXP_DECL void pcre_free_substring_list(const char **);
PCRE_EXP_DECL void pcre16_free_substring_list(PCRE_SPTR16 *); PCRE_EXP_DECL void pcre16_free_substring_list(PCRE_SPTR16 *);
PCRE_EXP_DECL void pcre32_free_substring_list(PCRE_SPTR32 *);
PCRE_EXP_DECL int pcre_fullinfo(const pcre *, const pcre_extra *, int, PCRE_EXP_DECL int pcre_fullinfo(const pcre *, const pcre_extra *, int,
void *); void *);
PCRE_EXP_DECL int pcre16_fullinfo(const pcre16 *, const pcre16_extra *, int, PCRE_EXP_DECL int pcre16_fullinfo(const pcre16 *, const pcre16_extra *, int,
void *); void *);
PCRE_EXP_DECL int pcre32_fullinfo(const pcre32 *, const pcre32_extra *, int,
void *);
PCRE_EXP_DECL int pcre_get_named_substring(const pcre *, const char *, PCRE_EXP_DECL int pcre_get_named_substring(const pcre *, const char *,
int *, int, const char *, const char **); int *, int, const char *, const char **);
PCRE_EXP_DECL int pcre16_get_named_substring(const pcre16 *, PCRE_SPTR16, PCRE_EXP_DECL int pcre16_get_named_substring(const pcre16 *, PCRE_SPTR16,
int *, int, PCRE_SPTR16, PCRE_SPTR16 *); int *, int, PCRE_SPTR16, PCRE_SPTR16 *);
PCRE_EXP_DECL int pcre32_get_named_substring(const pcre32 *, PCRE_SPTR32,
int *, int, PCRE_SPTR32, PCRE_SPTR32 *);
PCRE_EXP_DECL int pcre_get_stringnumber(const pcre *, const char *); PCRE_EXP_DECL int pcre_get_stringnumber(const pcre *, const char *);
PCRE_EXP_DECL int pcre16_get_stringnumber(const pcre16 *, PCRE_SPTR16); PCRE_EXP_DECL int pcre16_get_stringnumber(const pcre16 *, PCRE_SPTR16);
PCRE_EXP_DECL int pcre32_get_stringnumber(const pcre32 *, PCRE_SPTR32);
PCRE_EXP_DECL int pcre_get_stringtable_entries(const pcre *, const char *, PCRE_EXP_DECL int pcre_get_stringtable_entries(const pcre *, const char *,
char **, char **); char **, char **);
PCRE_EXP_DECL int pcre16_get_stringtable_entries(const pcre16 *, PCRE_SPTR16, PCRE_EXP_DECL int pcre16_get_stringtable_entries(const pcre16 *, PCRE_SPTR16,
PCRE_UCHAR16 **, PCRE_UCHAR16 **); PCRE_UCHAR16 **, PCRE_UCHAR16 **);
PCRE_EXP_DECL int pcre32_get_stringtable_entries(const pcre32 *, PCRE_SPTR32,
PCRE_UCHAR32 **, PCRE_UCHAR32 **);
PCRE_EXP_DECL int pcre_get_substring(const char *, int *, int, int, PCRE_EXP_DECL int pcre_get_substring(const char *, int *, int, int,
const char **); const char **);
PCRE_EXP_DECL int pcre16_get_substring(PCRE_SPTR16, int *, int, int, PCRE_EXP_DECL int pcre16_get_substring(PCRE_SPTR16, int *, int, int,
PCRE_SPTR16 *); PCRE_SPTR16 *);
PCRE_EXP_DECL int pcre32_get_substring(PCRE_SPTR32, int *, int, int,
PCRE_SPTR32 *);
PCRE_EXP_DECL int pcre_get_substring_list(const char *, int *, int, PCRE_EXP_DECL int pcre_get_substring_list(const char *, int *, int,
const char ***); const char ***);
PCRE_EXP_DECL int pcre16_get_substring_list(PCRE_SPTR16, int *, int, PCRE_EXP_DECL int pcre16_get_substring_list(PCRE_SPTR16, int *, int,
PCRE_SPTR16 **); PCRE_SPTR16 **);
PCRE_EXP_DECL int pcre32_get_substring_list(PCRE_SPTR32, int *, int,
PCRE_SPTR32 **);
PCRE_EXP_DECL const unsigned char *pcre_maketables(void); PCRE_EXP_DECL const unsigned char *pcre_maketables(void);
PCRE_EXP_DECL const unsigned char *pcre16_maketables(void); PCRE_EXP_DECL const unsigned char *pcre16_maketables(void);
PCRE_EXP_DECL const unsigned char *pcre32_maketables(void);
PCRE_EXP_DECL int pcre_refcount(pcre *, int); PCRE_EXP_DECL int pcre_refcount(pcre *, int);
PCRE_EXP_DECL int pcre16_refcount(pcre16 *, int); PCRE_EXP_DECL int pcre16_refcount(pcre16 *, int);
PCRE_EXP_DECL int pcre32_refcount(pcre32 *, int);
PCRE_EXP_DECL pcre_extra *pcre_study(const pcre *, int, const char **); PCRE_EXP_DECL pcre_extra *pcre_study(const pcre *, int, const char **);
PCRE_EXP_DECL pcre16_extra *pcre16_study(const pcre16 *, int, const char **); PCRE_EXP_DECL pcre16_extra *pcre16_study(const pcre16 *, int, const char **);
PCRE_EXP_DECL pcre32_extra *pcre32_study(const pcre32 *, int, const char **);
PCRE_EXP_DECL void pcre_free_study(pcre_extra *); PCRE_EXP_DECL void pcre_free_study(pcre_extra *);
PCRE_EXP_DECL void pcre16_free_study(pcre16_extra *); PCRE_EXP_DECL void pcre16_free_study(pcre16_extra *);
PCRE_EXP_DECL void pcre32_free_study(pcre32_extra *);
PCRE_EXP_DECL const char *pcre_version(void); PCRE_EXP_DECL const char *pcre_version(void);
PCRE_EXP_DECL const char *pcre16_version(void); PCRE_EXP_DECL const char *pcre16_version(void);
PCRE_EXP_DECL const char *pcre32_version(void);
/* Utility functions for byte order swaps. */ /* Utility functions for byte order swaps. */
PCRE_EXP_DECL int pcre_pattern_to_host_byte_order(pcre *, pcre_extra *, PCRE_EXP_DECL int pcre_pattern_to_host_byte_order(pcre *, pcre_extra *,
const unsigned char *); const unsigned char *);
PCRE_EXP_DECL int pcre16_pattern_to_host_byte_order(pcre16 *, pcre16_extra *, PCRE_EXP_DECL int pcre16_pattern_to_host_byte_order(pcre16 *, pcre16_extra *,
const unsigned char *); const unsigned char *);
PCRE_EXP_DECL int pcre32_pattern_to_host_byte_order(pcre32 *, pcre32_extra *,
const unsigned char *);
PCRE_EXP_DECL int pcre16_utf16_to_host_byte_order(PCRE_UCHAR16 *, PCRE_EXP_DECL int pcre16_utf16_to_host_byte_order(PCRE_UCHAR16 *,
PCRE_SPTR16, int, int *, int); PCRE_SPTR16, int, int *, int);
PCRE_EXP_DECL int pcre32_utf32_to_host_byte_order(PCRE_UCHAR32 *,
PCRE_SPTR32, int, int *, int);
/* JIT compiler related functions. */ /* JIT compiler related functions. */
PCRE_EXP_DECL pcre_jit_stack *pcre_jit_stack_alloc(int, int); PCRE_EXP_DECL pcre_jit_stack *pcre_jit_stack_alloc(int, int);
PCRE_EXP_DECL pcre16_jit_stack *pcre16_jit_stack_alloc(int, int); PCRE_EXP_DECL pcre16_jit_stack *pcre16_jit_stack_alloc(int, int);
PCRE_EXP_DECL pcre32_jit_stack *pcre32_jit_stack_alloc(int, int);
PCRE_EXP_DECL void pcre_jit_stack_free(pcre_jit_stack *); PCRE_EXP_DECL void pcre_jit_stack_free(pcre_jit_stack *);
PCRE_EXP_DECL void pcre16_jit_stack_free(pcre16_jit_stack *); PCRE_EXP_DECL void pcre16_jit_stack_free(pcre16_jit_stack *);
PCRE_EXP_DECL void pcre32_jit_stack_free(pcre32_jit_stack *);
PCRE_EXP_DECL void pcre_assign_jit_stack(pcre_extra *, PCRE_EXP_DECL void pcre_assign_jit_stack(pcre_extra *,
pcre_jit_callback, void *); pcre_jit_callback, void *);
PCRE_EXP_DECL void pcre16_assign_jit_stack(pcre16_extra *, PCRE_EXP_DECL void pcre16_assign_jit_stack(pcre16_extra *,
pcre16_jit_callback, void *); pcre16_jit_callback, void *);
PCRE_EXP_DECL void pcre32_assign_jit_stack(pcre32_extra *,
pcre32_jit_callback, void *);
#ifdef __cplusplus #ifdef __cplusplus
} /* extern "C" */ } /* extern "C" */

View File

@ -20,11 +20,13 @@ and dead code stripping is activated. This leads to link errors. Pulling in the
header ensures that the array gets flagged as "someone outside this compilation header ensures that the array gets flagged as "someone outside this compilation
unit might reference this" and so it will always be supplied to the linker. */ unit might reference this" and so it will always be supplied to the linker. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
const unsigned char _pcre_default_tables[] = { const pcre_uint8 PRIV(default_tables)[] = {
/* This table is a lower casing table. */ /* This table is a lower casing table. */

File diff suppressed because it is too large Load Diff

View File

@ -41,7 +41,9 @@ POSSIBILITY OF SUCH DAMAGE.
/* This module contains the external function pcre_config(). */ /* This module contains the external function pcre_config(). */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
/* Keep the original link size. */ /* Keep the original link size. */
static int real_link_size = LINK_SIZE; static int real_link_size = LINK_SIZE;
@ -63,18 +65,21 @@ Arguments:
Returns: 0 if data returned, negative on error Returns: 0 if data returned, negative on error
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_config(int what, void *where) pcre_config(int what, void *where)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_config(int what, void *where) pcre16_config(int what, void *where)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_config(int what, void *where)
#endif #endif
{ {
switch (what) switch (what)
{ {
case PCRE_CONFIG_UTF8: case PCRE_CONFIG_UTF8:
#if defined COMPILE_PCRE16 #if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
*((int *)where) = 0; *((int *)where) = 0;
return PCRE_ERROR_BADOPTION; return PCRE_ERROR_BADOPTION;
#else #else
@ -87,7 +92,20 @@ switch (what)
#endif #endif
case PCRE_CONFIG_UTF16: case PCRE_CONFIG_UTF16:
#if defined COMPILE_PCRE8 #if defined COMPILE_PCRE8 || defined COMPILE_PCRE32
*((int *)where) = 0;
return PCRE_ERROR_BADOPTION;
#else
#if defined SUPPORT_UTF
*((int *)where) = 1;
#else
*((int *)where) = 0;
#endif
break;
#endif
case PCRE_CONFIG_UTF32:
#if defined COMPILE_PCRE8 || defined COMPILE_PCRE16
*((int *)where) = 0; *((int *)where) = 0;
return PCRE_ERROR_BADOPTION; return PCRE_ERROR_BADOPTION;
#else #else

File diff suppressed because it is too large Load Diff

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
information about a compiled pattern. */ information about a compiled pattern. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
@ -63,14 +65,18 @@ Arguments:
Returns: 0 if data returned, negative on error Returns: 0 if data returned, negative on error
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_fullinfo(const pcre *argument_re, const pcre_extra *extra_data, pcre_fullinfo(const pcre *argument_re, const pcre_extra *extra_data,
int what, void *where) int what, void *where)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_fullinfo(const pcre16 *argument_re, const pcre16_extra *extra_data, pcre16_fullinfo(const pcre16 *argument_re, const pcre16_extra *extra_data,
int what, void *where) int what, void *where)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_fullinfo(const pcre32 *argument_re, const pcre32_extra *extra_data,
int what, void *where)
#endif #endif
{ {
const REAL_PCRE *re = (const REAL_PCRE *)argument_re; const REAL_PCRE *re = (const REAL_PCRE *)argument_re;
@ -130,10 +136,21 @@ switch (what)
case PCRE_INFO_FIRSTBYTE: case PCRE_INFO_FIRSTBYTE:
*((int *)where) = *((int *)where) =
((re->flags & PCRE_FIRSTSET) != 0)? re->first_char : ((re->flags & PCRE_FIRSTSET) != 0)? (int)re->first_char :
((re->flags & PCRE_STARTLINE) != 0)? -1 : -2; ((re->flags & PCRE_STARTLINE) != 0)? -1 : -2;
break; break;
case PCRE_INFO_FIRSTCHARACTER:
*((pcre_uint32 *)where) =
(re->flags & PCRE_FIRSTSET) != 0 ? re->first_char : 0;
break;
case PCRE_INFO_FIRSTCHARACTERFLAGS:
*((int *)where) =
((re->flags & PCRE_FIRSTSET) != 0) ? 1 :
((re->flags & PCRE_STARTLINE) != 0) ? 2 : 0;
break;
/* Make sure we pass back the pointer to the bit vector in the external /* Make sure we pass back the pointer to the bit vector in the external
block, not the internal copy (with flipped integer fields). */ block, not the internal copy (with flipped integer fields). */
@ -157,9 +174,19 @@ switch (what)
case PCRE_INFO_LASTLITERAL: case PCRE_INFO_LASTLITERAL:
*((int *)where) = *((int *)where) =
((re->flags & PCRE_REQCHSET) != 0)? re->req_char : -1; ((re->flags & PCRE_REQCHSET) != 0)? (int)re->req_char : -1;
break; break;
case PCRE_INFO_REQUIREDCHAR:
*((pcre_uint32 *)where) =
((re->flags & PCRE_REQCHSET) != 0) ? re->req_char : 0;
break;
case PCRE_INFO_REQUIREDCHARFLAGS:
*((int *)where) =
((re->flags & PCRE_REQCHSET) != 0);
break;
case PCRE_INFO_NAMEENTRYSIZE: case PCRE_INFO_NAMEENTRYSIZE:
*((int *)where) = re->name_entry_size; *((int *)where) = re->name_entry_size;
break; break;

View File

@ -43,7 +43,9 @@ from the subject string after a regex match has succeeded. The original idea
for these functions came from Scott Wimer. */ for these functions came from Scott Wimer. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
@ -63,12 +65,15 @@ Returns: the number of the named parentheses, or a negative number
(PCRE_ERROR_NOSUBSTRING) if not found (PCRE_ERROR_NOSUBSTRING) if not found
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_stringnumber(const pcre *code, const char *stringname) pcre_get_stringnumber(const pcre *code, const char *stringname)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_stringnumber(const pcre16 *code, PCRE_SPTR16 stringname) pcre16_get_stringnumber(const pcre16 *code, PCRE_SPTR16 stringname)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_stringnumber(const pcre32 *code, PCRE_SPTR32 stringname)
#endif #endif
{ {
int rc; int rc;
@ -96,6 +101,16 @@ if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0
if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0) if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc; return rc;
#endif #endif
#ifdef COMPILE_PCRE32
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
return rc;
if (top <= 0) return PCRE_ERROR_NOSUBSTRING;
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0)
return rc;
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc;
#endif
bot = 0; bot = 0;
while (top > bot) while (top > bot)
@ -130,14 +145,18 @@ Returns: the length of each entry, or a negative number
(PCRE_ERROR_NOSUBSTRING) if not found (PCRE_ERROR_NOSUBSTRING) if not found
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_stringtable_entries(const pcre *code, const char *stringname, pcre_get_stringtable_entries(const pcre *code, const char *stringname,
char **firstptr, char **lastptr) char **firstptr, char **lastptr)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_stringtable_entries(const pcre16 *code, PCRE_SPTR16 stringname, pcre16_get_stringtable_entries(const pcre16 *code, PCRE_SPTR16 stringname,
PCRE_UCHAR16 **firstptr, PCRE_UCHAR16 **lastptr) PCRE_UCHAR16 **firstptr, PCRE_UCHAR16 **lastptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_stringtable_entries(const pcre32 *code, PCRE_SPTR32 stringname,
PCRE_UCHAR32 **firstptr, PCRE_UCHAR32 **lastptr)
#endif #endif
{ {
int rc; int rc;
@ -165,6 +184,16 @@ if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0
if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0) if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc; return rc;
#endif #endif
#ifdef COMPILE_PCRE32
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
return rc;
if (top <= 0) return PCRE_ERROR_NOSUBSTRING;
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0)
return rc;
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
return rc;
#endif
lastentry = nametable + entrysize * (top - 1); lastentry = nametable + entrysize * (top - 1);
bot = 0; bot = 0;
@ -190,12 +219,15 @@ while (top > bot)
(pcre_uchar *)(last + entrysize + IMM2_SIZE)) != 0) break; (pcre_uchar *)(last + entrysize + IMM2_SIZE)) != 0) break;
last += entrysize; last += entrysize;
} }
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
*firstptr = (char *)first; *firstptr = (char *)first;
*lastptr = (char *)last; *lastptr = (char *)last;
#else #elif defined COMPILE_PCRE16
*firstptr = (PCRE_UCHAR16 *)first; *firstptr = (PCRE_UCHAR16 *)first;
*lastptr = (PCRE_UCHAR16 *)last; *lastptr = (PCRE_UCHAR16 *)last;
#elif defined COMPILE_PCRE32
*firstptr = (PCRE_UCHAR32 *)first;
*lastptr = (PCRE_UCHAR32 *)last;
#endif #endif
return entrysize; return entrysize;
} }
@ -224,31 +256,40 @@ Returns: the number of the first that is set,
or a negative number on error or a negative number on error
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
static int static int
get_first_set(const pcre *code, const char *stringname, int *ovector) get_first_set(const pcre *code, const char *stringname, int *ovector)
#else #elif defined COMPILE_PCRE16
static int static int
get_first_set(const pcre16 *code, PCRE_SPTR16 stringname, int *ovector) get_first_set(const pcre16 *code, PCRE_SPTR16 stringname, int *ovector)
#elif defined COMPILE_PCRE32
static int
get_first_set(const pcre32 *code, PCRE_SPTR32 stringname, int *ovector)
#endif #endif
{ {
const REAL_PCRE *re = (const REAL_PCRE *)code; const REAL_PCRE *re = (const REAL_PCRE *)code;
int entrysize; int entrysize;
pcre_uchar *entry; pcre_uchar *entry;
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
char *first, *last; char *first, *last;
#else #elif defined COMPILE_PCRE16
PCRE_UCHAR16 *first, *last; PCRE_UCHAR16 *first, *last;
#elif defined COMPILE_PCRE32
PCRE_UCHAR32 *first, *last;
#endif #endif
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0) if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
return pcre_get_stringnumber(code, stringname); return pcre_get_stringnumber(code, stringname);
entrysize = pcre_get_stringtable_entries(code, stringname, &first, &last); entrysize = pcre_get_stringtable_entries(code, stringname, &first, &last);
#else #elif defined COMPILE_PCRE16
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0) if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
return pcre16_get_stringnumber(code, stringname); return pcre16_get_stringnumber(code, stringname);
entrysize = pcre16_get_stringtable_entries(code, stringname, &first, &last); entrysize = pcre16_get_stringtable_entries(code, stringname, &first, &last);
#elif defined COMPILE_PCRE32
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
return pcre32_get_stringnumber(code, stringname);
entrysize = pcre32_get_stringtable_entries(code, stringname, &first, &last);
#endif #endif
if (entrysize <= 0) return entrysize; if (entrysize <= 0) return entrysize;
for (entry = (pcre_uchar *)first; entry <= (pcre_uchar *)last; entry += entrysize) for (entry = (pcre_uchar *)first; entry <= (pcre_uchar *)last; entry += entrysize)
@ -289,14 +330,18 @@ Returns: if successful:
PCRE_ERROR_NOSUBSTRING (-7) no such captured substring PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_copy_substring(const char *subject, int *ovector, int stringcount, pcre_copy_substring(const char *subject, int *ovector, int stringcount,
int stringnumber, char *buffer, int size) int stringnumber, char *buffer, int size)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_copy_substring(PCRE_SPTR16 subject, int *ovector, int stringcount, pcre16_copy_substring(PCRE_SPTR16 subject, int *ovector, int stringcount,
int stringnumber, PCRE_UCHAR16 *buffer, int size) int stringnumber, PCRE_UCHAR16 *buffer, int size)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_copy_substring(PCRE_SPTR32 subject, int *ovector, int stringcount,
int stringnumber, PCRE_UCHAR32 *buffer, int size)
#endif #endif
{ {
int yield; int yield;
@ -340,24 +385,31 @@ Returns: if successful:
PCRE_ERROR_NOSUBSTRING (-7) no such captured substring PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_copy_named_substring(const pcre *code, const char *subject, pcre_copy_named_substring(const pcre *code, const char *subject,
int *ovector, int stringcount, const char *stringname, int *ovector, int stringcount, const char *stringname,
char *buffer, int size) char *buffer, int size)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_copy_named_substring(const pcre16 *code, PCRE_SPTR16 subject, pcre16_copy_named_substring(const pcre16 *code, PCRE_SPTR16 subject,
int *ovector, int stringcount, PCRE_SPTR16 stringname, int *ovector, int stringcount, PCRE_SPTR16 stringname,
PCRE_UCHAR16 *buffer, int size) PCRE_UCHAR16 *buffer, int size)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_copy_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
int *ovector, int stringcount, PCRE_SPTR32 stringname,
PCRE_UCHAR32 *buffer, int size)
#endif #endif
{ {
int n = get_first_set(code, stringname, ovector); int n = get_first_set(code, stringname, ovector);
if (n <= 0) return n; if (n <= 0) return n;
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
return pcre_copy_substring(subject, ovector, stringcount, n, buffer, size); return pcre_copy_substring(subject, ovector, stringcount, n, buffer, size);
#else #elif defined COMPILE_PCRE16
return pcre16_copy_substring(subject, ovector, stringcount, n, buffer, size); return pcre16_copy_substring(subject, ovector, stringcount, n, buffer, size);
#elif defined COMPILE_PCRE32
return pcre32_copy_substring(subject, ovector, stringcount, n, buffer, size);
#endif #endif
} }
@ -384,14 +436,18 @@ Returns: if successful: 0
PCRE_ERROR_NOMEMORY (-6) failed to get store PCRE_ERROR_NOMEMORY (-6) failed to get store
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_substring_list(const char *subject, int *ovector, int stringcount, pcre_get_substring_list(const char *subject, int *ovector, int stringcount,
const char ***listptr) const char ***listptr)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_substring_list(PCRE_SPTR16 subject, int *ovector, int stringcount, pcre16_get_substring_list(PCRE_SPTR16 subject, int *ovector, int stringcount,
PCRE_SPTR16 **listptr) PCRE_SPTR16 **listptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_substring_list(PCRE_SPTR32 subject, int *ovector, int stringcount,
PCRE_SPTR32 **listptr)
#endif #endif
{ {
int i; int i;
@ -406,10 +462,12 @@ for (i = 0; i < double_count; i += 2)
stringlist = (pcre_uchar **)(PUBL(malloc))(size); stringlist = (pcre_uchar **)(PUBL(malloc))(size);
if (stringlist == NULL) return PCRE_ERROR_NOMEMORY; if (stringlist == NULL) return PCRE_ERROR_NOMEMORY;
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
*listptr = (const char **)stringlist; *listptr = (const char **)stringlist;
#else #elif defined COMPILE_PCRE16
*listptr = (PCRE_SPTR16 *)stringlist; *listptr = (PCRE_SPTR16 *)stringlist;
#elif defined COMPILE_PCRE32
*listptr = (PCRE_SPTR32 *)stringlist;
#endif #endif
p = (pcre_uchar *)(stringlist + stringcount + 1); p = (pcre_uchar *)(stringlist + stringcount + 1);
@ -440,12 +498,15 @@ Argument: the result of a previous pcre_get_substring_list()
Returns: nothing Returns: nothing
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre_free_substring_list(const char **pointer) pcre_free_substring_list(const char **pointer)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre16_free_substring_list(PCRE_SPTR16 *pointer) pcre16_free_substring_list(PCRE_SPTR16 *pointer)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre32_free_substring_list(PCRE_SPTR32 *pointer)
#endif #endif
{ {
(PUBL(free))((void *)pointer); (PUBL(free))((void *)pointer);
@ -478,14 +539,18 @@ Returns: if successful:
PCRE_ERROR_NOSUBSTRING (-7) substring not present PCRE_ERROR_NOSUBSTRING (-7) substring not present
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_substring(const char *subject, int *ovector, int stringcount, pcre_get_substring(const char *subject, int *ovector, int stringcount,
int stringnumber, const char **stringptr) int stringnumber, const char **stringptr)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_substring(PCRE_SPTR16 subject, int *ovector, int stringcount, pcre16_get_substring(PCRE_SPTR16 subject, int *ovector, int stringcount,
int stringnumber, PCRE_SPTR16 *stringptr) int stringnumber, PCRE_SPTR16 *stringptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_substring(PCRE_SPTR32 subject, int *ovector, int stringcount,
int stringnumber, PCRE_SPTR32 *stringptr)
#endif #endif
{ {
int yield; int yield;
@ -498,10 +563,12 @@ substring = (pcre_uchar *)(PUBL(malloc))(IN_UCHARS(yield + 1));
if (substring == NULL) return PCRE_ERROR_NOMEMORY; if (substring == NULL) return PCRE_ERROR_NOMEMORY;
memcpy(substring, subject + ovector[stringnumber], IN_UCHARS(yield)); memcpy(substring, subject + ovector[stringnumber], IN_UCHARS(yield));
substring[yield] = 0; substring[yield] = 0;
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
*stringptr = (const char *)substring; *stringptr = (const char *)substring;
#else #elif defined COMPILE_PCRE16
*stringptr = (PCRE_SPTR16)substring; *stringptr = (PCRE_SPTR16)substring;
#elif defined COMPILE_PCRE32
*stringptr = (PCRE_SPTR32)substring;
#endif #endif
return yield; return yield;
} }
@ -535,24 +602,31 @@ Returns: if successful:
PCRE_ERROR_NOSUBSTRING (-7) no such captured substring PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_get_named_substring(const pcre *code, const char *subject, pcre_get_named_substring(const pcre *code, const char *subject,
int *ovector, int stringcount, const char *stringname, int *ovector, int stringcount, const char *stringname,
const char **stringptr) const char **stringptr)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_get_named_substring(const pcre16 *code, PCRE_SPTR16 subject, pcre16_get_named_substring(const pcre16 *code, PCRE_SPTR16 subject,
int *ovector, int stringcount, PCRE_SPTR16 stringname, int *ovector, int stringcount, PCRE_SPTR16 stringname,
PCRE_SPTR16 *stringptr) PCRE_SPTR16 *stringptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_get_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
int *ovector, int stringcount, PCRE_SPTR32 stringname,
PCRE_SPTR32 *stringptr)
#endif #endif
{ {
int n = get_first_set(code, stringname, ovector); int n = get_first_set(code, stringname, ovector);
if (n <= 0) return n; if (n <= 0) return n;
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
return pcre_get_substring(subject, ovector, stringcount, n, stringptr); return pcre_get_substring(subject, ovector, stringcount, n, stringptr);
#else #elif defined COMPILE_PCRE16
return pcre16_get_substring(subject, ovector, stringcount, n, stringptr); return pcre16_get_substring(subject, ovector, stringcount, n, stringptr);
#elif defined COMPILE_PCRE32
return pcre32_get_substring(subject, ovector, stringcount, n, stringptr);
#endif #endif
} }
@ -571,12 +645,15 @@ Argument: the result of a previous pcre_get_substring()
Returns: nothing Returns: nothing
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre_free_substring(const char *pointer) pcre_free_substring(const char *pointer)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre16_free_substring(PCRE_SPTR16 pointer) pcre16_free_substring(PCRE_SPTR16 pointer)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
pcre32_free_substring(PCRE_SPTR32 pointer)
#endif #endif
{ {
(PUBL(free))((void *)pointer); (PUBL(free))((void *)pointer);

View File

@ -52,7 +52,9 @@ a local function is used.
Also, when compiling for Virtual Pascal, things are done differently, and Also, when compiling for Virtual Pascal, things are done differently, and
global variables are not used. */ global variables are not used. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"

File diff suppressed because it is too large Load Diff

View File

@ -45,7 +45,9 @@ compilation of dftables.c, in which case the macro DFTABLES is defined. */
#ifndef DFTABLES #ifndef DFTABLES
# ifdef HAVE_CONFIG_H
# include "config.h" # include "config.h"
# endif
# include "pcre_internal.h" # include "pcre_internal.h"
#endif #endif
@ -64,12 +66,15 @@ Arguments: none
Returns: pointer to the contiguous block of data Returns: pointer to the contiguous block of data
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
const unsigned char * const unsigned char *
pcre_maketables(void) pcre_maketables(void)
#else #elif defined COMPILE_PCRE16
const unsigned char * const unsigned char *
pcre16_maketables(void) pcre16_maketables(void)
#elif defined COMPILE_PCRE32
const unsigned char *
pcre32_maketables(void)
#endif #endif
{ {
unsigned char *yield, *p; unsigned char *yield, *p;
@ -125,7 +130,7 @@ within regexes. */
for (i = 0; i < 256; i++) for (i = 0; i < 256; i++)
{ {
int x = 0; int x = 0;
if (i != 0x0b && isspace(i)) x += ctype_space; if (i != CHAR_VT && isspace(i)) x += ctype_space;
if (isalpha(i)) x += ctype_letter; if (isalpha(i)) x += ctype_letter;
if (isdigit(i)) x += ctype_digit; if (isdigit(i)) x += ctype_digit;
if (isxdigit(i)) x += ctype_xdigit; if (isxdigit(i)) x += ctype_xdigit;

View File

@ -47,7 +47,9 @@ and NLTYPE_ANY. The full list of Unicode newline characters is taken from
http://unicode.org/unicode/reports/tr18/. */ http://unicode.org/unicode/reports/tr18/. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
@ -74,7 +76,7 @@ BOOL
PRIV(is_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR endptr, int *lenptr, PRIV(is_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR endptr, int *lenptr,
BOOL utf) BOOL utf)
{ {
int c; pcre_uint32 c;
(void)utf; (void)utf;
#ifdef SUPPORT_UTF #ifdef SUPPORT_UTF
if (utf) if (utf)
@ -85,11 +87,13 @@ else
#endif /* SUPPORT_UTF */ #endif /* SUPPORT_UTF */
c = *ptr; c = *ptr;
/* Note that this function is called only for ANY or ANYCRLF. */
if (type == NLTYPE_ANYCRLF) switch(c) if (type == NLTYPE_ANYCRLF) switch(c)
{ {
case 0x000a: *lenptr = 1; return TRUE; /* LF */ case CHAR_LF: *lenptr = 1; return TRUE;
case 0x000d: *lenptr = (ptr < endptr - 1 && ptr[1] == 0x0a)? 2 : 1; case CHAR_CR: *lenptr = (ptr < endptr - 1 && ptr[1] == CHAR_LF)? 2 : 1;
return TRUE; /* CR */ return TRUE;
default: return FALSE; default: return FALSE;
} }
@ -97,20 +101,29 @@ if (type == NLTYPE_ANYCRLF) switch(c)
else switch(c) else switch(c)
{ {
case 0x000a: /* LF */ #ifdef EBCDIC
case 0x000b: /* VT */ case CHAR_NEL:
case 0x000c: *lenptr = 1; return TRUE; /* FF */ #endif
case 0x000d: *lenptr = (ptr < endptr - 1 && ptr[1] == 0x0a)? 2 : 1; case CHAR_LF:
return TRUE; /* CR */ case CHAR_VT:
case CHAR_FF: *lenptr = 1; return TRUE;
case CHAR_CR:
*lenptr = (ptr < endptr - 1 && ptr[1] == CHAR_LF)? 2 : 1;
return TRUE;
#ifndef EBCDIC
#ifdef COMPILE_PCRE8 #ifdef COMPILE_PCRE8
case 0x0085: *lenptr = utf? 2 : 1; return TRUE; /* NEL */ case CHAR_NEL: *lenptr = utf? 2 : 1; return TRUE;
case 0x2028: /* LS */ case 0x2028: /* LS */
case 0x2029: *lenptr = 3; return TRUE; /* PS */ case 0x2029: *lenptr = 3; return TRUE; /* PS */
#else #else /* COMPILE_PCRE16 || COMPILE_PCRE32 */
case 0x0085: /* NEL */ case CHAR_NEL:
case 0x2028: /* LS */ case 0x2028: /* LS */
case 0x2029: *lenptr = 1; return TRUE; /* PS */ case 0x2029: *lenptr = 1; return TRUE; /* PS */
#endif /* COMPILE_PCRE8 */ #endif /* COMPILE_PCRE8 */
#endif /* Not EBCDIC */
default: return FALSE; default: return FALSE;
} }
} }
@ -138,7 +151,7 @@ BOOL
PRIV(was_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR startptr, int *lenptr, PRIV(was_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR startptr, int *lenptr,
BOOL utf) BOOL utf)
{ {
int c; pcre_uint32 c;
(void)utf; (void)utf;
ptr--; ptr--;
#ifdef SUPPORT_UTF #ifdef SUPPORT_UTF
@ -151,30 +164,45 @@ else
#endif /* SUPPORT_UTF */ #endif /* SUPPORT_UTF */
c = *ptr; c = *ptr;
/* Note that this function is called only for ANY or ANYCRLF. */
if (type == NLTYPE_ANYCRLF) switch(c) if (type == NLTYPE_ANYCRLF) switch(c)
{ {
case 0x000a: *lenptr = (ptr > startptr && ptr[-1] == 0x0d)? 2 : 1; case CHAR_LF:
return TRUE; /* LF */ *lenptr = (ptr > startptr && ptr[-1] == CHAR_CR)? 2 : 1;
case 0x000d: *lenptr = 1; return TRUE; /* CR */ return TRUE;
case CHAR_CR: *lenptr = 1; return TRUE;
default: return FALSE; default: return FALSE;
} }
/* NLTYPE_ANY */
else switch(c) else switch(c)
{ {
case 0x000a: *lenptr = (ptr > startptr && ptr[-1] == 0x0d)? 2 : 1; case CHAR_LF:
return TRUE; /* LF */ *lenptr = (ptr > startptr && ptr[-1] == CHAR_CR)? 2 : 1;
case 0x000b: /* VT */ return TRUE;
case 0x000c: /* FF */
case 0x000d: *lenptr = 1; return TRUE; /* CR */ #ifdef EBCDIC
case CHAR_NEL:
#endif
case CHAR_VT:
case CHAR_FF:
case CHAR_CR: *lenptr = 1; return TRUE;
#ifndef EBCDIC
#ifdef COMPILE_PCRE8 #ifdef COMPILE_PCRE8
case 0x0085: *lenptr = utf? 2 : 1; return TRUE; /* NEL */ case CHAR_NEL: *lenptr = utf? 2 : 1; return TRUE;
case 0x2028: /* LS */ case 0x2028: /* LS */
case 0x2029: *lenptr = 3; return TRUE; /* PS */ case 0x2029: *lenptr = 3; return TRUE; /* PS */
#else #else /* COMPILE_PCRE16 || COMPILE_PCRE32 */
case 0x0085: /* NEL */ case CHAR_NEL:
case 0x2028: /* LS */ case 0x2028: /* LS */
case 0x2029: *lenptr = 1; return TRUE; /* PS */ case 0x2029: *lenptr = 1; return TRUE; /* PS */
#endif /* COMPILE_PCRE8 */ #endif /* COMPILE_PCRE8 */
#endif /* NotEBCDIC */
default: return FALSE; default: return FALSE;
} }
} }

View File

@ -41,17 +41,20 @@ POSSIBILITY OF SUCH DAMAGE.
/* This file contains a private PCRE function that converts an ordinal /* This file contains a private PCRE function that converts an ordinal
character value into a UTF8 string. */ character value into a UTF8 string. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#define COMPILE_PCRE8
#include "pcre_internal.h" #include "pcre_internal.h"
/************************************************* /*************************************************
* Convert character value to UTF-8 * * Convert character value to UTF-8 *
*************************************************/ *************************************************/
/* This function takes an integer value in the range 0 - 0x10ffff /* This function takes an integer value in the range 0 - 0x10ffff
and encodes it as a UTF-8 character in 1 to 6 pcre_uchars. and encodes it as a UTF-8 character in 1 to 4 pcre_uchars.
Arguments: Arguments:
cvalue the character value cvalue the character value
@ -60,6 +63,7 @@ Arguments:
Returns: number of characters placed in the buffer Returns: number of characters placed in the buffer
*/ */
unsigned
int int
PRIV(ord2utf)(pcre_uint32 cvalue, pcre_uchar *buffer) PRIV(ord2utf)(pcre_uint32 cvalue, pcre_uchar *buffer)
{ {
@ -67,11 +71,6 @@ PRIV(ord2utf)(pcre_uint32 cvalue, pcre_uchar *buffer)
register int i, j; register int i, j;
/* Checking invalid cvalue character, encoded as invalid UTF-16 character.
Should never happen in practice. */
if ((cvalue & 0xf800) == 0xd800 || cvalue >= 0x110000)
cvalue = 0xfffe;
for (i = 0; i < PRIV(utf8_table1_size); i++) for (i = 0; i < PRIV(utf8_table1_size); i++)
if ((int)cvalue <= PRIV(utf8_table1)[i]) break; if ((int)cvalue <= PRIV(utf8_table1)[i]) break;
buffer += i; buffer += i;

View File

@ -44,7 +44,9 @@ pattern data block. This might be helpful in applications where the block is
shared by different users. */ shared by different users. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
@ -66,12 +68,15 @@ Returns: the (possibly updated) count value (a non-negative number), or
a negative error number a negative error number
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre_refcount(pcre *argument_re, int adjust) pcre_refcount(pcre *argument_re, int adjust)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre16_refcount(pcre16 *argument_re, int adjust) pcre16_refcount(pcre16 *argument_re, int adjust)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
pcre32_refcount(pcre32 *argument_re, int adjust)
#endif #endif
{ {
REAL_PCRE *re = (REAL_PCRE *)argument_re; REAL_PCRE *re = (REAL_PCRE *)argument_re;

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
supporting functions. */ supporting functions. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
@ -96,7 +98,7 @@ for (;;)
{ {
int d, min; int d, min;
pcre_uchar *cs, *ce; pcre_uchar *cs, *ce;
register int op = *cc; register pcre_uchar op = *cc;
switch (op) switch (op)
{ {
@ -321,15 +323,19 @@ for (;;)
/* Check a class for variable quantification */ /* Check a class for variable quantification */
#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
case OP_XCLASS:
cc += GET(cc, 1) - PRIV(OP_lengths)[OP_CLASS];
/* Fall through */
#endif
case OP_CLASS: case OP_CLASS:
case OP_NCLASS: case OP_NCLASS:
#if defined SUPPORT_UTF || defined COMPILE_PCRE16 || defined COMPILE_PCRE32
case OP_XCLASS:
/* The original code caused an unsigned overflow in 64 bit systems,
so now we use a conditional statement. */
if (op == OP_XCLASS)
cc += GET(cc, 1);
else
cc += PRIV(OP_lengths)[OP_CLASS];
#else
cc += PRIV(OP_lengths)[OP_CLASS]; cc += PRIV(OP_lengths)[OP_CLASS];
#endif
switch (*cc) switch (*cc)
{ {
@ -536,7 +542,7 @@ Arguments:
p points to the character p points to the character
caseless the caseless flag caseless the caseless flag
cd the block with char table pointers cd the block with char table pointers
utf TRUE for UTF-8 / UTF-16 mode utf TRUE for UTF-8 / UTF-16 / UTF-32 mode
Returns: pointer after the character Returns: pointer after the character
*/ */
@ -545,7 +551,7 @@ static const pcre_uchar *
set_table_bit(pcre_uint8 *start_bits, const pcre_uchar *p, BOOL caseless, set_table_bit(pcre_uint8 *start_bits, const pcre_uchar *p, BOOL caseless,
compile_data *cd, BOOL utf) compile_data *cd, BOOL utf)
{ {
unsigned int c = *p; pcre_uint32 c = *p;
#ifdef COMPILE_PCRE8 #ifdef COMPILE_PCRE8
SET_BIT(c); SET_BIT(c);
@ -562,18 +568,20 @@ if (utf && c > 127)
(void)PRIV(ord2utf)(c, buff); (void)PRIV(ord2utf)(c, buff);
SET_BIT(buff[0]); SET_BIT(buff[0]);
} }
#endif #endif /* Not SUPPORT_UCP */
return p; return p;
} }
#endif #else /* Not SUPPORT_UTF */
(void)(utf); /* Stops warning for unused parameter */
#endif /* SUPPORT_UTF */
/* Not UTF-8 mode, or character is less than 127. */ /* Not UTF-8 mode, or character is less than 127. */
if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]); if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]);
return p + 1; return p + 1;
#endif #endif /* COMPILE_PCRE8 */
#ifdef COMPILE_PCRE16 #if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
if (c > 0xff) if (c > 0xff)
{ {
c = 0xff; c = 0xff;
@ -593,10 +601,12 @@ if (utf && c > 127)
c = 0xff; c = 0xff;
SET_BIT(c); SET_BIT(c);
} }
#endif #endif /* SUPPORT_UCP */
return p; return p;
} }
#endif #else /* Not SUPPORT_UTF */
(void)(utf); /* Stops warning for unused parameter */
#endif /* SUPPORT_UTF */
if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]); if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]);
return p + 1; return p + 1;
@ -626,10 +636,10 @@ Returns: nothing
*/ */
static void static void
set_type_bits(pcre_uint8 *start_bits, int cbit_type, int table_limit, set_type_bits(pcre_uint8 *start_bits, int cbit_type, unsigned int table_limit,
compile_data *cd) compile_data *cd)
{ {
register int c; register pcre_uint32 c;
for (c = 0; c < table_limit; c++) start_bits[c] |= cd->cbits[c+cbit_type]; for (c = 0; c < table_limit; c++) start_bits[c] |= cd->cbits[c+cbit_type];
#if defined SUPPORT_UTF && defined COMPILE_PCRE8 #if defined SUPPORT_UTF && defined COMPILE_PCRE8
if (table_limit == 32) return; if (table_limit == 32) return;
@ -668,10 +678,10 @@ Returns: nothing
*/ */
static void static void
set_nottype_bits(pcre_uint8 *start_bits, int cbit_type, int table_limit, set_nottype_bits(pcre_uint8 *start_bits, int cbit_type, unsigned int table_limit,
compile_data *cd) compile_data *cd)
{ {
register int c; register pcre_uint32 c;
for (c = 0; c < table_limit; c++) start_bits[c] |= ~cd->cbits[c+cbit_type]; for (c = 0; c < table_limit; c++) start_bits[c] |= ~cd->cbits[c+cbit_type];
#if defined SUPPORT_UTF && defined COMPILE_PCRE8 #if defined SUPPORT_UTF && defined COMPILE_PCRE8
if (table_limit != 32) for (c = 24; c < 32; c++) start_bits[c] = 0xff; if (table_limit != 32) for (c = 24; c < 32; c++) start_bits[c] = 0xff;
@ -695,7 +705,7 @@ function fails unless the result is SSB_DONE.
Arguments: Arguments:
code points to an expression code points to an expression
start_bits points to a 32-byte table, initialized to 0 start_bits points to a 32-byte table, initialized to 0
utf TRUE if in UTF-8 / UTF-16 mode utf TRUE if in UTF-8 / UTF-16 / UTF-32 mode
cd the block with char table pointers cd the block with char table pointers
Returns: SSB_FAIL => Failed to find any starting bytes Returns: SSB_FAIL => Failed to find any starting bytes
@ -708,7 +718,7 @@ static int
set_start_bits(const pcre_uchar *code, pcre_uint8 *start_bits, BOOL utf, set_start_bits(const pcre_uchar *code, pcre_uint8 *start_bits, BOOL utf,
compile_data *cd) compile_data *cd)
{ {
register int c; register pcre_uint32 c;
int yield = SSB_DONE; int yield = SSB_DONE;
#if defined SUPPORT_UTF && defined COMPILE_PCRE8 #if defined SUPPORT_UTF && defined COMPILE_PCRE8
int table_limit = utf? 16:32; int table_limit = utf? 16:32;
@ -984,8 +994,8 @@ do
identical. */ identical. */
case OP_HSPACE: case OP_HSPACE:
SET_BIT(0x09); SET_BIT(CHAR_HT);
SET_BIT(0x20); SET_BIT(CHAR_SPACE);
#ifdef SUPPORT_UTF #ifdef SUPPORT_UTF
if (utf) if (utf)
{ {
@ -994,46 +1004,46 @@ do
SET_BIT(0xE1); /* For U+1680, U+180E */ SET_BIT(0xE1); /* For U+1680, U+180E */
SET_BIT(0xE2); /* For U+2000 - U+200A, U+202F, U+205F */ SET_BIT(0xE2); /* For U+2000 - U+200A, U+202F, U+205F */
SET_BIT(0xE3); /* For U+3000 */ SET_BIT(0xE3); /* For U+3000 */
#endif #elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
#ifdef COMPILE_PCRE16
SET_BIT(0xA0); SET_BIT(0xA0);
SET_BIT(0xFF); /* For characters > 255 */ SET_BIT(0xFF); /* For characters > 255 */
#endif #endif /* COMPILE_PCRE[8|16|32] */
} }
else else
#endif /* SUPPORT_UTF */ #endif /* SUPPORT_UTF */
{ {
#ifndef EBCDIC
SET_BIT(0xA0); SET_BIT(0xA0);
#ifdef COMPILE_PCRE16 #endif /* Not EBCDIC */
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(0xFF); /* For characters > 255 */ SET_BIT(0xFF); /* For characters > 255 */
#endif #endif /* COMPILE_PCRE[16|32] */
} }
try_next = FALSE; try_next = FALSE;
break; break;
case OP_ANYNL: case OP_ANYNL:
case OP_VSPACE: case OP_VSPACE:
SET_BIT(0x0A); SET_BIT(CHAR_LF);
SET_BIT(0x0B); SET_BIT(CHAR_VT);
SET_BIT(0x0C); SET_BIT(CHAR_FF);
SET_BIT(0x0D); SET_BIT(CHAR_CR);
#ifdef SUPPORT_UTF #ifdef SUPPORT_UTF
if (utf) if (utf)
{ {
#ifdef COMPILE_PCRE8 #ifdef COMPILE_PCRE8
SET_BIT(0xC2); /* For U+0085 */ SET_BIT(0xC2); /* For U+0085 */
SET_BIT(0xE2); /* For U+2028, U+2029 */ SET_BIT(0xE2); /* For U+2028, U+2029 */
#endif #elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
#ifdef COMPILE_PCRE16 SET_BIT(CHAR_NEL);
SET_BIT(0x85);
SET_BIT(0xFF); /* For characters > 255 */ SET_BIT(0xFF); /* For characters > 255 */
#endif #endif /* COMPILE_PCRE[8|16|32] */
} }
else else
#endif /* SUPPORT_UTF */ #endif /* SUPPORT_UTF */
{ {
SET_BIT(0x85); SET_BIT(CHAR_NEL);
#ifdef COMPILE_PCRE16 #if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(0xFF); /* For characters > 255 */ SET_BIT(0xFF); /* For characters > 255 */
#endif #endif
} }
@ -1056,7 +1066,8 @@ do
break; break;
/* The cbit_space table has vertical tab as whitespace; we have to /* The cbit_space table has vertical tab as whitespace; we have to
ensure it is set as not whitespace. */ ensure it is set as not whitespace. Luckily, the code value is the same
(0x0b) in ASCII and EBCDIC, so we can just adjust the appropriate bit. */
case OP_NOT_WHITESPACE: case OP_NOT_WHITESPACE:
set_nottype_bits(start_bits, cbit_space, table_limit, cd); set_nottype_bits(start_bits, cbit_space, table_limit, cd);
@ -1064,8 +1075,9 @@ do
try_next = FALSE; try_next = FALSE;
break; break;
/* The cbit_space table has vertical tab as whitespace; we have to /* The cbit_space table has vertical tab as whitespace; we have to not
not set it from the table. */ set it from the table. Luckily, the code value is the same (0x0b) in
ASCII and EBCDIC, so we can just adjust the appropriate bit. */
case OP_WHITESPACE: case OP_WHITESPACE:
c = start_bits[1]; /* Save in case it was already set */ c = start_bits[1]; /* Save in case it was already set */
@ -1119,8 +1131,8 @@ do
return SSB_FAIL; return SSB_FAIL;
case OP_HSPACE: case OP_HSPACE:
SET_BIT(0x09); SET_BIT(CHAR_HT);
SET_BIT(0x20); SET_BIT(CHAR_SPACE);
#ifdef SUPPORT_UTF #ifdef SUPPORT_UTF
if (utf) if (utf)
{ {
@ -1129,38 +1141,38 @@ do
SET_BIT(0xE1); /* For U+1680, U+180E */ SET_BIT(0xE1); /* For U+1680, U+180E */
SET_BIT(0xE2); /* For U+2000 - U+200A, U+202F, U+205F */ SET_BIT(0xE2); /* For U+2000 - U+200A, U+202F, U+205F */
SET_BIT(0xE3); /* For U+3000 */ SET_BIT(0xE3); /* For U+3000 */
#endif #elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
#ifdef COMPILE_PCRE16
SET_BIT(0xA0); SET_BIT(0xA0);
SET_BIT(0xFF); /* For characters > 255 */ SET_BIT(0xFF); /* For characters > 255 */
#endif #endif /* COMPILE_PCRE[8|16|32] */
} }
else else
#endif /* SUPPORT_UTF */ #endif /* SUPPORT_UTF */
#ifndef EBCDIC
SET_BIT(0xA0); SET_BIT(0xA0);
#endif /* Not EBCDIC */
break; break;
case OP_ANYNL: case OP_ANYNL:
case OP_VSPACE: case OP_VSPACE:
SET_BIT(0x0A); SET_BIT(CHAR_LF);
SET_BIT(0x0B); SET_BIT(CHAR_VT);
SET_BIT(0x0C); SET_BIT(CHAR_FF);
SET_BIT(0x0D); SET_BIT(CHAR_CR);
#ifdef SUPPORT_UTF #ifdef SUPPORT_UTF
if (utf) if (utf)
{ {
#ifdef COMPILE_PCRE8 #ifdef COMPILE_PCRE8
SET_BIT(0xC2); /* For U+0085 */ SET_BIT(0xC2); /* For U+0085 */
SET_BIT(0xE2); /* For U+2028, U+2029 */ SET_BIT(0xE2); /* For U+2028, U+2029 */
#endif #elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
#ifdef COMPILE_PCRE16 SET_BIT(CHAR_NEL);
SET_BIT(0x85);
SET_BIT(0xFF); /* For characters > 255 */ SET_BIT(0xFF); /* For characters > 255 */
#endif #endif /* COMPILE_PCRE16 */
} }
else else
#endif /* SUPPORT_UTF */ #endif /* SUPPORT_UTF */
SET_BIT(0x85); SET_BIT(CHAR_NEL);
break; break;
case OP_NOT_DIGIT: case OP_NOT_DIGIT:
@ -1172,7 +1184,9 @@ do
break; break;
/* The cbit_space table has vertical tab as whitespace; we have to /* The cbit_space table has vertical tab as whitespace; we have to
ensure it gets set as not whitespace. */ ensure it gets set as not whitespace. Luckily, the code value is the
same (0x0b) in ASCII and EBCDIC, so we can just adjust the appropriate
bit. */
case OP_NOT_WHITESPACE: case OP_NOT_WHITESPACE:
set_nottype_bits(start_bits, cbit_space, table_limit, cd); set_nottype_bits(start_bits, cbit_space, table_limit, cd);
@ -1180,7 +1194,8 @@ do
break; break;
/* The cbit_space table has vertical tab as whitespace; we have to /* The cbit_space table has vertical tab as whitespace; we have to
avoid setting it. */ avoid setting it. Luckily, the code value is the same (0x0b) in ASCII
and EBCDIC, so we can just adjust the appropriate bit. */
case OP_WHITESPACE: case OP_WHITESPACE:
c = start_bits[1]; /* Save in case it was already set */ c = start_bits[1]; /* Save in case it was already set */
@ -1214,7 +1229,7 @@ do
memset(start_bits+25, 0xff, 7); /* Bits for 0xc9 - 0xff */ memset(start_bits+25, 0xff, 7); /* Bits for 0xc9 - 0xff */
} }
#endif #endif
#ifdef COMPILE_PCRE16 #if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
SET_BIT(0xFF); /* For characters > 255 */ SET_BIT(0xFF); /* For characters > 255 */
#endif #endif
/* Fall through */ /* Fall through */
@ -1310,12 +1325,15 @@ Returns: pointer to a pcre[16]_extra block, with study_data filled in and
NULL on error or if no optimization possible NULL on error or if no optimization possible
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN pcre_extra * PCRE_CALL_CONVENTION PCRE_EXP_DEFN pcre_extra * PCRE_CALL_CONVENTION
pcre_study(const pcre *external_re, int options, const char **errorptr) pcre_study(const pcre *external_re, int options, const char **errorptr)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN pcre16_extra * PCRE_CALL_CONVENTION PCRE_EXP_DEFN pcre16_extra * PCRE_CALL_CONVENTION
pcre16_study(const pcre16 *external_re, int options, const char **errorptr) pcre16_study(const pcre16 *external_re, int options, const char **errorptr)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN pcre32_extra * PCRE_CALL_CONVENTION
pcre32_study(const pcre32 *external_re, int options, const char **errorptr)
#endif #endif
{ {
int min; int min;
@ -1338,10 +1356,12 @@ if (re == NULL || re->magic_number != MAGIC_NUMBER)
if ((re->flags & PCRE_MODE) == 0) if ((re->flags & PCRE_MODE) == 0)
{ {
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
*errorptr = "argument is compiled in 16 bit mode"; *errorptr = "argument not compiled in 8 bit mode";
#else #elif defined COMPILE_PCRE16
*errorptr = "argument is compiled in 8 bit mode"; *errorptr = "argument not compiled in 16 bit mode";
#elif defined COMPILE_PCRE32
*errorptr = "argument not compiled in 32 bit mode";
#endif #endif
return NULL; return NULL;
} }
@ -1368,14 +1388,18 @@ if ((re->options & PCRE_ANCHORED) == 0 &&
tables = re->tables; tables = re->tables;
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
if (tables == NULL) if (tables == NULL)
(void)pcre_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES, (void)pcre_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
(void *)(&tables)); (void *)(&tables));
#else #elif defined COMPILE_PCRE16
if (tables == NULL) if (tables == NULL)
(void)pcre16_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES, (void)pcre16_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
(void *)(&tables)); (void *)(&tables));
#elif defined COMPILE_PCRE32
if (tables == NULL)
(void)pcre32_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
(void *)(&tables));
#endif #endif
compile_block.lcc = tables + lcc_offset; compile_block.lcc = tables + lcc_offset;
@ -1406,20 +1430,20 @@ switch(min = find_minlength(code, code, re->options, 0))
} }
/* If a set of starting bytes has been identified, or if the minimum length is /* If a set of starting bytes has been identified, or if the minimum length is
greater than zero, or if JIT optimization has been requested, get a greater than zero, or if JIT optimization has been requested, or if
pcre[16]_extra block and a pcre_study_data block. The study data is put in the PCRE_STUDY_EXTRA_NEEDED is set, get a pcre[16]_extra block and a
latter, which is pointed to by the former, which may also get additional data pcre_study_data block. The study data is put in the latter, which is pointed to
set later by the calling program. At the moment, the size of pcre_study_data by the former, which may also get additional data set later by the calling
is fixed. We nevertheless save it in a field for returning via the program. At the moment, the size of pcre_study_data is fixed. We nevertheless
pcre_fullinfo() function so that if it becomes variable in the future, save it in a field for returning via the pcre_fullinfo() function so that if it
we don't have to change that code. */ becomes variable in the future, we don't have to change that code. */
if (bits_set || min > 0 if (bits_set || min > 0 || (options & (
#ifdef SUPPORT_JIT #ifdef SUPPORT_JIT
|| (options & (PCRE_STUDY_JIT_COMPILE | PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE PCRE_STUDY_JIT_COMPILE | PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE |
| PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE)) != 0 PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE |
#endif #endif
) PCRE_STUDY_EXTRA_NEEDED)) != 0)
{ {
extra = (PUBL(extra) *)(PUBL(malloc)) extra = (PUBL(extra) *)(PUBL(malloc))
(sizeof(PUBL(extra)) + sizeof(pcre_study_data)); (sizeof(PUBL(extra)) + sizeof(pcre_study_data));
@ -1473,7 +1497,8 @@ if (bits_set || min > 0
/* If JIT support was compiled and requested, attempt the JIT compilation. /* If JIT support was compiled and requested, attempt the JIT compilation.
If no starting bytes were found, and the minimum length is zero, and JIT If no starting bytes were found, and the minimum length is zero, and JIT
compilation fails, abandon the extra block and return NULL. */ compilation fails, abandon the extra block and return NULL, unless
PCRE_STUDY_EXTRA_NEEDED is set. */
#ifdef SUPPORT_JIT #ifdef SUPPORT_JIT
extra->executable_jit = NULL; extra->executable_jit = NULL;
@ -1484,13 +1509,15 @@ if (bits_set || min > 0
if ((options & PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE) != 0) if ((options & PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE) != 0)
PRIV(jit_compile)(re, extra, JIT_PARTIAL_HARD_COMPILE); PRIV(jit_compile)(re, extra, JIT_PARTIAL_HARD_COMPILE);
if (study->flags == 0 && (extra->flags & PCRE_EXTRA_EXECUTABLE_JIT) == 0) if (study->flags == 0 && (extra->flags & PCRE_EXTRA_EXECUTABLE_JIT) == 0 &&
(options & PCRE_STUDY_EXTRA_NEEDED) == 0)
{ {
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
pcre_free_study(extra); pcre_free_study(extra);
#endif #elif defined COMPILE_PCRE16
#ifdef COMPILE_PCRE16
pcre16_free_study(extra); pcre16_free_study(extra);
#elif defined COMPILE_PCRE32
pcre32_free_study(extra);
#endif #endif
extra = NULL; extra = NULL;
} }
@ -1511,12 +1538,15 @@ Argument: a pointer to the pcre[16]_extra block
Returns: nothing Returns: nothing
*/ */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN void PCRE_EXP_DEFN void
pcre_free_study(pcre_extra *extra) pcre_free_study(pcre_extra *extra)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN void PCRE_EXP_DEFN void
pcre16_free_study(pcre16_extra *extra) pcre16_free_study(pcre16_extra *extra)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN void
pcre32_free_study(pcre32_extra *extra)
#endif #endif
{ {
if (extra == NULL) if (extra == NULL)

View File

@ -45,7 +45,9 @@ uses macros to change their names from _pcre_xxx to xxxx, thereby avoiding name
clashes with the library. */ clashes with the library. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
@ -56,6 +58,12 @@ the definition is next to the definition of the opcodes in pcre_internal.h. */
const pcre_uint8 PRIV(OP_lengths)[] = { OP_LENGTHS }; const pcre_uint8 PRIV(OP_lengths)[] = { OP_LENGTHS };
/* Tables of horizontal and vertical whitespace characters, suitable for
adding to classes. */
const pcre_uint32 PRIV(hspace_list)[] = { HSPACE_LIST };
const pcre_uint32 PRIV(vspace_list)[] = { VSPACE_LIST };
/************************************************* /*************************************************
@ -66,9 +74,9 @@ const pcre_uint8 PRIV(OP_lengths)[] = { OP_LENGTHS };
character. */ character. */
#if (defined SUPPORT_UTF && defined COMPILE_PCRE8) \ #if (defined SUPPORT_UTF && defined COMPILE_PCRE8) \
|| (defined PCRE_INCLUDED && defined SUPPORT_PCRE16) || (defined PCRE_INCLUDED && (defined SUPPORT_PCRE16 || defined SUPPORT_PCRE32))
/* These tables are also required by pcretest in 16 bit mode. */ /* These tables are also required by pcretest in 16- or 32-bit mode. */
const int PRIV(utf8_table1)[] = const int PRIV(utf8_table1)[] =
{ 0x7f, 0x7ff, 0xffff, 0x1fffff, 0x3ffffff, 0x7fffffff}; { 0x7f, 0x7ff, 0xffff, 0x1fffff, 0x3ffffff, 0x7fffffff};
@ -90,13 +98,13 @@ const pcre_uint8 PRIV(utf8_table4)[] = {
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5 }; 3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5 };
#endif /* (SUPPORT_UTF && COMPILE_PCRE8) || (PCRE_INCLUDED && SUPPORT_PCRE16)*/ #endif /* (SUPPORT_UTF && COMPILE_PCRE8) || (PCRE_INCLUDED && SUPPORT_PCRE[16|32])*/
#ifdef SUPPORT_UTF #ifdef SUPPORT_UTF
/* Table to translate from particular type value to the general value. */ /* Table to translate from particular type value to the general value. */
const int PRIV(ucp_gentype)[] = { const pcre_uint32 PRIV(ucp_gentype)[] = {
ucp_C, ucp_C, ucp_C, ucp_C, ucp_C, /* Cc, Cf, Cn, Co, Cs */ ucp_C, ucp_C, ucp_C, ucp_C, ucp_C, /* Cc, Cf, Cn, Co, Cs */
ucp_L, ucp_L, ucp_L, ucp_L, ucp_L, /* Ll, Lu, Lm, Lo, Lt */ ucp_L, ucp_L, ucp_L, ucp_L, ucp_L, /* Ll, Lu, Lm, Lo, Lt */
ucp_M, ucp_M, ucp_M, /* Mc, Me, Mn */ ucp_M, ucp_M, ucp_M, /* Mc, Me, Mn */
@ -107,6 +115,66 @@ const int PRIV(ucp_gentype)[] = {
ucp_Z, ucp_Z, ucp_Z /* Zl, Zp, Zs */ ucp_Z, ucp_Z, ucp_Z /* Zl, Zp, Zs */
}; };
/* This table encodes the rules for finding the end of an extended grapheme
cluster. Every code point has a grapheme break property which is one of the
ucp_gbXX values defined in ucp.h. The 2-dimensional table is indexed by the
properties of two adjacent code points. The left property selects a word from
the table, and the right property selects a bit from that word like this:
ucp_gbtable[left-property] & (1 << right-property)
The value is non-zero if a grapheme break is NOT permitted between the relevant
two code points. The breaking rules are as follows:
1. Break at the start and end of text (pretty obviously).
2. Do not break between a CR and LF; otherwise, break before and after
controls.
3. Do not break Hangul syllable sequences, the rules for which are:
L may be followed by L, V, LV or LVT
LV or V may be followed by V or T
LVT or T may be followed by T
4. Do not break before extending characters.
The next two rules are only for extended grapheme clusters (but that's what we
are implementing).
5. Do not break before SpacingMarks.
6. Do not break after Prepend characters.
7. Otherwise, break everywhere.
*/
const pcre_uint32 PRIV(ucp_gbtable[]) = {
(1<<ucp_gbLF), /* 0 CR */
0, /* 1 LF */
0, /* 2 Control */
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark), /* 3 Extend */
(1<<ucp_gbExtend)|(1<<ucp_gbPrepend)| /* 4 Prepend */
(1<<ucp_gbSpacingMark)|(1<<ucp_gbL)|
(1<<ucp_gbV)|(1<<ucp_gbT)|(1<<ucp_gbLV)|
(1<<ucp_gbLVT)|(1<<ucp_gbOther),
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark), /* 5 SpacingMark */
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbL)| /* 6 L */
(1<<ucp_gbL)|(1<<ucp_gbV)|(1<<ucp_gbLV)|(1<<ucp_gbLVT),
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbV)| /* 7 V */
(1<<ucp_gbT),
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbT), /* 8 T */
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbV)| /* 9 LV */
(1<<ucp_gbT),
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbT), /* 10 LVT */
(1<<ucp_gbRegionalIndicator), /* 11 RegionalIndicator */
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark) /* 12 Other */
};
#ifdef SUPPORT_JIT #ifdef SUPPORT_JIT
/* This table reverses PRIV(ucp_gentype). We can save the cost /* This table reverses PRIV(ucp_gentype). We can save the cost
of a memory load. */ of a memory load. */

File diff suppressed because it is too large Load Diff

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
strings. */ strings. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
@ -90,6 +92,7 @@ PCRE_UTF8_ERR18 Overlong 5-byte sequence (won't ever occur)
PCRE_UTF8_ERR19 Overlong 6-byte sequence (won't ever occur) PCRE_UTF8_ERR19 Overlong 6-byte sequence (won't ever occur)
PCRE_UTF8_ERR20 Isolated 0x80 byte (not within UTF-8 character) PCRE_UTF8_ERR20 Isolated 0x80 byte (not within UTF-8 character)
PCRE_UTF8_ERR21 Byte with the illegal value 0xfe or 0xff PCRE_UTF8_ERR21 Byte with the illegal value 0xfe or 0xff
PCRE_UTF8_ERR22 Non-character
Arguments: Arguments:
string points to the string string points to the string
@ -114,7 +117,8 @@ if (length < 0)
for (p = string; length-- > 0; p++) for (p = string; length-- > 0; p++)
{ {
register int ab, c, d; register pcre_uchar ab, c, d;
pcre_uint32 v = 0;
c = *p; c = *p;
if (c < 128) continue; /* ASCII character */ if (c < 128) continue; /* ASCII character */
@ -183,6 +187,7 @@ for (p = string; length-- > 0; p++)
*erroroffset = (int)(p - string) - 2; *erroroffset = (int)(p - string) - 2;
return PCRE_UTF8_ERR14; return PCRE_UTF8_ERR14;
} }
v = ((c & 0x0f) << 12) | ((d & 0x3f) << 6) | (*p & 0x3f);
break; break;
/* 4-byte character. Check 3rd and 4th bytes for 0x80. Then check first 2 /* 4-byte character. Check 3rd and 4th bytes for 0x80. Then check first 2
@ -210,6 +215,7 @@ for (p = string; length-- > 0; p++)
*erroroffset = (int)(p - string) - 3; *erroroffset = (int)(p - string) - 3;
return PCRE_UTF8_ERR13; return PCRE_UTF8_ERR13;
} }
v = ((c & 0x07) << 18) | ((d & 0x3f) << 12) | ((p[-1] & 0x3f) << 6) | (*p & 0x3f);
break; break;
/* 5-byte and 6-byte characters are not allowed by RFC 3629, and will be /* 5-byte and 6-byte characters are not allowed by RFC 3629, and will be
@ -284,11 +290,20 @@ for (p = string; length-- > 0; p++)
*erroroffset = (int)(p - string) - ab; *erroroffset = (int)(p - string) - ab;
return (ab == 4)? PCRE_UTF8_ERR11 : PCRE_UTF8_ERR12; return (ab == 4)? PCRE_UTF8_ERR11 : PCRE_UTF8_ERR12;
} }
/* Reject non-characters. The pointer p is currently at the last byte of the
character. */
if ((v & 0xfffeu) == 0xfffeu || (v >= 0xfdd0 && v <= 0xfdef))
{
*erroroffset = (int)(p - string) - ab;
return PCRE_UTF8_ERR22;
}
} }
#else /* SUPPORT_UTF */ #else /* Not SUPPORT_UTF */
(void)(string); /* Keep picky compilers happy */ (void)(string); /* Keep picky compilers happy */
(void)(length); (void)(length);
(void)(erroroffset);
#endif #endif
return PCRE_UTF8_ERR0; /* This indicates success */ return PCRE_UTF8_ERR0; /* This indicates success */

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
string that identifies the PCRE version that is in use. */ string that identifies the PCRE version that is in use. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
@ -77,12 +79,15 @@ I could find no way of detecting that a macro is defined as an empty string at
pre-processor time. This hack uses a standard trick for avoiding calling pre-processor time. This hack uses a standard trick for avoiding calling
the STRING macro with an empty argument when doing the test. */ the STRING macro with an empty argument when doing the test. */
#ifdef COMPILE_PCRE8 #if defined COMPILE_PCRE8
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
pcre_version(void) pcre_version(void)
#else #elif defined COMPILE_PCRE16
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
pcre16_version(void) pcre16_version(void)
#elif defined COMPILE_PCRE32
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
pcre32_version(void)
#endif #endif
{ {
return (XSTRING(Z PCRE_PRERELEASE)[1] == 0)? return (XSTRING(Z PCRE_PRERELEASE)[1] == 0)?

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
class. It is used by both pcre_exec() and pcre_def_exec(). */ class. It is used by both pcre_exec() and pcre_def_exec(). */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
#include "pcre_internal.h" #include "pcre_internal.h"
@ -62,9 +64,9 @@ Returns: TRUE if character matches, else FALSE
*/ */
BOOL BOOL
PRIV(xclass)(int c, const pcre_uchar *data, BOOL utf) PRIV(xclass)(pcre_uint32 c, const pcre_uchar *data, BOOL utf)
{ {
int t; pcre_uchar t;
BOOL negated = (*data & XCL_NOT) != 0; BOOL negated = (*data & XCL_NOT) != 0;
(void)utf; (void)utf;
@ -92,7 +94,7 @@ if ((*data++ & XCL_MAP) != 0) data += 32 / sizeof(pcre_uchar);
while ((t = *data++) != XCL_END) while ((t = *data++) != XCL_END)
{ {
int x, y; pcre_uint32 x, y;
if (t == XCL_SINGLE) if (t == XCL_SINGLE)
{ {
#ifdef SUPPORT_UTF #ifdef SUPPORT_UTF

View File

@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
functions. */ functions. */
#ifdef HAVE_CONFIG_H
#include "config.h" #include "config.h"
#endif
/* Ensure that the PCREPOSIX_EXP_xxx macros are set appropriately for /* Ensure that the PCREPOSIX_EXP_xxx macros are set appropriately for
@ -155,11 +157,12 @@ static const int eint[] = {
REG_BADPAT, /* internal error: unknown opcode in find_fixedlength() */ REG_BADPAT, /* internal error: unknown opcode in find_fixedlength() */
REG_BADPAT, /* \N is not supported in a class */ REG_BADPAT, /* \N is not supported in a class */
REG_BADPAT, /* too many forward references */ REG_BADPAT, /* too many forward references */
REG_BADPAT, /* disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff) */ REG_BADPAT, /* disallowed UTF-8/16/32 code point (>= 0xd800 && <= 0xdfff) */
REG_BADPAT, /* invalid UTF-16 string (should not occur) */ REG_BADPAT, /* invalid UTF-16 string (should not occur) */
/* 75 */ /* 75 */
REG_BADPAT, /* overlong MARK name */ REG_BADPAT, /* overlong MARK name */
REG_BADPAT /* character value in \u.... sequence is too large */ REG_BADPAT, /* character value in \u.... sequence is too large */
REG_BADPAT /* invalid UTF-32 string (should not occur) */
}; };
/* Table of texts corresponding to POSIX error codes */ /* Table of texts corresponding to POSIX error codes */
@ -257,6 +260,7 @@ const char *errorptr;
int erroffset; int erroffset;
int errorcode; int errorcode;
int options = 0; int options = 0;
int re_nsub = 0;
if ((cflags & REG_ICASE) != 0) options |= PCRE_CASELESS; if ((cflags & REG_ICASE) != 0) options |= PCRE_CASELESS;
if ((cflags & REG_NEWLINE) != 0) options |= PCRE_MULTILINE; if ((cflags & REG_NEWLINE) != 0) options |= PCRE_MULTILINE;
@ -280,7 +284,8 @@ if (preg->re_pcre == NULL)
} }
(void)pcre_fullinfo((const pcre *)preg->re_pcre, NULL, PCRE_INFO_CAPTURECOUNT, (void)pcre_fullinfo((const pcre *)preg->re_pcre, NULL, PCRE_INFO_CAPTURECOUNT,
&(preg->re_nsub)); &re_nsub);
preg->re_nsub = (size_t)re_nsub;
return 0; return 0;
} }
@ -312,7 +317,7 @@ int *ovector = NULL;
int small_ovector[POSIX_MALLOC_THRESHOLD * 3]; int small_ovector[POSIX_MALLOC_THRESHOLD * 3];
BOOL allocated_ovector = FALSE; BOOL allocated_ovector = FALSE;
BOOL nosub = BOOL nosub =
(((const pcre *)preg->re_pcre)->options & PCRE_NO_AUTO_CAPTURE) != 0; (REAL_PCRE_OPTIONS((const pcre *)preg->re_pcre) & PCRE_NO_AUTO_CAPTURE) != 0;
if ((eflags & REG_NOTBOL) != 0) options |= PCRE_NOTBOL; if ((eflags & REG_NOTBOL) != 0) options |= PCRE_NOTBOL;
if ((eflags & REG_NOTEOL) != 0) options |= PCRE_NOTEOL; if ((eflags & REG_NOTEOL) != 0) options |= PCRE_NOTEOL;

View File

@ -93,6 +93,7 @@ RC=0
---------------------------- Test 13 ----------------------------- ---------------------------- Test 13 -----------------------------
Here is the pattern again. Here is the pattern again.
That time it was on a line by itself. That time it was on a line by itself.
seventeen
This line contains pattern not on a line by itself. This line contains pattern not on a line by itself.
RC=0 RC=0
---------------------------- Test 14 ----------------------------- ---------------------------- Test 14 -----------------------------
@ -370,11 +371,11 @@ RC=2
---------------------------- Test 34 ----------------------------- ---------------------------- Test 34 -----------------------------
RC=2 RC=2
---------------------------- Test 35 ----------------------------- ---------------------------- Test 35 -----------------------------
./testdata/grepinput8
./testdata/grepinputx ./testdata/grepinputx
RC=0 RC=0
---------------------------- Test 36 ----------------------------- ---------------------------- Test 36 -----------------------------
./testdata/grepinput3 ./testdata/grepinput3
./testdata/grepinput8
./testdata/grepinputx ./testdata/grepinputx
RC=0 RC=0
---------------------------- Test 37 ----------------------------- ---------------------------- Test 37 -----------------------------
@ -643,6 +644,7 @@ testdata/grepinputv:fox jumps
testdata/grepinputx:complete pair testdata/grepinputx:complete pair
testdata/grepinputx:That was a complete pair testdata/grepinputx:That was a complete pair
testdata/grepinputx:complete pair testdata/grepinputx:complete pair
testdata/grepinput3:triple: t7_txt s1_tag s_txt p_tag p_txt o_tag o_txt
RC=0 RC=0
---------------------------- Test 85 ----------------------------- ---------------------------- Test 85 -----------------------------
./testdata/grepinput3:Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. ./testdata/grepinput3:Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
@ -668,3 +670,38 @@ RC=0
---------------------------- Test 93 ----------------------------- ---------------------------- Test 93 -----------------------------
The quick brown fx jumps over the lazy dog. The quick brown fx jumps over the lazy dog.
RC=0 RC=0
---------------------------- Test 94 -----------------------------
./testdata/grepinput8
./testdata/grepinputx
RC=0
---------------------------- Test 95 -----------------------------
testdata/grepinputx:complete pair
testdata/grepinputx:That was a complete pair
testdata/grepinputx:complete pair
RC=0
---------------------------- Test 96 -----------------------------
./testdata/grepinput3
./testdata/grepinput8
./testdata/grepinputx
RC=0
---------------------------- Test 97 -----------------------------
./testdata/grepinput3
./testdata/grepinputx
RC=0
---------------------------- Test 98 -----------------------------
./testdata/grepinputx
RC=0
---------------------------- Test 99 -----------------------------
./testdata/grepinput3
./testdata/grepinputx
RC=0
---------------------------- Test 100 ------------------------------
./testdata/grepinput:zerothe.
./testdata/grepinput:zeroa
./testdata/grepinput:zerothe.
RC=0
---------------------------- Test 101 ------------------------------
./testdata/grepinput:.|zero|the|.
./testdata/grepinput:zero|a
./testdata/grepinput:.|zero|the|.
RC=0

View File

@ -5262,4 +5262,45 @@ name were given. ---/
/((?>a?)*)*c/ /((?>a?)*)*c/
aac aac
/(?>.*?a)(?<=ba)/
aba
/(?:.*?a)(?<=ba)/
aba
/.*?a(*PRUNE)b/
aab
/.*?a(*PRUNE)b/s
aab
/^a(*PRUNE)b/s
aab
/.*?a(*SKIP)b/
aab
/(?>.*?a)b/s
aab
/(?>.*?a)b/
aab
/(?>^a)b/s
aab
/(?>.*?)(?<=(abcd)|(wxyz))/
alphabetabcd
endingwxyz
/(?>.*)(?<=(abcd)|(wxyz))/
alphabetabcd
endingwxyz
"(?>.*)foo"
abcdfooxyz
"(?>.*?)foo"
abcdfooxyz
/-- End of testinput1 --/ /-- End of testinput1 --/

View File

@ -1026,4 +1026,312 @@
AA\P AA\P
AA\P\P AA\P\P
/-- These are tests for extended grapheme clusters --/
/^\X/8+
G\x{34e}\x{34e}X
\x{34e}\x{34e}X
\x04X
\x{1100}X
\x{1100}\x{34e}X
\x{1b04}\x{1b04}X
*These match up to the roman letters
\x{1111}\x{1111}L,L
\x{1111}\x{1111}\x{1169}L,L,V
\x{1111}\x{ae4c}L, LV
\x{1111}\x{ad89}L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
*These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
\x{ae4c}\x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
\x{1169}\x{1111}V, L
\x{1169}\x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
\x{11fe}\x{1169}T, V
\x{11fe}\x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
*Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
\x0d\x{0711}CR, extend
\x0d\x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
\x0a\x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
\x09\x{1b04}Control, spacingmark
*There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}?X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/-- --/
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
/[z\x{1e9e}]+/8i
\x{1e9e}\x{00df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
/[z\x{00df}]+/8i
\x{1e9e}\x{00df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
/[z\x{1f88}]+/8i
\x{1f88}\x{1f80}
/-- Perl matches these --/
/\x{00b5}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{039c}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{03bc}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{00c5}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{00e5}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{212b}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{01c4}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c5}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c6}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c7}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01c8}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01c9}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01ca}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01cb}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01cc}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01f1}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{01f2}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{01f3}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{0345}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0399}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{03b9}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{1fbe}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0392}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{03b2}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{03d0}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{0395}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{03b5}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{03f5}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{0398}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03b8}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03d1}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03f4}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{039a}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03ba}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03f0}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03a0}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03c0}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03d6}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03a1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03c1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03f1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03a3}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03c2}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03c3}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03a6}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03c6}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03d5}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03c9}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{03a9}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{2126}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{1e60}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
/\x{1f80}+/8i
\x{1f88}\x{1f80}
/\x{004b}+/8i
\x{004b}\x{006b}\x{212a}
/\x{006b}+/8i
\x{004b}\x{006b}\x{212a}
/\x{212a}+/8i
\x{004b}\x{006b}\x{212a}
/\x{0053}+/8i
\x{0053}\x{0073}\x{017f}
/\x{0073}+/8i
\x{0053}\x{0073}\x{017f}
/\x{017f}+/8i
\x{0053}\x{0073}\x{017f}
/ist/8i
ikt
/is+t/8i
iSs\x{17f}t
ikt
/is+?t/8i
ikt
/is?t/8i
ikt
/is{2}t/8i
iskt
/-- End of testinput10 --/ /-- End of testinput10 --/

View File

@ -3768,5 +3768,46 @@ assertion, and therefore fails the entire subroutine call. --/
/((?=a(*COMMIT)b)ab|ac){0}(?:(?1)|a(c))/ /((?=a(*COMMIT)b)ab|ac){0}(?:(?1)|a(c))/
ac ac
/-- These are all run as real matches in test 1; here we are just checking the
settings of the anchored and startline bits. --/
/(?>.*?a)(?<=ba)/I
/(?:.*?a)(?<=ba)/I
/.*?a(*PRUNE)b/I
/.*?a(*PRUNE)b/sI
/^a(*PRUNE)b/sI
/.*?a(*SKIP)b/I
/(?>.*?a)b/sI
/(?>.*?a)b/I
/(?>^a)b/sI
/(?>.*?)(?<=(abcd)|(wxyz))/I
/(?>.*)(?<=(abcd)|(wxyz))/I
"(?>.*)foo"I
"(?>.*?)foo"I
/(?>^abc)/mI
/(?>.*abc)/mI
/(?:.*abc)/mI
/-- Check PCRE_STUDY_EXTRA_NEEDED --/
/.?/S-I
/.?/S!I
/-- End of testinput2 --/ /-- End of testinput2 --/

View File

@ -1,6 +1,5 @@
/-- This set of tests is for Unicode property support. It is compatible with /-- This set of tests is for Unicode property support. It is compatible with
Perl >= 5.10, but not 5.8 because it tests some extra properties that are Perl >= 5.15. --/
not in the earlier release. --/
/^\pC\pL\pM\pN\pP\pS\pZ</8 /^\pC\pL\pM\pN\pP\pS\pZ</8
\x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}< \x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
@ -406,7 +405,13 @@
A\x{300}\x{301}B\x{300}C\x{300}\x{301} A\x{300}\x{301}B\x{300}C\x{300}\x{301}
A\x{300}\x{301}B\x{300}C\x{300}\x{301}X A\x{300}\x{301}B\x{300}C\x{300}\x{301}X
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
/^\X/8
A
A\x{300}BC
A\x{300}\x{301}\x{302}BC
\x{300}
/^\p{Han}+/8 /^\p{Han}+/8
\x{2e81}\x{3007}\x{2f804}\x{31a0} \x{2e81}\x{3007}\x{2f804}\x{31a0}
** Failers ** Failers
@ -666,6 +671,7 @@
\x{65c} \x{65c}
\x{65d} \x{65d}
\x{65e} \x{65e}
\x{65f}
\x{66a} \x{66a}
\x{6e9} \x{6e9}
\x{6ef} \x{6ef}
@ -677,7 +683,6 @@
\x{653} \x{653}
\x{654} \x{654}
\x{655} \x{655}
\x{65f}
/^\p{Cyrillic}/8 /^\p{Cyrillic}/8
\x{1d2b} \x{1d2b}
@ -814,5 +819,501 @@
/Ⱥ/8i /Ⱥ/8i
Ⱥ Ⱥ
/-- These are tests for extended grapheme clusters --/
/^\X/8+
G\x{34e}\x{34e}X
\x{34e}\x{34e}X
\x04X
\x{1100}X
\x{1100}\x{34e}X
\x{1b04}\x{1b04}X
*These match up to the roman letters
\x{1111}\x{1111}L,L
\x{1111}\x{1111}\x{1169}L,L,V
\x{1111}\x{ae4c}L, LV
\x{1111}\x{ad89}L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
*These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
\x{ae4c}\x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
\x{1169}\x{1111}V, L
\x{1169}\x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
\x{11fe}\x{1169}T, V
\x{11fe}\x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
*Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
\x0d\x{0711}CR, extend
\x0d\x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
\x0a\x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
\x09\x{1b04}Control, spacingmark
*There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/^\X{2,4}?X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
/-- --/
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
/[z\x{1e9e}]+/8i
\x{1e9e}\x{00df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
/[z\x{00df}]+/8i
\x{1e9e}\x{00df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
/[z\x{1f88}]+/8i
\x{1f88}\x{1f80}
/-- Characters with more than one other case; test in classes --/
/[z\x{00b5}]+/8i
\x{00b5}\x{039c}\x{03bc}
/[z\x{039c}]+/8i
\x{00b5}\x{039c}\x{03bc}
/[z\x{03bc}]+/8i
\x{00b5}\x{039c}\x{03bc}
/[z\x{00c5}]+/8i
\x{00c5}\x{00e5}\x{212b}
/[z\x{00e5}]+/8i
\x{00c5}\x{00e5}\x{212b}
/[z\x{212b}]+/8i
\x{00c5}\x{00e5}\x{212b}
/[z\x{01c4}]+/8i
\x{01c4}\x{01c5}\x{01c6}
/[z\x{01c5}]+/8i
\x{01c4}\x{01c5}\x{01c6}
/[z\x{01c6}]+/8i
\x{01c4}\x{01c5}\x{01c6}
/[z\x{01c7}]+/8i
\x{01c7}\x{01c8}\x{01c9}
/[z\x{01c8}]+/8i
\x{01c7}\x{01c8}\x{01c9}
/[z\x{01c9}]+/8i
\x{01c7}\x{01c8}\x{01c9}
/[z\x{01ca}]+/8i
\x{01ca}\x{01cb}\x{01cc}
/[z\x{01cb}]+/8i
\x{01ca}\x{01cb}\x{01cc}
/[z\x{01cc}]+/8i
\x{01ca}\x{01cb}\x{01cc}
/[z\x{01f1}]+/8i
\x{01f1}\x{01f2}\x{01f3}
/[z\x{01f2}]+/8i
\x{01f1}\x{01f2}\x{01f3}
/[z\x{01f3}]+/8i
\x{01f1}\x{01f2}\x{01f3}
/[z\x{0345}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{0399}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{03b9}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{1fbe}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/[z\x{0392}]+/8i
\x{0392}\x{03b2}\x{03d0}
/[z\x{03b2}]+/8i
\x{0392}\x{03b2}\x{03d0}
/[z\x{03d0}]+/8i
\x{0392}\x{03b2}\x{03d0}
/[z\x{0395}]+/8i
\x{0395}\x{03b5}\x{03f5}
/[z\x{03b5}]+/8i
\x{0395}\x{03b5}\x{03f5}
/[z\x{03f5}]+/8i
\x{0395}\x{03b5}\x{03f5}
/[z\x{0398}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{03b8}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{03d1}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{03f4}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/[z\x{039a}]+/8i
\x{039a}\x{03ba}\x{03f0}
/[z\x{03ba}]+/8i
\x{039a}\x{03ba}\x{03f0}
/[z\x{03f0}]+/8i
\x{039a}\x{03ba}\x{03f0}
/[z\x{03a0}]+/8i
\x{03a0}\x{03c0}\x{03d6}
/[z\x{03c0}]+/8i
\x{03a0}\x{03c0}\x{03d6}
/[z\x{03d6}]+/8i
\x{03a0}\x{03c0}\x{03d6}
/[z\x{03a1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
/[z\x{03c1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
/[z\x{03f1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
/[z\x{03a3}]+/8i
\x{03A3}\x{03C2}\x{03C3}
/[z\x{03c2}]+/8i
\x{03A3}\x{03C2}\x{03C3}
/[z\x{03c3}]+/8i
\x{03A3}\x{03C2}\x{03C3}
/[z\x{03a6}]+/8i
\x{03a6}\x{03c6}\x{03d5}
/[z\x{03c6}]+/8i
\x{03a6}\x{03c6}\x{03d5}
/[z\x{03d5}]+/8i
\x{03a6}\x{03c6}\x{03d5}
/[z\x{03c9}]+/8i
\x{03c9}\x{03a9}\x{2126}
/[z\x{03a9}]+/8i
\x{03c9}\x{03a9}\x{2126}
/[z\x{2126}]+/8i
\x{03c9}\x{03a9}\x{2126}
/[z\x{1e60}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e61}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e9b}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
/-- Perl 5.12.4 gets these wrong, but 5.15.3 is OK --/
/[z\x{004b}]+/8i
\x{004b}\x{006b}\x{212a}
/[z\x{006b}]+/8i
\x{004b}\x{006b}\x{212a}
/[z\x{212a}]+/8i
\x{004b}\x{006b}\x{212a}
/[z\x{0053}]+/8i
\x{0053}\x{0073}\x{017f}
/[z\x{0073}]+/8i
\x{0053}\x{0073}\x{017f}
/[z\x{017f}]+/8i
\x{0053}\x{0073}\x{017f}
/-- --/
/(ΣΆΜΟΣ) \1/8i
ΣΆΜΟΣ ΣΆΜΟΣ
ΣΆΜΟΣ σάμος
σάμος σάμος
σάμος σάμοσ
σάμος ΣΆΜΟΣ
/(σάμος) \1/8i
ΣΆΜΟΣ ΣΆΜΟΣ
ΣΆΜΟΣ σάμος
σάμος σάμος
σάμος σάμοσ
σάμος ΣΆΜΟΣ
/(ΣΆΜΟΣ) \1*/8i
ΣΆΜΟΣ\x20
ΣΆΜΟΣ ΣΆΜΟΣσάμοςσάμος
/-- Perl matches these --/
/\x{00b5}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{039c}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{03bc}+/8i
\x{00b5}\x{039c}\x{03bc}
/\x{00c5}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{00e5}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{212b}+/8i
\x{00c5}\x{00e5}\x{212b}
/\x{01c4}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c5}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c6}+/8i
\x{01c4}\x{01c5}\x{01c6}
/\x{01c7}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01c8}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01c9}+/8i
\x{01c7}\x{01c8}\x{01c9}
/\x{01ca}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01cb}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01cc}+/8i
\x{01ca}\x{01cb}\x{01cc}
/\x{01f1}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{01f2}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{01f3}+/8i
\x{01f1}\x{01f2}\x{01f3}
/\x{0345}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0399}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{03b9}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{1fbe}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
/\x{0392}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{03b2}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{03d0}+/8i
\x{0392}\x{03b2}\x{03d0}
/\x{0395}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{03b5}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{03f5}+/8i
\x{0395}\x{03b5}\x{03f5}
/\x{0398}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03b8}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03d1}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{03f4}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
/\x{039a}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03ba}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03f0}+/8i
\x{039a}\x{03ba}\x{03f0}
/\x{03a0}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03c0}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03d6}+/8i
\x{03a0}\x{03c0}\x{03d6}
/\x{03a1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03c1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03f1}+/8i
\x{03a1}\x{03c1}\x{03f1}
/\x{03a3}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03c2}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03c3}+/8i
\x{03A3}\x{03C2}\x{03C3}
/\x{03a6}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03c6}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03d5}+/8i
\x{03a6}\x{03c6}\x{03d5}
/\x{03c9}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{03a9}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{2126}+/8i
\x{03c9}\x{03a9}\x{2126}
/\x{1e60}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/8i
\x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
/\x{1f80}+/8i
\x{1f88}\x{1f80}
/-- Perl 5.12.4 gets these wrong, but 5.15.3 is OK --/
/\x{004b}+/8i
\x{004b}\x{006b}\x{212a}
/\x{006b}+/8i
\x{004b}\x{006b}\x{212a}
/\x{212a}+/8i
\x{004b}\x{006b}\x{212a}
/\x{0053}+/8i
\x{0053}\x{0073}\x{017f}
/\x{0073}+/8i
\x{0053}\x{0073}\x{017f}
/\x{017f}+/8i
\x{0053}\x{0073}\x{017f}
/-- End of testinput6 --/ /-- End of testinput6 --/

View File

@ -89,7 +89,7 @@
/(\p{Yi}{0,3}+\277)*/ /(\p{Yi}{0,3}+\277)*/
/\p{Zl}{2,3}+/8BZ /\p{Zl}{2,3}+/8BZ
\xe2\x80\xa8\xe2\x80\xa8
\x{2028}\x{2028}\x{2028} \x{2028}\x{2028}\x{2028}
/\p{Zl}/8BZ /\p{Zl}/8BZ
@ -195,15 +195,6 @@ of case for anything other than the ASCII letters. --/
\x{c0} \x{c0}
\x{e0} \x{e0}
/-- This should be Perl-compatible but Perl 5.11 gets \x{300} wrong. --/8
/^\X/8
A
A\x{300}BC
A\x{300}\x{301}\x{302}BC
*** Failers
\x{300}
/-- These are PCRE's extra properties to help with Unicodizing \d etc. --/ /-- These are PCRE's extra properties to help with Unicodizing \d etc. --/
/^\p{Xan}/8 /^\p{Xan}/8
@ -622,4 +613,60 @@ of case for anything other than the ASCII letters. --/
AA\P AA\P
AA\P\P AA\P\P
/A\x{3a3}B/8iDZ
/\x{3a3}B/8iDZ
/[\x{3a3}]/8iBZ
/[^\x{3a3}]/8iBZ
/[\x{3a3}]+/8iBZ
/[^\x{3a3}]+/8iBZ
/a*\x{3a3}/8iBZ
/\x{3a3}+a/8iBZ
/\x{3a3}*\x{3c2}/8iBZ
/\x{3a3}{3}/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}{2,4}/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}{2,4}?/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}+./8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}++./8i+
** Failers
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}*\x{3c2}/8iBZ
/[^\x{3a3}]*\x{3c2}/8iBZ
/[^a]*\x{3c2}/8iBZ
/ist/8iBZ
ikt
/is+t/8i
iSs\x{17f}t
ikt
/is+?t/8i
ikt
/is?t/8i
ikt
/is{2}t/8i
iskt
/-- End of testinput7 --/ /-- End of testinput7 --/

View File

@ -8733,4 +8733,66 @@ No match
0: aac 0: aac
1: 1:
/(?>.*?a)(?<=ba)/
aba
0: ba
/(?:.*?a)(?<=ba)/
aba
0: aba
/.*?a(*PRUNE)b/
aab
0: ab
/.*?a(*PRUNE)b/s
aab
0: ab
/^a(*PRUNE)b/s
aab
No match
/.*?a(*SKIP)b/
aab
0: ab
/(?>.*?a)b/s
aab
0: ab
/(?>.*?a)b/
aab
0: ab
/(?>^a)b/s
aab
No match
/(?>.*?)(?<=(abcd)|(wxyz))/
alphabetabcd
0:
1: abcd
endingwxyz
0:
1: <unset>
2: wxyz
/(?>.*)(?<=(abcd)|(wxyz))/
alphabetabcd
0: alphabetabcd
1: abcd
endingwxyz
0: endingwxyz
1: <unset>
2: wxyz
"(?>.*)foo"
abcdfooxyz
No match
"(?>.*?)foo"
abcdfooxyz
0: foo
/-- End of testinput1 --/ /-- End of testinput1 --/

View File

@ -90,7 +90,7 @@ No match
9: ** 9: **
10: * 10: *
\x{300}\x{301}\x{302} \x{300}\x{301}\x{302}
No match 0: \x{300}\x{301}\x{302}
/\X?abc/8 /\X?abc/8
abc abc
@ -100,7 +100,7 @@ No match
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
0: A\x{300}abc 0: A\x{300}abc
\x{300}abc \x{300}abc
0: abc 0: \x{300}abc
*** Failers *** Failers
No match No match
@ -114,7 +114,7 @@ No match
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
No match No match
\x{300}abc \x{300}abc
No match 0: \x{300}abc
/\X*abc/8 /\X*abc/8
abc abc
@ -124,7 +124,7 @@ No match
A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abcxyz
0: A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abc 0: A\x{300}\x{301}\x{302}A\x{300}A\x{300}A\x{300}abc
\x{300}abc \x{300}abc
0: abc 0: \x{300}abc
*** Failers *** Failers
No match No match
@ -138,7 +138,7 @@ No match
*** Failers *** Failers
No match No match
\x{300}abc \x{300}abc
No match 0: \x{300}abc
/^\pL?=./8 /^\pL?=./8
A=b A=b
@ -1133,7 +1133,7 @@ No match
*** Failers *** Failers
0: * 0: *
\x{300} \x{300}
No match 0: \x{300}
/^[\X]/8 /^[\X]/8
X123 X123
@ -2100,4 +2100,627 @@ Partial match: AA
AA\P\P AA\P\P
Partial match: AA Partial match: AA
/-- These are tests for extended grapheme clusters --/
/^\X/8+
G\x{34e}\x{34e}X
0: G\x{34e}\x{34e}
0+ X
\x{34e}\x{34e}X
0: \x{34e}\x{34e}
0+ X
\x04X
0: \x{04}
0+ X
\x{1100}X
0: \x{1100}
0+ X
\x{1100}\x{34e}X
0: \x{1100}\x{34e}
0+ X
\x{1b04}\x{1b04}X
0: \x{1b04}\x{1b04}
0+ X
*These match up to the roman letters
0: *
0+ These match up to the roman letters
\x{1111}\x{1111}L,L
0: \x{1111}\x{1111}
0+ L,L
\x{1111}\x{1111}\x{1169}L,L,V
0: \x{1111}\x{1111}\x{1169}
0+ L,L,V
\x{1111}\x{ae4c}L, LV
0: \x{1111}\x{ae4c}
0+ L, LV
\x{1111}\x{ad89}L, LVT
0: \x{1111}\x{ad89}
0+ L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
0: \x{1111}\x{ae4c}\x{1169}
0+ L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
0: \x{1111}\x{ae4c}\x{1169}\x{1169}
0+ L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
0: \x{1111}\x{ae4c}\x{1169}\x{11fe}
0+ L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
0: \x{1111}\x{ad89}\x{11fe}
0+ L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
0: \x{1111}\x{ad89}\x{11fe}\x{11fe}
0+ L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
0: \x{ad89}\x{11fe}\x{11fe}
0+ LVT, T, T
*These match just the first codepoint (invalid sequence)
0: *
0+ These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
0: \x{1111}
0+ \x{11fe}L, T
\x{ae4c}\x{1111}LV, L
0: \x{ae4c}
0+ \x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
0: \x{ae4c}
0+ \x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
0: \x{ae4c}
0+ \x{ad89}LV, LVT
\x{1169}\x{1111}V, L
0: \x{1169}
0+ \x{1111}V, L
\x{1169}\x{ae4c}V, LV
0: \x{1169}
0+ \x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
0: \x{1169}
0+ \x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
0: \x{ad89}
0+ \x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
0: \x{ad89}
0+ \x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
0: \x{ad89}
0+ \x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
0: \x{ad89}
0+ \x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
0: \x{11fe}
0+ \x{1111}T, L
\x{11fe}\x{1169}T, V
0: \x{11fe}
0+ \x{1169}T, V
\x{11fe}\x{ae4c}T, LV
0: \x{11fe}
0+ \x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
0: \x{11fe}
0+ \x{ad89}T, LVT
*Test extend and spacing mark
0: *
0+ Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
0: \x{1111}\x{ae4c}\x{711}
0+ L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}
0+ L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}\x{711}\x{1b04}
0+ L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
0: *
0+ Test CR, LF, and control
\x0d\x{0711}CR, extend
0: \x{0d}
0+ \x{711}CR, extend
\x0d\x{1b04}CR, spacingmark
0: \x{0d}
0+ \x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
0: \x{0a}
0+ \x{711}LF, extend
\x0a\x{1b04}LF, spacingmark
0: \x{0a}
0+ \x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
0: \x{0b}
0+ \x{711}Control, extend
\x09\x{1b04}Control, spacingmark
0: \x{09}
0+ \x{1b04}Control, spacingmark
*There are no Prepend characters, so we can't test Prepend, CR
0: *
0+ There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}?X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/-- --/
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/[z\x{1e9e}]+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/[z\x{00df}]+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
1: \x{1f88}
/[z\x{1f88}]+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
1: \x{1f88}
/-- Perl matches these --/
/\x{00b5}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
1: \x{b5}\x{39c}
2: \x{b5}
/\x{039c}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
1: \x{b5}\x{39c}
2: \x{b5}
/\x{03bc}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
1: \x{b5}\x{39c}
2: \x{b5}
/\x{00c5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
1: \x{c5}\x{e5}
2: \x{c5}
/\x{00e5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
1: \x{c5}\x{e5}
2: \x{c5}
/\x{212b}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
1: \x{c5}\x{e5}
2: \x{c5}
/\x{01c4}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
1: \x{1c4}\x{1c5}
2: \x{1c4}
/\x{01c5}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
1: \x{1c4}\x{1c5}
2: \x{1c4}
/\x{01c6}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
1: \x{1c4}\x{1c5}
2: \x{1c4}
/\x{01c7}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
1: \x{1c7}\x{1c8}
2: \x{1c7}
/\x{01c8}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
1: \x{1c7}\x{1c8}
2: \x{1c7}
/\x{01c9}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
1: \x{1c7}\x{1c8}
2: \x{1c7}
/\x{01ca}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
1: \x{1ca}\x{1cb}
2: \x{1ca}
/\x{01cb}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
1: \x{1ca}\x{1cb}
2: \x{1ca}
/\x{01cc}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
1: \x{1ca}\x{1cb}
2: \x{1ca}
/\x{01f1}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
1: \x{1f1}\x{1f2}
2: \x{1f1}
/\x{01f2}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
1: \x{1f1}\x{1f2}
2: \x{1f1}
/\x{01f3}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
1: \x{1f1}\x{1f2}
2: \x{1f1}
/\x{0345}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
1: \x{345}\x{399}\x{3b9}
2: \x{345}\x{399}
3: \x{345}
/\x{0399}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
1: \x{345}\x{399}\x{3b9}
2: \x{345}\x{399}
3: \x{345}
/\x{03b9}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
1: \x{345}\x{399}\x{3b9}
2: \x{345}\x{399}
3: \x{345}
/\x{1fbe}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
1: \x{345}\x{399}\x{3b9}
2: \x{345}\x{399}
3: \x{345}
/\x{0392}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
1: \x{392}\x{3b2}
2: \x{392}
/\x{03b2}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
1: \x{392}\x{3b2}
2: \x{392}
/\x{03d0}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
1: \x{392}\x{3b2}
2: \x{392}
/\x{0395}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
1: \x{395}\x{3b5}
2: \x{395}
/\x{03b5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
1: \x{395}\x{3b5}
2: \x{395}
/\x{03f5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
1: \x{395}\x{3b5}
2: \x{395}
/\x{0398}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
1: \x{398}\x{3b8}\x{3d1}
2: \x{398}\x{3b8}
3: \x{398}
/\x{03b8}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
1: \x{398}\x{3b8}\x{3d1}
2: \x{398}\x{3b8}
3: \x{398}
/\x{03d1}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
1: \x{398}\x{3b8}\x{3d1}
2: \x{398}\x{3b8}
3: \x{398}
/\x{03f4}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
1: \x{398}\x{3b8}\x{3d1}
2: \x{398}\x{3b8}
3: \x{398}
/\x{039a}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
1: \x{39a}\x{3ba}
2: \x{39a}
/\x{03ba}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
1: \x{39a}\x{3ba}
2: \x{39a}
/\x{03f0}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
1: \x{39a}\x{3ba}
2: \x{39a}
/\x{03a0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
1: \x{3a0}\x{3c0}
2: \x{3a0}
/\x{03c0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
1: \x{3a0}\x{3c0}
2: \x{3a0}
/\x{03d6}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
1: \x{3a0}\x{3c0}
2: \x{3a0}
/\x{03a1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
1: \x{3a1}\x{3c1}
2: \x{3a1}
/\x{03c1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
1: \x{3a1}\x{3c1}
2: \x{3a1}
/\x{03f1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
1: \x{3a1}\x{3c1}
2: \x{3a1}
/\x{03a3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
1: \x{3a3}\x{3c2}
2: \x{3a3}
/\x{03c2}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
1: \x{3a3}\x{3c2}
2: \x{3a3}
/\x{03c3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
1: \x{3a3}\x{3c2}
2: \x{3a3}
/\x{03a6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
1: \x{3a6}\x{3c6}
2: \x{3a6}
/\x{03c6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
1: \x{3a6}\x{3c6}
2: \x{3a6}
/\x{03d5}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
1: \x{3a6}\x{3c6}
2: \x{3a6}
/\x{03c9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
1: \x{3c9}\x{3a9}
2: \x{3c9}
/\x{03a9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
1: \x{3c9}\x{3a9}
2: \x{3c9}
/\x{2126}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
1: \x{3c9}\x{3a9}
2: \x{3c9}
/\x{1e60}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
1: \x{1e60}\x{1e61}
2: \x{1e60}
/\x{1e61}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
1: \x{1e60}\x{1e61}
2: \x{1e60}
/\x{1e9b}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
1: \x{1e60}\x{1e61}
2: \x{1e60}
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
1: \x{1e9e}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
1: \x{1f88}
/\x{1f80}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
1: \x{1f88}
/\x{004b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
1: Kk
2: K
/\x{006b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
1: Kk
2: K
/\x{212a}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
1: Kk
2: K
/\x{0053}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
1: Ss
2: S
/\x{0073}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
1: Ss
2: S
/\x{017f}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
1: Ss
2: S
/ist/8i
ikt
No match
/is+t/8i
iSs\x{17f}t
0: iSs\x{17f}t
ikt
No match
/is+?t/8i
ikt
No match
/is?t/8i
ikt
No match
/is{2}t/8i
iskt
No match
/-- End of testinput10 --/ /-- End of testinput10 --/

View File

@ -768,7 +768,7 @@ Max lookbehind = 3
/(?>.*)(?<=(abcd)|(xyz))/I /(?>.*)(?<=(abcd)|(xyz))/I
Capturing subpattern count = 2 Capturing subpattern count = 2
No options No options
First char at start or follows newline No first char
No need char No need char
Max lookbehind = 4 Max lookbehind = 4
alphabetabcd alphabetabcd
@ -10110,7 +10110,7 @@ No set of starting bytes
"(?>.*/)foo"SI "(?>.*/)foo"SI
Capturing subpattern count = 0 Capturing subpattern count = 0
No options No options
First char at start or follows newline No first char
Need char = 'o' Need char = 'o'
Subject length lower bound = 4 Subject length lower bound = 4
No set of starting bytes No set of starting bytes
@ -12360,5 +12360,125 @@ assertion, and therefore fails the entire subroutine call. --/
/((?=a(*COMMIT)b)ab|ac){0}(?:(?1)|a(c))/ /((?=a(*COMMIT)b)ab|ac){0}(?:(?1)|a(c))/
ac ac
0: ac 0: ac
/-- These are all run as real matches in test 1; here we are just checking the
settings of the anchored and startline bits. --/
/(?>.*?a)(?<=ba)/I
Capturing subpattern count = 0
No options
No first char
Need char = 'a'
Max lookbehind = 2
/(?:.*?a)(?<=ba)/I
Capturing subpattern count = 0
No options
First char at start or follows newline
Need char = 'a'
Max lookbehind = 2
/.*?a(*PRUNE)b/I
Capturing subpattern count = 0
No options
No first char
Need char = 'b'
/.*?a(*PRUNE)b/sI
Capturing subpattern count = 0
Options: dotall
No first char
Need char = 'b'
/^a(*PRUNE)b/sI
Capturing subpattern count = 0
Options: anchored dotall
No first char
No need char
/.*?a(*SKIP)b/I
Capturing subpattern count = 0
No options
No first char
Need char = 'b'
/(?>.*?a)b/sI
Capturing subpattern count = 0
Options: dotall
No first char
Need char = 'b'
/(?>.*?a)b/I
Capturing subpattern count = 0
No options
No first char
Need char = 'b'
/(?>^a)b/sI
Capturing subpattern count = 0
Options: anchored dotall
No first char
No need char
/(?>.*?)(?<=(abcd)|(wxyz))/I
Capturing subpattern count = 2
No options
No first char
No need char
Max lookbehind = 4
/(?>.*)(?<=(abcd)|(wxyz))/I
Capturing subpattern count = 2
No options
No first char
No need char
Max lookbehind = 4
"(?>.*)foo"I
Capturing subpattern count = 0
No options
No first char
Need char = 'o'
"(?>.*?)foo"I
Capturing subpattern count = 0
No options
No first char
Need char = 'o'
/(?>^abc)/mI
Capturing subpattern count = 0
Options: multiline
First char at start or follows newline
Need char = 'c'
/(?>.*abc)/mI
Capturing subpattern count = 0
Options: multiline
No first char
Need char = 'c'
/(?:.*abc)/mI
Capturing subpattern count = 0
Options: multiline
First char at start or follows newline
Need char = 'c'
/-- Check PCRE_STUDY_EXTRA_NEEDED --/
/.?/S-I
Capturing subpattern count = 0
No options
No first char
No need char
Study returned NULL
/.?/S!I
Capturing subpattern count = 0
No options
No first char
No need char
Subject length lower bound = -1
No set of starting bytes
/-- End of testinput2 --/ /-- End of testinput2 --/

View File

@ -276,7 +276,7 @@ No need char
/[\xFF]/DZ /[\xFF]/DZ
------------------------------------------------------------------ ------------------------------------------------------------------
Bra Bra
\xff \x{ff}
Ket Ket
End End
------------------------------------------------------------------ ------------------------------------------------------------------
@ -290,7 +290,7 @@ No need char
/[^\xFF]/DZ /[^\xFF]/DZ
------------------------------------------------------------------ ------------------------------------------------------------------
Bra Bra
[^\xff] [^\x{ff}]
Ket Ket
End End
------------------------------------------------------------------ ------------------------------------------------------------------
@ -786,7 +786,7 @@ No match
/[\H]/8BZ /[\H]/8BZ
------------------------------------------------------------------ ------------------------------------------------------------------
Bra Bra
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}] [\x00-\x08\x0a-\x1f!-\x9f\x{a1}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}]
Ket Ket
End End
------------------------------------------------------------------ ------------------------------------------------------------------
@ -794,7 +794,7 @@ No match
/[\V]/8BZ /[\V]/8BZ
------------------------------------------------------------------ ------------------------------------------------------------------
Bra Bra
[\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{10ffff}] [\x00-\x09\x0e-\x84\x{86}-\x{2027}\x{202a}-\x{10ffff}]
Ket Ket
End End
------------------------------------------------------------------ ------------------------------------------------------------------
@ -1594,7 +1594,7 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 7
/[\H\x{d7ff}]+/8BZ /[\H\x{d7ff}]+/8BZ
------------------------------------------------------------------ ------------------------------------------------------------------
Bra Bra
[\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}\x{d7ff}]+ [\x00-\x08\x0a-\x1f!-\x9f\x{a1}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{10ffff}\x{d7ff}]+
Ket Ket
End End
------------------------------------------------------------------ ------------------------------------------------------------------
@ -1634,7 +1634,7 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 7
/[\V\x{d7ff}]+/8BZ /[\V\x{d7ff}]+/8BZ
------------------------------------------------------------------ ------------------------------------------------------------------
Bra Bra
[\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{202a}-\x{10ffff}\x{d7ff}]+ [\x00-\x09\x0e-\x84\x{86}-\x{2027}\x{202a}-\x{10ffff}\x{d7ff}]+
Ket Ket
End End
------------------------------------------------------------------ ------------------------------------------------------------------

View File

@ -1,6 +1,5 @@
/-- This set of tests is for Unicode property support. It is compatible with /-- This set of tests is for Unicode property support. It is compatible with
Perl >= 5.10, but not 5.8 because it tests some extra properties that are Perl >= 5.15. --/
not in the earlier release. --/
/^\pC\pL\pM\pN\pP\pS\pZ</8 /^\pC\pL\pM\pN\pP\pS\pZ</8
\x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}< \x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
@ -696,7 +695,17 @@ No match
A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X A\x{300}\x{301}B\x{300}C\x{300}\x{301}DA\x{300}X
0: A\x{300}\x{301}B\x{300}C 0: A\x{300}\x{301}B\x{300}C
1: C 1: C
/^\X/8
A
0: A
A\x{300}BC
0: A\x{300}
A\x{300}\x{301}\x{302}BC
0: A\x{300}\x{301}\x{302}
\x{300}
0: \x{300}
/^\p{Han}+/8 /^\p{Han}+/8
\x{2e81}\x{3007}\x{2f804}\x{31a0} \x{2e81}\x{3007}\x{2f804}\x{31a0}
0: \x{2e81}\x{3007}\x{2f804} 0: \x{2e81}\x{3007}\x{2f804}
@ -1136,6 +1145,8 @@ No match
0: \x{65d} 0: \x{65d}
\x{65e} \x{65e}
0: \x{65e} 0: \x{65e}
\x{65f}
0: \x{65f}
\x{66a} \x{66a}
0: \x{66a} 0: \x{66a}
\x{6e9} \x{6e9}
@ -1158,8 +1169,6 @@ No match
No match No match
\x{655} \x{655}
No match No match
\x{65f}
No match
/^\p{Cyrillic}/8 /^\p{Cyrillic}/8
\x{1d2b} \x{1d2b}
@ -1372,5 +1381,757 @@ No match
0: \x{23a} 0: \x{23a}
0: \x{2c65} 0: \x{2c65}
/-- These are tests for extended grapheme clusters --/
/^\X/8+
G\x{34e}\x{34e}X
0: G\x{34e}\x{34e}
0+ X
\x{34e}\x{34e}X
0: \x{34e}\x{34e}
0+ X
\x04X
0: \x{04}
0+ X
\x{1100}X
0: \x{1100}
0+ X
\x{1100}\x{34e}X
0: \x{1100}\x{34e}
0+ X
\x{1b04}\x{1b04}X
0: \x{1b04}\x{1b04}
0+ X
*These match up to the roman letters
0: *
0+ These match up to the roman letters
\x{1111}\x{1111}L,L
0: \x{1111}\x{1111}
0+ L,L
\x{1111}\x{1111}\x{1169}L,L,V
0: \x{1111}\x{1111}\x{1169}
0+ L,L,V
\x{1111}\x{ae4c}L, LV
0: \x{1111}\x{ae4c}
0+ L, LV
\x{1111}\x{ad89}L, LVT
0: \x{1111}\x{ad89}
0+ L, LVT
\x{1111}\x{ae4c}\x{1169}L, LV, V
0: \x{1111}\x{ae4c}\x{1169}
0+ L, LV, V
\x{1111}\x{ae4c}\x{1169}\x{1169}L, LV, V, V
0: \x{1111}\x{ae4c}\x{1169}\x{1169}
0+ L, LV, V, V
\x{1111}\x{ae4c}\x{1169}\x{11fe}L, LV, V, T
0: \x{1111}\x{ae4c}\x{1169}\x{11fe}
0+ L, LV, V, T
\x{1111}\x{ad89}\x{11fe}L, LVT, T
0: \x{1111}\x{ad89}\x{11fe}
0+ L, LVT, T
\x{1111}\x{ad89}\x{11fe}\x{11fe}L, LVT, T, T
0: \x{1111}\x{ad89}\x{11fe}\x{11fe}
0+ L, LVT, T, T
\x{ad89}\x{11fe}\x{11fe}LVT, T, T
0: \x{ad89}\x{11fe}\x{11fe}
0+ LVT, T, T
*These match just the first codepoint (invalid sequence)
0: *
0+ These match just the first codepoint (invalid sequence)
\x{1111}\x{11fe}L, T
0: \x{1111}
0+ \x{11fe}L, T
\x{ae4c}\x{1111}LV, L
0: \x{ae4c}
0+ \x{1111}LV, L
\x{ae4c}\x{ae4c}LV, LV
0: \x{ae4c}
0+ \x{ae4c}LV, LV
\x{ae4c}\x{ad89}LV, LVT
0: \x{ae4c}
0+ \x{ad89}LV, LVT
\x{1169}\x{1111}V, L
0: \x{1169}
0+ \x{1111}V, L
\x{1169}\x{ae4c}V, LV
0: \x{1169}
0+ \x{ae4c}V, LV
\x{1169}\x{ad89}V, LVT
0: \x{1169}
0+ \x{ad89}V, LVT
\x{ad89}\x{1111}LVT, L
0: \x{ad89}
0+ \x{1111}LVT, L
\x{ad89}\x{1169}LVT, V
0: \x{ad89}
0+ \x{1169}LVT, V
\x{ad89}\x{ae4c}LVT, LV
0: \x{ad89}
0+ \x{ae4c}LVT, LV
\x{ad89}\x{ad89}LVT, LVT
0: \x{ad89}
0+ \x{ad89}LVT, LVT
\x{11fe}\x{1111}T, L
0: \x{11fe}
0+ \x{1111}T, L
\x{11fe}\x{1169}T, V
0: \x{11fe}
0+ \x{1169}T, V
\x{11fe}\x{ae4c}T, LV
0: \x{11fe}
0+ \x{ae4c}T, LV
\x{11fe}\x{ad89}T, LVT
0: \x{11fe}
0+ \x{ad89}T, LVT
*Test extend and spacing mark
0: *
0+ Test extend and spacing mark
\x{1111}\x{ae4c}\x{0711}L, LV, extend
0: \x{1111}\x{ae4c}\x{711}
0+ L, LV, extend
\x{1111}\x{ae4c}\x{1b04}L, LV, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}
0+ L, LV, spacing mark
\x{1111}\x{ae4c}\x{1b04}\x{0711}\x{1b04}L, LV, spacing mark, extend, spacing mark
0: \x{1111}\x{ae4c}\x{1b04}\x{711}\x{1b04}
0+ L, LV, spacing mark, extend, spacing mark
*Test CR, LF, and control
0: *
0+ Test CR, LF, and control
\x0d\x{0711}CR, extend
0: \x{0d}
0+ \x{711}CR, extend
\x0d\x{1b04}CR, spacingmark
0: \x{0d}
0+ \x{1b04}CR, spacingmark
\x0a\x{0711}LF, extend
0: \x{0a}
0+ \x{711}LF, extend
\x0a\x{1b04}LF, spacingmark
0: \x{0a}
0+ \x{1b04}LF, spacingmark
\x0b\x{0711}Control, extend
0: \x{0b}
0+ \x{711}Control, extend
\x09\x{1b04}Control, spacingmark
0: \x{09}
0+ \x{1b04}Control, spacingmark
*There are no Prepend characters, so we can't test Prepend, CR
0: *
0+ There are no Prepend characters, so we can't test Prepend, CR
/^(?>\X{2})X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/^\X{2,4}?X/8+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0: \x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}\x{1111}\x{ae4c}X
0+
/-- --/
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/[z\x{1e9e}]+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/[z\x{00df}]+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/[z\x{1f88}]+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/-- Characters with more than one other case; test in classes --/
/[z\x{00b5}]+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/[z\x{039c}]+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/[z\x{03bc}]+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/[z\x{00c5}]+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/[z\x{00e5}]+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/[z\x{212b}]+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/[z\x{01c4}]+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/[z\x{01c5}]+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/[z\x{01c6}]+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/[z\x{01c7}]+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/[z\x{01c8}]+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/[z\x{01c9}]+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/[z\x{01ca}]+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/[z\x{01cb}]+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/[z\x{01cc}]+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/[z\x{01f1}]+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/[z\x{01f2}]+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/[z\x{01f3}]+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/[z\x{0345}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{0399}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{03b9}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{1fbe}]+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/[z\x{0392}]+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/[z\x{03b2}]+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/[z\x{03d0}]+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/[z\x{0395}]+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/[z\x{03b5}]+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/[z\x{03f5}]+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/[z\x{0398}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{03b8}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{03d1}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{03f4}]+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/[z\x{039a}]+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/[z\x{03ba}]+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/[z\x{03f0}]+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/[z\x{03a0}]+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/[z\x{03c0}]+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/[z\x{03d6}]+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/[z\x{03a1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/[z\x{03c1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/[z\x{03f1}]+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/[z\x{03a3}]+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/[z\x{03c2}]+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/[z\x{03c3}]+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/[z\x{03a6}]+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/[z\x{03c6}]+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/[z\x{03d5}]+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/[z\x{03c9}]+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/[z\x{03a9}]+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/[z\x{2126}]+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/[z\x{1e60}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e61}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/[z\x{1e9b}]+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/-- Perl 5.12.4 gets these wrong, but 5.15.3 is OK --/
/[z\x{004b}]+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/[z\x{006b}]+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/[z\x{212a}]+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/[z\x{0053}]+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/[z\x{0073}]+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/[z\x{017f}]+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/-- --/
/(ΣΆΜΟΣ) \1/8i
ΣΆΜΟΣ ΣΆΜΟΣ
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
ΣΆΜΟΣ σάμος
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
σάμος σάμος
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος σάμοσ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος ΣΆΜΟΣ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
/(σάμος) \1/8i
ΣΆΜΟΣ ΣΆΜΟΣ
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
ΣΆΜΟΣ σάμος
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
σάμος σάμος
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος σάμοσ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
σάμος ΣΆΜΟΣ
0: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
/(ΣΆΜΟΣ) \1*/8i
ΣΆΜΟΣ\x20
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
ΣΆΜΟΣ ΣΆΜΟΣσάμοςσάμος
0: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3} \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}\x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}\x{3c3}\x{3ac}\x{3bc}\x{3bf}\x{3c2}
1: \x{3a3}\x{386}\x{39c}\x{39f}\x{3a3}
/-- Perl matches these --/
/\x{00b5}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{039c}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{03bc}+/8i
\x{00b5}\x{039c}\x{03bc}
0: \x{b5}\x{39c}\x{3bc}
/\x{00c5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{00e5}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{212b}+/8i
\x{00c5}\x{00e5}\x{212b}
0: \x{c5}\x{e5}\x{212b}
/\x{01c4}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c5}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c6}+/8i
\x{01c4}\x{01c5}\x{01c6}
0: \x{1c4}\x{1c5}\x{1c6}
/\x{01c7}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01c8}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01c9}+/8i
\x{01c7}\x{01c8}\x{01c9}
0: \x{1c7}\x{1c8}\x{1c9}
/\x{01ca}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01cb}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01cc}+/8i
\x{01ca}\x{01cb}\x{01cc}
0: \x{1ca}\x{1cb}\x{1cc}
/\x{01f1}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{01f2}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{01f3}+/8i
\x{01f1}\x{01f2}\x{01f3}
0: \x{1f1}\x{1f2}\x{1f3}
/\x{0345}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{0399}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{03b9}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{1fbe}+/8i
\x{0345}\x{0399}\x{03b9}\x{1fbe}
0: \x{345}\x{399}\x{3b9}\x{1fbe}
/\x{0392}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{03b2}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{03d0}+/8i
\x{0392}\x{03b2}\x{03d0}
0: \x{392}\x{3b2}\x{3d0}
/\x{0395}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{03b5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{03f5}+/8i
\x{0395}\x{03b5}\x{03f5}
0: \x{395}\x{3b5}\x{3f5}
/\x{0398}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03b8}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03d1}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{03f4}+/8i
\x{0398}\x{03b8}\x{03d1}\x{03f4}
0: \x{398}\x{3b8}\x{3d1}\x{3f4}
/\x{039a}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03ba}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03f0}+/8i
\x{039a}\x{03ba}\x{03f0}
0: \x{39a}\x{3ba}\x{3f0}
/\x{03a0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03c0}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03d6}+/8i
\x{03a0}\x{03c0}\x{03d6}
0: \x{3a0}\x{3c0}\x{3d6}
/\x{03a1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03c1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03f1}+/8i
\x{03a1}\x{03c1}\x{03f1}
0: \x{3a1}\x{3c1}\x{3f1}
/\x{03a3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03c2}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03c3}+/8i
\x{03A3}\x{03C2}\x{03C3}
0: \x{3a3}\x{3c2}\x{3c3}
/\x{03a6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03c6}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03d5}+/8i
\x{03a6}\x{03c6}\x{03d5}
0: \x{3a6}\x{3c6}\x{3d5}
/\x{03c9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{03a9}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{2126}+/8i
\x{03c9}\x{03a9}\x{2126}
0: \x{3c9}\x{3a9}\x{2126}
/\x{1e60}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e61}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e9b}+/8i
\x{1e60}\x{1e61}\x{1e9b}
0: \x{1e60}\x{1e61}\x{1e9b}
/\x{1e9e}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{00df}+/8i
\x{1e9e}\x{00df}
0: \x{1e9e}\x{df}
/\x{1f88}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/\x{1f80}+/8i
\x{1f88}\x{1f80}
0: \x{1f88}\x{1f80}
/-- Perl 5.12.4 gets these wrong, but 5.15.3 is OK --/
/\x{004b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{006b}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{212a}+/8i
\x{004b}\x{006b}\x{212a}
0: Kk\x{212a}
/\x{0053}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/\x{0073}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/\x{017f}+/8i
\x{0053}\x{0073}\x{017f}
0: Ss\x{17f}
/-- End of testinput6 --/ /-- End of testinput6 --/

View File

@ -124,7 +124,7 @@ No match
/[z-\x{100}]/8iDZ /[z-\x{100}]/8iDZ
------------------------------------------------------------------ ------------------------------------------------------------------
Bra Bra
[Z\x{39c}\x{178}z-\x{101}] [Z\x{39c}\x{3bc}\x{1e9e}\x{178}z-\x{101}]
Ket Ket
End End
------------------------------------------------------------------ ------------------------------------------------------------------
@ -162,7 +162,7 @@ No match
/[z-\x{100}]/8DZi /[z-\x{100}]/8DZi
------------------------------------------------------------------ ------------------------------------------------------------------
Bra Bra
[Z\x{39c}\x{178}z-\x{101}] [Z\x{39c}\x{3bc}\x{1e9e}\x{178}z-\x{101}]
Ket Ket
End End
------------------------------------------------------------------ ------------------------------------------------------------------
@ -233,7 +233,7 @@ No need char
Ket Ket
End End
------------------------------------------------------------------ ------------------------------------------------------------------
\xe2\x80\xa8\xe2\x80\xa8
0: \x{2028}\x{2028} 0: \x{2028}\x{2028}
\x{2028}\x{2028}\x{2028} \x{2028}\x{2028}\x{2028}
0: \x{2028}\x{2028}\x{2028} 0: \x{2028}\x{2028}\x{2028}
@ -423,20 +423,6 @@ of case for anything other than the ASCII letters. --/
\x{e0} \x{e0}
0: \x{e0} 0: \x{e0}
/-- This should be Perl-compatible but Perl 5.11 gets \x{300} wrong. --/8
/^\X/8
A
0: A
A\x{300}BC
0: A\x{300}
A\x{300}\x{301}\x{302}BC
0: A\x{300}\x{301}\x{302}
*** Failers
0: *
\x{300}
No match
/-- These are PCRE's extra properties to help with Unicodizing \d etc. --/ /-- These are PCRE's extra properties to help with Unicodizing \d etc. --/
/^\p{Xan}/8 /^\p{Xan}/8
@ -1194,11 +1180,13 @@ No match
/^S(\X*)e(\X*)$/8 /^S(\X*)e(\X*)$/8
Stéréo Stéréo
No match 0: Ste\x{301}re\x{301}o
1: te\x{301}r
2: \x{301}o
/^\X/8 /^\X/8
́réo ́réo
No match 0: \x{301}
/^a\X41z/<JS> /^a\X41z/<JS>
aX41z aX41z
@ -1313,4 +1301,173 @@ Partial match: AA
AA\P\P AA\P\P
Partial match: AA Partial match: AA
/A\x{3a3}B/8iDZ
------------------------------------------------------------------
Bra
/i A
clist 03a3 03c2 03c3
/i B
Ket
End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: caseless utf
First char = 'A' (caseless)
Need char = 'B' (caseless)
/\x{3a3}B/8iDZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3
/i B
Ket
End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: caseless utf
No first char
Need char = 'B' (caseless)
/[\x{3a3}]/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/[^\x{3a3}]/8iBZ
------------------------------------------------------------------
Bra
not clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/[\x{3a3}]+/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3 +
Ket
End
------------------------------------------------------------------
/[^\x{3a3}]+/8iBZ
------------------------------------------------------------------
Bra
not clist 03a3 03c2 03c3 +
Ket
End
------------------------------------------------------------------
/a*\x{3a3}/8iBZ
------------------------------------------------------------------
Bra
/i a*+
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/\x{3a3}+a/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3 ++
/i a
Ket
End
------------------------------------------------------------------
/\x{3a3}*\x{3c2}/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3 *
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/\x{3a3}{3}/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0: \x{3a3}\x{3c3}\x{3c2}
0+ \x{3a3}\x{3c3}\x{3c2}
/\x{3a3}{2,4}/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0: \x{3a3}\x{3c3}\x{3c2}\x{3a3}
0+ \x{3c3}\x{3c2}
/\x{3a3}{2,4}?/8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0: \x{3a3}\x{3c3}
0+ \x{3c2}\x{3a3}\x{3c3}\x{3c2}
/\x{3a3}+./8i+
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0: \x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
0+
/\x{3a3}++./8i+
** Failers
No match
\x{3a3}\x{3c3}\x{3c2}\x{3a3}\x{3c3}\x{3c2}
No match
/\x{3a3}*\x{3c2}/8iBZ
------------------------------------------------------------------
Bra
clist 03a3 03c2 03c3 *
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/[^\x{3a3}]*\x{3c2}/8iBZ
------------------------------------------------------------------
Bra
not clist 03a3 03c2 03c3 *+
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/[^a]*\x{3c2}/8iBZ
------------------------------------------------------------------
Bra
/i [^a]*
clist 03a3 03c2 03c3
Ket
End
------------------------------------------------------------------
/ist/8iBZ
------------------------------------------------------------------
Bra
/i i
clist 0053 0073 017f
/i t
Ket
End
------------------------------------------------------------------
ikt
No match
/is+t/8i
iSs\x{17f}t
0: iSs\x{17f}t
ikt
No match
/is+?t/8i
ikt
No match
/is?t/8i
ikt
No match
/is{2}t/8i
iskt
No match
/-- End of testinput7 --/ /-- End of testinput7 --/

View File

@ -7,7 +7,11 @@
/* This file contains definitions of the property values that are returned by /* This file contains definitions of the property values that are returned by
the UCD access macros. New values that are added for new releases of Unicode the UCD access macros. New values that are added for new releases of Unicode
should always be at the end of each enum, for backwards compatibility. */ should always be at the end of each enum, for backwards compatibility.
IMPORTANT: Note also that the specific numeric values of the enums have to be
the same as the values that are generated by the maint/MultiStage2.py script,
where the equivalent property descriptive names are listed in vectors. */
/* These are the general character categories. */ /* These are the general character categories. */
@ -21,7 +25,7 @@ enum {
ucp_Z /* Separator */ ucp_Z /* Separator */
}; };
/* These are the particular character types. */ /* These are the particular character categories. */
enum { enum {
ucp_Cc, /* Control */ ucp_Cc, /* Control */
@ -56,6 +60,26 @@ enum {
ucp_Zs /* Space separator */ ucp_Zs /* Space separator */
}; };
/* These are grapheme break properties. Note that the code for processing them
assumes that the values are less than 16. If more values are added that take
the number to 16 or more, the code will have to be rewritten. */
enum {
ucp_gbCR, /* 0 */
ucp_gbLF, /* 1 */
ucp_gbControl, /* 2 */
ucp_gbExtend, /* 3 */
ucp_gbPrepend, /* 4 */
ucp_gbSpacingMark, /* 5 */
ucp_gbL, /* 6 Hangul syllable type L */
ucp_gbV, /* 7 Hangul syllable type V */
ucp_gbT, /* 8 Hangul syllable type T */
ucp_gbLV, /* 9 Hangul syllable type LV */
ucp_gbLVT, /* 10 Hangul syllable type LVT */
ucp_gbRegionalIndicator, /* 11 */
ucp_gbOther /* 12 */
};
/* These are the script identifications. */ /* These are the script identifications. */
enum { enum {

View File

@ -244,12 +244,19 @@ PHPAPI pcre_cache_entry* pcre_get_compiled_regex_cache(char *regex, int regex_le
int count = 0; int count = 0;
unsigned const char *tables = NULL; unsigned const char *tables = NULL;
#if HAVE_SETLOCALE #if HAVE_SETLOCALE
char *locale = setlocale(LC_CTYPE, NULL); char *locale;
#endif #endif
pcre_cache_entry *pce; pcre_cache_entry *pce;
pcre_cache_entry new_entry; pcre_cache_entry new_entry;
char *tmp = NULL; char *tmp = NULL;
#if HAVE_SETLOCALE
# ifdef PHP_WIN32 && ZTS
_configthreadlocale(_ENABLE_PER_THREAD_LOCALE);
# endif
locale = setlocale(LC_CTYPE, NULL);
#endif
/* Try to lookup the cached regex entry, and if successful, just pass /* Try to lookup the cached regex entry, and if successful, just pass
back the compiled pattern, otherwise go on and compile it. */ back the compiled pattern, otherwise go on and compile it. */
if (zend_hash_find(&PCRE_G(pcre_cache), regex, regex_len+1, (void **)&pce) == SUCCESS) { if (zend_hash_find(&PCRE_G(pcre_cache), regex, regex_len+1, (void **)&pce) == SUCCESS) {

View File

@ -28,6 +28,10 @@
# include "config.h" # include "config.h"
#endif #endif
#ifdef __APPLE__
#define __APPLE_USE_RFC_3542
#endif
#if HAVE_SOCKETS #if HAVE_SOCKETS
#include <php.h> #include <php.h>

View File

@ -9,6 +9,9 @@ if (!defined('IPPROTO_IPV6')) {
die('skip IPv6 not available.'); die('skip IPv6 not available.');
} }
$s = socket_create(AF_INET6, SOCK_DGRAM, SOL_UDP); $s = socket_create(AF_INET6, SOCK_DGRAM, SOL_UDP);
if ($s === false) {
die("skip unable to create socket");
}
$br = socket_bind($s, '::', 3000); $br = socket_bind($s, '::', 3000);
/* On Linux, there is no route ff00::/8 by default on lo, which makes it /* On Linux, there is no route ff00::/8 by default on lo, which makes it
* troublesome to send multicast traffic from lo, which we must since * troublesome to send multicast traffic from lo, which we must since

View File

@ -5,6 +5,11 @@ Test if socket_set_option() returns 'unable to set socket option' failure for in
if (!extension_loaded('sockets')) { if (!extension_loaded('sockets')) {
die('SKIP sockets extension not available.'); die('SKIP sockets extension not available.');
} }
if (PHP_OS == 'Darwin') {
die('skip Not for OSX');
}
$filename = dirname(__FILE__) . '/006_root_check.tmp'; $filename = dirname(__FILE__) . '/006_root_check.tmp';
$fp = fopen($filename, 'w'); $fp = fopen($filename, 'w');
fclose($fp); fclose($fp);

View File

@ -62,8 +62,8 @@ int(16)
int(24) int(24)
-- Iteration 3 -- -- Iteration 3 --
1234000 0 120 1234000 3875820019684212736 120
int(25) int(34)
-- Iteration 4 -- -- Iteration 4 --
#1 0 $0 10 #1 0 $0 10

View File

@ -58,7 +58,7 @@ string(16) "1234567 342391 0"
string(24) "12345678900 u 1234 12345" string(24) "12345678900 u 1234 12345"
-- Iteration 3 -- -- Iteration 3 --
string(25) " 1234000 0 120" string(34) " 1234000 3875820019684212736 120"
-- Iteration 4 -- -- Iteration 4 --
string(10) "#1 0 $0 10" string(10) "#1 0 $0 10"