mirror of
https://github.com/php/php-src.git
synced 2024-12-26 02:10:46 +08:00
Merge branch 'PHP-5.5' of git.php.net:php-src into PHP-5.5
This commit is contained in:
commit
e9a2642c89
15
NEWS
15
NEWS
@ -10,6 +10,9 @@ PHP NEWS
|
||||
. Fixed bug #64287 (sendmsg/recvmsg shutdown handler causes segfault).
|
||||
(Gustavo)
|
||||
|
||||
- PCRE:
|
||||
. Merged PCRE 8.32. (Anatol)
|
||||
|
||||
21 Feb 2013, PHP 5.5.0 Alpha 5
|
||||
|
||||
- Core:
|
||||
@ -60,6 +63,18 @@ PHP NEWS
|
||||
- Filter:
|
||||
. Implemented FR #49180 - added MAC address validation. (Martin)
|
||||
|
||||
- Phar:
|
||||
. Fixed timestamp update on Phar contents modification. (Dmitry)
|
||||
|
||||
- SPL:
|
||||
. Fixed bug #64264 (SPLFixedArray toArray problem). (Laruence)
|
||||
. Fixed bug #64228 (RecursiveDirectoryIterator always assumes SKIP_DOTS).
|
||||
(patch by kriss@krizalys.com, Laruence)
|
||||
. Fixed bug #64106 (Segfault on SplFixedArray[][x] = y when extended).
|
||||
(Nikita Popov)
|
||||
. Fixed bug #52861 (unset fails with ArrayObject and deep arrays).
|
||||
(Mike Willbanks)
|
||||
|
||||
- SNMP:
|
||||
. Fixed bug #64124 (IPv6 malformed). (Boris Lytochkin)
|
||||
|
||||
|
344
NEWS-5.5
Normal file
344
NEWS-5.5
Normal file
@ -0,0 +1,344 @@
|
||||
PHP NEWS
|
||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||
?? ??? 201?, PHP 5.5.0 Beta 1
|
||||
|
||||
- Core:
|
||||
. Fixed bug #49348 (Uninitialized ++$foo->bar; does not cause a notice).
|
||||
(Stas)
|
||||
|
||||
- Sockets:
|
||||
. Fixed bug #64287 (sendmsg/recvmsg shutdown handler causes segfault).
|
||||
(Gustavo)
|
||||
|
||||
- PCRE:
|
||||
. Merged PCRE 8.32. (Anatol)
|
||||
|
||||
21 Feb 2013, PHP 5.5.0 Alpha 5
|
||||
|
||||
- Core:
|
||||
. Implemented FR #64175 (Added HTTP codes as of RFC 6585). (Jonh Wendell)
|
||||
. Fixed bug #64135 (Exceptions from set_error_handler are not always
|
||||
propagated). (Laruence)
|
||||
. Fixed bug #63830 (Segfault on undefined function call in nested generator).
|
||||
(Nikita Popov)
|
||||
. Fixed bug #60833 (self, parent, static behave inconsistently
|
||||
case-sensitive). (Stas, mario at include-once dot org)
|
||||
. Implemented FR #60524 (specify temp dir by php.ini). (ALeX Kazik).
|
||||
. Fixed bug #64142 (dval to lval different behavior on ppc64). (Remi)
|
||||
. Added ARMv7/v8 versions of various Zend arithmetic functions that are
|
||||
implemented using inline assembler (Ard Biesheuvel)
|
||||
. Fix undefined behavior when converting double variables to integers.
|
||||
The double is now always rounded towards zero, the remainder of its division
|
||||
by 2^32 or 2^64 (depending on sizeof(long)) is calculated and it's made
|
||||
signed assuming a two's complement representation. (Gustavo)
|
||||
|
||||
- CLI server:
|
||||
. Fixed bug #64128 (buit-in web server is broken on ppc64). (Remi)
|
||||
|
||||
- cURL:
|
||||
. Implemented FR #46439 - added CURLFile for safer file uploads.
|
||||
(Stas)
|
||||
|
||||
- Intl:
|
||||
. Cherry-picked UConverter wrapper, which had accidentaly been committed only
|
||||
to master.
|
||||
|
||||
- mysqli
|
||||
. Added mysqli_begin_transaction()/mysqli::begin_transaction(). Implemented
|
||||
all options, per MySQL 5.6, which can be used with START TRANSACTION, COMMIT
|
||||
and ROLLBACK through options to mysqli_commit()/mysqli_rollback() and their
|
||||
respective OO counterparts. They work in libmysql and mysqlnd mode. (Andrey)
|
||||
. Added mysqli_savepoint(), mysqli_release_savepoint(). (Andrey)
|
||||
|
||||
- mysqlnd
|
||||
. Add new begin_transaction() call to the connection object. Implemented all
|
||||
options, per MySQL 5.6, which can be used with START TRANSACTION, COMMIT
|
||||
and ROLLBACK. (Andrey)
|
||||
. Added mysqlnd_savepoint(), mysqlnd_release_savepoint(). (Andrey)
|
||||
|
||||
- Sockets:
|
||||
. Added recvmsg() and sendmsg() wrappers. (Gustavo)
|
||||
See https://wiki.php.net/rfc/sendrecvmsg
|
||||
|
||||
- Filter:
|
||||
. Implemented FR #49180 - added MAC address validation. (Martin)
|
||||
|
||||
- Phar:
|
||||
. Fixed timestamp update on Phar contents modification. (Dmitry)
|
||||
|
||||
- SPL:
|
||||
. Fixed bug #64264 (SPLFixedArray toArray problem). (Laruence)
|
||||
. Fixed bug #64228 (RecursiveDirectoryIterator always assumes SKIP_DOTS).
|
||||
(patch by kriss@krizalys.com, Laruence)
|
||||
. Fixed bug #64106 (Segfault on SplFixedArray[][x] = y when extended).
|
||||
(Nikita Popov)
|
||||
. Fixed bug #52861 (unset fails with ArrayObject and deep arrays).
|
||||
(Mike Willbanks)
|
||||
|
||||
- SNMP:
|
||||
. Fixed bug #64124 (IPv6 malformed). (Boris Lytochkin)
|
||||
|
||||
24 Jan 2013, PHP 5.5.0 Alpha 4
|
||||
|
||||
- Core:
|
||||
. Fixed bug #63980 (object members get trimmed by zero bytes). (Laruence)
|
||||
. Implemented RFC for Class Name Resolution As Scalar Via "class" Keyword.
|
||||
(Ralph Schindler, Nikita Popov, Lars)
|
||||
|
||||
- DateTime
|
||||
. Added DateTimeImmutable - a variant of DateTime that only returns the
|
||||
modified state instead of changing itself. (Derick)
|
||||
|
||||
- FPM:
|
||||
. Fixed bug #63999 (php with fpm fails to build on Solaris 10 or 11). (Adam)
|
||||
|
||||
- pgsql:
|
||||
. Bug #46408: Locale number format settings can cause pg_query_params to
|
||||
break with numerics. (asmecher, Lars)
|
||||
|
||||
- dba:
|
||||
. Bug #62489: dba_insert not working as expected.
|
||||
(marc-bennewitz at arcor dot de, Lars)
|
||||
|
||||
- Reflection:
|
||||
. Fixed bug #64007 (There is an ability to create instance of Generator by
|
||||
hand). (Laruence)
|
||||
|
||||
10 Jan 2013, PHP 5.5.0 Alpha 3
|
||||
|
||||
- General improvements:
|
||||
. Fixed bug #63874 (Segfault if php_strip_whitespace has heredoc). (Pierrick)
|
||||
. Fixed bug #63822 (Crash when using closures with ArrayAccess).
|
||||
(Nikita Popov)
|
||||
. Add Generator::throw() method. (Nikita Popov)
|
||||
. Bug #23955: allow specifying Max-Age attribute in setcookie() (narfbg, Lars)
|
||||
. Bug #52126: timestamp for mail.log (Martin Jansen, Lars)
|
||||
|
||||
- mysqlnd
|
||||
. Fixed return value of mysqli_stmt_affected_rows() in the time after
|
||||
prepare() and before execute(). (Andrey)
|
||||
|
||||
- cURL:
|
||||
. Added new functions curl_escape, curl_multi_setopt, curl_multi_strerror
|
||||
curl_pause, curl_reset, curl_share_close, curl_share_init,
|
||||
curl_share_setopt curl_strerror and curl_unescape. (Pierrick)
|
||||
. Addes new curl options CURLOPT_TELNETOPTIONS, CURLOPT_GSSAPI_DELEGATION,
|
||||
CURLOPT_ACCEPTTIMEOUT_MS, CURLOPT_SSL_OPTIONS, CURLOPT_TCP_KEEPALIVE,
|
||||
CURLOPT_TCP_KEEPIDLE and CURLOPT_TCP_KEEPINTVL. (Pierrick)
|
||||
|
||||
18 Dec 2012, PHP 5.5.0 Alpha 2
|
||||
|
||||
- General improvements:
|
||||
. Added systemtap support by enabling systemtap compatible dtrace probes on
|
||||
linux. (David Soria Parra)
|
||||
. Added support for using empty() on the result of function calls and
|
||||
other expressions (https://wiki.php.net/rfc/empty_isset_exprs).
|
||||
(Nikita Popov)
|
||||
. Optimized access to temporary and compiled VM variables. 8% less memory
|
||||
reads. (Dmitry)
|
||||
. The VM stacks for passing function arguments and syntaticaly nested calls
|
||||
were merged into a single stack. The stack size needed for op_array
|
||||
execution is calculated at compile time and preallocated at once. As result
|
||||
all the stack push operatins don't require checks for stack overflow
|
||||
any more. (Dmitry)
|
||||
|
||||
- MySQL
|
||||
. This extension is now deprecated, and deprecation warnings will be generated
|
||||
when connections are established to databases via mysql_connect(),
|
||||
mysql_pconnect(), or through implicit connection: use MySQLi or PDO_MySQL
|
||||
instead (https://wiki.php.net/rfc/mysql_deprecation). (Adam)
|
||||
|
||||
- Fileinfo:
|
||||
. Fixed bug #63590 (Different results in TS and NTS under Windows).
|
||||
(Anatoliy)
|
||||
|
||||
- Apache2 Handler SAPI:
|
||||
. Enabled Apache 2.4 configure option for Windows (Pierre, Anatoliy)
|
||||
|
||||
13 Nov 2012, PHP 5.5.0 Alpha 1
|
||||
|
||||
- General improvements:
|
||||
. Added generators and coroutines (https://wiki.php.net/rfc/generators).
|
||||
(Nikita Popov)
|
||||
. Added "finally" keyword (https://wiki.php.net/rfc/finally). (Laruence)
|
||||
. Add simplified password hashing API
|
||||
(https://wiki.php.net/rfc/password_hash). (Anthony Ferrara)
|
||||
. Added support for list in foreach (https://wiki.php.net/rfc/foreachlist).
|
||||
(Laruence)
|
||||
. Added support for using empty() on the result of function calls and
|
||||
other expressions (https://wiki.php.net/rfc/empty_isset_exprs).
|
||||
(Nikita Popov)
|
||||
. Added support for constant array/string dereferencing. (Laruence)
|
||||
. Improve set_exception_handler while doing reset.(Laruence)
|
||||
. Remove php_logo_guid(), php_egg_logo_guid(), php_real_logo_guid(),
|
||||
zend_logo_guid(). (Adnrew Faulds)
|
||||
. Drop Windows XP and 2003 support. (Pierre)
|
||||
|
||||
- Calendar:
|
||||
. Fixed bug #54254 (cal_from_jd returns month = 6 when there is only one Adar)
|
||||
(Stas, Eitan Mosenkis)
|
||||
|
||||
- Core:
|
||||
. Added boolval(). (Jille Timmermans)
|
||||
. Added "Z" option to pack/unpack. (Gustavo)
|
||||
. Implemented FR #60738 (Allow 'set_error_handler' to handle NULL).
|
||||
(Laruence, Nikita Popov)
|
||||
. Added optional second argument for assert() to specify custom message. Patch
|
||||
by Lonny Kapelushnik (lonny@lonnylot.com). (Lars)
|
||||
. Fixed bug #18556 (Engine uses locale rules to handle class names). (Stas)
|
||||
. Fixed bug #61681 (Malformed grammar). (Nikita Popov, Etienne, Laruence)
|
||||
. Fixed bug #61038 (unpack("a5", "str\0\0") does not work as expected).
|
||||
(srgoogleguy, Gustavo)
|
||||
. Return previous handler when passing NULL to set_error_handler and
|
||||
set_exception_handler. (Nikita Popov)
|
||||
|
||||
- cURL:
|
||||
. Added support for CURLOPT_FTP_RESPONSE_TIMEOUT, CURLOPT_APPEND,
|
||||
CURLOPT_DIRLISTONLY, CURLOPT_NEW_DIRECTORY_PERMS, CURLOPT_NEW_FILE_PERMS,
|
||||
CURLOPT_NETRC_FILE, CURLOPT_PREQUOTE, CURLOPT_KRBLEVEL, CURLOPT_MAXFILESIZE,
|
||||
CURLOPT_FTP_ACCOUNT, CURLOPT_COOKIELIST, CURLOPT_IGNORE_CONTENT_LENGTH,
|
||||
CURLOPT_CONNECT_ONLY, CURLOPT_LOCALPORT, CURLOPT_LOCALPORTRANGE,
|
||||
CURLOPT_FTP_ALTERNATIVE_TO_USER, CURLOPT_SSL_SESSIONID_CACHE,
|
||||
CURLOPT_FTP_SSL_CCC, CURLOPT_HTTP_CONTENT_DECODING,
|
||||
CURLOPT_HTTP_TRANSFER_DECODING, CURLOPT_PROXY_TRANSFER_MODE,
|
||||
CURLOPT_ADDRESS_SCOPE, CURLOPT_CRLFILE, CURLOPT_ISSUERCERT,
|
||||
CURLOPT_USERNAME, CURLOPT_PASSWORD, CURLOPT_PROXYUSERNAME,
|
||||
CURLOPT_PROXYPASSWORD, CURLOPT_NOPROXY, CURLOPT_SOCKS5_GSSAPI_NEC,
|
||||
CURLOPT_SOCKS5_GSSAPI_SERVICE, CURLOPT_TFTP_BLKSIZE,
|
||||
CURLOPT_SSH_KNOWNHOSTS, CURLOPT_FTP_USE_PRET, CURLOPT_MAIL_FROM,
|
||||
CURLOPT_MAIL_RCPT, CURLOPT_RTSP_CLIENT_CSEQ, CURLOPT_RTSP_SERVER_CSEQ,
|
||||
CURLOPT_RTSP_SESSION_ID, CURLOPT_RTSP_STREAM_URI, CURLOPT_RTSP_TRANSPORT,
|
||||
CURLOPT_RTSP_REQUEST, CURLOPT_RESOLVE, CURLOPT_ACCEPT_ENCODING,
|
||||
CURLOPT_TRANSFER_ENCODING, CURLOPT_DNS_SERVERS and CURLOPT_USE_SSL.
|
||||
(Pierrick)
|
||||
. Fixed bug #55635 (CURLOPT_BINARYTRANSFER no longer used. The constant
|
||||
still exists for backward compatibility but is doing nothing). (Pierrick)
|
||||
. Fixed bug #54995 (Missing CURLINFO_RESPONSE_CODE support). (Pierrick)
|
||||
|
||||
- Datetime
|
||||
. Fixed bug #61642 (modify("+5 weekdays") returns Sunday).
|
||||
(Dmitri Iouchtchenko)
|
||||
|
||||
- Hash
|
||||
. Added support for PBKDF2 via hash_pbkdf2(). (Anthony Ferrara)
|
||||
|
||||
- Intl
|
||||
. The intl extension now requires ICU 4.0+.
|
||||
. Added intl.use_exceptions INI directive, which controls what happens when
|
||||
global errors are set together with intl.error_level. (Gustavo)
|
||||
. MessageFormatter::format() and related functions now accepted named
|
||||
arguments and mixed numeric/named arguments in ICU 4.8+. (Gustavo)
|
||||
. MessageFormatter::format() and related functions now don't error out when
|
||||
an insufficient argument count is provided. Instead, the placeholders will
|
||||
remain unsubstituted. (Gustavo)
|
||||
. MessageFormatter::parse() and MessageFormat::format() (and their static
|
||||
equivalents) don't throw away better than second precision in the arguments.
|
||||
(Gustavo)
|
||||
. IntlDateFormatter::__construct and datefmt_create() now accept for the
|
||||
$timezone argument time zone identifiers, IntlTimeZone objects, DateTimeZone
|
||||
objects and NULL. (Gustavo)
|
||||
. IntlDateFormatter::__construct and datefmt_create() no longer accept invalid
|
||||
timezone identifiers or empty strings. (Gustavo)
|
||||
. The default time zone used in IntlDateFormatter::__construct and
|
||||
datefmt_create() (when the corresponding argument is not passed or NULL is
|
||||
passed) is now the one given by date_default_timezone_get(), not the
|
||||
default ICU time zone. (Gustavo)
|
||||
. The time zone passed to the IntlDateFormatter is ignored if it is NULL and
|
||||
if the calendar passed is an IntlCalendar object -- in this case, the
|
||||
IntlCalendar's time zone will be used instead. Otherwise, the time zone
|
||||
specified in the $timezone argument is used instead. This does not affect
|
||||
old code, as IntlCalendar was introduced in this version. (Gustavo)
|
||||
. IntlDateFormatter::__construct and datefmt_create() now accept for the
|
||||
$calendar argument also IntlCalendar objects. (Gustavo)
|
||||
. IntlDateFormatter::getCalendar() and datefmt_get_calendar() return false
|
||||
if the IntlDateFormatter was set up with an IntlCalendar instead of the
|
||||
constants IntlDateFormatter::GREGORIAN/TRADITIONAL. IntlCalendar did not
|
||||
exist before this version. (Gustavo)
|
||||
. IntlDateFormatter::setCalendar() and datefmt_set_calendar() now also accept
|
||||
an IntlCalendar object, in which case its time zone is taken. Passing a
|
||||
constant is still allowed, and still keeps the time zone. (Gustavo)
|
||||
. IntlDateFormatter::setTimeZoneID() and datefmt_set_timezone_id() are
|
||||
deprecated. Use IntlDateFormatter::setTimeZone() or datefmt_set_timezone()
|
||||
instead. (Gustavo)
|
||||
. IntlDateFormatter::format() and datefmt_format() now also accept an
|
||||
IntlCalendar object for formatting. (Gustavo)
|
||||
. Added the classes: IntlCalendar, IntlGregorianCalendar, IntlTimeZone,
|
||||
IntlBreakIterator, IntlRuleBasedBreakIterator and
|
||||
IntlCodePointBreakIterator. (Gustavo)
|
||||
. Added the functions: intlcal_get_keyword_values_for_locale(),
|
||||
intlcal_get_now(), intlcal_get_available_locales(), intlcal_get(),
|
||||
intlcal_get_time(), intlcal_set_time(), intlcal_add(),
|
||||
intlcal_set_time_zone(), intlcal_after(), intlcal_before(), intlcal_set(),
|
||||
intlcal_roll(), intlcal_clear(), intlcal_field_difference(),
|
||||
intlcal_get_actual_maximum(), intlcal_get_actual_minimum(),
|
||||
intlcal_get_day_of_week_type(), intlcal_get_first_day_of_week(),
|
||||
intlcal_get_greatest_minimum(), intlcal_get_least_maximum(),
|
||||
intlcal_get_locale(), intlcal_get_maximum(),
|
||||
intlcal_get_minimal_days_in_first_week(), intlcal_get_minimum(),
|
||||
intlcal_get_time_zone(), intlcal_get_type(),
|
||||
intlcal_get_weekend_transition(), intlcal_in_daylight_time(),
|
||||
intlcal_is_equivalent_to(), intlcal_is_lenient(), intlcal_is_set(),
|
||||
intlcal_is_weekend(), intlcal_set_first_day_of_week(),
|
||||
intlcal_set_lenient(), intlcal_equals(),
|
||||
intlcal_get_repeated_wall_time_option(),
|
||||
intlcal_get_skipped_wall_time_option(),
|
||||
intlcal_set_repeated_wall_time_option(),
|
||||
intlcal_set_skipped_wall_time_option(), intlcal_from_date_time(),
|
||||
intlcal_to_date_time(), intlcal_get_error_code(),
|
||||
intlcal_get_error_message(), intlgregcal_create_instance(),
|
||||
intlgregcal_set_gregorian_change(), intlgregcal_get_gregorian_change() and
|
||||
intlgregcal_is_leap_year(). (Gustavo)
|
||||
. Added the functions: intltz_create_time_zone(), intltz_create_default(),
|
||||
intltz_get_id(), intltz_get_gmt(), intltz_get_unknown(),
|
||||
intltz_create_enumeration(), intltz_count_equivalent_ids(),
|
||||
intltz_create_time_zone_id_enumeration(), intltz_get_canonical_id(),
|
||||
intltz_get_region(), intltz_get_tz_data_version(),
|
||||
intltz_get_equivalent_id(), intltz_use_daylight_time(), intltz_get_offset(),
|
||||
intltz_get_raw_offset(), intltz_has_same_rules(), intltz_get_display_name(),
|
||||
intltz_get_dst_savings(), intltz_from_date_time_zone(),
|
||||
intltz_to_date_time_zone(), intltz_get_error_code(),
|
||||
intltz_get_error_message(). (Gustavo)
|
||||
. Added the methods: IntlDateFormatter::formatObject(),
|
||||
IntlDateFormatter::getCalendarObject(), IntlDateFormatter::getTimeZone(),
|
||||
IntlDateFormatter::setTimeZone(). (Gustavo)
|
||||
. Added the functions: datefmt_format_object(), datefmt_get_calendar_object(),
|
||||
datefmt_get_timezone(), datefmt_set_timezone(),
|
||||
datefmt_get_calendar_object(), intlcal_create_instance(). (Gustavo)
|
||||
|
||||
- MCrypt
|
||||
. mcrypt_ecb(), mcrypt_cbc(), mcrypt_cfb() and mcrypt_ofb() now throw
|
||||
E_DEPRECATED. (GoogleGuy)
|
||||
|
||||
- MySQLi
|
||||
. Dropped support for LOAD DATA LOCAL INFILE handlers when using libmysql.
|
||||
Known for stability problems. (Andrey)
|
||||
. Added support for SHA256 authentication available with MySQL 5.6.6+.
|
||||
(Andrey)
|
||||
|
||||
- PCRE:
|
||||
. Deprecated the /e modifier
|
||||
(https://wiki.php.net/rfc/remove_preg_replace_eval_modifier). (Nikita Popov)
|
||||
. Fixed bug #63284 (Upgrade PCRE to 8.31). (Anatoliy)
|
||||
|
||||
- pgsql
|
||||
. Added pg_escape_literal() and pg_escape_identifier() (Yasuo)
|
||||
|
||||
- SPL
|
||||
. Fix bug #60560 (SplFixedArray un-/serialize, getSize(), count() return 0,
|
||||
keys are strings). (Adam)
|
||||
|
||||
- Tokenizer:
|
||||
. Fixed bug #60097 (token_get_all fails to lex nested heredoc). (Nikita Popov)
|
||||
|
||||
- Zip:
|
||||
. Upgraded libzip to 0.10.1 (Anatoliy)
|
||||
|
||||
- Fileinfo:
|
||||
. Fixed bug #63248 (Load multiple magic files from a directory under Windows).
|
||||
(Anatoliy)
|
||||
|
||||
- General improvements:
|
||||
. Implemented FR #46487 (Dereferencing process-handles no longer waits on
|
||||
those processes). (Jille Timmermans)
|
||||
|
||||
<<< NOTE: Insert NEWS from last stable release here prior to actual release! >>>
|
File diff suppressed because it is too large
Load Diff
@ -18,7 +18,6 @@ $pattern = '[[:space:]]';
|
||||
$string = '1 2 3 4 5';
|
||||
var_dump(split($pattern, $string, 0));
|
||||
var_dump(split($pattern, $string, -10));
|
||||
var_dump(split($pattern, $string, 10E20));
|
||||
|
||||
|
||||
echo "Done";
|
||||
@ -35,9 +34,4 @@ array(1) {
|
||||
[0]=>
|
||||
string(9) "1 2 3 4 5"
|
||||
}
|
||||
Error: 8192 - Function split() is deprecated, %s(18)
|
||||
array(1) {
|
||||
[0]=>
|
||||
string(9) "1 2 3 4 5"
|
||||
}
|
||||
Done
|
||||
|
@ -18,7 +18,6 @@ $pattern = '[[:space:]]';
|
||||
$string = '1 2 3 4 5';
|
||||
var_dump(spliti($pattern, $string, 0));
|
||||
var_dump(spliti($pattern, $string, -10));
|
||||
var_dump(spliti($pattern, $string, 10E20));
|
||||
|
||||
|
||||
echo "Done";
|
||||
@ -35,9 +34,4 @@ array(1) {
|
||||
[0]=>
|
||||
string(9) "1 2 3 4 5"
|
||||
}
|
||||
Error: 8192 - Function spliti() is deprecated, %s(18)
|
||||
array(1) {
|
||||
[0]=>
|
||||
string(9) "1 2 3 4 5"
|
||||
}
|
||||
Done
|
||||
|
@ -10,3 +10,4 @@ AC_DEFINE('HAVE_BUNDLED_PCRE', 1, 'Using bundled PCRE library');
|
||||
AC_DEFINE('HAVE_PCRE', 1, 'Have PCRE library');
|
||||
PHP_PCRE="yes";
|
||||
PHP_INSTALL_HEADERS("ext/pcre", "php_pcre.h pcrelib/");
|
||||
ADD_FLAG("CFLAGS_PCRE", " /D HAVE_CONFIG_H");
|
||||
|
@ -59,7 +59,8 @@ PHP_ARG_WITH(pcre-regex,,
|
||||
pcrelib/pcre_ord2utf8.c pcrelib/pcre_refcount.c pcrelib/pcre_study.c \
|
||||
pcrelib/pcre_tables.c pcrelib/pcre_valid_utf8.c \
|
||||
pcrelib/pcre_version.c pcrelib/pcre_xclass.c"
|
||||
PHP_NEW_EXTENSION(pcre, $pcrelib_sources php_pcre.c, no,,-I@ext_srcdir@/pcrelib)
|
||||
PHP_PCRE_CFLAGS="-DHAVE_CONFIG_H -I@ext_srcdir@/pcrelib"
|
||||
PHP_NEW_EXTENSION(pcre, $pcrelib_sources php_pcre.c, no,,$PHP_PCRE_CFLAGS)
|
||||
PHP_ADD_BUILD_DIR($ext_builddir/pcrelib)
|
||||
PHP_INSTALL_HEADERS([ext/pcre], [php_pcre.h pcrelib/])
|
||||
AC_DEFINE(HAVE_BUNDLED_PCRE, 1, [ ])
|
||||
|
@ -1,6 +1,170 @@
|
||||
ChangeLog for PCRE
|
||||
------------------
|
||||
|
||||
Version 8.32 30-November-2012
|
||||
-----------------------------
|
||||
|
||||
1. Improved JIT compiler optimizations for first character search and single
|
||||
character iterators.
|
||||
|
||||
2. Supporting IBM XL C compilers for PPC architectures in the JIT compiler.
|
||||
Patch by Daniel Richard G.
|
||||
|
||||
3. Single character iterator optimizations in the JIT compiler.
|
||||
|
||||
4. Improved JIT compiler optimizations for character ranges.
|
||||
|
||||
5. Rename the "leave" variable names to "quit" to improve WinCE compatibility.
|
||||
Reported by Giuseppe D'Angelo.
|
||||
|
||||
6. The PCRE_STARTLINE bit, indicating that a match can occur only at the start
|
||||
of a line, was being set incorrectly in cases where .* appeared inside
|
||||
atomic brackets at the start of a pattern, or where there was a subsequent
|
||||
*PRUNE or *SKIP.
|
||||
|
||||
7. Improved instruction cache flush for POWER/PowerPC.
|
||||
Patch by Daniel Richard G.
|
||||
|
||||
8. Fixed a number of issues in pcregrep, making it more compatible with GNU
|
||||
grep:
|
||||
|
||||
(a) There is now no limit to the number of patterns to be matched.
|
||||
|
||||
(b) An error is given if a pattern is too long.
|
||||
|
||||
(c) Multiple uses of --exclude, --exclude-dir, --include, and --include-dir
|
||||
are now supported.
|
||||
|
||||
(d) --exclude-from and --include-from (multiple use) have been added.
|
||||
|
||||
(e) Exclusions and inclusions now apply to all files and directories, not
|
||||
just to those obtained from scanning a directory recursively.
|
||||
|
||||
(f) Multiple uses of -f and --file-list are now supported.
|
||||
|
||||
(g) In a Windows environment, the default for -d has been changed from
|
||||
"read" (the GNU grep default) to "skip", because otherwise the presence
|
||||
of a directory in the file list provokes an error.
|
||||
|
||||
(h) The documentation has been revised and clarified in places.
|
||||
|
||||
9. Improve the matching speed of capturing brackets.
|
||||
|
||||
10. Changed the meaning of \X so that it now matches a Unicode extended
|
||||
grapheme cluster.
|
||||
|
||||
11. Patch by Daniel Richard G to the autoconf files to add a macro for sorting
|
||||
out POSIX threads when JIT support is configured.
|
||||
|
||||
12. Added support for PCRE_STUDY_EXTRA_NEEDED.
|
||||
|
||||
13. In the POSIX wrapper regcomp() function, setting re_nsub field in the preg
|
||||
structure could go wrong in environments where size_t is not the same size
|
||||
as int.
|
||||
|
||||
14. Applied user-supplied patch to pcrecpp.cc to allow PCRE_NO_UTF8_CHECK to be
|
||||
set.
|
||||
|
||||
15. The EBCDIC support had decayed; later updates to the code had included
|
||||
explicit references to (e.g.) \x0a instead of CHAR_LF. There has been a
|
||||
general tidy up of EBCDIC-related issues, and the documentation was also
|
||||
not quite right. There is now a test that can be run on ASCII systems to
|
||||
check some of the EBCDIC-related things (but is it not a full test).
|
||||
|
||||
16. The new PCRE_STUDY_EXTRA_NEEDED option is now used by pcregrep, resulting
|
||||
in a small tidy to the code.
|
||||
|
||||
17. Fix JIT tests when UTF is disabled and both 8 and 16 bit mode are enabled.
|
||||
|
||||
18. If the --only-matching (-o) option in pcregrep is specified multiple
|
||||
times, each one causes appropriate output. For example, -o1 -o2 outputs the
|
||||
substrings matched by the 1st and 2nd capturing parentheses. A separating
|
||||
string can be specified by --om-separator (default empty).
|
||||
|
||||
19. Improving the first n character searches.
|
||||
|
||||
20. Turn case lists for horizontal and vertical white space into macros so that
|
||||
they are defined only once.
|
||||
|
||||
21. This set of changes together give more compatible Unicode case-folding
|
||||
behaviour for characters that have more than one other case when UCP
|
||||
support is available.
|
||||
|
||||
(a) The Unicode property table now has offsets into a new table of sets of
|
||||
three or more characters that are case-equivalent. The MultiStage2.py
|
||||
script that generates these tables (the pcre_ucd.c file) now scans
|
||||
CaseFolding.txt instead of UnicodeData.txt for character case
|
||||
information.
|
||||
|
||||
(b) The code for adding characters or ranges of characters to a character
|
||||
class has been abstracted into a generalized function that also handles
|
||||
case-independence. In UTF-mode with UCP support, this uses the new data
|
||||
to handle characters with more than one other case.
|
||||
|
||||
(c) A bug that is fixed as a result of (b) is that codepoints less than 256
|
||||
whose other case is greater than 256 are now correctly matched
|
||||
caselessly. Previously, the high codepoint matched the low one, but not
|
||||
vice versa.
|
||||
|
||||
(d) The processing of \h, \H, \v, and \ in character classes now makes use
|
||||
of the new class addition function, using character lists defined as
|
||||
macros alongside the case definitions of 20 above.
|
||||
|
||||
(e) Caseless back references now work with characters that have more than
|
||||
one other case.
|
||||
|
||||
(f) General caseless matching of characters with more than one other case
|
||||
is supported.
|
||||
|
||||
22. Unicode character properties were updated from Unicode 6.2.0
|
||||
|
||||
23. Improved CMake support under Windows. Patch by Daniel Richard G.
|
||||
|
||||
24. Add support for 32-bit character strings, and UTF-32
|
||||
|
||||
25. Major JIT compiler update (code refactoring and bugfixing).
|
||||
Experimental Sparc 32 support is added.
|
||||
|
||||
26. Applied a modified version of Daniel Richard G's patch to create
|
||||
pcre.h.generic and config.h.generic by "make" instead of in the
|
||||
PrepareRelease script.
|
||||
|
||||
27. Added a definition for CHAR_NULL (helpful for the z/OS port), and use it in
|
||||
pcre_compile.c when checking for a zero character.
|
||||
|
||||
28. Introducing a native interface for JIT. Through this interface, the compiled
|
||||
machine code can be directly executed. The purpose of this interface is to
|
||||
provide fast pattern matching, so several sanity checks are not performed.
|
||||
However, feature tests are still performed. The new interface provides
|
||||
1.4x speedup compared to the old one.
|
||||
|
||||
29. If pcre_exec() or pcre_dfa_exec() was called with a negative value for
|
||||
the subject string length, the error given was PCRE_ERROR_BADOFFSET, which
|
||||
was confusing. There is now a new error PCRE_ERROR_BADLENGTH for this case.
|
||||
|
||||
30. In 8-bit UTF-8 mode, pcretest failed to give an error for data codepoints
|
||||
greater than 0x7fffffff (which cannot be represented in UTF-8, even under
|
||||
the "old" RFC 2279). Instead, it ended up passing a negative length to
|
||||
pcre_exec().
|
||||
|
||||
31. Add support for GCC's visibility feature to hide internal functions.
|
||||
|
||||
32. Running "pcretest -C pcre8" or "pcretest -C pcre16" gave a spurious error
|
||||
"unknown -C option" after outputting 0 or 1.
|
||||
|
||||
33. There is now support for generating a code coverage report for the test
|
||||
suite in environments where gcc is the compiler and lcov is installed. This
|
||||
is mainly for the benefit of the developers.
|
||||
|
||||
34. If PCRE is built with --enable-valgrind, certain memory regions are marked
|
||||
unaddressable using valgrind annotations, allowing valgrind to detect
|
||||
invalid memory accesses. This is mainly for the benefit of the developers.
|
||||
|
||||
25. (*UTF) can now be used to start a pattern in any of the three libraries.
|
||||
|
||||
26. Give configure error if --enable-cpp but no C++ compiler found.
|
||||
|
||||
|
||||
Version 8.31 06-July-2012
|
||||
-------------------------
|
||||
|
||||
|
@ -49,16 +49,17 @@ complexity in Perl regular expressions, I couldn't do this. In any case, a
|
||||
first pass through the pattern is helpful for other reasons.
|
||||
|
||||
|
||||
Support for 16-bit data strings
|
||||
-------------------------------
|
||||
Support for 16-bit and 32-bit data strings
|
||||
-------------------------------------------
|
||||
|
||||
From release 8.30, PCRE supports 16-bit as well as 8-bit data strings, by being
|
||||
compilable in either 8-bit or 16-bit modes, or both. Thus, two different
|
||||
libraries can be created. In the description that follows, the word "short" is
|
||||
From release 8.30, PCRE supports 16-bit as well as 8-bit data strings; and from
|
||||
release 8.32, PCRE supports 32-bit data strings. The library can be compiled
|
||||
in any combination of 8-bit, 16-bit or 32-bit modes, creating different
|
||||
libraries. In the description that follows, the word "short" is
|
||||
used for a 16-bit data quantity, and the word "unit" is used for a quantity
|
||||
that is a byte in 8-bit mode and a short in 16-bit mode. However, so as not to
|
||||
over-complicate the text, the names of PCRE functions are given in 8-bit form
|
||||
only.
|
||||
that is a byte in 8-bit mode, a short in 16-bit mode and a 32-bit unsigned
|
||||
integer in 32-bit mode. However, so as not to over-complicate the text, the
|
||||
names of PCRE functions are given in 8-bit form only.
|
||||
|
||||
|
||||
Computing the memory requirement: how it was
|
||||
@ -138,9 +139,10 @@ Format of compiled patterns
|
||||
---------------------------
|
||||
|
||||
The compiled form of a pattern is a vector of units (bytes in 8-bit mode, or
|
||||
shorts in 16-bit mode), containing items of variable length. The first unit in
|
||||
an item contains an opcode, and the length of the item is either implicit in
|
||||
the opcode or contained in the data that follows it.
|
||||
shorts in 16-bit mode, 32-bit unsigned integers in 32-bit mode), containing
|
||||
items of variable length. The first unit in an item contains an opcode, and
|
||||
the length of the item is either implicit in the opcode or contained in the
|
||||
data that follows it.
|
||||
|
||||
In many cases listed below, LINK_SIZE data values are specified for offsets
|
||||
within the compiled pattern. LINK_SIZE always specifies a number of bytes. The
|
||||
@ -207,7 +209,8 @@ Matching literal characters
|
||||
|
||||
The OP_CHAR opcode is followed by a single character that is to be matched
|
||||
casefully. For caseless matching, OP_CHARI is used. In UTF-8 or UTF-16 modes,
|
||||
the character may be more than one unit long.
|
||||
the character may be more than one unit long. In UTF-32 mode, characters
|
||||
are always exactly one unit long.
|
||||
|
||||
|
||||
Repeating single characters
|
||||
@ -228,7 +231,8 @@ following opcodes, which come in caseful and caseless versions:
|
||||
OP_POSQUERY OP_POSQUERYI
|
||||
|
||||
Each opcode is followed by the character that is to be repeated. In ASCII mode,
|
||||
these are two-unit items; in UTF-8 or UTF-16 modes, the length is variable.
|
||||
these are two-unit items; in UTF-8 or UTF-16 modes, the length is variable; in
|
||||
UTF-32 mode these are one-unit items.
|
||||
Those with "MIN" in their names are the minimizing versions. Those with "POS"
|
||||
in their names are possessive versions. Other repeats make use of these
|
||||
opcodes:
|
||||
@ -299,7 +303,7 @@ bit map containing a 1 bit for every character that is acceptable. The bits are
|
||||
counted from the least significant end of each unit. In caseless mode, bits for
|
||||
both cases are set.
|
||||
|
||||
The reason for having both OP_CLASS and OP_NCLASS is so that, in UTF-8/16 mode,
|
||||
The reason for having both OP_CLASS and OP_NCLASS is so that, in UTF-8/16/32 mode,
|
||||
subject characters with values greater than 255 can be handled correctly. For
|
||||
OP_CLASS they do not match, whereas for OP_NCLASS they do.
|
||||
|
||||
@ -412,7 +416,8 @@ OP_ASSERTBACK and OP_ASSERTBACK_NOT, and the first opcode inside the assertion
|
||||
is OP_REVERSE, followed by a two byte (one short) count of the number of
|
||||
characters to move back the pointer in the subject string. In ASCII mode, the
|
||||
count is a number of units, but in UTF-8/16 mode each character may occupy more
|
||||
than one unit. A separate count is present in each alternative of a lookbehind
|
||||
than one unit; in UTF-32 mode each character occupies exactly one unit.
|
||||
A separate count is present in each alternative of a lookbehind
|
||||
assertion, allowing them to have different fixed lengths.
|
||||
|
||||
|
||||
|
@ -1,6 +1,46 @@
|
||||
News about PCRE releases
|
||||
------------------------
|
||||
|
||||
Release 8.32 30-November-2012
|
||||
-----------------------------
|
||||
|
||||
This release fixes a number of bugs, but also has some new features. These are
|
||||
the highlights:
|
||||
|
||||
. There is now support for 32-bit character strings and UTF-32. Like the
|
||||
16-bit support, this is done by compiling a separate 32-bit library.
|
||||
|
||||
. \X now matches a Unicode extended grapheme cluster.
|
||||
|
||||
. Case-independent matching of Unicode characters that have more than one
|
||||
"other case" now makes all three (or more) characters equivalent. This
|
||||
applies, for example, to Greek Sigma, which has two lowercase versions.
|
||||
|
||||
. Unicode character properties are updated to Unicode 6.2.0.
|
||||
|
||||
. The EBCDIC support, which had decayed, has had a spring clean.
|
||||
|
||||
. A number of JIT optimizations have been added, which give faster JIT
|
||||
execution speed. In addition, a new direct interface to JIT execution is
|
||||
available. This bypasses some of the sanity checks of pcre_exec() to give a
|
||||
noticeable speed-up.
|
||||
|
||||
. A number of issues in pcregrep have been fixed, making it more compatible
|
||||
with GNU grep. In particular, --exclude and --include (and variants) apply
|
||||
to all files now, not just those obtained from scanning a directory
|
||||
recursively. In Windows environments, the default action for directories is
|
||||
now "skip" instead of "read" (which provokes an error).
|
||||
|
||||
. If the --only-matching (-o) option in pcregrep is specified multiple
|
||||
times, each one causes appropriate output. For example, -o1 -o2 outputs the
|
||||
substrings matched by the 1st and 2nd capturing parentheses. A separating
|
||||
string can be specified by --om-separator (default empty).
|
||||
|
||||
. When PCRE is built via Autotools using a version of gcc that has the
|
||||
"visibility" feature, it is used to hide internal library functions that are
|
||||
not part of the public API.
|
||||
|
||||
|
||||
Release 8.31 06-July-2012
|
||||
-------------------------
|
||||
|
||||
@ -9,7 +49,7 @@ This is mainly a bug-fixing release, with a small number of developments:
|
||||
. The JIT compiler now supports partial matching and the (*MARK) and
|
||||
(*COMMIT) verbs.
|
||||
|
||||
. PCRE_INFO_MAXLOOKBEHIND can be used to find the longest lookbehing in a
|
||||
. PCRE_INFO_MAXLOOKBEHIND can be used to find the longest lookbehind in a
|
||||
pattern.
|
||||
|
||||
. There should be a performance improvement when using the heap instead of the
|
||||
|
@ -35,9 +35,10 @@ The contents of this README file are:
|
||||
The PCRE APIs
|
||||
-------------
|
||||
|
||||
PCRE is written in C, and it has its own API. There are two sets of functions,
|
||||
one for the 8-bit library, which processes strings of bytes, and one for the
|
||||
16-bit library, which processes strings of 16-bit values. The distribution also
|
||||
PCRE is written in C, and it has its own API. There are three sets of functions,
|
||||
one for the 8-bit library, which processes strings of bytes, one for the
|
||||
16-bit library, which processes strings of 16-bit values, and one for the 32-bit
|
||||
library, which processes strings of 32-bit values. The distribution also
|
||||
includes a set of C++ wrapper functions (see the pcrecpp man page for details),
|
||||
courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
|
||||
C++.
|
||||
@ -183,8 +184,10 @@ library. They are also documented in the pcrebuild man page.
|
||||
(See also "Shared libraries on Unix-like systems" below.)
|
||||
|
||||
. By default, only the 8-bit library is built. If you add --enable-pcre16 to
|
||||
the "configure" command, the 16-bit library is also built. If you want only
|
||||
the 16-bit library, use "./configure --enable-pcre16 --disable-pcre8".
|
||||
the "configure" command, the 16-bit library is also built. If you add
|
||||
--enable-pcre32 to the "configure" command, the 32-bit library is also built.
|
||||
If you want only the 16-bit or 32-bit library, use --disable-pcre8 to disable
|
||||
building the 8-bit library.
|
||||
|
||||
. If you are building the 8-bit library and want to suppress the building of
|
||||
the C++ wrapper library, you can add --disable-cpp to the "configure"
|
||||
@ -203,23 +206,24 @@ library. They are also documented in the pcrebuild man page.
|
||||
|
||||
. If you want to make use of the support for UTF-8 Unicode character strings in
|
||||
the 8-bit library, or UTF-16 Unicode character strings in the 16-bit library,
|
||||
you must add --enable-utf to the "configure" command. Without it, the code
|
||||
for handling UTF-8 and UTF-16 is not included in the relevant library. Even
|
||||
or UTF-32 Unicode character strings in the 32-bit library, you must add
|
||||
--enable-utf to the "configure" command. Without it, the code for handling
|
||||
UTF-8, UTF-16 and UTF-8 is not included in the relevant library. Even
|
||||
when --enable-utf is included, the use of a UTF encoding still has to be
|
||||
enabled by an option at run time. When PCRE is compiled with this option, its
|
||||
input can only either be ASCII or UTF-8/16, even when running on EBCDIC
|
||||
input can only either be ASCII or UTF-8/16/32, even when running on EBCDIC
|
||||
platforms. It is not possible to use both --enable-utf and --enable-ebcdic at
|
||||
the same time.
|
||||
|
||||
. There are no separate options for enabling UTF-8 and UTF-16 independently
|
||||
because that would allow ridiculous settings such as requesting UTF-16
|
||||
support while building only the 8-bit library. However, the option
|
||||
. There are no separate options for enabling UTF-8, UTF-16 and UTF-32
|
||||
independently because that would allow ridiculous settings such as requesting
|
||||
UTF-16 support while building only the 8-bit library. However, the option
|
||||
--enable-utf8 is retained for backwards compatibility with earlier releases
|
||||
that did not support 16-bit character strings. It is synonymous with
|
||||
that did not support 16-bit or 32-bit character strings. It is synonymous with
|
||||
--enable-utf. It is not possible to configure one library with UTF support
|
||||
and the other without in the same configuration.
|
||||
|
||||
. If, in addition to support for UTF-8/16 character strings, you want to
|
||||
. If, in addition to support for UTF-8/16/32 character strings, you want to
|
||||
include support for the \P, \p, and \X sequences that recognize Unicode
|
||||
character properties, you must add --enable-unicode-properties to the
|
||||
"configure" command. This adds about 30K to the size of the library (in the
|
||||
@ -281,7 +285,8 @@ library. They are also documented in the pcrebuild man page.
|
||||
library, PCRE then uses three bytes instead of two for offsets to different
|
||||
parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
|
||||
the same as --with-link-size=4, which (in both libraries) uses four-byte
|
||||
offsets. Increasing the internal link size reduces performance.
|
||||
offsets. Increasing the internal link size reduces performance. In the 32-bit
|
||||
library, the only supported link size is 4.
|
||||
|
||||
. You can build PCRE so that its internal match() function that is called from
|
||||
pcre_exec() does not call itself recursively. Instead, it uses memory blocks
|
||||
@ -310,13 +315,34 @@ library. They are also documented in the pcrebuild man page.
|
||||
pcre_chartables.c.dist. See "Character tables" below for further information.
|
||||
|
||||
. It is possible to compile PCRE for use on systems that use EBCDIC as their
|
||||
character code (as opposed to ASCII) by specifying
|
||||
character code (as opposed to ASCII/Unicode) by specifying
|
||||
|
||||
--enable-ebcdic
|
||||
|
||||
This automatically implies --enable-rebuild-chartables (see above). However,
|
||||
when PCRE is built this way, it always operates in EBCDIC. It cannot support
|
||||
both EBCDIC and UTF-8/16.
|
||||
both EBCDIC and UTF-8/16/32. There is a second option, --enable-ebcdic-nl25,
|
||||
which specifies that the code value for the EBCDIC NL character is 0x25
|
||||
instead of the default 0x15.
|
||||
|
||||
. In environments where valgrind is installed, if you specify
|
||||
|
||||
--enable-valgrind
|
||||
|
||||
PCRE will use valgrind annotations to mark certain memory regions as
|
||||
unaddressable. This allows it to detect invalid memory accesses, and is
|
||||
mostly useful for debugging PCRE itself.
|
||||
|
||||
. In environments where the gcc compiler is used and lcov version 1.6 or above
|
||||
is installed, if you specify
|
||||
|
||||
--enable-coverage
|
||||
|
||||
the build process implements a code coverage report for the test suite. The
|
||||
report is generated by running "make coverage". If ccache is installed on
|
||||
your system, it must be disabled when building PCRE for coverage reporting.
|
||||
You can do this by setting the environment variable CCACHE_DISABLE=1 before
|
||||
running "make" to build PCRE.
|
||||
|
||||
. The pcregrep program currently supports only 8-bit data files, and so
|
||||
requires the 8-bit PCRE library. It is possible to compile pcregrep to use
|
||||
@ -366,6 +392,7 @@ The "configure" script builds the following files for the basic C library:
|
||||
that were set for "configure"
|
||||
. libpcre.pc ) data for the pkg-config command
|
||||
. libpcre16.pc )
|
||||
. libpcre32.pc )
|
||||
. libpcreposix.pc )
|
||||
. libtool script that builds shared and/or static libraries
|
||||
|
||||
@ -385,8 +412,8 @@ The "configure" script also creates config.status, which is an executable
|
||||
script that can be run to recreate the configuration, and config.log, which
|
||||
contains compiler output from tests that "configure" runs.
|
||||
|
||||
Once "configure" has run, you can run "make". This builds either or both of the
|
||||
libraries libpcre and libpcre16, and a test program called pcretest. If you
|
||||
Once "configure" has run, you can run "make". This builds the the libraries
|
||||
libpcre, libpcre16 and/or libpcre32, and a test program called pcretest. If you
|
||||
enabled JIT support with --enable-jit, a test program called pcre_jit_test is
|
||||
built as well.
|
||||
|
||||
@ -410,12 +437,14 @@ system. The following are installed (file names are all relative to the
|
||||
|
||||
Libraries (lib):
|
||||
libpcre16 (if 16-bit support is enabled)
|
||||
libpcre32 (if 32-bit support is enabled)
|
||||
libpcre (if 8-bit support is enabled)
|
||||
libpcreposix (if 8-bit support is enabled)
|
||||
libpcrecpp (if 8-bit and C++ support is enabled)
|
||||
|
||||
Configuration information (lib/pkgconfig):
|
||||
libpcre16.pc
|
||||
libpcre32.pc
|
||||
libpcre.pc
|
||||
libpcreposix.pc
|
||||
libpcrecpp.pc (if C++ support is enabled)
|
||||
@ -596,7 +625,7 @@ The RunTest script runs the pcretest test program (which is documented in its
|
||||
own man page) on each of the relevant testinput files in the testdata
|
||||
directory, and compares the output with the contents of the corresponding
|
||||
testoutput files. Some tests are relevant only when certain build-time options
|
||||
were selected. For example, the tests for UTF-8/16 support are run only if
|
||||
were selected. For example, the tests for UTF-8/16/32 support are run only if
|
||||
--enable-utf was used. RunTest outputs a comment when it skips a test.
|
||||
|
||||
Many of the tests that are not skipped are run up to three times. The second
|
||||
@ -605,9 +634,9 @@ tests that are marked "never study" (see the pcretest program for how this is
|
||||
done). If JIT support is available, the non-DFA tests are run a third time,
|
||||
this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.
|
||||
|
||||
When both 8-bit and 16-bit support is enabled, the entire set of tests is run
|
||||
twice, once for each library. If you want to run just one set of tests, call
|
||||
RunTest with either the -8 or -16 option.
|
||||
The entire set of tests is run once for each of the 8-bit, 16-bit and 32-bit
|
||||
libraries that are enabled. If you want to run just one set of tests, call
|
||||
RunTest with either the -8, -16 or -32 option.
|
||||
|
||||
RunTest uses a file called testtry to hold the main output from pcretest.
|
||||
Other files whose names begin with "test" are used as working files in some
|
||||
@ -658,13 +687,13 @@ RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses
|
||||
Windows versions of test 2. More info on using RunTest.bat is included in the
|
||||
document entitled NON-UNIX-USE.]
|
||||
|
||||
The fourth and fifth tests check the UTF-8/16 support and error handling and
|
||||
The fourth and fifth tests check the UTF-8/16/32 support and error handling and
|
||||
internal UTF features of PCRE that are not relevant to Perl, respectively. The
|
||||
sixth and seventh tests do the same for Unicode character properties support.
|
||||
|
||||
The eighth, ninth, and tenth tests check the pcre_dfa_exec() alternative
|
||||
matching function, in non-UTF-8/16 mode, UTF-8/16 mode, and UTF-8/16 mode with
|
||||
Unicode property support, respectively.
|
||||
matching function, in non-UTF-8/16/32 mode, UTF-8/16/32 mode, and UTF-8/16/32
|
||||
mode with Unicode property support, respectively.
|
||||
|
||||
The eleventh test checks some internal offsets and code size features; it is
|
||||
run only when the default "link size" of 2 is set (in other cases the sizes
|
||||
@ -675,16 +704,21 @@ test is run only when JIT support is not available. They test some JIT-specific
|
||||
features such as information output from pcretest about JIT compilation.
|
||||
|
||||
The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
|
||||
the seventeenth, eighteenth, and nineteenth tests are run only in 16-bit mode.
|
||||
the seventeenth, eighteenth, and nineteenth tests are run only in 16/32-bit mode.
|
||||
These are tests that generate different output in the two modes. They are for
|
||||
general cases, UTF-8/16 support, and Unicode property support, respectively.
|
||||
general cases, UTF-8/16/32 support, and Unicode property support, respectively.
|
||||
|
||||
The twentieth test is run only in 16-bit mode. It tests some specific 16-bit
|
||||
features of the DFA matching engine.
|
||||
The twentieth test is run only in 16/32-bit mode. It tests some specific
|
||||
16/32-bit features of the DFA matching engine.
|
||||
|
||||
The twenty-first and twenty-second tests are run only in 16-bit mode, when the
|
||||
link size is set to 2. They test reloading pre-compiled patterns.
|
||||
The twenty-first and twenty-second tests are run only in 16/32-bit mode, when the
|
||||
link size is set to 2 for the 16-bit library. They test reloading pre-compiled patterns.
|
||||
|
||||
The twenty-third and twenty-fourth tests are run only in 16-bit mode. They are for
|
||||
general cases, and UTF-16 support, respectively.
|
||||
|
||||
The twenty-fifth and twenty-sixth tests are run only in 32-bit mode. They are for
|
||||
general cases, and UTF-32 support, respectively.
|
||||
|
||||
Character tables
|
||||
----------------
|
||||
@ -744,8 +778,8 @@ File manifest
|
||||
-------------
|
||||
|
||||
The distribution should contain the files listed below. Where a file name is
|
||||
given as pcre[16]_xxx it means that there are two files, one with the name
|
||||
pcre_xxx and the other with the name pcre16_xxx.
|
||||
given as pcre[16|32]_xxx it means that there are three files, one with the name
|
||||
pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
|
||||
|
||||
(A) Source files of the PCRE library functions and their headers:
|
||||
|
||||
@ -757,31 +791,33 @@ pcre_xxx and the other with the name pcre16_xxx.
|
||||
specified, by copying to pcre[16]_chartables.c
|
||||
|
||||
pcreposix.c )
|
||||
pcre[16]_byte_order.c )
|
||||
pcre[16]_compile.c )
|
||||
pcre[16]_config.c )
|
||||
pcre[16]_dfa_exec.c )
|
||||
pcre[16]_exec.c )
|
||||
pcre[16]_fullinfo.c )
|
||||
pcre[16]_get.c ) sources for the functions in the library,
|
||||
pcre[16]_globals.c ) and some internal functions that they use
|
||||
pcre[16]_jit_compile.c )
|
||||
pcre[16]_maketables.c )
|
||||
pcre[16]_newline.c )
|
||||
pcre[16]_refcount.c )
|
||||
pcre[16]_string_utils.c )
|
||||
pcre[16]_study.c )
|
||||
pcre[16]_tables.c )
|
||||
pcre[16]_ucd.c )
|
||||
pcre[16]_version.c )
|
||||
pcre[16]_xclass.c )
|
||||
pcre[16|32]_byte_order.c )
|
||||
pcre[16|32]_compile.c )
|
||||
pcre[16|32]_config.c )
|
||||
pcre[16|32]_dfa_exec.c )
|
||||
pcre[16|32]_exec.c )
|
||||
pcre[16|32]_fullinfo.c )
|
||||
pcre[16|32]_get.c ) sources for the functions in the library,
|
||||
pcre[16|32]_globals.c ) and some internal functions that they use
|
||||
pcre[16|32]_jit_compile.c )
|
||||
pcre[16|32]_maketables.c )
|
||||
pcre[16|32]_newline.c )
|
||||
pcre[16|32]_refcount.c )
|
||||
pcre[16|32]_string_utils.c )
|
||||
pcre[16|32]_study.c )
|
||||
pcre[16|32]_tables.c )
|
||||
pcre[16|32]_ucd.c )
|
||||
pcre[16|32]_version.c )
|
||||
pcre[16|32]_xclass.c )
|
||||
pcre_ord2utf8.c )
|
||||
pcre_valid_utf8.c )
|
||||
pcre16_ord2utf16.c )
|
||||
pcre16_utf16_utils.c )
|
||||
pcre16_valid_utf16.c )
|
||||
pcre32_utf32_utils.c )
|
||||
pcre32_valid_utf32.c )
|
||||
|
||||
pcre[16]_printint.c ) debugging function that is used by pcretest,
|
||||
pcre[16|32]_printint.c ) debugging function that is used by pcretest,
|
||||
) and can also be #included in pcre_compile()
|
||||
|
||||
pcre.h.in template for pcre.h when built by "configure"
|
||||
@ -847,6 +883,7 @@ pcre_xxx and the other with the name pcre16_xxx.
|
||||
doc/perltest.txt plain text documentation of Perl test program
|
||||
install-sh a shell script for installing files
|
||||
libpcre16.pc.in template for libpcre16.pc for pkg-config
|
||||
libpcre32.pc.in template for libpcre32.pc for pkg-config
|
||||
libpcre.pc.in template for libpcre.pc for pkg-config
|
||||
libpcreposix.pc.in template for libpcreposix.pc for pkg-config
|
||||
libpcrecpp.pc.in template for libpcrecpp.pc for pkg-config
|
||||
@ -895,4 +932,4 @@ pcre_xxx and the other with the name pcre16_xxx.
|
||||
Philip Hazel
|
||||
Email local part: ph10
|
||||
Email domain: cam.ac.uk
|
||||
Last updated: 18 June 2012
|
||||
Last updated: 27 October 2012
|
||||
|
@ -31,16 +31,17 @@
|
||||
/* config.h.in. Generated from configure.ac by autoheader. */
|
||||
|
||||
|
||||
/* On Unix-like systems config.h.in is converted by "configure" into config.h.
|
||||
Some other environments also support the use of "configure". PCRE is written in
|
||||
Standard C, but there are a few non-standard things it can cope with, allowing
|
||||
it to run on SunOS4 and other "close to standard" systems.
|
||||
/* PCRE is written in Standard C, but there are a few non-standard things it
|
||||
can cope with, allowing it to run on SunOS4 and other "close to standard"
|
||||
systems.
|
||||
|
||||
If you are going to build PCRE "by hand" on a system without "configure" you
|
||||
should copy the distributed config.h.generic to config.h, and then set up the
|
||||
macro definitions the way you need them. You must then add -DHAVE_CONFIG_H to
|
||||
all of your compile commands, so that config.h is included at the start of
|
||||
every source.
|
||||
In environments that support the facilities, config.h.in is converted by
|
||||
"configure", or config-cmake.h.in is converted by CMake, into config.h. If you
|
||||
are going to build PCRE "by hand" without using "configure" or CMake, you
|
||||
should copy the distributed config.h.generic to config.h, and then edit the
|
||||
macro definitions to be the way you need them. You must then add
|
||||
-DHAVE_CONFIG_H to all of your compile commands, so that config.h is included
|
||||
at the start of every source.
|
||||
|
||||
Alternatively, you can avoid editing by using -D on the compiler command line
|
||||
to set the macro values. In this case, you do not have to set -DHAVE_CONFIG_H.
|
||||
@ -50,19 +51,27 @@ HAVE_BCOPY is set to 1. If your system has neither bcopy() nor memmove(), set
|
||||
them both to 0; an emulation function will be used. */
|
||||
|
||||
/* By default, the \R escape sequence matches any Unicode line ending
|
||||
character or sequence of characters. If BSR_ANYCRLF is defined, this is
|
||||
changed so that backslash-R matches only CR, LF, or CRLF. The build- time
|
||||
default can be overridden by the user of PCRE at runtime. On systems that
|
||||
support it, "configure" can be used to override the default. */
|
||||
/* #undef BSR_ANYCRLF */
|
||||
character or sequence of characters. If BSR_ANYCRLF is defined (to any
|
||||
value), this is changed so that backslash-R matches only CR, LF, or CRLF.
|
||||
The build-time default can be overridden by the user of PCRE at runtime. */
|
||||
#undef BSR_ANYCRLF
|
||||
|
||||
/* If you are compiling for a system that uses EBCDIC instead of ASCII
|
||||
character codes, define this macro as 1. On systems that can use
|
||||
"configure", this can be done via --enable-ebcdic. PCRE will then assume
|
||||
that all input strings are in EBCDIC. If you do not define this macro, PCRE
|
||||
will assume input strings are ASCII or UTF-8 Unicode. It is not possible to
|
||||
build a version of PCRE that supports both EBCDIC and UTF-8. */
|
||||
/* #undef EBCDIC */
|
||||
character codes, define this macro to any value. You must also edit the
|
||||
NEWLINE macro below to set a suitable EBCDIC newline, commonly 21 (0x15).
|
||||
On systems that can use "configure" or CMake to set EBCDIC, NEWLINE is
|
||||
automatically adjusted. When EBCDIC is set, PCRE assumes that all input
|
||||
strings are in EBCDIC. If you do not define this macro, PCRE will assume
|
||||
input strings are ASCII or UTF-8/16/32 Unicode. It is not possible to build
|
||||
a version of PCRE that supports both EBCDIC and UTF-8/16/32. */
|
||||
#undef EBCDIC
|
||||
|
||||
/* In an EBCDIC environment, define this macro to any value to arrange for the
|
||||
NL character to be 0x25 instead of the default 0x15. NL plays the role that
|
||||
LF does in an ASCII/Unicode environment. The value must also be set in the
|
||||
NEWLINE macro below. On systems that can use "configure" or CMake to set
|
||||
EBCDIC_NL25, the adjustment of NEWLINE is automatic. */
|
||||
#undef EBCDIC_NL25
|
||||
|
||||
/* Define to 1 if you have the `bcopy' function. */
|
||||
#ifndef HAVE_BCOPY
|
||||
@ -87,6 +96,12 @@ them both to 0; an emulation function will be used. */
|
||||
#define HAVE_DLFCN_H 1
|
||||
#endif
|
||||
|
||||
/* Define to 1 if you have the <editline/readline.h> header file. */
|
||||
/*#undef HAVE_EDITLINE_READLINE_H*/
|
||||
|
||||
/* Define to 1 if you have the <edit/readline/readline.h> header file. */
|
||||
/* #undef HAVE_EDIT_READLINE_READLINE_H */
|
||||
|
||||
/* Define to 1 if you have the <inttypes.h> header file. */
|
||||
#ifndef HAVE_INTTYPES_H
|
||||
#define HAVE_INTTYPES_H 1
|
||||
@ -112,6 +127,11 @@ them both to 0; an emulation function will be used. */
|
||||
#define HAVE_MEMORY_H 1
|
||||
#endif
|
||||
|
||||
/* Define if you have POSIX threads libraries and header files. */
|
||||
#undef HAVE_PTHREAD
|
||||
|
||||
/* Have PTHREAD_PRIO_INHERIT. */
|
||||
#undef HAVE_PTHREAD_PRIO_INHERIT
|
||||
/* Define to 1 if you have the <readline/history.h> header file. */
|
||||
#ifndef HAVE_READLINE_HISTORY_H
|
||||
#define HAVE_READLINE_HISTORY_H 1
|
||||
@ -186,6 +206,10 @@ them both to 0; an emulation function will be used. */
|
||||
#define HAVE_UNSIGNED_LONG_LONG 1
|
||||
#endif
|
||||
|
||||
/* Define to 1 or 0, depending whether the compiler supports simple visibility
|
||||
declarations. */
|
||||
/* #undef HAVE_VISIBILITY */
|
||||
|
||||
/* Define to 1 if you have the <windows.h> header file. */
|
||||
/* #undef HAVE_WINDOWS_H */
|
||||
|
||||
@ -254,22 +278,28 @@ them both to 0; an emulation function will be used. */
|
||||
#define MAX_NAME_SIZE 32
|
||||
#endif
|
||||
|
||||
/* The value of NEWLINE determines the newline character sequence. On systems
|
||||
that support it, "configure" can be used to override the default, which is
|
||||
10. The possible values are 10 (LF), 13 (CR), 3338 (CRLF), -1 (ANY), or -2
|
||||
(ANYCRLF). */
|
||||
/* The value of NEWLINE determines the default newline character sequence.
|
||||
PCRE client programs can override this by selecting other values at run
|
||||
time. In ASCII environments, the value can be 10 (LF), 13 (CR), or 3338
|
||||
(CRLF); in EBCDIC environments the value can be 21 or 37 (LF), 13 (CR), or
|
||||
3349 or 3365 (CRLF) because there are two alternative codepoints (0x15 and
|
||||
0x25) that are used as the NL line terminator that is equivalent to ASCII
|
||||
LF. In both ASCII and EBCDIC environments the value can also be -1 (ANY),
|
||||
or -2 (ANYCRLF). */
|
||||
#ifndef NEWLINE
|
||||
#define NEWLINE 10
|
||||
#endif
|
||||
|
||||
/* Define to 1 if your C compiler doesn't accept -c and -o together. */
|
||||
/* #undef NO_MINUS_C_MINUS_O */
|
||||
|
||||
/* PCRE uses recursive function calls to handle backtracking while matching.
|
||||
This can sometimes be a problem on systems that have stacks of limited
|
||||
size. Define NO_RECURSE to get a version that doesn't use recursion in the
|
||||
match() function; instead it creates its own stack by steam using
|
||||
pcre_recurse_malloc() to obtain memory from the heap. For more detail, see
|
||||
the comments and other stuff just above the match() function. On systems
|
||||
that support it, "configure" can be used to set this in the Makefile (use
|
||||
--disable-stack-for-recursion). */
|
||||
size. Define NO_RECURSE to any value to get a version that doesn't use
|
||||
recursion in the match() function; instead it creates its own stack by
|
||||
steam using pcre_recurse_malloc() to obtain memory from the heap. For more
|
||||
detail, see the comments and other stuff just above the match() function.
|
||||
*/
|
||||
/* #undef NO_RECURSE */
|
||||
|
||||
/* Name of package */
|
||||
@ -282,7 +312,7 @@ them both to 0; an emulation function will be used. */
|
||||
#define PACKAGE_NAME "PCRE"
|
||||
|
||||
/* Define to the full name and version of this package. */
|
||||
#define PACKAGE_STRING "PCRE 8.31"
|
||||
#define PACKAGE_STRING "PCRE 8.32"
|
||||
|
||||
/* Define to the one symbol short name of this package. */
|
||||
#define PACKAGE_TARNAME "pcre"
|
||||
@ -291,21 +321,46 @@ them both to 0; an emulation function will be used. */
|
||||
#define PACKAGE_URL ""
|
||||
|
||||
/* Define to the version of this package. */
|
||||
#define PACKAGE_VERSION "8.31"
|
||||
#define PACKAGE_VERSION "8.32"
|
||||
|
||||
/* to make a symbol visible */
|
||||
/* #undef PCRECPP_EXP_DECL */
|
||||
|
||||
/* to make a symbol visible */
|
||||
/* #undef PCRECPP_EXP_DEFN */
|
||||
|
||||
/* The value of PCREGREP_BUFSIZE determines the size of buffer used by
|
||||
pcregrep to hold parts of the file it is searching. This is also the
|
||||
minimum value. The actual amount of memory used by pcregrep is three times
|
||||
this number, because it allows for the buffering of "before" and "after"
|
||||
lines. */
|
||||
/* #undef PCREGREP_BUFSIZE */
|
||||
|
||||
/* to make a symbol visible */
|
||||
/* #undef PCREPOSIX_EXP_DECL */
|
||||
|
||||
/* to make a symbol visible */
|
||||
/* #undef PCREPOSIX_EXP_DEFN */
|
||||
|
||||
/* to make a symbol visible */
|
||||
/* #undef PCRE_EXP_DATA_DEFN */
|
||||
|
||||
/* to make a symbol visible */
|
||||
/* #undef PCRE_EXP_DECL */
|
||||
|
||||
|
||||
/* If you are compiling for a system other than a Unix-like system or
|
||||
Win32, and it needs some magic to be inserted before the definition
|
||||
of a function that is exported by the library, define this macro to
|
||||
contain the relevant magic. If you do not define this macro, it
|
||||
defaults to "extern" for a C compiler and "extern C" for a C++
|
||||
compiler on non-Win32 systems. This macro apears at the start of
|
||||
every exported function that is part of the external API. It does
|
||||
not appear on functions that are "external" in the C sense, but
|
||||
which are internal to the library. */
|
||||
contain the relevant magic. If you do not define this macro, a suitable
|
||||
__declspec value is used for Windows systems; in other environments
|
||||
"extern" is used for a C compiler and "extern C" for a C++ compiler.
|
||||
This macro apears at the start of every exported function that is part
|
||||
of the external API. It does not appear on functions that are "external"
|
||||
in the C sense, but which are internal to the library. */
|
||||
/* #undef PCRE_EXP_DEFN */
|
||||
|
||||
/* Define if linking statically (TODO: make nice with Libtool) */
|
||||
/* Define to any value if linking statically (TODO: make nice with Libtool) */
|
||||
/* #undef PCRE_STATIC */
|
||||
|
||||
/* When calling PCRE via the POSIX interface, additional working storage is
|
||||
@ -314,40 +369,68 @@ them both to 0; an emulation function will be used. */
|
||||
only two. If the number of expected substrings is small, the wrapper
|
||||
function uses space on the stack, because this is faster than using
|
||||
malloc() for each call. The threshold above which the stack is no longer
|
||||
used is defined by POSIX_MALLOC_THRESHOLD. On systems that support it,
|
||||
"configure" can be used to override this default. */
|
||||
used is defined by POSIX_MALLOC_THRESHOLD. */
|
||||
#ifndef POSIX_MALLOC_THRESHOLD
|
||||
#define POSIX_MALLOC_THRESHOLD 10
|
||||
#endif
|
||||
|
||||
/* Define to necessary symbol if this constant uses a non-standard name on
|
||||
your system. */
|
||||
/* #undef PTHREAD_CREATE_JOINABLE */
|
||||
|
||||
/* Define to 1 if you have the ANSI C header files. */
|
||||
#ifndef STDC_HEADERS
|
||||
#define STDC_HEADERS 1
|
||||
#endif
|
||||
|
||||
/* Define to allow pcregrep to be linked with libbz2, so that it is able to
|
||||
handle .bz2 files. */
|
||||
/* Define to allow pcretest and pcregrep to be linked with gcov, so that they
|
||||
are able to generate code coverage reports. */
|
||||
#undef SUPPORT_GCOV
|
||||
|
||||
/* Define to any value to enable support for Just-In-Time compiling. */
|
||||
#undef SUPPORT_JIT
|
||||
|
||||
/* Define to any value to allow pcregrep to be linked with libbz2, so that it
|
||||
is able to handle .bz2 files. */
|
||||
/* #undef SUPPORT_LIBBZ2 */
|
||||
|
||||
/* Define to allow pcretest to be linked with libreadline. */
|
||||
/* Define to any value to allow pcretest to be linked with libedit. */
|
||||
#undef SUPPORT_LIBEDIT
|
||||
|
||||
/* Define to any value to allow pcretest to be linked with libreadline. */
|
||||
/* #undef SUPPORT_LIBREADLINE */
|
||||
|
||||
/* Define to allow pcregrep to be linked with libz, so that it is able to
|
||||
handle .gz files. */
|
||||
/* Define to any value to allow pcregrep to be linked with libz, so that it is
|
||||
able to handle .gz files. */
|
||||
/* #undef SUPPORT_LIBZ */
|
||||
|
||||
/* Define to any value to enable the 16 bit PCRE library. */
|
||||
/* #undef SUPPORT_PCRE16 */
|
||||
|
||||
/* Define to any value to enable the 32 bit PCRE library. */
|
||||
/* #undef SUPPORT_PCRE32 */
|
||||
|
||||
/* Define to any value to enable the 8 bit PCRE library. */
|
||||
/* #undef SUPPORT_PCRE8 */
|
||||
|
||||
/* Define to any value to enable JIT support in pcregrep. */
|
||||
/* #undef SUPPORT_PCREGREP_JIT */
|
||||
|
||||
/* Define to enable support for Unicode properties */
|
||||
/* #undef SUPPORT_UCP */
|
||||
|
||||
/* Define to enable support for the UTF-8 Unicode encoding. This will work
|
||||
even in an EBCDIC environment, but it is incompatible with the EBCDIC
|
||||
macro. That is, PCRE can support *either* EBCDIC code *or* ASCII/UTF-8, but
|
||||
not both at once. */
|
||||
/* Define to any value to enable support for the UTF-8/16/32 Unicode encoding.
|
||||
This will work even in an EBCDIC environment, but it is incompatible with
|
||||
the EBCDIC macro. That is, PCRE can support *either* EBCDIC code *or*
|
||||
ASCII/UTF-8/16/32, but not both at once. */
|
||||
/* #undef SUPPORT_UTF8 */
|
||||
|
||||
/* Valgrind support to find invalid memory reads. */
|
||||
/* #undef SUPPORT_VALGRIND */
|
||||
|
||||
/* Version number of package */
|
||||
#ifndef VERSION
|
||||
#define VERSION "8.31"
|
||||
#define VERSION "8.32"
|
||||
#endif
|
||||
|
||||
/* Define to empty if `const' does not conform to ANSI C. */
|
||||
|
@ -43,7 +43,9 @@ character tables for PCRE. The tables are built according to the current
|
||||
locale. Now that pcre_maketables is a function visible to the outside world, we
|
||||
make use of its code from here in order to be consistent. */
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include <ctype.h>
|
||||
#include <stdio.h>
|
||||
@ -106,11 +108,24 @@ fprintf(f,
|
||||
"library and dead code stripping is activated. This leads to link errors.\n"
|
||||
"Pulling in the header ensures that the array gets flagged as \"someone\n"
|
||||
"outside this compilation unit might reference this\" and so it will always\n"
|
||||
"be supplied to the linker. */\n\n"
|
||||
"be supplied to the linker. */\n\n");
|
||||
|
||||
/* Force config.h in z/OS */
|
||||
|
||||
#if defined NATIVE_ZOS
|
||||
fprintf(f,
|
||||
"/* For z/OS, config.h is forced */\n"
|
||||
"#ifndef HAVE_CONFIG_H\n"
|
||||
"#define HAVE_CONFIG_H 1\n"
|
||||
"#endif\n\n");
|
||||
#endif
|
||||
|
||||
fprintf(f,
|
||||
"#ifdef HAVE_CONFIG_H\n"
|
||||
"#include \"config.h\"\n"
|
||||
"#endif\n\n"
|
||||
"#include \"pcre_internal.h\"\n\n");
|
||||
|
||||
fprintf(f,
|
||||
"const pcre_uint8 PRIV(default_tables)[] = {\n\n"
|
||||
"/* This table is a lower casing table. */\n\n");
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -42,9 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||
/* The current PCRE version information. */
|
||||
|
||||
#define PCRE_MAJOR 8
|
||||
#define PCRE_MINOR 31
|
||||
#define PCRE_MINOR 32
|
||||
#define PCRE_PRERELEASE
|
||||
#define PCRE_DATE 2012-07-06
|
||||
#define PCRE_DATE 2012-11-30
|
||||
|
||||
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
||||
imported have to be identified as such. When building PCRE, the appropriate
|
||||
@ -95,54 +95,70 @@ it is needed here for malloc. */
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
/* Options. Some are compile-time only, some are run-time only, and some are
|
||||
both, so we keep them all distinct. However, almost all the bits in the options
|
||||
word are now used. In the long run, we may have to re-use some of the
|
||||
compile-time only bits for runtime options, or vice versa. In the comments
|
||||
below, "compile", "exec", and "DFA exec" mean that the option is permitted to
|
||||
be set for those functions; "used in" means that an option may be set only for
|
||||
compile, but is subsequently referenced in exec and/or DFA exec. Any of the
|
||||
/* Public options. Some are compile-time only, some are run-time only, and some
|
||||
are both, so we keep them all distinct. However, almost all the bits in the
|
||||
options word are now used. In the long run, we may have to re-use some of the
|
||||
compile-time only bits for runtime options, or vice versa. Any of the
|
||||
compile-time options may be inspected during studying (and therefore JIT
|
||||
compiling). */
|
||||
compiling).
|
||||
|
||||
#define PCRE_CASELESS 0x00000001 /* Compile */
|
||||
#define PCRE_MULTILINE 0x00000002 /* Compile */
|
||||
#define PCRE_DOTALL 0x00000004 /* Compile */
|
||||
#define PCRE_EXTENDED 0x00000008 /* Compile */
|
||||
#define PCRE_ANCHORED 0x00000010 /* Compile, exec, DFA exec */
|
||||
#define PCRE_DOLLAR_ENDONLY 0x00000020 /* Compile, used in exec, DFA exec */
|
||||
#define PCRE_EXTRA 0x00000040 /* Compile */
|
||||
#define PCRE_NOTBOL 0x00000080 /* Exec, DFA exec */
|
||||
#define PCRE_NOTEOL 0x00000100 /* Exec, DFA exec */
|
||||
#define PCRE_UNGREEDY 0x00000200 /* Compile */
|
||||
#define PCRE_NOTEMPTY 0x00000400 /* Exec, DFA exec */
|
||||
/* The next two are also used in exec and DFA exec */
|
||||
#define PCRE_UTF8 0x00000800 /* Compile (same as PCRE_UTF16) */
|
||||
#define PCRE_UTF16 0x00000800 /* Compile (same as PCRE_UTF8) */
|
||||
#define PCRE_NO_AUTO_CAPTURE 0x00001000 /* Compile */
|
||||
/* The next two are also used in exec and DFA exec */
|
||||
#define PCRE_NO_UTF8_CHECK 0x00002000 /* Compile (same as PCRE_NO_UTF16_CHECK) */
|
||||
#define PCRE_NO_UTF16_CHECK 0x00002000 /* Compile (same as PCRE_NO_UTF8_CHECK) */
|
||||
#define PCRE_AUTO_CALLOUT 0x00004000 /* Compile */
|
||||
#define PCRE_PARTIAL_SOFT 0x00008000 /* Exec, DFA exec */
|
||||
#define PCRE_PARTIAL 0x00008000 /* Backwards compatible synonym */
|
||||
#define PCRE_DFA_SHORTEST 0x00010000 /* DFA exec */
|
||||
#define PCRE_DFA_RESTART 0x00020000 /* DFA exec */
|
||||
#define PCRE_FIRSTLINE 0x00040000 /* Compile, used in exec, DFA exec */
|
||||
#define PCRE_DUPNAMES 0x00080000 /* Compile */
|
||||
#define PCRE_NEWLINE_CR 0x00100000 /* Compile, exec, DFA exec */
|
||||
#define PCRE_NEWLINE_LF 0x00200000 /* Compile, exec, DFA exec */
|
||||
#define PCRE_NEWLINE_CRLF 0x00300000 /* Compile, exec, DFA exec */
|
||||
#define PCRE_NEWLINE_ANY 0x00400000 /* Compile, exec, DFA exec */
|
||||
#define PCRE_NEWLINE_ANYCRLF 0x00500000 /* Compile, exec, DFA exec */
|
||||
#define PCRE_BSR_ANYCRLF 0x00800000 /* Compile, exec, DFA exec */
|
||||
#define PCRE_BSR_UNICODE 0x01000000 /* Compile, exec, DFA exec */
|
||||
#define PCRE_JAVASCRIPT_COMPAT 0x02000000 /* Compile, used in exec */
|
||||
#define PCRE_NO_START_OPTIMIZE 0x04000000 /* Compile, exec, DFA exec */
|
||||
#define PCRE_NO_START_OPTIMISE 0x04000000 /* Synonym */
|
||||
#define PCRE_PARTIAL_HARD 0x08000000 /* Exec, DFA exec */
|
||||
#define PCRE_NOTEMPTY_ATSTART 0x10000000 /* Exec, DFA exec */
|
||||
#define PCRE_UCP 0x20000000 /* Compile, used in exec, DFA exec */
|
||||
Some options for pcre_compile() change its behaviour but do not affect the
|
||||
behaviour of the execution functions. Other options are passed through to the
|
||||
execution functions and affect their behaviour, with or without affecting the
|
||||
behaviour of pcre_compile().
|
||||
|
||||
Options that can be passed to pcre_compile() are tagged Cx below, with these
|
||||
variants:
|
||||
|
||||
C1 Affects compile only
|
||||
C2 Does not affect compile; affects exec, dfa_exec
|
||||
C3 Affects compile, exec, dfa_exec
|
||||
C4 Affects compile, exec, dfa_exec, study
|
||||
C5 Affects compile, exec, study
|
||||
|
||||
Options that can be set for pcre_exec() and/or pcre_dfa_exec() are flagged with
|
||||
E and D, respectively. They take precedence over C3, C4, and C5 settings passed
|
||||
from pcre_compile(). Those that are compatible with JIT execution are flagged
|
||||
with J. */
|
||||
|
||||
#define PCRE_CASELESS 0x00000001 /* C1 */
|
||||
#define PCRE_MULTILINE 0x00000002 /* C1 */
|
||||
#define PCRE_DOTALL 0x00000004 /* C1 */
|
||||
#define PCRE_EXTENDED 0x00000008 /* C1 */
|
||||
#define PCRE_ANCHORED 0x00000010 /* C4 E D */
|
||||
#define PCRE_DOLLAR_ENDONLY 0x00000020 /* C2 */
|
||||
#define PCRE_EXTRA 0x00000040 /* C1 */
|
||||
#define PCRE_NOTBOL 0x00000080 /* E D J */
|
||||
#define PCRE_NOTEOL 0x00000100 /* E D J */
|
||||
#define PCRE_UNGREEDY 0x00000200 /* C1 */
|
||||
#define PCRE_NOTEMPTY 0x00000400 /* E D J */
|
||||
#define PCRE_UTF8 0x00000800 /* C4 ) */
|
||||
#define PCRE_UTF16 0x00000800 /* C4 ) Synonyms */
|
||||
#define PCRE_UTF32 0x00000800 /* C4 ) */
|
||||
#define PCRE_NO_AUTO_CAPTURE 0x00001000 /* C1 */
|
||||
#define PCRE_NO_UTF8_CHECK 0x00002000 /* C1 E D J ) */
|
||||
#define PCRE_NO_UTF16_CHECK 0x00002000 /* C1 E D J ) Synonyms */
|
||||
#define PCRE_NO_UTF32_CHECK 0x00002000 /* C1 E D J ) */
|
||||
#define PCRE_AUTO_CALLOUT 0x00004000 /* C1 */
|
||||
#define PCRE_PARTIAL_SOFT 0x00008000 /* E D J ) Synonyms */
|
||||
#define PCRE_PARTIAL 0x00008000 /* E D J ) */
|
||||
#define PCRE_DFA_SHORTEST 0x00010000 /* D */
|
||||
#define PCRE_DFA_RESTART 0x00020000 /* D */
|
||||
#define PCRE_FIRSTLINE 0x00040000 /* C3 */
|
||||
#define PCRE_DUPNAMES 0x00080000 /* C1 */
|
||||
#define PCRE_NEWLINE_CR 0x00100000 /* C3 E D */
|
||||
#define PCRE_NEWLINE_LF 0x00200000 /* C3 E D */
|
||||
#define PCRE_NEWLINE_CRLF 0x00300000 /* C3 E D */
|
||||
#define PCRE_NEWLINE_ANY 0x00400000 /* C3 E D */
|
||||
#define PCRE_NEWLINE_ANYCRLF 0x00500000 /* C3 E D */
|
||||
#define PCRE_BSR_ANYCRLF 0x00800000 /* C3 E D */
|
||||
#define PCRE_BSR_UNICODE 0x01000000 /* C3 E D */
|
||||
#define PCRE_JAVASCRIPT_COMPAT 0x02000000 /* C5 */
|
||||
#define PCRE_NO_START_OPTIMIZE 0x04000000 /* C2 E D ) Synonyms */
|
||||
#define PCRE_NO_START_OPTIMISE 0x04000000 /* C2 E D ) */
|
||||
#define PCRE_PARTIAL_HARD 0x08000000 /* E D J */
|
||||
#define PCRE_NOTEMPTY_ATSTART 0x10000000 /* E D J */
|
||||
#define PCRE_UCP 0x20000000 /* C3 */
|
||||
|
||||
/* Exec-time and get/set-time error codes */
|
||||
|
||||
@ -156,8 +172,9 @@ compiling). */
|
||||
#define PCRE_ERROR_NOSUBSTRING (-7)
|
||||
#define PCRE_ERROR_MATCHLIMIT (-8)
|
||||
#define PCRE_ERROR_CALLOUT (-9) /* Never used by PCRE itself */
|
||||
#define PCRE_ERROR_BADUTF8 (-10) /* Same for 8/16 */
|
||||
#define PCRE_ERROR_BADUTF16 (-10) /* Same for 8/16 */
|
||||
#define PCRE_ERROR_BADUTF8 (-10) /* Same for 8/16/32 */
|
||||
#define PCRE_ERROR_BADUTF16 (-10) /* Same for 8/16/32 */
|
||||
#define PCRE_ERROR_BADUTF32 (-10) /* Same for 8/16/32 */
|
||||
#define PCRE_ERROR_BADUTF8_OFFSET (-11) /* Same for 8/16 */
|
||||
#define PCRE_ERROR_BADUTF16_OFFSET (-11) /* Same for 8/16 */
|
||||
#define PCRE_ERROR_PARTIAL (-12)
|
||||
@ -180,6 +197,8 @@ compiling). */
|
||||
#define PCRE_ERROR_BADMODE (-28)
|
||||
#define PCRE_ERROR_BADENDIANNESS (-29)
|
||||
#define PCRE_ERROR_DFA_BADRESTART (-30)
|
||||
#define PCRE_ERROR_JIT_BADOPTION (-31)
|
||||
#define PCRE_ERROR_BADLENGTH (-32)
|
||||
|
||||
/* Specific error codes for UTF-8 validity checks */
|
||||
|
||||
@ -205,6 +224,7 @@ compiling). */
|
||||
#define PCRE_UTF8_ERR19 19
|
||||
#define PCRE_UTF8_ERR20 20
|
||||
#define PCRE_UTF8_ERR21 21
|
||||
#define PCRE_UTF8_ERR22 22
|
||||
|
||||
/* Specific error codes for UTF-16 validity checks */
|
||||
|
||||
@ -214,6 +234,13 @@ compiling). */
|
||||
#define PCRE_UTF16_ERR3 3
|
||||
#define PCRE_UTF16_ERR4 4
|
||||
|
||||
/* Specific error codes for UTF-32 validity checks */
|
||||
|
||||
#define PCRE_UTF32_ERR0 0
|
||||
#define PCRE_UTF32_ERR1 1
|
||||
#define PCRE_UTF32_ERR2 2
|
||||
#define PCRE_UTF32_ERR3 3
|
||||
|
||||
/* Request types for pcre_fullinfo() */
|
||||
|
||||
#define PCRE_INFO_OPTIONS 0
|
||||
@ -236,6 +263,10 @@ compiling). */
|
||||
#define PCRE_INFO_JIT 16
|
||||
#define PCRE_INFO_JITSIZE 17
|
||||
#define PCRE_INFO_MAXLOOKBEHIND 18
|
||||
#define PCRE_INFO_FIRSTCHARACTER 19
|
||||
#define PCRE_INFO_FIRSTCHARACTERFLAGS 20
|
||||
#define PCRE_INFO_REQUIREDCHAR 21
|
||||
#define PCRE_INFO_REQUIREDCHARFLAGS 22
|
||||
|
||||
/* Request types for pcre_config(). Do not re-arrange, in order to remain
|
||||
compatible. */
|
||||
@ -252,6 +283,7 @@ compatible. */
|
||||
#define PCRE_CONFIG_JIT 9
|
||||
#define PCRE_CONFIG_UTF16 10
|
||||
#define PCRE_CONFIG_JITTARGET 11
|
||||
#define PCRE_CONFIG_UTF32 12
|
||||
|
||||
/* Request types for pcre_study(). Do not re-arrange, in order to remain
|
||||
compatible. */
|
||||
@ -259,8 +291,9 @@ compatible. */
|
||||
#define PCRE_STUDY_JIT_COMPILE 0x0001
|
||||
#define PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE 0x0002
|
||||
#define PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE 0x0004
|
||||
#define PCRE_STUDY_EXTRA_NEEDED 0x0008
|
||||
|
||||
/* Bit flags for the pcre[16]_extra structure. Do not re-arrange or redefine
|
||||
/* Bit flags for the pcre[16|32]_extra structure. Do not re-arrange or redefine
|
||||
these bits, just add new ones on the end, in order to remain compatible. */
|
||||
|
||||
#define PCRE_EXTRA_STUDY_DATA 0x0001
|
||||
@ -279,12 +312,18 @@ typedef struct real_pcre pcre;
|
||||
struct real_pcre16; /* declaration; the definition is private */
|
||||
typedef struct real_pcre16 pcre16;
|
||||
|
||||
struct real_pcre32; /* declaration; the definition is private */
|
||||
typedef struct real_pcre32 pcre32;
|
||||
|
||||
struct real_pcre_jit_stack; /* declaration; the definition is private */
|
||||
typedef struct real_pcre_jit_stack pcre_jit_stack;
|
||||
|
||||
struct real_pcre16_jit_stack; /* declaration; the definition is private */
|
||||
typedef struct real_pcre16_jit_stack pcre16_jit_stack;
|
||||
|
||||
struct real_pcre32_jit_stack; /* declaration; the definition is private */
|
||||
typedef struct real_pcre32_jit_stack pcre32_jit_stack;
|
||||
|
||||
/* If PCRE is compiled with 16 bit character support, PCRE_UCHAR16 must contain
|
||||
a 16 bit wide signed data type. Otherwise it can be a dummy data type since
|
||||
pcre16 functions are not implemented. There is a check for this in pcre_internal.h. */
|
||||
@ -296,6 +335,17 @@ pcre16 functions are not implemented. There is a check for this in pcre_internal
|
||||
#define PCRE_SPTR16 const PCRE_UCHAR16 *
|
||||
#endif
|
||||
|
||||
/* If PCRE is compiled with 32 bit character support, PCRE_UCHAR32 must contain
|
||||
a 32 bit wide signed data type. Otherwise it can be a dummy data type since
|
||||
pcre32 functions are not implemented. There is a check for this in pcre_internal.h. */
|
||||
#ifndef PCRE_UCHAR32
|
||||
#define PCRE_UCHAR32 unsigned int
|
||||
#endif
|
||||
|
||||
#ifndef PCRE_SPTR32
|
||||
#define PCRE_SPTR32 const PCRE_UCHAR32 *
|
||||
#endif
|
||||
|
||||
/* When PCRE is compiled as a C++ library, the subject pointer type can be
|
||||
replaced with a custom type. For conventional use, the public interface is a
|
||||
const char *. */
|
||||
@ -332,6 +382,19 @@ typedef struct pcre16_extra {
|
||||
void *executable_jit; /* Contains a pointer to a compiled jit code */
|
||||
} pcre16_extra;
|
||||
|
||||
/* Same structure as above, but with 32 bit char pointers. */
|
||||
|
||||
typedef struct pcre32_extra {
|
||||
unsigned long int flags; /* Bits for which fields are set */
|
||||
void *study_data; /* Opaque data from pcre_study() */
|
||||
unsigned long int match_limit; /* Maximum number of calls to match() */
|
||||
void *callout_data; /* Data passed back in callouts */
|
||||
const unsigned char *tables; /* Pointer to character tables */
|
||||
unsigned long int match_limit_recursion; /* Max recursive calls to match() */
|
||||
PCRE_UCHAR32 **mark; /* For passing back a mark pointer */
|
||||
void *executable_jit; /* Contains a pointer to a compiled jit code */
|
||||
} pcre32_extra;
|
||||
|
||||
/* The structure for passing out data via the pcre_callout_function. We use a
|
||||
structure so that new fields can be added on the end in future versions,
|
||||
without changing the API of the function, thereby allowing old clients to work
|
||||
@ -379,6 +442,28 @@ typedef struct pcre16_callout_block {
|
||||
/* ------------------------------------------------------------------ */
|
||||
} pcre16_callout_block;
|
||||
|
||||
/* Same structure as above, but with 32 bit char pointers. */
|
||||
|
||||
typedef struct pcre32_callout_block {
|
||||
int version; /* Identifies version of block */
|
||||
/* ------------------------ Version 0 ------------------------------- */
|
||||
int callout_number; /* Number compiled into pattern */
|
||||
int *offset_vector; /* The offset vector */
|
||||
PCRE_SPTR32 subject; /* The subject being matched */
|
||||
int subject_length; /* The length of the subject */
|
||||
int start_match; /* Offset to start of this match attempt */
|
||||
int current_position; /* Where we currently are in the subject */
|
||||
int capture_top; /* Max current capture */
|
||||
int capture_last; /* Most recently closed capture */
|
||||
void *callout_data; /* Data passed in with the call */
|
||||
/* ------------------- Added for Version 1 -------------------------- */
|
||||
int pattern_position; /* Offset to next item in the pattern */
|
||||
int next_item_length; /* Length of next item in the pattern */
|
||||
/* ------------------- Added for Version 2 -------------------------- */
|
||||
const PCRE_UCHAR32 *mark; /* Pointer to current mark or NULL */
|
||||
/* ------------------------------------------------------------------ */
|
||||
} pcre32_callout_block;
|
||||
|
||||
/* Indirection for store get and free functions. These can be set to
|
||||
alternative malloc/free functions if required. Special ones are used in the
|
||||
non-recursive case for "frames". There is also an optional callout function
|
||||
@ -397,6 +482,12 @@ PCRE_EXP_DECL void (*pcre16_free)(void *);
|
||||
PCRE_EXP_DECL void *(*pcre16_stack_malloc)(size_t);
|
||||
PCRE_EXP_DECL void (*pcre16_stack_free)(void *);
|
||||
PCRE_EXP_DECL int (*pcre16_callout)(pcre16_callout_block *);
|
||||
|
||||
PCRE_EXP_DECL void *(*pcre32_malloc)(size_t);
|
||||
PCRE_EXP_DECL void (*pcre32_free)(void *);
|
||||
PCRE_EXP_DECL void *(*pcre32_stack_malloc)(size_t);
|
||||
PCRE_EXP_DECL void (*pcre32_stack_free)(void *);
|
||||
PCRE_EXP_DECL int (*pcre32_callout)(pcre32_callout_block *);
|
||||
#else /* VPCOMPAT */
|
||||
PCRE_EXP_DECL void *pcre_malloc(size_t);
|
||||
PCRE_EXP_DECL void pcre_free(void *);
|
||||
@ -409,12 +500,19 @@ PCRE_EXP_DECL void pcre16_free(void *);
|
||||
PCRE_EXP_DECL void *pcre16_stack_malloc(size_t);
|
||||
PCRE_EXP_DECL void pcre16_stack_free(void *);
|
||||
PCRE_EXP_DECL int pcre16_callout(pcre16_callout_block *);
|
||||
|
||||
PCRE_EXP_DECL void *pcre32_malloc(size_t);
|
||||
PCRE_EXP_DECL void pcre32_free(void *);
|
||||
PCRE_EXP_DECL void *pcre32_stack_malloc(size_t);
|
||||
PCRE_EXP_DECL void pcre32_stack_free(void *);
|
||||
PCRE_EXP_DECL int pcre32_callout(pcre32_callout_block *);
|
||||
#endif /* VPCOMPAT */
|
||||
|
||||
/* User defined callback which provides a stack just before the match starts. */
|
||||
|
||||
typedef pcre_jit_stack *(*pcre_jit_callback)(void *);
|
||||
typedef pcre16_jit_stack *(*pcre16_jit_callback)(void *);
|
||||
typedef pcre32_jit_stack *(*pcre32_jit_callback)(void *);
|
||||
|
||||
/* Exported PCRE functions */
|
||||
|
||||
@ -422,83 +520,131 @@ PCRE_EXP_DECL pcre *pcre_compile(const char *, int, const char **, int *,
|
||||
const unsigned char *);
|
||||
PCRE_EXP_DECL pcre16 *pcre16_compile(PCRE_SPTR16, int, const char **, int *,
|
||||
const unsigned char *);
|
||||
PCRE_EXP_DECL pcre32 *pcre32_compile(PCRE_SPTR32, int, const char **, int *,
|
||||
const unsigned char *);
|
||||
PCRE_EXP_DECL pcre *pcre_compile2(const char *, int, int *, const char **,
|
||||
int *, const unsigned char *);
|
||||
PCRE_EXP_DECL pcre16 *pcre16_compile2(PCRE_SPTR16, int, int *, const char **,
|
||||
int *, const unsigned char *);
|
||||
PCRE_EXP_DECL pcre32 *pcre32_compile2(PCRE_SPTR32, int, int *, const char **,
|
||||
int *, const unsigned char *);
|
||||
PCRE_EXP_DECL int pcre_config(int, void *);
|
||||
PCRE_EXP_DECL int pcre16_config(int, void *);
|
||||
PCRE_EXP_DECL int pcre32_config(int, void *);
|
||||
PCRE_EXP_DECL int pcre_copy_named_substring(const pcre *, const char *,
|
||||
int *, int, const char *, char *, int);
|
||||
PCRE_EXP_DECL int pcre16_copy_named_substring(const pcre16 *, PCRE_SPTR16,
|
||||
int *, int, PCRE_SPTR16, PCRE_UCHAR16 *, int);
|
||||
PCRE_EXP_DECL int pcre32_copy_named_substring(const pcre32 *, PCRE_SPTR32,
|
||||
int *, int, PCRE_SPTR32, PCRE_UCHAR32 *, int);
|
||||
PCRE_EXP_DECL int pcre_copy_substring(const char *, int *, int, int,
|
||||
char *, int);
|
||||
PCRE_EXP_DECL int pcre16_copy_substring(PCRE_SPTR16, int *, int, int,
|
||||
PCRE_UCHAR16 *, int);
|
||||
PCRE_EXP_DECL int pcre32_copy_substring(PCRE_SPTR32, int *, int, int,
|
||||
PCRE_UCHAR32 *, int);
|
||||
PCRE_EXP_DECL int pcre_dfa_exec(const pcre *, const pcre_extra *,
|
||||
const char *, int, int, int, int *, int , int *, int);
|
||||
PCRE_EXP_DECL int pcre16_dfa_exec(const pcre16 *, const pcre16_extra *,
|
||||
PCRE_SPTR16, int, int, int, int *, int , int *, int);
|
||||
PCRE_EXP_DECL int pcre32_dfa_exec(const pcre32 *, const pcre32_extra *,
|
||||
PCRE_SPTR32, int, int, int, int *, int , int *, int);
|
||||
PCRE_EXP_DECL int pcre_exec(const pcre *, const pcre_extra *, PCRE_SPTR,
|
||||
int, int, int, int *, int);
|
||||
PCRE_EXP_DECL int pcre16_exec(const pcre16 *, const pcre16_extra *,
|
||||
PCRE_SPTR16, int, int, int, int *, int);
|
||||
PCRE_EXP_DECL int pcre32_exec(const pcre32 *, const pcre32_extra *,
|
||||
PCRE_SPTR32, int, int, int, int *, int);
|
||||
PCRE_EXP_DECL int pcre_jit_exec(const pcre *, const pcre_extra *,
|
||||
PCRE_SPTR, int, int, int, int *, int,
|
||||
pcre_jit_stack *);
|
||||
PCRE_EXP_DECL int pcre16_jit_exec(const pcre16 *, const pcre16_extra *,
|
||||
PCRE_SPTR16, int, int, int, int *, int,
|
||||
pcre16_jit_stack *);
|
||||
PCRE_EXP_DECL int pcre32_jit_exec(const pcre32 *, const pcre32_extra *,
|
||||
PCRE_SPTR32, int, int, int, int *, int,
|
||||
pcre32_jit_stack *);
|
||||
PCRE_EXP_DECL void pcre_free_substring(const char *);
|
||||
PCRE_EXP_DECL void pcre16_free_substring(PCRE_SPTR16);
|
||||
PCRE_EXP_DECL void pcre32_free_substring(PCRE_SPTR32);
|
||||
PCRE_EXP_DECL void pcre_free_substring_list(const char **);
|
||||
PCRE_EXP_DECL void pcre16_free_substring_list(PCRE_SPTR16 *);
|
||||
PCRE_EXP_DECL void pcre32_free_substring_list(PCRE_SPTR32 *);
|
||||
PCRE_EXP_DECL int pcre_fullinfo(const pcre *, const pcre_extra *, int,
|
||||
void *);
|
||||
PCRE_EXP_DECL int pcre16_fullinfo(const pcre16 *, const pcre16_extra *, int,
|
||||
void *);
|
||||
PCRE_EXP_DECL int pcre32_fullinfo(const pcre32 *, const pcre32_extra *, int,
|
||||
void *);
|
||||
PCRE_EXP_DECL int pcre_get_named_substring(const pcre *, const char *,
|
||||
int *, int, const char *, const char **);
|
||||
PCRE_EXP_DECL int pcre16_get_named_substring(const pcre16 *, PCRE_SPTR16,
|
||||
int *, int, PCRE_SPTR16, PCRE_SPTR16 *);
|
||||
PCRE_EXP_DECL int pcre32_get_named_substring(const pcre32 *, PCRE_SPTR32,
|
||||
int *, int, PCRE_SPTR32, PCRE_SPTR32 *);
|
||||
PCRE_EXP_DECL int pcre_get_stringnumber(const pcre *, const char *);
|
||||
PCRE_EXP_DECL int pcre16_get_stringnumber(const pcre16 *, PCRE_SPTR16);
|
||||
PCRE_EXP_DECL int pcre32_get_stringnumber(const pcre32 *, PCRE_SPTR32);
|
||||
PCRE_EXP_DECL int pcre_get_stringtable_entries(const pcre *, const char *,
|
||||
char **, char **);
|
||||
PCRE_EXP_DECL int pcre16_get_stringtable_entries(const pcre16 *, PCRE_SPTR16,
|
||||
PCRE_UCHAR16 **, PCRE_UCHAR16 **);
|
||||
PCRE_EXP_DECL int pcre32_get_stringtable_entries(const pcre32 *, PCRE_SPTR32,
|
||||
PCRE_UCHAR32 **, PCRE_UCHAR32 **);
|
||||
PCRE_EXP_DECL int pcre_get_substring(const char *, int *, int, int,
|
||||
const char **);
|
||||
PCRE_EXP_DECL int pcre16_get_substring(PCRE_SPTR16, int *, int, int,
|
||||
PCRE_SPTR16 *);
|
||||
PCRE_EXP_DECL int pcre32_get_substring(PCRE_SPTR32, int *, int, int,
|
||||
PCRE_SPTR32 *);
|
||||
PCRE_EXP_DECL int pcre_get_substring_list(const char *, int *, int,
|
||||
const char ***);
|
||||
PCRE_EXP_DECL int pcre16_get_substring_list(PCRE_SPTR16, int *, int,
|
||||
PCRE_SPTR16 **);
|
||||
PCRE_EXP_DECL int pcre32_get_substring_list(PCRE_SPTR32, int *, int,
|
||||
PCRE_SPTR32 **);
|
||||
PCRE_EXP_DECL const unsigned char *pcre_maketables(void);
|
||||
PCRE_EXP_DECL const unsigned char *pcre16_maketables(void);
|
||||
PCRE_EXP_DECL const unsigned char *pcre32_maketables(void);
|
||||
PCRE_EXP_DECL int pcre_refcount(pcre *, int);
|
||||
PCRE_EXP_DECL int pcre16_refcount(pcre16 *, int);
|
||||
PCRE_EXP_DECL int pcre32_refcount(pcre32 *, int);
|
||||
PCRE_EXP_DECL pcre_extra *pcre_study(const pcre *, int, const char **);
|
||||
PCRE_EXP_DECL pcre16_extra *pcre16_study(const pcre16 *, int, const char **);
|
||||
PCRE_EXP_DECL pcre32_extra *pcre32_study(const pcre32 *, int, const char **);
|
||||
PCRE_EXP_DECL void pcre_free_study(pcre_extra *);
|
||||
PCRE_EXP_DECL void pcre16_free_study(pcre16_extra *);
|
||||
PCRE_EXP_DECL void pcre32_free_study(pcre32_extra *);
|
||||
PCRE_EXP_DECL const char *pcre_version(void);
|
||||
PCRE_EXP_DECL const char *pcre16_version(void);
|
||||
PCRE_EXP_DECL const char *pcre32_version(void);
|
||||
|
||||
/* Utility functions for byte order swaps. */
|
||||
PCRE_EXP_DECL int pcre_pattern_to_host_byte_order(pcre *, pcre_extra *,
|
||||
const unsigned char *);
|
||||
PCRE_EXP_DECL int pcre16_pattern_to_host_byte_order(pcre16 *, pcre16_extra *,
|
||||
const unsigned char *);
|
||||
PCRE_EXP_DECL int pcre32_pattern_to_host_byte_order(pcre32 *, pcre32_extra *,
|
||||
const unsigned char *);
|
||||
PCRE_EXP_DECL int pcre16_utf16_to_host_byte_order(PCRE_UCHAR16 *,
|
||||
PCRE_SPTR16, int, int *, int);
|
||||
PCRE_EXP_DECL int pcre32_utf32_to_host_byte_order(PCRE_UCHAR32 *,
|
||||
PCRE_SPTR32, int, int *, int);
|
||||
|
||||
/* JIT compiler related functions. */
|
||||
|
||||
PCRE_EXP_DECL pcre_jit_stack *pcre_jit_stack_alloc(int, int);
|
||||
PCRE_EXP_DECL pcre16_jit_stack *pcre16_jit_stack_alloc(int, int);
|
||||
PCRE_EXP_DECL pcre32_jit_stack *pcre32_jit_stack_alloc(int, int);
|
||||
PCRE_EXP_DECL void pcre_jit_stack_free(pcre_jit_stack *);
|
||||
PCRE_EXP_DECL void pcre16_jit_stack_free(pcre16_jit_stack *);
|
||||
PCRE_EXP_DECL void pcre32_jit_stack_free(pcre32_jit_stack *);
|
||||
PCRE_EXP_DECL void pcre_assign_jit_stack(pcre_extra *,
|
||||
pcre_jit_callback, void *);
|
||||
PCRE_EXP_DECL void pcre16_assign_jit_stack(pcre16_extra *,
|
||||
pcre16_jit_callback, void *);
|
||||
PCRE_EXP_DECL void pcre32_assign_jit_stack(pcre32_extra *,
|
||||
pcre32_jit_callback, void *);
|
||||
|
||||
#ifdef __cplusplus
|
||||
} /* extern "C" */
|
||||
|
@ -20,11 +20,13 @@ and dead code stripping is activated. This leads to link errors. Pulling in the
|
||||
header ensures that the array gets flagged as "someone outside this compilation
|
||||
unit might reference this" and so it will always be supplied to the linker. */
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
const unsigned char _pcre_default_tables[] = {
|
||||
const pcre_uint8 PRIV(default_tables)[] = {
|
||||
|
||||
/* This table is a lower casing table. */
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -41,7 +41,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||
/* This module contains the external function pcre_config(). */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
/* Keep the original link size. */
|
||||
static int real_link_size = LINK_SIZE;
|
||||
@ -63,18 +65,21 @@ Arguments:
|
||||
Returns: 0 if data returned, negative on error
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_config(int what, void *where)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_config(int what, void *where)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_config(int what, void *where)
|
||||
#endif
|
||||
{
|
||||
switch (what)
|
||||
{
|
||||
case PCRE_CONFIG_UTF8:
|
||||
#if defined COMPILE_PCRE16
|
||||
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
*((int *)where) = 0;
|
||||
return PCRE_ERROR_BADOPTION;
|
||||
#else
|
||||
@ -87,7 +92,20 @@ switch (what)
|
||||
#endif
|
||||
|
||||
case PCRE_CONFIG_UTF16:
|
||||
#if defined COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8 || defined COMPILE_PCRE32
|
||||
*((int *)where) = 0;
|
||||
return PCRE_ERROR_BADOPTION;
|
||||
#else
|
||||
#if defined SUPPORT_UTF
|
||||
*((int *)where) = 1;
|
||||
#else
|
||||
*((int *)where) = 0;
|
||||
#endif
|
||||
break;
|
||||
#endif
|
||||
|
||||
case PCRE_CONFIG_UTF32:
|
||||
#if defined COMPILE_PCRE8 || defined COMPILE_PCRE16
|
||||
*((int *)where) = 0;
|
||||
return PCRE_ERROR_BADOPTION;
|
||||
#else
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||
information about a compiled pattern. */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
@ -63,14 +65,18 @@ Arguments:
|
||||
Returns: 0 if data returned, negative on error
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_fullinfo(const pcre *argument_re, const pcre_extra *extra_data,
|
||||
int what, void *where)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_fullinfo(const pcre16 *argument_re, const pcre16_extra *extra_data,
|
||||
int what, void *where)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_fullinfo(const pcre32 *argument_re, const pcre32_extra *extra_data,
|
||||
int what, void *where)
|
||||
#endif
|
||||
{
|
||||
const REAL_PCRE *re = (const REAL_PCRE *)argument_re;
|
||||
@ -130,10 +136,21 @@ switch (what)
|
||||
|
||||
case PCRE_INFO_FIRSTBYTE:
|
||||
*((int *)where) =
|
||||
((re->flags & PCRE_FIRSTSET) != 0)? re->first_char :
|
||||
((re->flags & PCRE_FIRSTSET) != 0)? (int)re->first_char :
|
||||
((re->flags & PCRE_STARTLINE) != 0)? -1 : -2;
|
||||
break;
|
||||
|
||||
case PCRE_INFO_FIRSTCHARACTER:
|
||||
*((pcre_uint32 *)where) =
|
||||
(re->flags & PCRE_FIRSTSET) != 0 ? re->first_char : 0;
|
||||
break;
|
||||
|
||||
case PCRE_INFO_FIRSTCHARACTERFLAGS:
|
||||
*((int *)where) =
|
||||
((re->flags & PCRE_FIRSTSET) != 0) ? 1 :
|
||||
((re->flags & PCRE_STARTLINE) != 0) ? 2 : 0;
|
||||
break;
|
||||
|
||||
/* Make sure we pass back the pointer to the bit vector in the external
|
||||
block, not the internal copy (with flipped integer fields). */
|
||||
|
||||
@ -157,7 +174,17 @@ switch (what)
|
||||
|
||||
case PCRE_INFO_LASTLITERAL:
|
||||
*((int *)where) =
|
||||
((re->flags & PCRE_REQCHSET) != 0)? re->req_char : -1;
|
||||
((re->flags & PCRE_REQCHSET) != 0)? (int)re->req_char : -1;
|
||||
break;
|
||||
|
||||
case PCRE_INFO_REQUIREDCHAR:
|
||||
*((pcre_uint32 *)where) =
|
||||
((re->flags & PCRE_REQCHSET) != 0) ? re->req_char : 0;
|
||||
break;
|
||||
|
||||
case PCRE_INFO_REQUIREDCHARFLAGS:
|
||||
*((int *)where) =
|
||||
((re->flags & PCRE_REQCHSET) != 0);
|
||||
break;
|
||||
|
||||
case PCRE_INFO_NAMEENTRYSIZE:
|
||||
|
@ -43,7 +43,9 @@ from the subject string after a regex match has succeeded. The original idea
|
||||
for these functions came from Scott Wimer. */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
@ -63,12 +65,15 @@ Returns: the number of the named parentheses, or a negative number
|
||||
(PCRE_ERROR_NOSUBSTRING) if not found
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_get_stringnumber(const pcre *code, const char *stringname)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_get_stringnumber(const pcre16 *code, PCRE_SPTR16 stringname)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_get_stringnumber(const pcre32 *code, PCRE_SPTR32 stringname)
|
||||
#endif
|
||||
{
|
||||
int rc;
|
||||
@ -96,6 +101,16 @@ if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0
|
||||
if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
|
||||
return rc;
|
||||
#endif
|
||||
#ifdef COMPILE_PCRE32
|
||||
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
|
||||
return rc;
|
||||
if (top <= 0) return PCRE_ERROR_NOSUBSTRING;
|
||||
|
||||
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0)
|
||||
return rc;
|
||||
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
|
||||
return rc;
|
||||
#endif
|
||||
|
||||
bot = 0;
|
||||
while (top > bot)
|
||||
@ -130,14 +145,18 @@ Returns: the length of each entry, or a negative number
|
||||
(PCRE_ERROR_NOSUBSTRING) if not found
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_get_stringtable_entries(const pcre *code, const char *stringname,
|
||||
char **firstptr, char **lastptr)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_get_stringtable_entries(const pcre16 *code, PCRE_SPTR16 stringname,
|
||||
PCRE_UCHAR16 **firstptr, PCRE_UCHAR16 **lastptr)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_get_stringtable_entries(const pcre32 *code, PCRE_SPTR32 stringname,
|
||||
PCRE_UCHAR32 **firstptr, PCRE_UCHAR32 **lastptr)
|
||||
#endif
|
||||
{
|
||||
int rc;
|
||||
@ -165,6 +184,16 @@ if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0
|
||||
if ((rc = pcre16_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
|
||||
return rc;
|
||||
#endif
|
||||
#ifdef COMPILE_PCRE32
|
||||
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMECOUNT, &top)) != 0)
|
||||
return rc;
|
||||
if (top <= 0) return PCRE_ERROR_NOSUBSTRING;
|
||||
|
||||
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMEENTRYSIZE, &entrysize)) != 0)
|
||||
return rc;
|
||||
if ((rc = pcre32_fullinfo(code, NULL, PCRE_INFO_NAMETABLE, &nametable)) != 0)
|
||||
return rc;
|
||||
#endif
|
||||
|
||||
lastentry = nametable + entrysize * (top - 1);
|
||||
bot = 0;
|
||||
@ -190,12 +219,15 @@ while (top > bot)
|
||||
(pcre_uchar *)(last + entrysize + IMM2_SIZE)) != 0) break;
|
||||
last += entrysize;
|
||||
}
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
*firstptr = (char *)first;
|
||||
*lastptr = (char *)last;
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
*firstptr = (PCRE_UCHAR16 *)first;
|
||||
*lastptr = (PCRE_UCHAR16 *)last;
|
||||
#elif defined COMPILE_PCRE32
|
||||
*firstptr = (PCRE_UCHAR32 *)first;
|
||||
*lastptr = (PCRE_UCHAR32 *)last;
|
||||
#endif
|
||||
return entrysize;
|
||||
}
|
||||
@ -224,31 +256,40 @@ Returns: the number of the first that is set,
|
||||
or a negative number on error
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
static int
|
||||
get_first_set(const pcre *code, const char *stringname, int *ovector)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
static int
|
||||
get_first_set(const pcre16 *code, PCRE_SPTR16 stringname, int *ovector)
|
||||
#elif defined COMPILE_PCRE32
|
||||
static int
|
||||
get_first_set(const pcre32 *code, PCRE_SPTR32 stringname, int *ovector)
|
||||
#endif
|
||||
{
|
||||
const REAL_PCRE *re = (const REAL_PCRE *)code;
|
||||
int entrysize;
|
||||
pcre_uchar *entry;
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
char *first, *last;
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_UCHAR16 *first, *last;
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_UCHAR32 *first, *last;
|
||||
#endif
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
|
||||
return pcre_get_stringnumber(code, stringname);
|
||||
entrysize = pcre_get_stringtable_entries(code, stringname, &first, &last);
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
|
||||
return pcre16_get_stringnumber(code, stringname);
|
||||
entrysize = pcre16_get_stringtable_entries(code, stringname, &first, &last);
|
||||
#elif defined COMPILE_PCRE32
|
||||
if ((re->options & PCRE_DUPNAMES) == 0 && (re->flags & PCRE_JCHANGED) == 0)
|
||||
return pcre32_get_stringnumber(code, stringname);
|
||||
entrysize = pcre32_get_stringtable_entries(code, stringname, &first, &last);
|
||||
#endif
|
||||
if (entrysize <= 0) return entrysize;
|
||||
for (entry = (pcre_uchar *)first; entry <= (pcre_uchar *)last; entry += entrysize)
|
||||
@ -289,14 +330,18 @@ Returns: if successful:
|
||||
PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_copy_substring(const char *subject, int *ovector, int stringcount,
|
||||
int stringnumber, char *buffer, int size)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_copy_substring(PCRE_SPTR16 subject, int *ovector, int stringcount,
|
||||
int stringnumber, PCRE_UCHAR16 *buffer, int size)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_copy_substring(PCRE_SPTR32 subject, int *ovector, int stringcount,
|
||||
int stringnumber, PCRE_UCHAR32 *buffer, int size)
|
||||
#endif
|
||||
{
|
||||
int yield;
|
||||
@ -340,24 +385,31 @@ Returns: if successful:
|
||||
PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_copy_named_substring(const pcre *code, const char *subject,
|
||||
int *ovector, int stringcount, const char *stringname,
|
||||
char *buffer, int size)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_copy_named_substring(const pcre16 *code, PCRE_SPTR16 subject,
|
||||
int *ovector, int stringcount, PCRE_SPTR16 stringname,
|
||||
PCRE_UCHAR16 *buffer, int size)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_copy_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
|
||||
int *ovector, int stringcount, PCRE_SPTR32 stringname,
|
||||
PCRE_UCHAR32 *buffer, int size)
|
||||
#endif
|
||||
{
|
||||
int n = get_first_set(code, stringname, ovector);
|
||||
if (n <= 0) return n;
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
return pcre_copy_substring(subject, ovector, stringcount, n, buffer, size);
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
return pcre16_copy_substring(subject, ovector, stringcount, n, buffer, size);
|
||||
#elif defined COMPILE_PCRE32
|
||||
return pcre32_copy_substring(subject, ovector, stringcount, n, buffer, size);
|
||||
#endif
|
||||
}
|
||||
|
||||
@ -384,14 +436,18 @@ Returns: if successful: 0
|
||||
PCRE_ERROR_NOMEMORY (-6) failed to get store
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_get_substring_list(const char *subject, int *ovector, int stringcount,
|
||||
const char ***listptr)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_get_substring_list(PCRE_SPTR16 subject, int *ovector, int stringcount,
|
||||
PCRE_SPTR16 **listptr)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_get_substring_list(PCRE_SPTR32 subject, int *ovector, int stringcount,
|
||||
PCRE_SPTR32 **listptr)
|
||||
#endif
|
||||
{
|
||||
int i;
|
||||
@ -406,10 +462,12 @@ for (i = 0; i < double_count; i += 2)
|
||||
stringlist = (pcre_uchar **)(PUBL(malloc))(size);
|
||||
if (stringlist == NULL) return PCRE_ERROR_NOMEMORY;
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
*listptr = (const char **)stringlist;
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
*listptr = (PCRE_SPTR16 *)stringlist;
|
||||
#elif defined COMPILE_PCRE32
|
||||
*listptr = (PCRE_SPTR32 *)stringlist;
|
||||
#endif
|
||||
p = (pcre_uchar *)(stringlist + stringcount + 1);
|
||||
|
||||
@ -440,12 +498,15 @@ Argument: the result of a previous pcre_get_substring_list()
|
||||
Returns: nothing
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
|
||||
pcre_free_substring_list(const char **pointer)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
|
||||
pcre16_free_substring_list(PCRE_SPTR16 *pointer)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
|
||||
pcre32_free_substring_list(PCRE_SPTR32 *pointer)
|
||||
#endif
|
||||
{
|
||||
(PUBL(free))((void *)pointer);
|
||||
@ -478,14 +539,18 @@ Returns: if successful:
|
||||
PCRE_ERROR_NOSUBSTRING (-7) substring not present
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_get_substring(const char *subject, int *ovector, int stringcount,
|
||||
int stringnumber, const char **stringptr)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_get_substring(PCRE_SPTR16 subject, int *ovector, int stringcount,
|
||||
int stringnumber, PCRE_SPTR16 *stringptr)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_get_substring(PCRE_SPTR32 subject, int *ovector, int stringcount,
|
||||
int stringnumber, PCRE_SPTR32 *stringptr)
|
||||
#endif
|
||||
{
|
||||
int yield;
|
||||
@ -498,10 +563,12 @@ substring = (pcre_uchar *)(PUBL(malloc))(IN_UCHARS(yield + 1));
|
||||
if (substring == NULL) return PCRE_ERROR_NOMEMORY;
|
||||
memcpy(substring, subject + ovector[stringnumber], IN_UCHARS(yield));
|
||||
substring[yield] = 0;
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
*stringptr = (const char *)substring;
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
*stringptr = (PCRE_SPTR16)substring;
|
||||
#elif defined COMPILE_PCRE32
|
||||
*stringptr = (PCRE_SPTR32)substring;
|
||||
#endif
|
||||
return yield;
|
||||
}
|
||||
@ -535,24 +602,31 @@ Returns: if successful:
|
||||
PCRE_ERROR_NOSUBSTRING (-7) no such captured substring
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_get_named_substring(const pcre *code, const char *subject,
|
||||
int *ovector, int stringcount, const char *stringname,
|
||||
const char **stringptr)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_get_named_substring(const pcre16 *code, PCRE_SPTR16 subject,
|
||||
int *ovector, int stringcount, PCRE_SPTR16 stringname,
|
||||
PCRE_SPTR16 *stringptr)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_get_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
|
||||
int *ovector, int stringcount, PCRE_SPTR32 stringname,
|
||||
PCRE_SPTR32 *stringptr)
|
||||
#endif
|
||||
{
|
||||
int n = get_first_set(code, stringname, ovector);
|
||||
if (n <= 0) return n;
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
return pcre_get_substring(subject, ovector, stringcount, n, stringptr);
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
return pcre16_get_substring(subject, ovector, stringcount, n, stringptr);
|
||||
#elif defined COMPILE_PCRE32
|
||||
return pcre32_get_substring(subject, ovector, stringcount, n, stringptr);
|
||||
#endif
|
||||
}
|
||||
|
||||
@ -571,12 +645,15 @@ Argument: the result of a previous pcre_get_substring()
|
||||
Returns: nothing
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
|
||||
pcre_free_substring(const char *pointer)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
|
||||
pcre16_free_substring(PCRE_SPTR16 pointer)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN void PCRE_CALL_CONVENTION
|
||||
pcre32_free_substring(PCRE_SPTR32 pointer)
|
||||
#endif
|
||||
{
|
||||
(PUBL(free))((void *)pointer);
|
||||
|
@ -52,7 +52,9 @@ a local function is used.
|
||||
Also, when compiling for Virtual Pascal, things are done differently, and
|
||||
global variables are not used. */
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -45,7 +45,9 @@ compilation of dftables.c, in which case the macro DFTABLES is defined. */
|
||||
|
||||
|
||||
#ifndef DFTABLES
|
||||
# ifdef HAVE_CONFIG_H
|
||||
# include "config.h"
|
||||
# endif
|
||||
# include "pcre_internal.h"
|
||||
#endif
|
||||
|
||||
@ -64,12 +66,15 @@ Arguments: none
|
||||
Returns: pointer to the contiguous block of data
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
const unsigned char *
|
||||
pcre_maketables(void)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
const unsigned char *
|
||||
pcre16_maketables(void)
|
||||
#elif defined COMPILE_PCRE32
|
||||
const unsigned char *
|
||||
pcre32_maketables(void)
|
||||
#endif
|
||||
{
|
||||
unsigned char *yield, *p;
|
||||
@ -125,7 +130,7 @@ within regexes. */
|
||||
for (i = 0; i < 256; i++)
|
||||
{
|
||||
int x = 0;
|
||||
if (i != 0x0b && isspace(i)) x += ctype_space;
|
||||
if (i != CHAR_VT && isspace(i)) x += ctype_space;
|
||||
if (isalpha(i)) x += ctype_letter;
|
||||
if (isdigit(i)) x += ctype_digit;
|
||||
if (isxdigit(i)) x += ctype_xdigit;
|
||||
|
@ -47,7 +47,9 @@ and NLTYPE_ANY. The full list of Unicode newline characters is taken from
|
||||
http://unicode.org/unicode/reports/tr18/. */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
@ -74,7 +76,7 @@ BOOL
|
||||
PRIV(is_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR endptr, int *lenptr,
|
||||
BOOL utf)
|
||||
{
|
||||
int c;
|
||||
pcre_uint32 c;
|
||||
(void)utf;
|
||||
#ifdef SUPPORT_UTF
|
||||
if (utf)
|
||||
@ -85,11 +87,13 @@ else
|
||||
#endif /* SUPPORT_UTF */
|
||||
c = *ptr;
|
||||
|
||||
/* Note that this function is called only for ANY or ANYCRLF. */
|
||||
|
||||
if (type == NLTYPE_ANYCRLF) switch(c)
|
||||
{
|
||||
case 0x000a: *lenptr = 1; return TRUE; /* LF */
|
||||
case 0x000d: *lenptr = (ptr < endptr - 1 && ptr[1] == 0x0a)? 2 : 1;
|
||||
return TRUE; /* CR */
|
||||
case CHAR_LF: *lenptr = 1; return TRUE;
|
||||
case CHAR_CR: *lenptr = (ptr < endptr - 1 && ptr[1] == CHAR_LF)? 2 : 1;
|
||||
return TRUE;
|
||||
default: return FALSE;
|
||||
}
|
||||
|
||||
@ -97,20 +101,29 @@ if (type == NLTYPE_ANYCRLF) switch(c)
|
||||
|
||||
else switch(c)
|
||||
{
|
||||
case 0x000a: /* LF */
|
||||
case 0x000b: /* VT */
|
||||
case 0x000c: *lenptr = 1; return TRUE; /* FF */
|
||||
case 0x000d: *lenptr = (ptr < endptr - 1 && ptr[1] == 0x0a)? 2 : 1;
|
||||
return TRUE; /* CR */
|
||||
#ifdef EBCDIC
|
||||
case CHAR_NEL:
|
||||
#endif
|
||||
case CHAR_LF:
|
||||
case CHAR_VT:
|
||||
case CHAR_FF: *lenptr = 1; return TRUE;
|
||||
|
||||
case CHAR_CR:
|
||||
*lenptr = (ptr < endptr - 1 && ptr[1] == CHAR_LF)? 2 : 1;
|
||||
return TRUE;
|
||||
|
||||
#ifndef EBCDIC
|
||||
#ifdef COMPILE_PCRE8
|
||||
case 0x0085: *lenptr = utf? 2 : 1; return TRUE; /* NEL */
|
||||
case CHAR_NEL: *lenptr = utf? 2 : 1; return TRUE;
|
||||
case 0x2028: /* LS */
|
||||
case 0x2029: *lenptr = 3; return TRUE; /* PS */
|
||||
#else
|
||||
case 0x0085: /* NEL */
|
||||
#else /* COMPILE_PCRE16 || COMPILE_PCRE32 */
|
||||
case CHAR_NEL:
|
||||
case 0x2028: /* LS */
|
||||
case 0x2029: *lenptr = 1; return TRUE; /* PS */
|
||||
#endif /* COMPILE_PCRE8 */
|
||||
#endif /* Not EBCDIC */
|
||||
|
||||
default: return FALSE;
|
||||
}
|
||||
}
|
||||
@ -138,7 +151,7 @@ BOOL
|
||||
PRIV(was_newline)(PCRE_PUCHAR ptr, int type, PCRE_PUCHAR startptr, int *lenptr,
|
||||
BOOL utf)
|
||||
{
|
||||
int c;
|
||||
pcre_uint32 c;
|
||||
(void)utf;
|
||||
ptr--;
|
||||
#ifdef SUPPORT_UTF
|
||||
@ -151,30 +164,45 @@ else
|
||||
#endif /* SUPPORT_UTF */
|
||||
c = *ptr;
|
||||
|
||||
/* Note that this function is called only for ANY or ANYCRLF. */
|
||||
|
||||
if (type == NLTYPE_ANYCRLF) switch(c)
|
||||
{
|
||||
case 0x000a: *lenptr = (ptr > startptr && ptr[-1] == 0x0d)? 2 : 1;
|
||||
return TRUE; /* LF */
|
||||
case 0x000d: *lenptr = 1; return TRUE; /* CR */
|
||||
case CHAR_LF:
|
||||
*lenptr = (ptr > startptr && ptr[-1] == CHAR_CR)? 2 : 1;
|
||||
return TRUE;
|
||||
|
||||
case CHAR_CR: *lenptr = 1; return TRUE;
|
||||
default: return FALSE;
|
||||
}
|
||||
|
||||
/* NLTYPE_ANY */
|
||||
|
||||
else switch(c)
|
||||
{
|
||||
case 0x000a: *lenptr = (ptr > startptr && ptr[-1] == 0x0d)? 2 : 1;
|
||||
return TRUE; /* LF */
|
||||
case 0x000b: /* VT */
|
||||
case 0x000c: /* FF */
|
||||
case 0x000d: *lenptr = 1; return TRUE; /* CR */
|
||||
case CHAR_LF:
|
||||
*lenptr = (ptr > startptr && ptr[-1] == CHAR_CR)? 2 : 1;
|
||||
return TRUE;
|
||||
|
||||
#ifdef EBCDIC
|
||||
case CHAR_NEL:
|
||||
#endif
|
||||
case CHAR_VT:
|
||||
case CHAR_FF:
|
||||
case CHAR_CR: *lenptr = 1; return TRUE;
|
||||
|
||||
#ifndef EBCDIC
|
||||
#ifdef COMPILE_PCRE8
|
||||
case 0x0085: *lenptr = utf? 2 : 1; return TRUE; /* NEL */
|
||||
case CHAR_NEL: *lenptr = utf? 2 : 1; return TRUE;
|
||||
case 0x2028: /* LS */
|
||||
case 0x2029: *lenptr = 3; return TRUE; /* PS */
|
||||
#else
|
||||
case 0x0085: /* NEL */
|
||||
#else /* COMPILE_PCRE16 || COMPILE_PCRE32 */
|
||||
case CHAR_NEL:
|
||||
case 0x2028: /* LS */
|
||||
case 0x2029: *lenptr = 1; return TRUE; /* PS */
|
||||
#endif /* COMPILE_PCRE8 */
|
||||
#endif /* NotEBCDIC */
|
||||
|
||||
default: return FALSE;
|
||||
}
|
||||
}
|
||||
|
@ -41,17 +41,20 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||
/* This file contains a private PCRE function that converts an ordinal
|
||||
character value into a UTF8 string. */
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#define COMPILE_PCRE8
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
|
||||
/*************************************************
|
||||
* Convert character value to UTF-8 *
|
||||
*************************************************/
|
||||
|
||||
/* This function takes an integer value in the range 0 - 0x10ffff
|
||||
and encodes it as a UTF-8 character in 1 to 6 pcre_uchars.
|
||||
and encodes it as a UTF-8 character in 1 to 4 pcre_uchars.
|
||||
|
||||
Arguments:
|
||||
cvalue the character value
|
||||
@ -60,6 +63,7 @@ Arguments:
|
||||
Returns: number of characters placed in the buffer
|
||||
*/
|
||||
|
||||
unsigned
|
||||
int
|
||||
PRIV(ord2utf)(pcre_uint32 cvalue, pcre_uchar *buffer)
|
||||
{
|
||||
@ -67,11 +71,6 @@ PRIV(ord2utf)(pcre_uint32 cvalue, pcre_uchar *buffer)
|
||||
|
||||
register int i, j;
|
||||
|
||||
/* Checking invalid cvalue character, encoded as invalid UTF-16 character.
|
||||
Should never happen in practice. */
|
||||
if ((cvalue & 0xf800) == 0xd800 || cvalue >= 0x110000)
|
||||
cvalue = 0xfffe;
|
||||
|
||||
for (i = 0; i < PRIV(utf8_table1_size); i++)
|
||||
if ((int)cvalue <= PRIV(utf8_table1)[i]) break;
|
||||
buffer += i;
|
||||
|
@ -44,7 +44,9 @@ pattern data block. This might be helpful in applications where the block is
|
||||
shared by different users. */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
@ -66,12 +68,15 @@ Returns: the (possibly updated) count value (a non-negative number), or
|
||||
a negative error number
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre_refcount(pcre *argument_re, int adjust)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre16_refcount(pcre16 *argument_re, int adjust)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN int PCRE_CALL_CONVENTION
|
||||
pcre32_refcount(pcre32 *argument_re, int adjust)
|
||||
#endif
|
||||
{
|
||||
REAL_PCRE *re = (REAL_PCRE *)argument_re;
|
||||
|
@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||
supporting functions. */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
@ -96,7 +98,7 @@ for (;;)
|
||||
{
|
||||
int d, min;
|
||||
pcre_uchar *cs, *ce;
|
||||
register int op = *cc;
|
||||
register pcre_uchar op = *cc;
|
||||
|
||||
switch (op)
|
||||
{
|
||||
@ -321,15 +323,19 @@ for (;;)
|
||||
|
||||
/* Check a class for variable quantification */
|
||||
|
||||
#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
|
||||
case OP_XCLASS:
|
||||
cc += GET(cc, 1) - PRIV(OP_lengths)[OP_CLASS];
|
||||
/* Fall through */
|
||||
#endif
|
||||
|
||||
case OP_CLASS:
|
||||
case OP_NCLASS:
|
||||
#if defined SUPPORT_UTF || defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
case OP_XCLASS:
|
||||
/* The original code caused an unsigned overflow in 64 bit systems,
|
||||
so now we use a conditional statement. */
|
||||
if (op == OP_XCLASS)
|
||||
cc += GET(cc, 1);
|
||||
else
|
||||
cc += PRIV(OP_lengths)[OP_CLASS];
|
||||
#else
|
||||
cc += PRIV(OP_lengths)[OP_CLASS];
|
||||
#endif
|
||||
|
||||
switch (*cc)
|
||||
{
|
||||
@ -536,7 +542,7 @@ Arguments:
|
||||
p points to the character
|
||||
caseless the caseless flag
|
||||
cd the block with char table pointers
|
||||
utf TRUE for UTF-8 / UTF-16 mode
|
||||
utf TRUE for UTF-8 / UTF-16 / UTF-32 mode
|
||||
|
||||
Returns: pointer after the character
|
||||
*/
|
||||
@ -545,7 +551,7 @@ static const pcre_uchar *
|
||||
set_table_bit(pcre_uint8 *start_bits, const pcre_uchar *p, BOOL caseless,
|
||||
compile_data *cd, BOOL utf)
|
||||
{
|
||||
unsigned int c = *p;
|
||||
pcre_uint32 c = *p;
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
SET_BIT(c);
|
||||
@ -562,18 +568,20 @@ if (utf && c > 127)
|
||||
(void)PRIV(ord2utf)(c, buff);
|
||||
SET_BIT(buff[0]);
|
||||
}
|
||||
#endif
|
||||
#endif /* Not SUPPORT_UCP */
|
||||
return p;
|
||||
}
|
||||
#endif
|
||||
#else /* Not SUPPORT_UTF */
|
||||
(void)(utf); /* Stops warning for unused parameter */
|
||||
#endif /* SUPPORT_UTF */
|
||||
|
||||
/* Not UTF-8 mode, or character is less than 127. */
|
||||
|
||||
if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]);
|
||||
return p + 1;
|
||||
#endif
|
||||
#endif /* COMPILE_PCRE8 */
|
||||
|
||||
#ifdef COMPILE_PCRE16
|
||||
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
if (c > 0xff)
|
||||
{
|
||||
c = 0xff;
|
||||
@ -593,10 +601,12 @@ if (utf && c > 127)
|
||||
c = 0xff;
|
||||
SET_BIT(c);
|
||||
}
|
||||
#endif
|
||||
#endif /* SUPPORT_UCP */
|
||||
return p;
|
||||
}
|
||||
#endif
|
||||
#else /* Not SUPPORT_UTF */
|
||||
(void)(utf); /* Stops warning for unused parameter */
|
||||
#endif /* SUPPORT_UTF */
|
||||
|
||||
if (caseless && (cd->ctypes[c] & ctype_letter) != 0) SET_BIT(cd->fcc[c]);
|
||||
return p + 1;
|
||||
@ -626,10 +636,10 @@ Returns: nothing
|
||||
*/
|
||||
|
||||
static void
|
||||
set_type_bits(pcre_uint8 *start_bits, int cbit_type, int table_limit,
|
||||
set_type_bits(pcre_uint8 *start_bits, int cbit_type, unsigned int table_limit,
|
||||
compile_data *cd)
|
||||
{
|
||||
register int c;
|
||||
register pcre_uint32 c;
|
||||
for (c = 0; c < table_limit; c++) start_bits[c] |= cd->cbits[c+cbit_type];
|
||||
#if defined SUPPORT_UTF && defined COMPILE_PCRE8
|
||||
if (table_limit == 32) return;
|
||||
@ -668,10 +678,10 @@ Returns: nothing
|
||||
*/
|
||||
|
||||
static void
|
||||
set_nottype_bits(pcre_uint8 *start_bits, int cbit_type, int table_limit,
|
||||
set_nottype_bits(pcre_uint8 *start_bits, int cbit_type, unsigned int table_limit,
|
||||
compile_data *cd)
|
||||
{
|
||||
register int c;
|
||||
register pcre_uint32 c;
|
||||
for (c = 0; c < table_limit; c++) start_bits[c] |= ~cd->cbits[c+cbit_type];
|
||||
#if defined SUPPORT_UTF && defined COMPILE_PCRE8
|
||||
if (table_limit != 32) for (c = 24; c < 32; c++) start_bits[c] = 0xff;
|
||||
@ -695,7 +705,7 @@ function fails unless the result is SSB_DONE.
|
||||
Arguments:
|
||||
code points to an expression
|
||||
start_bits points to a 32-byte table, initialized to 0
|
||||
utf TRUE if in UTF-8 / UTF-16 mode
|
||||
utf TRUE if in UTF-8 / UTF-16 / UTF-32 mode
|
||||
cd the block with char table pointers
|
||||
|
||||
Returns: SSB_FAIL => Failed to find any starting bytes
|
||||
@ -708,7 +718,7 @@ static int
|
||||
set_start_bits(const pcre_uchar *code, pcre_uint8 *start_bits, BOOL utf,
|
||||
compile_data *cd)
|
||||
{
|
||||
register int c;
|
||||
register pcre_uint32 c;
|
||||
int yield = SSB_DONE;
|
||||
#if defined SUPPORT_UTF && defined COMPILE_PCRE8
|
||||
int table_limit = utf? 16:32;
|
||||
@ -984,8 +994,8 @@ do
|
||||
identical. */
|
||||
|
||||
case OP_HSPACE:
|
||||
SET_BIT(0x09);
|
||||
SET_BIT(0x20);
|
||||
SET_BIT(CHAR_HT);
|
||||
SET_BIT(CHAR_SPACE);
|
||||
#ifdef SUPPORT_UTF
|
||||
if (utf)
|
||||
{
|
||||
@ -994,46 +1004,46 @@ do
|
||||
SET_BIT(0xE1); /* For U+1680, U+180E */
|
||||
SET_BIT(0xE2); /* For U+2000 - U+200A, U+202F, U+205F */
|
||||
SET_BIT(0xE3); /* For U+3000 */
|
||||
#endif
|
||||
#ifdef COMPILE_PCRE16
|
||||
#elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
SET_BIT(0xA0);
|
||||
SET_BIT(0xFF); /* For characters > 255 */
|
||||
#endif
|
||||
#endif /* COMPILE_PCRE[8|16|32] */
|
||||
}
|
||||
else
|
||||
#endif /* SUPPORT_UTF */
|
||||
{
|
||||
#ifndef EBCDIC
|
||||
SET_BIT(0xA0);
|
||||
#ifdef COMPILE_PCRE16
|
||||
#endif /* Not EBCDIC */
|
||||
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
SET_BIT(0xFF); /* For characters > 255 */
|
||||
#endif
|
||||
#endif /* COMPILE_PCRE[16|32] */
|
||||
}
|
||||
try_next = FALSE;
|
||||
break;
|
||||
|
||||
case OP_ANYNL:
|
||||
case OP_VSPACE:
|
||||
SET_BIT(0x0A);
|
||||
SET_BIT(0x0B);
|
||||
SET_BIT(0x0C);
|
||||
SET_BIT(0x0D);
|
||||
SET_BIT(CHAR_LF);
|
||||
SET_BIT(CHAR_VT);
|
||||
SET_BIT(CHAR_FF);
|
||||
SET_BIT(CHAR_CR);
|
||||
#ifdef SUPPORT_UTF
|
||||
if (utf)
|
||||
{
|
||||
#ifdef COMPILE_PCRE8
|
||||
SET_BIT(0xC2); /* For U+0085 */
|
||||
SET_BIT(0xE2); /* For U+2028, U+2029 */
|
||||
#endif
|
||||
#ifdef COMPILE_PCRE16
|
||||
SET_BIT(0x85);
|
||||
#elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
SET_BIT(CHAR_NEL);
|
||||
SET_BIT(0xFF); /* For characters > 255 */
|
||||
#endif
|
||||
#endif /* COMPILE_PCRE[8|16|32] */
|
||||
}
|
||||
else
|
||||
#endif /* SUPPORT_UTF */
|
||||
{
|
||||
SET_BIT(0x85);
|
||||
#ifdef COMPILE_PCRE16
|
||||
SET_BIT(CHAR_NEL);
|
||||
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
SET_BIT(0xFF); /* For characters > 255 */
|
||||
#endif
|
||||
}
|
||||
@ -1056,7 +1066,8 @@ do
|
||||
break;
|
||||
|
||||
/* The cbit_space table has vertical tab as whitespace; we have to
|
||||
ensure it is set as not whitespace. */
|
||||
ensure it is set as not whitespace. Luckily, the code value is the same
|
||||
(0x0b) in ASCII and EBCDIC, so we can just adjust the appropriate bit. */
|
||||
|
||||
case OP_NOT_WHITESPACE:
|
||||
set_nottype_bits(start_bits, cbit_space, table_limit, cd);
|
||||
@ -1064,8 +1075,9 @@ do
|
||||
try_next = FALSE;
|
||||
break;
|
||||
|
||||
/* The cbit_space table has vertical tab as whitespace; we have to
|
||||
not set it from the table. */
|
||||
/* The cbit_space table has vertical tab as whitespace; we have to not
|
||||
set it from the table. Luckily, the code value is the same (0x0b) in
|
||||
ASCII and EBCDIC, so we can just adjust the appropriate bit. */
|
||||
|
||||
case OP_WHITESPACE:
|
||||
c = start_bits[1]; /* Save in case it was already set */
|
||||
@ -1119,8 +1131,8 @@ do
|
||||
return SSB_FAIL;
|
||||
|
||||
case OP_HSPACE:
|
||||
SET_BIT(0x09);
|
||||
SET_BIT(0x20);
|
||||
SET_BIT(CHAR_HT);
|
||||
SET_BIT(CHAR_SPACE);
|
||||
#ifdef SUPPORT_UTF
|
||||
if (utf)
|
||||
{
|
||||
@ -1129,38 +1141,38 @@ do
|
||||
SET_BIT(0xE1); /* For U+1680, U+180E */
|
||||
SET_BIT(0xE2); /* For U+2000 - U+200A, U+202F, U+205F */
|
||||
SET_BIT(0xE3); /* For U+3000 */
|
||||
#endif
|
||||
#ifdef COMPILE_PCRE16
|
||||
#elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
SET_BIT(0xA0);
|
||||
SET_BIT(0xFF); /* For characters > 255 */
|
||||
#endif
|
||||
#endif /* COMPILE_PCRE[8|16|32] */
|
||||
}
|
||||
else
|
||||
#endif /* SUPPORT_UTF */
|
||||
#ifndef EBCDIC
|
||||
SET_BIT(0xA0);
|
||||
#endif /* Not EBCDIC */
|
||||
break;
|
||||
|
||||
case OP_ANYNL:
|
||||
case OP_VSPACE:
|
||||
SET_BIT(0x0A);
|
||||
SET_BIT(0x0B);
|
||||
SET_BIT(0x0C);
|
||||
SET_BIT(0x0D);
|
||||
SET_BIT(CHAR_LF);
|
||||
SET_BIT(CHAR_VT);
|
||||
SET_BIT(CHAR_FF);
|
||||
SET_BIT(CHAR_CR);
|
||||
#ifdef SUPPORT_UTF
|
||||
if (utf)
|
||||
{
|
||||
#ifdef COMPILE_PCRE8
|
||||
SET_BIT(0xC2); /* For U+0085 */
|
||||
SET_BIT(0xE2); /* For U+2028, U+2029 */
|
||||
#endif
|
||||
#ifdef COMPILE_PCRE16
|
||||
SET_BIT(0x85);
|
||||
#elif defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
SET_BIT(CHAR_NEL);
|
||||
SET_BIT(0xFF); /* For characters > 255 */
|
||||
#endif
|
||||
#endif /* COMPILE_PCRE16 */
|
||||
}
|
||||
else
|
||||
#endif /* SUPPORT_UTF */
|
||||
SET_BIT(0x85);
|
||||
SET_BIT(CHAR_NEL);
|
||||
break;
|
||||
|
||||
case OP_NOT_DIGIT:
|
||||
@ -1172,7 +1184,9 @@ do
|
||||
break;
|
||||
|
||||
/* The cbit_space table has vertical tab as whitespace; we have to
|
||||
ensure it gets set as not whitespace. */
|
||||
ensure it gets set as not whitespace. Luckily, the code value is the
|
||||
same (0x0b) in ASCII and EBCDIC, so we can just adjust the appropriate
|
||||
bit. */
|
||||
|
||||
case OP_NOT_WHITESPACE:
|
||||
set_nottype_bits(start_bits, cbit_space, table_limit, cd);
|
||||
@ -1180,7 +1194,8 @@ do
|
||||
break;
|
||||
|
||||
/* The cbit_space table has vertical tab as whitespace; we have to
|
||||
avoid setting it. */
|
||||
avoid setting it. Luckily, the code value is the same (0x0b) in ASCII
|
||||
and EBCDIC, so we can just adjust the appropriate bit. */
|
||||
|
||||
case OP_WHITESPACE:
|
||||
c = start_bits[1]; /* Save in case it was already set */
|
||||
@ -1214,7 +1229,7 @@ do
|
||||
memset(start_bits+25, 0xff, 7); /* Bits for 0xc9 - 0xff */
|
||||
}
|
||||
#endif
|
||||
#ifdef COMPILE_PCRE16
|
||||
#if defined COMPILE_PCRE16 || defined COMPILE_PCRE32
|
||||
SET_BIT(0xFF); /* For characters > 255 */
|
||||
#endif
|
||||
/* Fall through */
|
||||
@ -1310,12 +1325,15 @@ Returns: pointer to a pcre[16]_extra block, with study_data filled in and
|
||||
NULL on error or if no optimization possible
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN pcre_extra * PCRE_CALL_CONVENTION
|
||||
pcre_study(const pcre *external_re, int options, const char **errorptr)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN pcre16_extra * PCRE_CALL_CONVENTION
|
||||
pcre16_study(const pcre16 *external_re, int options, const char **errorptr)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN pcre32_extra * PCRE_CALL_CONVENTION
|
||||
pcre32_study(const pcre32 *external_re, int options, const char **errorptr)
|
||||
#endif
|
||||
{
|
||||
int min;
|
||||
@ -1338,10 +1356,12 @@ if (re == NULL || re->magic_number != MAGIC_NUMBER)
|
||||
|
||||
if ((re->flags & PCRE_MODE) == 0)
|
||||
{
|
||||
#ifdef COMPILE_PCRE8
|
||||
*errorptr = "argument is compiled in 16 bit mode";
|
||||
#else
|
||||
*errorptr = "argument is compiled in 8 bit mode";
|
||||
#if defined COMPILE_PCRE8
|
||||
*errorptr = "argument not compiled in 8 bit mode";
|
||||
#elif defined COMPILE_PCRE16
|
||||
*errorptr = "argument not compiled in 16 bit mode";
|
||||
#elif defined COMPILE_PCRE32
|
||||
*errorptr = "argument not compiled in 32 bit mode";
|
||||
#endif
|
||||
return NULL;
|
||||
}
|
||||
@ -1368,14 +1388,18 @@ if ((re->options & PCRE_ANCHORED) == 0 &&
|
||||
|
||||
tables = re->tables;
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
if (tables == NULL)
|
||||
(void)pcre_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
|
||||
(void *)(&tables));
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
if (tables == NULL)
|
||||
(void)pcre16_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
|
||||
(void *)(&tables));
|
||||
#elif defined COMPILE_PCRE32
|
||||
if (tables == NULL)
|
||||
(void)pcre32_fullinfo(external_re, NULL, PCRE_INFO_DEFAULT_TABLES,
|
||||
(void *)(&tables));
|
||||
#endif
|
||||
|
||||
compile_block.lcc = tables + lcc_offset;
|
||||
@ -1406,20 +1430,20 @@ switch(min = find_minlength(code, code, re->options, 0))
|
||||
}
|
||||
|
||||
/* If a set of starting bytes has been identified, or if the minimum length is
|
||||
greater than zero, or if JIT optimization has been requested, get a
|
||||
pcre[16]_extra block and a pcre_study_data block. The study data is put in the
|
||||
latter, which is pointed to by the former, which may also get additional data
|
||||
set later by the calling program. At the moment, the size of pcre_study_data
|
||||
is fixed. We nevertheless save it in a field for returning via the
|
||||
pcre_fullinfo() function so that if it becomes variable in the future,
|
||||
we don't have to change that code. */
|
||||
greater than zero, or if JIT optimization has been requested, or if
|
||||
PCRE_STUDY_EXTRA_NEEDED is set, get a pcre[16]_extra block and a
|
||||
pcre_study_data block. The study data is put in the latter, which is pointed to
|
||||
by the former, which may also get additional data set later by the calling
|
||||
program. At the moment, the size of pcre_study_data is fixed. We nevertheless
|
||||
save it in a field for returning via the pcre_fullinfo() function so that if it
|
||||
becomes variable in the future, we don't have to change that code. */
|
||||
|
||||
if (bits_set || min > 0
|
||||
if (bits_set || min > 0 || (options & (
|
||||
#ifdef SUPPORT_JIT
|
||||
|| (options & (PCRE_STUDY_JIT_COMPILE | PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE
|
||||
| PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE)) != 0
|
||||
PCRE_STUDY_JIT_COMPILE | PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE |
|
||||
PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE |
|
||||
#endif
|
||||
)
|
||||
PCRE_STUDY_EXTRA_NEEDED)) != 0)
|
||||
{
|
||||
extra = (PUBL(extra) *)(PUBL(malloc))
|
||||
(sizeof(PUBL(extra)) + sizeof(pcre_study_data));
|
||||
@ -1473,7 +1497,8 @@ if (bits_set || min > 0
|
||||
|
||||
/* If JIT support was compiled and requested, attempt the JIT compilation.
|
||||
If no starting bytes were found, and the minimum length is zero, and JIT
|
||||
compilation fails, abandon the extra block and return NULL. */
|
||||
compilation fails, abandon the extra block and return NULL, unless
|
||||
PCRE_STUDY_EXTRA_NEEDED is set. */
|
||||
|
||||
#ifdef SUPPORT_JIT
|
||||
extra->executable_jit = NULL;
|
||||
@ -1484,13 +1509,15 @@ if (bits_set || min > 0
|
||||
if ((options & PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE) != 0)
|
||||
PRIV(jit_compile)(re, extra, JIT_PARTIAL_HARD_COMPILE);
|
||||
|
||||
if (study->flags == 0 && (extra->flags & PCRE_EXTRA_EXECUTABLE_JIT) == 0)
|
||||
if (study->flags == 0 && (extra->flags & PCRE_EXTRA_EXECUTABLE_JIT) == 0 &&
|
||||
(options & PCRE_STUDY_EXTRA_NEEDED) == 0)
|
||||
{
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
pcre_free_study(extra);
|
||||
#endif
|
||||
#ifdef COMPILE_PCRE16
|
||||
#elif defined COMPILE_PCRE16
|
||||
pcre16_free_study(extra);
|
||||
#elif defined COMPILE_PCRE32
|
||||
pcre32_free_study(extra);
|
||||
#endif
|
||||
extra = NULL;
|
||||
}
|
||||
@ -1511,12 +1538,15 @@ Argument: a pointer to the pcre[16]_extra block
|
||||
Returns: nothing
|
||||
*/
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN void
|
||||
pcre_free_study(pcre_extra *extra)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN void
|
||||
pcre16_free_study(pcre16_extra *extra)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN void
|
||||
pcre32_free_study(pcre32_extra *extra)
|
||||
#endif
|
||||
{
|
||||
if (extra == NULL)
|
||||
|
@ -45,7 +45,9 @@ uses macros to change their names from _pcre_xxx to xxxx, thereby avoiding name
|
||||
clashes with the library. */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
@ -56,6 +58,12 @@ the definition is next to the definition of the opcodes in pcre_internal.h. */
|
||||
|
||||
const pcre_uint8 PRIV(OP_lengths)[] = { OP_LENGTHS };
|
||||
|
||||
/* Tables of horizontal and vertical whitespace characters, suitable for
|
||||
adding to classes. */
|
||||
|
||||
const pcre_uint32 PRIV(hspace_list)[] = { HSPACE_LIST };
|
||||
const pcre_uint32 PRIV(vspace_list)[] = { VSPACE_LIST };
|
||||
|
||||
|
||||
|
||||
/*************************************************
|
||||
@ -66,9 +74,9 @@ const pcre_uint8 PRIV(OP_lengths)[] = { OP_LENGTHS };
|
||||
character. */
|
||||
|
||||
#if (defined SUPPORT_UTF && defined COMPILE_PCRE8) \
|
||||
|| (defined PCRE_INCLUDED && defined SUPPORT_PCRE16)
|
||||
|| (defined PCRE_INCLUDED && (defined SUPPORT_PCRE16 || defined SUPPORT_PCRE32))
|
||||
|
||||
/* These tables are also required by pcretest in 16 bit mode. */
|
||||
/* These tables are also required by pcretest in 16- or 32-bit mode. */
|
||||
|
||||
const int PRIV(utf8_table1)[] =
|
||||
{ 0x7f, 0x7ff, 0xffff, 0x1fffff, 0x3ffffff, 0x7fffffff};
|
||||
@ -90,13 +98,13 @@ const pcre_uint8 PRIV(utf8_table4)[] = {
|
||||
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
|
||||
3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5 };
|
||||
|
||||
#endif /* (SUPPORT_UTF && COMPILE_PCRE8) || (PCRE_INCLUDED && SUPPORT_PCRE16)*/
|
||||
#endif /* (SUPPORT_UTF && COMPILE_PCRE8) || (PCRE_INCLUDED && SUPPORT_PCRE[16|32])*/
|
||||
|
||||
#ifdef SUPPORT_UTF
|
||||
|
||||
/* Table to translate from particular type value to the general value. */
|
||||
|
||||
const int PRIV(ucp_gentype)[] = {
|
||||
const pcre_uint32 PRIV(ucp_gentype)[] = {
|
||||
ucp_C, ucp_C, ucp_C, ucp_C, ucp_C, /* Cc, Cf, Cn, Co, Cs */
|
||||
ucp_L, ucp_L, ucp_L, ucp_L, ucp_L, /* Ll, Lu, Lm, Lo, Lt */
|
||||
ucp_M, ucp_M, ucp_M, /* Mc, Me, Mn */
|
||||
@ -107,6 +115,66 @@ const int PRIV(ucp_gentype)[] = {
|
||||
ucp_Z, ucp_Z, ucp_Z /* Zl, Zp, Zs */
|
||||
};
|
||||
|
||||
/* This table encodes the rules for finding the end of an extended grapheme
|
||||
cluster. Every code point has a grapheme break property which is one of the
|
||||
ucp_gbXX values defined in ucp.h. The 2-dimensional table is indexed by the
|
||||
properties of two adjacent code points. The left property selects a word from
|
||||
the table, and the right property selects a bit from that word like this:
|
||||
|
||||
ucp_gbtable[left-property] & (1 << right-property)
|
||||
|
||||
The value is non-zero if a grapheme break is NOT permitted between the relevant
|
||||
two code points. The breaking rules are as follows:
|
||||
|
||||
1. Break at the start and end of text (pretty obviously).
|
||||
|
||||
2. Do not break between a CR and LF; otherwise, break before and after
|
||||
controls.
|
||||
|
||||
3. Do not break Hangul syllable sequences, the rules for which are:
|
||||
|
||||
L may be followed by L, V, LV or LVT
|
||||
LV or V may be followed by V or T
|
||||
LVT or T may be followed by T
|
||||
|
||||
4. Do not break before extending characters.
|
||||
|
||||
The next two rules are only for extended grapheme clusters (but that's what we
|
||||
are implementing).
|
||||
|
||||
5. Do not break before SpacingMarks.
|
||||
|
||||
6. Do not break after Prepend characters.
|
||||
|
||||
7. Otherwise, break everywhere.
|
||||
*/
|
||||
|
||||
const pcre_uint32 PRIV(ucp_gbtable[]) = {
|
||||
(1<<ucp_gbLF), /* 0 CR */
|
||||
0, /* 1 LF */
|
||||
0, /* 2 Control */
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark), /* 3 Extend */
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbPrepend)| /* 4 Prepend */
|
||||
(1<<ucp_gbSpacingMark)|(1<<ucp_gbL)|
|
||||
(1<<ucp_gbV)|(1<<ucp_gbT)|(1<<ucp_gbLV)|
|
||||
(1<<ucp_gbLVT)|(1<<ucp_gbOther),
|
||||
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark), /* 5 SpacingMark */
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbL)| /* 6 L */
|
||||
(1<<ucp_gbL)|(1<<ucp_gbV)|(1<<ucp_gbLV)|(1<<ucp_gbLVT),
|
||||
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbV)| /* 7 V */
|
||||
(1<<ucp_gbT),
|
||||
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbT), /* 8 T */
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbV)| /* 9 LV */
|
||||
(1<<ucp_gbT),
|
||||
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbT), /* 10 LVT */
|
||||
(1<<ucp_gbRegionalIndicator), /* 11 RegionalIndicator */
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark) /* 12 Other */
|
||||
};
|
||||
|
||||
#ifdef SUPPORT_JIT
|
||||
/* This table reverses PRIV(ucp_gentype). We can save the cost
|
||||
of a memory load. */
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||
strings. */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
@ -90,6 +92,7 @@ PCRE_UTF8_ERR18 Overlong 5-byte sequence (won't ever occur)
|
||||
PCRE_UTF8_ERR19 Overlong 6-byte sequence (won't ever occur)
|
||||
PCRE_UTF8_ERR20 Isolated 0x80 byte (not within UTF-8 character)
|
||||
PCRE_UTF8_ERR21 Byte with the illegal value 0xfe or 0xff
|
||||
PCRE_UTF8_ERR22 Non-character
|
||||
|
||||
Arguments:
|
||||
string points to the string
|
||||
@ -114,7 +117,8 @@ if (length < 0)
|
||||
|
||||
for (p = string; length-- > 0; p++)
|
||||
{
|
||||
register int ab, c, d;
|
||||
register pcre_uchar ab, c, d;
|
||||
pcre_uint32 v = 0;
|
||||
|
||||
c = *p;
|
||||
if (c < 128) continue; /* ASCII character */
|
||||
@ -183,6 +187,7 @@ for (p = string; length-- > 0; p++)
|
||||
*erroroffset = (int)(p - string) - 2;
|
||||
return PCRE_UTF8_ERR14;
|
||||
}
|
||||
v = ((c & 0x0f) << 12) | ((d & 0x3f) << 6) | (*p & 0x3f);
|
||||
break;
|
||||
|
||||
/* 4-byte character. Check 3rd and 4th bytes for 0x80. Then check first 2
|
||||
@ -210,6 +215,7 @@ for (p = string; length-- > 0; p++)
|
||||
*erroroffset = (int)(p - string) - 3;
|
||||
return PCRE_UTF8_ERR13;
|
||||
}
|
||||
v = ((c & 0x07) << 18) | ((d & 0x3f) << 12) | ((p[-1] & 0x3f) << 6) | (*p & 0x3f);
|
||||
break;
|
||||
|
||||
/* 5-byte and 6-byte characters are not allowed by RFC 3629, and will be
|
||||
@ -284,11 +290,20 @@ for (p = string; length-- > 0; p++)
|
||||
*erroroffset = (int)(p - string) - ab;
|
||||
return (ab == 4)? PCRE_UTF8_ERR11 : PCRE_UTF8_ERR12;
|
||||
}
|
||||
|
||||
/* Reject non-characters. The pointer p is currently at the last byte of the
|
||||
character. */
|
||||
if ((v & 0xfffeu) == 0xfffeu || (v >= 0xfdd0 && v <= 0xfdef))
|
||||
{
|
||||
*erroroffset = (int)(p - string) - ab;
|
||||
return PCRE_UTF8_ERR22;
|
||||
}
|
||||
}
|
||||
|
||||
#else /* SUPPORT_UTF */
|
||||
#else /* Not SUPPORT_UTF */
|
||||
(void)(string); /* Keep picky compilers happy */
|
||||
(void)(length);
|
||||
(void)(erroroffset);
|
||||
#endif
|
||||
|
||||
return PCRE_UTF8_ERR0; /* This indicates success */
|
||||
|
@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||
string that identifies the PCRE version that is in use. */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
@ -77,12 +79,15 @@ I could find no way of detecting that a macro is defined as an empty string at
|
||||
pre-processor time. This hack uses a standard trick for avoiding calling
|
||||
the STRING macro with an empty argument when doing the test. */
|
||||
|
||||
#ifdef COMPILE_PCRE8
|
||||
#if defined COMPILE_PCRE8
|
||||
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
|
||||
pcre_version(void)
|
||||
#else
|
||||
#elif defined COMPILE_PCRE16
|
||||
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
|
||||
pcre16_version(void)
|
||||
#elif defined COMPILE_PCRE32
|
||||
PCRE_EXP_DEFN const char * PCRE_CALL_CONVENTION
|
||||
pcre32_version(void)
|
||||
#endif
|
||||
{
|
||||
return (XSTRING(Z PCRE_PRERELEASE)[1] == 0)?
|
||||
|
@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||
class. It is used by both pcre_exec() and pcre_def_exec(). */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
#include "pcre_internal.h"
|
||||
|
||||
@ -62,9 +64,9 @@ Returns: TRUE if character matches, else FALSE
|
||||
*/
|
||||
|
||||
BOOL
|
||||
PRIV(xclass)(int c, const pcre_uchar *data, BOOL utf)
|
||||
PRIV(xclass)(pcre_uint32 c, const pcre_uchar *data, BOOL utf)
|
||||
{
|
||||
int t;
|
||||
pcre_uchar t;
|
||||
BOOL negated = (*data & XCL_NOT) != 0;
|
||||
|
||||
(void)utf;
|
||||
@ -92,7 +94,7 @@ if ((*data++ & XCL_MAP) != 0) data += 32 / sizeof(pcre_uchar);
|
||||
|
||||
while ((t = *data++) != XCL_END)
|
||||
{
|
||||
int x, y;
|
||||
pcre_uint32 x, y;
|
||||
if (t == XCL_SINGLE)
|
||||
{
|
||||
#ifdef SUPPORT_UTF
|
||||
|
@ -42,7 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||
functions. */
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include "config.h"
|
||||
#endif
|
||||
|
||||
|
||||
/* Ensure that the PCREPOSIX_EXP_xxx macros are set appropriately for
|
||||
@ -155,11 +157,12 @@ static const int eint[] = {
|
||||
REG_BADPAT, /* internal error: unknown opcode in find_fixedlength() */
|
||||
REG_BADPAT, /* \N is not supported in a class */
|
||||
REG_BADPAT, /* too many forward references */
|
||||
REG_BADPAT, /* disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff) */
|
||||
REG_BADPAT, /* disallowed UTF-8/16/32 code point (>= 0xd800 && <= 0xdfff) */
|
||||
REG_BADPAT, /* invalid UTF-16 string (should not occur) */
|
||||
/* 75 */
|
||||
REG_BADPAT, /* overlong MARK name */
|
||||
REG_BADPAT /* character value in \u.... sequence is too large */
|
||||
REG_BADPAT, /* character value in \u.... sequence is too large */
|
||||
REG_BADPAT /* invalid UTF-32 string (should not occur) */
|
||||
};
|
||||
|
||||
/* Table of texts corresponding to POSIX error codes */
|
||||
@ -257,6 +260,7 @@ const char *errorptr;
|
||||
int erroffset;
|
||||
int errorcode;
|
||||
int options = 0;
|
||||
int re_nsub = 0;
|
||||
|
||||
if ((cflags & REG_ICASE) != 0) options |= PCRE_CASELESS;
|
||||
if ((cflags & REG_NEWLINE) != 0) options |= PCRE_MULTILINE;
|
||||
@ -280,7 +284,8 @@ if (preg->re_pcre == NULL)
|
||||
}
|
||||
|
||||
(void)pcre_fullinfo((const pcre *)preg->re_pcre, NULL, PCRE_INFO_CAPTURECOUNT,
|
||||
&(preg->re_nsub));
|
||||
&re_nsub);
|
||||
preg->re_nsub = (size_t)re_nsub;
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -312,7 +317,7 @@ int *ovector = NULL;
|
||||
int small_ovector[POSIX_MALLOC_THRESHOLD * 3];
|
||||
BOOL allocated_ovector = FALSE;
|
||||
BOOL nosub =
|
||||
(((const pcre *)preg->re_pcre)->options & PCRE_NO_AUTO_CAPTURE) != 0;
|
||||
(REAL_PCRE_OPTIONS((const pcre *)preg->re_pcre) & PCRE_NO_AUTO_CAPTURE) != 0;
|
||||
|
||||
if ((eflags & REG_NOTBOL) != 0) options |= PCRE_NOTBOL;
|
||||
if ((eflags & REG_NOTEOL) != 0) options |= PCRE_NOTEOL;
|
||||
|
39
ext/pcre/pcrelib/testdata/grepoutput
vendored
39
ext/pcre/pcrelib/testdata/grepoutput
vendored
@ -93,6 +93,7 @@ RC=0
|
||||
---------------------------- Test 13 -----------------------------
|
||||
Here is the pattern again.
|
||||
That time it was on a line by itself.
|
||||
seventeen
|
||||
This line contains pattern not on a line by itself.
|
||||
RC=0
|
||||
---------------------------- Test 14 -----------------------------
|
||||
@ -370,11 +371,11 @@ RC=2
|
||||
---------------------------- Test 34 -----------------------------
|
||||
RC=2
|
||||
---------------------------- Test 35 -----------------------------
|
||||
./testdata/grepinput8
|
||||
./testdata/grepinputx
|
||||
RC=0
|
||||
---------------------------- Test 36 -----------------------------
|
||||
./testdata/grepinput3
|
||||
./testdata/grepinput8
|
||||
./testdata/grepinputx
|
||||
RC=0
|
||||
---------------------------- Test 37 -----------------------------
|
||||
@ -643,6 +644,7 @@ testdata/grepinputv:fox jumps
|
||||
testdata/grepinputx:complete pair
|
||||
testdata/grepinputx:That was a complete pair
|
||||
testdata/grepinputx:complete pair
|
||||
testdata/grepinput3:triple: t7_txt s1_tag s_txt p_tag p_txt o_tag o_txt
|
||||
RC=0
|
||||
---------------------------- Test 85 -----------------------------
|
||||
./testdata/grepinput3:Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
|
||||
@ -668,3 +670,38 @@ RC=0
|
||||
---------------------------- Test 93 -----------------------------
|
||||
The quick brown f |