Commit Graph

31 Commits

Author SHA1 Message Date
Nikita Popov
e74b84a9fe Fix Intl constructor leaks
Drop the Z_OBJ(return_value) = NULL hack and return status code
from ctor function instead.
2015-04-17 10:33:59 +02:00
Nikita Popov
25affa8a0f Fix leak in transliterator_transliterate() 2015-04-17 10:33:58 +02:00
Dmitry Stogov
9e70d7672d Move zend_object->guards into additional slot of zend_object->properties_table[]. As result size of objects without __get/__set/__unset/__isset magic methods is reduced. 2015-02-04 15:24:13 +03:00
Stanislav Malyshev
b7a7b1a624 trailing whitespace removal 2015-01-10 15:07:38 -08:00
Stanislav Malyshev
82f3d36583 cleanup intl types 2014-12-29 14:06:12 -08:00
Anatol Belski
bdeb220f48 first shot remove TSRMLS_* things 2014-12-13 23:06:14 +01:00
Nikita Popov
a770d29df7 Add smart_str_append for appending zend_strings
Also replaces usages in Zend/ and ext/standard
2014-09-21 20:58:31 +02:00
Nikita Popov
e33f3d3b7c Move smart_str implementation into Zend/
So we can use it there as well...

For now I've retained the zend_smart_str_public.h header, though
it would probably be better to just move that one struct into
zend_types.h.
2014-09-21 20:49:39 +02:00
Johannes Schlüter
d0cb715373 s/PHP 5/PHP 7/ 2014-09-19 18:33:14 +02:00
Anatol Belski
6db8d4f829 's' works with size_t round 3 2014-08-27 20:49:36 +02:00
Anatol Belski
3234480827 first show to make 's' work with size_t 2014-08-27 20:49:31 +02:00
Anatol Belski
c3e3c98ec6 master renames phase 1 2014-08-25 19:24:55 +02:00
Anatol Belski
113e3c1eeb fix zpp 2014-08-21 15:42:48 +02:00
Anatol Belski
063079b62e ported ext/intl, bugfixes to go 2014-08-19 22:57:17 +02:00
Anatol Belski
63d3f0b844 basic macro replacements, all at once 2014-08-19 08:07:31 +02:00
Xinchen Hui
238a3167e6 Fixed valgrind issues 2014-07-31 18:15:47 +08:00
Dmitry Stogov
2ed8a17045 Refactored run_time_cache usage in object handlers 2014-07-07 20:54:31 +04:00
Xinchen Hui
0b12c08363 Fixed object properties init 2014-06-29 15:39:45 +08:00
Xinchen Hui
e4e9e80067 Fixed segfault while starting up 2014-06-28 20:14:12 +08:00
Xinchen Hui
4fbaddb4f8 Refactoring ext/intl (incompleted) 2014-06-28 00:02:50 +08:00
Dmitry Stogov
050d7e38ad Cleanup (1-st round) 2014-04-15 15:40:40 +04:00
Stanislav Malyshev
0c6d903ce7 fix bug #49348 - issue notice on get_property_ptr_ptr when used for read 2013-02-18 20:56:02 -08:00
Gustavo Lopes
befe4ab479 Merge branch 'PHP-5.4'
* PHP-5.4:
  Fixed defective cloning in ext/intl classes
  NEWS for commit 72c807a
  Allow Spoofchecker to be registered on ICU 49.1
  Announce on NEWS change in 1ce572c
2012-08-26 23:42:57 +02:00
Gustavo Lopes
886a50a619 Fixed defective cloning in ext/intl classes
See also bug #62915
2012-08-26 23:42:13 +02:00
Gustavo André dos Santos Lopes
f5b421621d BreakIterator and RuleBasedBreakiterator added
This commit adds wrappers for the classes BreakIterator and
RuleBasedbreakIterator. The C++ ICU classes are described here:
<http://icu-project.org/apiref/icu4c/classBreakIterator.html>
<http://icu-project.org/apiref/icu4c/classRuleBasedBreakIterator.html>

Additionally, a tutorial is available at:
<http://userguide.icu-project.org/boundaryanalysis>

This implementation wraps UTF-8 text in a UText. The text is
iterated without any copying or conversion to UTF-16. There is
also no validation that the input is actually UTF-8; where there
are malformed sequences, the UText will simply U+FFFD.

The class BreakIterator cannot be instantiated directly (has a
private constructor). It provides the interface exposed by the ICU
abstract class with the same name. The PHP class is not abstract
because we may use it to wrap native subclasses of BreakIterator
that we don't know how to wrap. This class includes methods to
move the iterator position to the beginning (first()), to the
end (last()), forward (next()), backwards (previous()), to the
boundary preceding a certain position (preceding()) and following
a certain position (following()) and to obtain the current position
(current()). next() can also be used to advance or recede an
arbitrary number of positions.

BreakIterator also exposes other native methods:
getAvailableLocales(), getLocale() and factory methods to build
several predefined types of BreakIterators: createWordInstance()
for word boundaries, createCharacterInstance() for locale
dependent notions of "characters", createSentenceInstance() for
sentences, createLineInstance() and createTitleInstance() -- for
title casing breaks. These factories currently return
RuleBasedbreakIterators where the names of the rule sets are found
in the ICU data, observing the passed locale (although the locale
is taken into considering there are very few exceptions to the
root rules).

The clone and compare_object PHP object handlers are also
implemented, though the comparison does not yield meaningful results
when used with >, <, >= and <=.

Note that BreakIterator is an iterator only in the sense of the
first 'Iterator' in 'IteratorIterator', i.e., it does not
implement the Iterator interface. The reason is that there is
no sensible implementation for Iterator::key(). Using it for
an ordinal of the current boundary is not feasible because
we are allowed to move to any boundary at any time. It we were
to determine the current ordinal when last() is called we'd
have to traverse the whole input text to find out how many
breaks there were before. Therefore, BreakIterator implements
only Traversable. It can be wrapped in an IteratorIterator,
but the usual warnings apply.

Finally, I added a convenience method to BreakIterator:
getPartsIterator(). This provides an IntlIterator, backed
by the BreakIterator PHP object (i.e. moving the pointer or
changing the text in BreakIterator affects the iterator
and also moving the iterator affects the backing BreakIterator),
which allows traversing the text between each boundary.
This iterator uses the original text to retrieve the text
between two positions, not the code points returned by the
wrapping UText. Therefore, if the text includes invalid code
unit sequences, these invalid sequences will be in the output
of this iterator, not U+FFFD code points.

The class RuleBasedIterator exposes a constructor that allows
building an iterator from arbitrary compiled or non-compiled
rules. The form of these rules in described in the tutorial linked
above. The rest of the methods allow retrieving the rules --
getRules() and getCompiledRules() --, a hash code of the rule set
(hashCode()) and the rules statuses (getRuleStatus() and
getRuleStatusVec()).

Because the RuleBasedBreakIterator constructor may return parse
errors, I reuse the UParseError to text function that was in the
transliterator files. Therefore, I move that function to
intl_error.c.

common_enum.cpp was also changed, mainly to expose previously
static functions. This avoided code duplication when implementing
the BreakIterator iterator and the IntlIterator returned by
BreakIterator::getPartsIterator().
2012-06-04 22:25:07 +02:00
Felipe Pena
783b05326a - Added missing PHP_FE_END/ZEND_FE_END 2011-08-06 01:22:27 +00:00
Felipe Pena
9c289189d3 - Added missing PHP_FE_END/ZEND_FE_END 2011-08-06 01:22:27 +00:00
Felipe Pena
d8f08192ef - Fixed possible efree(NULL) (bug #55296) 2011-08-04 00:59:43 +00:00
Felipe Pena
463de70efd - Fixed possible efree(NULL) (bug #55296) 2011-08-04 00:59:43 +00:00
Felipe Pena
189b042a01 - Fixed compiler warnings 2011-01-12 00:29:59 +00:00
Gustavo André dos Santos Lopes
e283f7a7fe - Added support for ICU Transformations (Transliterator).
- Changes request #52986 to "to be documented".
2010-10-06 18:53:27 +00:00