Commit Graph

55 Commits

Author SHA1 Message Date
Máté Kocsis
ed0f1f04b9
Declare ext/standard constants in stubs - part 8 (#9615) 2022-09-30 13:51:18 +02:00
George Peter Banyard
dd62ec065e
Refactor php_next_utf8_char() to use zend_result 2022-03-13 13:48:21 +00:00
KsaR
01b3fc03c3
Update http->https in license (#6945)
1. Update: http://www.php.net/license/3_01.txt to https, as there is anyway server header "Location:" to https.
2. Update few license 3.0 to 3.01 as 3.0 states "php 5.1.1, 4.1.1, and earlier".
3. In some license comments is "at through the world-wide-web" while most is without "at", so deleted.
4. fixed indentation in some files before |
2021-05-06 12:16:35 +02:00
Nikita Popov
3e01f5afb1 Replace zend_bool uses with bool
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.

Of course, zend_bool is retained as an alias.
2021-01-15 12:33:06 +01:00
twosee
88355dd338 Constify char * arguments of APIs
Closes GH-5676.
2020-06-08 10:38:45 +02:00
Nikita Popov
c50cfc4d3d Add quiet parameter to internal HTML entities API
In some places, we need to make sure that no warnings are thrown
due to unknown encoding. The error reporting code tried to avoid
this by determining a "safe charset", but this introduces subtle
discrepancies in which charset is picked (normally
internal_encoding takes precedence). Avoid this by suppressing
the warning in the first place.

While here, use the fallback logic to print error messages with
substitution characters more consistently, to avoid skipping
parts of the error message entirely.
2020-05-07 15:46:08 +02:00
Nikita Popov
2bfcd8825c Remove now unnecessary PHP_FUNCTION() declarations 2020-04-03 15:41:41 +02:00
Gabriel Caruso
5d6e923d46
Remove mention of PHP major version in Copyright headers
Closes GH-4732.
2019-09-25 14:51:43 +02:00
Zeev Suraski
38c337f22e Remove year range from copyright notice 2019-01-30 11:00:23 +02:00
Peter Kokot
8d3f8ca12a Remove unused Git attributes ident
The $Id$ keywords were used in Subversion where they can be substituted
with filename, last revision number change, last changed date, and last
user who changed it.

In Git this functionality is different and can be done with Git attribute
ident. These need to be defined manually for each file in the
.gitattributes file and are afterwards replaced with 40-character
hexadecimal blob object name which is based only on the particular file
contents.

This patch simplifies handling of $Id$ keywords by removing them since
they are not used anymore.
2018-07-25 00:53:25 +02:00
Xinchen Hui
a6519d0514 year++ 2018-01-02 12:57:58 +08:00
Dmitry Stogov
68dc754998 Avoid string reallocations in html_entity_decode() and htmlspecialchars_decode() 2017-06-06 16:09:26 +03:00
Sammy Kaye Powers
9e29f841ce Update copyright headers to 2017 2017-01-02 09:30:12 -06:00
Lior Kaplan
ed35de784f Merge branch 'PHP-5.6' into PHP-7.0
* PHP-5.6:
  Happy new year (Update copyright to 2016)
2016-01-01 19:48:25 +02:00
Lior Kaplan
49493a2dcf Happy new year (Update copyright to 2016) 2016-01-01 19:21:47 +02:00
Xinchen Hui
fc33f52d8c bump year 2015-01-15 23:27:30 +08:00
Xinchen Hui
0579e8278d bump year 2015-01-15 23:26:37 +08:00
Stanislav Malyshev
b7a7b1a624 trailing whitespace removal 2015-01-10 15:07:38 -08:00
Anatol Belski
bdeb220f48 first shot remove TSRMLS_* things 2014-12-13 23:06:14 +01:00
Johannes Schlüter
d0cb715373 s/PHP 5/PHP 7/ 2014-09-19 18:33:14 +02:00
Dmitry Stogov
40e053e7f3 Use better data structures (incomplete) 2014-02-13 17:54:23 +04:00
Xinchen Hui
c081ce628f Bump year 2014-01-03 11:08:10 +08:00
Xinchen Hui
a666285bc2 Happy New Year 2013-01-01 16:37:09 +08:00
Felipe Pena
8775a37559 - Year++ 2012-01-01 13:15:04 +00:00
Felipe Pena
0203cc3d44 - Year++ 2011-01-01 02:17:06 +00:00
Gustavo André dos Santos Lopes
e69b1ff2c4 - Fixed bug #49687 (utf8_decode vulnerabilities and deficiencies in the number
of reported malformed sequences). (Gustavo)
#Made a public interface for get_next_char/utf-8 in trunk to use in utf8_decode.
#In PHP 5.3, trunk's get_next_char was copied to xml.c because 5.3's
#get_next_char is different and is not prepared to recover appropriately from
#errors.
2010-10-27 18:13:25 +00:00
Gustavo André dos Santos Lopes
91727cb844 - Completed rewrite of html.c. Except for determine_charset, almost nothing
remains.
- Fixed bug on determine_charset that was preventing correct detection in
  combination with internal mbstring encoding "none", "pass" or "auto".
- Added profiles for entity encode/decode for HTMl 4.01, XHTML 1.0, XML 1.0
  and HTML 5. Added the constants ENT_HTML401, ENT_XML1, ENT_XHTML and
  ENT_HTML5.
- htmlentities()/htmlspecialchars(), when told not to double encode, verify
  the correctness of the existenting entities more thoroughly.
  It is checked whether the numerical entity represents a valid unicode code
  point (number is between 0 and 0x10FFFF). If using the flag ENT_DISALLOWED,
  it is also checked whether that numerical entity is valid in selected
  document. In HTML 4.01, all the numerical entities that represent a Unicode
  code point (< U+10FFFFFF) are valid, but that's not the case with other
  document types. If the entity is not valid, & is encoded to &amp;.
  For named entities, the check is also more thorough. While before the only
  check would be to determine if the entity was constituted by alphanumeric
  characters, now it is checked whether that entity is necessarily defined for
  the target document type. Otherwise, & is encoded to &amp;.
- For html_entity_decode(), only valid numerical and named entities (as defined
  above for htmlentities()/htmlspecialchars() + !double_encode) are decoded.
  But there is in this case one additional check. Entities that represent
  non-SGML or otherwise invalid characters are not decoded. Note that, in
  HTML5, U+000D is a valid literal character, but the entity &#x0D is not
  valid and is therefore not decoded.
- The hash tables lazily created for decoding in html_entity_decode() that were
  added recently were substituted by static hash tables. Instead of 1 hash
  table per encoding, there's only one hash table per document type defined in
  terms of unicode code points. This means that for charsets other than UTF-8
  and ISO-8859-1, a conversion to unicode code points is necessary before
  decoding.
- On the encoding side, the ad hoc ranges of entities of the translation
  tables, which mapped (in general) non-unicode code points to HTML entities
  were replaced by three-stage tables for HTML 4 and HTML 5. This mapping
  tables are defined only in terms of unicode code points, so a conversion
  is necessary for charsets other than UTF-8 and ISO-8859-1. Even so, the
  multi-stage table is much faster than the previous method, by a factor
  of 5; the conversion to unicode is a small penalty because it's just a
  simple table lookup.
  XML 1.0/htmlspecialchars() uses a simple table instead of a three-stage
  table.
- Added the flag ENT_SUBSTITUTE, which makes htmlentities()/htmlspecialchars()
  replace the invalid multibyte sequences with U+FFFD (UTF-8) or &#FFFD;
  (other encodings).
- Added the flag ENT_DISALLOWED. Implements FR #52860. Characters that cannot
  appear literally are replaced by U+FFFD (UTF-8) or &#FFFD; (otherwise).
  An alternative implementation would be to encode those characters into
  numerical entities, but that would only work in HTML 4.01 due to limitations
  on the values of numerical entities in other document types. See also the
  effects on htmlentities()/htmlspecialchars() with !double_encode above.
2010-10-24 15:01:02 +00:00
Sebastian Bergmann
9ba1e81665 sed -i "s#1997-2009#1997-2010#g" **/*.c **/*.h **/*.php 2010-01-03 09:23:27 +00:00
Sebastian Bergmann
08659c2dcd MFH: Bump copyright year, 3 of 3. 2008-12-31 11:15:49 +00:00
Arnaud Le Blanc
18794addbd MFH: Added ENT_IGNORE as a compatibility flag for htmlentities() and
htmlspecialchars() to skip multibyte sequences intead of returning an
empty string (as iconv's //IGNORE). These functions will still never
return an invalid or incomplete multibyte sequence.
Fixes #43896
2008-11-26 03:00:06 +00:00
Sebastian Bergmann
d1dded8751 MFH: Bump copyright year, 2 of 2. 2007-12-31 07:17:19 +00:00
Ilia Alshanetsky
c98cbb6020 [DOC] Added a 4th parameter flag to htmlspecialchars() and htmlentities()
that makes the function not encode existing html entities. The feature is
disabled by default and can be activated by passing FALSE as the 4th param
2007-05-22 12:37:00 +00:00
Sebastian Bergmann
4223aa4d5e MFH: Bump year. 2007-01-01 09:36:18 +00:00
Antony Dovgal
aaf120127e add php_unescape_html_entities() proto to the header
(fixes #39665)
2006-11-28 20:41:07 +00:00
foobar
5bd93221a8 bump year and license version 2006-01-01 12:51:34 +00:00
foobar
23e671a51e - Bumber up year 2005-08-03 14:08:58 +00:00
Ilia Alshanetsky
975ff6f5d5 Added htmlspecialchars_decode() function for fast conversion from
htmlspecialchars() generated entities back to characters.
2005-03-07 19:37:27 +00:00
foobar
ccfc46b0aa - Happy new year and PHP 5 for rest of the files too..
# Should the LICENSE and Zend/LICENSE dates be updated too?
2004-01-08 17:33:29 +00:00
James Cox
f68c7ff249 updating license information in the headers. 2003-06-10 20:04:29 +00:00
Sebastian Bergmann
b506f5c8f8 Bump year. 2002-12-31 16:08:15 +00:00
Sebastian Bergmann
b5d4b5496d Fix ZTS build. 2002-09-26 18:13:32 +00:00
Wez Furlong
a184f5d1d3 * formatting, plus remove some old fopen wrappers 2002-03-16 01:34:52 +00:00
Wez Furlong
0f65280cb5 New PHP streams... 2002-03-15 21:03:08 +00:00
Sebastian Bergmann
90613d2282 Maintain headers. 2002-02-28 08:29:35 +00:00
Sebastian Bergmann
38933514e1 Update headers. 2001-12-11 15:32:16 +00:00
Wez Furlong
d38cba8697 Added charset awareness to htmlentities() and htmlspecialchars(); use an
optional third parameter to specify the charset; otherwise tries to determine
it from the LC_CTYPE locale setting.
2001-05-28 11:00:06 +00:00
Andrei Zmievski
07a5e3fb9c * Made ENT_* defines availabe to other functions.
* The key/variable names in WDDX are now html escaped to not break XML.
@- Fixed WDDX serialization to HTML-escape key/variable names so as not to
@  break the XML packet. (Andrei)
2001-04-25 20:14:29 +00:00
Andi Gutmans
eb6ba01d1c - Fix copyright notices with 2001 2001-02-26 06:11:02 +00:00
Rasmus Lerdorf
d23ad61dc3 Clean up htmlspecialchars/htmlentities inconsistencies.
@Clean up htmlspecialchars/htmlentities inconsistencies. (Rasmus)
2000-09-12 17:22:37 +00:00
David Croft
83513d9580 Changed lots of PHP 3 licence headers to PHP 4, mainly in .h files.
Added a few RCS $Id$ tags.

# Note: I have avoided changing any .h files if the corresponding .c file
# had not already been changed as I am not sure if there are any legal
# issues here. So some extensions still have PHP 3 headers.
2000-07-24 01:40:02 +00:00