Commit Graph

490 Commits

Author SHA1 Message Date
Ilija Tovilo
9bf119832d
Implement nullsafe ?-> operator
RFC: https://wiki.php.net/rfc/nullsafe_operator

Closes GH-5619.

Co-authored-by: Nikita Popov <nikita.ppv@gmail.com>
2020-07-24 10:05:03 +02:00
Nikita Popov
7a3dcc3e33 Treat namespaced names as single token
Namespace names are now lexed as single tokens of type
T_NAME_QUALIFIED, T_NAME_FULLY_QUALIFIED or T_NAME_RELATIVE.

RFC: https://wiki.php.net/rfc/namespaced_names_as_token

Closes GH-5827.
2020-07-22 12:36:05 +02:00
Ilija Tovilo
9fa1d13301
Implement match expression
RFC: https://wiki.php.net/rfc/match_expression_v2

Closes GH-5371.
2020-07-09 23:52:17 +02:00
Nikita Popov
47cf18ba4e Don't include trailing newline in comment token
Don't include a trailing newline in T_COMMENT tokens, instead leave
it for a following T_WHITESPACE token. The newline does not belong
to the comment logically, and this makes for an ugly special case,
as other tokens do not include trailing newlines.

Whitespace-sensitive tooling will want to either forward or backward
emulate this change.

Closes GH-5182.
2020-06-25 11:25:22 +02:00
Nikita Popov
5571765609 Forbid use of <?= as a semi-reserved identifier
One of the weirdest pieces of PHP code I've ever seen. In terms
of tokens, this gets internally translated to

    use x as y; echo as my_echo;

On master it crashes because this "echo" does not have attached
identifier metadata. Make sure it is added and then reject the
use of "<?=" as an identifier inside zend_lex_tstring.

Fixes oss-fuzz #23547.
2020-06-19 09:29:58 +02:00
Nikita Popov
c7ad8a8738 Initialize indentation_uses_spaces field
This avoids reading a trap representation from _Bool,
but shouldn't matter as far as behavior is concerned.
2020-06-12 11:23:48 +02:00
Nikita Popov
b03cafd19c Fix bug #77966: Cannot alias a method named "namespace"
This is a bit tricky: In this cases we have "namespace as", which
means that we will only recognize "namespace" as an identifier when
the lookahead token is already at the "as". This means that
zend_lex_tstring picks up the wrong identifier.

We solve this by actually assigning the identifier as the semantic
value on the parser stack -- as in almost all cases we will not
actually need the identifier, this is just an (offset, size)
reference, not a copy of the string.

Additionally, we need to teach the lexer feedback mechanism used
by tokenizer TOKEN_PARSE mode to apply feedback to something
other than the very last token. To that purpose we pass through
the token text and check the tokens in reverse order to find the
right one.

Closes GH-5668.
2020-06-08 12:55:14 +02:00
Alex Dowad
80598f1250 Syntax errors caused by unclosed {, [, ( mention specific location
Aside from a few very specific syntax errors for which detailed exceptions are
thrown, generally PHP just emits the default error messages generated by bison on syntax
error. These messages are very uninformative; they just say "Unexpected ... at line ...".

This is most problematic with constructs which can span an arbitrary number of lines, such
as blocks of code delimited by { }, 'if' conditions delimited by ( ), and so on. If a closing
delimiter is missed, the block will run for the entire remainder of the source file (which
could be thousands of lines), and then at the end, a parse error will be thrown with the
dreaded words: "Unexpected end of file".

Therefore, track the positions of opening and closing delimiters and ensure that they match
up correctly. If any mismatch or missing delimiter is detected, immediately throw a parse
error which points the user to the offending line. This is best done in the *lexer* and not
in the parser.

Thanks to Nikita Popov and George Peter Banyard for suggesting improvements.

Fixes bug #79368.
Closes GH-5364.
2020-04-14 11:22:23 +02:00
Máté Kocsis
3709e74b5e
Store default parameter values of internal functions in arg info
Closes GH-5353. From now on, PHP will have reflection information
about default values of parameters of internal functions.

Co-authored-by: Nikita Popov <nikita.ppv@gmail.com>
2020-04-08 18:37:51 +02:00
Nikita Popov
4565a7f269 Fix assertion when (real) is used
And bring back a test for it...
2020-03-16 18:37:01 +01:00
George Peter Banyard
c9db32271a Remove deprecated (real) cast
Closes GH-5220
2020-03-12 15:40:21 +01:00
Nikita Popov
30c2388738 Merge branch 'PHP-7.4'
* PHP-7.4:
  Fixed bug #79062
2020-02-28 17:08:27 +01:00
Nikita Popov
375191aae7 Merge branch 'PHP-7.3' into PHP-7.4
* PHP-7.3:
  Fixed bug #79062
2020-02-28 17:07:36 +01:00
Nikita Popov
6c48da9a50 Fixed bug #79062
Back up the doc comment when performing heredoc scanahead.
2020-02-28 17:06:05 +01:00
Nikita Popov
d5c886ab7d Merge branch 'PHP-7.4'
* PHP-7.4:
  Properly propagate url_stat exceptions during include
2019-12-30 22:57:07 +01:00
Nikita Popov
f77747b06c Properly propagate url_stat exceptions during include
Make sure we abort operations early, and that we don't emit
additional warnings or errors if an exception has been thrown.
2019-12-30 22:56:42 +01:00
Nikita Popov
e0a401335d Make "unterminated comment" into a parse error 2019-10-30 11:00:27 +01:00
Nikita Popov
4e563e6c95 Merge branch 'PHP-7.4'
* PHP-7.4:
  Fix handling of overflowing invalid octal in tokenizer
2019-10-14 16:37:27 +02:00
Nikita Popov
641f9615cc Fix handling of overflowing invalid octal in tokenizer
If token_get_all() is used, we still need to correctly distinguish
LNUMBER vs DNUMBER here for backwards compatibility.
2019-10-14 16:36:27 +02:00
Nikita Popov
595e8c6773 Merge branch 'PHP-7.4' 2019-10-10 11:17:55 +02:00
Nikita Popov
6878c583b0 Report error if stream_read is not implemented
We need to return -1 in this case. Slightly restructure the code
to avoid unnecessary conditions.
2019-10-10 11:13:10 +02:00
Nikita Popov
e9f4676b3f Merge branch 'PHP-7.4' 2019-10-07 17:31:12 +02:00
Nikita Popov
2f64844495 Fix leak when include fails in a read operation
Usually it will already fail when opening, but reads can also
fail since PHP 7.4, in which case we still need to place the
file handle in open_files to make sure the destructor will run
on it.
2019-10-07 17:29:33 +02:00
Nikita Popov
5050507282 Merge branch 'PHP-7.4' 2019-10-04 22:47:10 +02:00
Nikita Popov
b078ae6c01 Merge branch 'PHP-7.3' into PHP-7.4 2019-10-04 22:46:53 +02:00
Nikita Popov
239e2dda64 Make sure T_ERROR is returned for all lexer exceptions
This originally manifested as a leak in oss-fuzz #18000. The following
is a reduced test case:

    <?php
    [
        5 => 1,
        "foo" > 1,
        "      " => "" == 0
    ];
    <<<BAR
    $x
     BAR;

Because this particular error condition did not return T_ERROR,
EG(exception) was set while performing binary operation constant
evaluation, which checks exceptions for cast failures.

Instead of adding this indirect test case, I'm adding an assertion
that the lexer has to return T_ERROR if EG(exception) is set.
2019-10-04 22:42:14 +02:00
Nikita Popov
d44cf9b539 Replace "unexpected character" warning with ParseError
Closes GH-4767.
2019-10-04 11:28:58 +02:00
Nikita Popov
7e77617533 Merge branch 'PHP-7.4' 2019-09-30 10:42:47 +02:00
Nikita Popov
19e7e4b197 Fixed bug #78604
<?php followed by EOF is valid since PHP 7.4.
2019-09-30 10:41:14 +02:00
Nikita Popov
38387d7cfc Merge branch 'PHP-7.4' 2019-09-28 17:17:36 +02:00
Nikita Popov
1b08ab26f7 Merge branch 'PHP-7.3' into PHP-7.4 2019-09-28 17:17:18 +02:00
Nikita Popov
7df50ef147 Don't throw warnings during heredoc scan-ahead
Otherwise these warnings will turn up twice (or more...)
2019-09-28 17:15:36 +02:00
Nikita Popov
3372a24b99 Merge branch 'PHP-7.4' 2019-09-14 12:10:15 +02:00
Nikita Popov
3f76f9416f Fix double-free on invalid large octal with separators
To clean up the mess here a bit, check for invalid octal digits
with an explicit loop instead of mixing this into the string to
number conversion.

Also clean up some type usage.
2019-09-14 12:10:06 +02:00
Christoph M. Becker
c8ec166cda Merge branch 'PHP-7.4'
* PHP-7.4:
  Fix #78454: Consecutive numeric separators cause OOM error
2019-08-25 22:46:55 +02:00
Theodore Brown
1a78bdab27 Fix #78454: Consecutive numeric separators cause OOM error
Resolves out of memory error when consecutive numeric separators follow a binary/hex literal.
2019-08-25 22:46:18 +02:00
Christoph M. Becker
c53064fbff Merge branch 'PHP-7.4'
* PHP-7.4:
  Fix #78441: Parse error due to heredoc identifier followed by digit
2019-08-21 22:55:25 +02:00
Christoph M. Becker
1eb75f2937 Merge branch 'PHP-7.3' into PHP-7.4
* PHP-7.3:
  Fix #78441: Parse error due to heredoc identifier followed by digit
2019-08-21 22:54:52 +02:00
Christoph M. Becker
310708845f Fix #78441: Parse error due to heredoc identifier followed by digit
Since digits are allowed for identifiers, we have to cater to them as
well.
2019-08-21 22:51:51 +02:00
Nikita Popov
173ebe4c44 Merge branch 'PHP-7.4' 2019-07-24 10:44:40 +02:00
Nikita Popov
d9680272c7 Revert "Drop free_filename field from zend_file_handle"
This reverts commit e0eca26285.

free_filename is used by the wincache extension, restore this
field for PHP 7.4.
2019-07-24 10:43:37 +02:00
Nikita Popov
36db71df47 Merge branch 'PHP-7.4' 2019-07-22 12:28:40 +02:00
Nikita Popov
e41b7f6db4 Deprecate (real) cast 2019-07-22 11:39:52 +02:00
Nikita Popov
c4a6998c62 Merge branch 'PHP-7.4' 2019-07-16 17:45:03 +02:00
Nikita Popov
e0eca26285 Drop free_filename field from zend_file_handle
free_filename was always zero.
2019-07-16 17:07:26 +02:00
Nikita Popov
3faa903d47 Merge branch 'PHP-7.4' 2019-07-16 16:44:46 +02:00
Nikita Popov
49bac9b77b Introduce zend_stream_init_filename()
Avoid more ad-hoc initialization of zend_file_handle structures.
2019-07-16 16:44:37 +02:00
Nikita Popov
ce972ba349 Merge branch 'PHP-7.4' 2019-07-16 11:54:40 +02:00
Nikita Popov
c9acc90186 Support <?php followed by EOF
This is an annoying edge-case for canonicalization.
2019-07-16 11:53:48 +02:00
Nikita Popov
90d5276e1b Merge branch 'PHP-7.4' 2019-07-15 17:29:45 +02:00