Handle type-casting in template parameter list when hashing symbols

Due to a logical bug in gdb/cp-support.c:cp_search_name_hash(), GDB
may not be able to find a symbol when asked by the user.  See the
accompanying test for such demonstration.

The cp_search_name_hash() cannot correctly handle a (demangled) symbol
that comprises of type-casting for the first parameter in its template
parameter list, e.g.:

  foo<(enum_test)0>(int, int)

In this example, the processing logic in cp_search_name_hash() considers
the "foo<" string for hashing instead of "foo".  This is due to a faulty
logic in the processing loop that tries to _keep_ hashing if a '<' char
with the following property is encountered:

---------------------------------------------------------------------
for (const char *string = search_name; *string != '\0'; ++string)
  {
    ...

    if (*string == '(')
      break;

    ...

    /* Ignore template parameter list.  */
    if (string[0] == '<'
        && string[1] != '(' && string[1] != '<' && string[1] != '='
        && string[1] != ' ' && string[1] = '\0')
      break;

    ...
    hash = SYMBOL_HASH_NEXT (hash, *string);
  }
---------------------------------------------------------------------

Ostensibly, this logic strives to bail out of the processing loop as
soon as the beginning of an argument list is encountered, "(int, int)"
in the example, or the beginning of a template parameter list, the
"<(enum_test)0>" in the example.  However, when "string" is pointing
at '<', the following incorrect logic takes precedence:

---------------------------------------------------------------------
for (const char *string = search_name; *string != '\0'; ++string)
  {
    if (*string == '(')
      break;
    ...
    if (string[0] == '<' && string[1] != '(' ...)
      break;

    hash = SYMBOL_HASH_NEXT (hash, *string);
  }
---------------------------------------------------------------------

In "foo<(enum_test)0>(int, int)", the '(' char that is positioned after
the '<' char causes the "if" condition at the end of the loop not to
"break".  As a result, the '<' is considered for hashing and at the
beginning of the next iteration, the loop is exited because "string"
points to '(' char.

It's obvious that the intention of the "if" condition at the end of the
loop body is to handle cases where the method name is "operator<",
"operator<<", or "operator<=".  While fixing the issue, I've re-written
the logic as such to make that more explicit.  Still, the complexity of
the function remains O(n).  It is worth mentioning that in the same
file the "find_toplevel_char()" follows the same explicit logic.

Reviewed-By: Lancelot SIX <lancelot.six@amd.com>
Reviewed-By: Pedro Alves <pedro@palves.net>
Approved-by: Tom Tromey <tom@tromey.com>
Change-Id: I64cbdbe79671e070cc5da465d1cce7989c58074e
This commit is contained in:
Shahab Vahedi 2024-10-25 17:18:24 +02:00
parent 0fb43ab598
commit af3f129d92
3 changed files with 164 additions and 4 deletions

View File

@ -1706,21 +1706,60 @@ cp_search_name_hash (const char *search_name)
unsigned int hash = 0;
for (const char *string = search_name; *string != '\0'; ++string)
{
const char *before_skip = string;
string = skip_spaces (string);
if (*string == '(')
break;
/* Could it be the beginning of a function name?
If yes, does it begin with the keyword "operator"? */
if ((string != before_skip || string == search_name)
&& (string[0] == 'o' && startswith (string, CP_OPERATOR_STR)))
{
/* Hash the "operator" part. */
for (size_t i = 0; i < CP_OPERATOR_LEN; ++i)
hash = SYMBOL_HASH_NEXT (hash, *string++);
string = skip_spaces (string);
/* If no more data to process, stop right now. This is specially
intended for SEARCH_NAMEs that end with "operator". In such
cases, the whole string is processed and STRING is pointing to a
null-byte. Letting the loop body resume naturally would lead to
a "++string" that causes STRING to point past the null-byte. */
if (string[0] == '\0')
break;
/* "<" and "<<" are sequences of interest here. This covers
"operator{<,<<,<=,<=>}". In the last 2 cases, the "=" and "=>"
parts are handled by the next iterations of the loop like other
input chars. The goal is to process all the operator-related '<'
chars, so that later if a '<' is visited it can be inferred for
sure that it is the beginning of a template parameter list.
STRING is a null-byte terminated string. If string[0] is not
a null-byte, according to the previous check, string[1] is not
past the end of the allocation and can be referenced safely. */
if (string[0] == '<')
{
hash = SYMBOL_HASH_NEXT (hash, *string);
if (string[1] == '<')
hash = SYMBOL_HASH_NEXT (hash, *++string);
continue;
}
}
/* Ignore ABI tags such as "[abi:cxx11]. */
if (*string == '['
&& startswith (string + 1, "abi:")
&& string[5] != ':')
break;
/* Ignore template parameter lists. */
if (string[0] == '<'
&& string[1] != '(' && string[1] != '<' && string[1] != '='
&& string[1] != ' ' && string[1] != '\0')
/* Ignore template parameter lists. The likely "operator{<,<<,<=,<=>}"
are already taken care of. Therefore, any encounter of '<' character
at this point is related to template lists. */
if (*string == '<')
break;
hash = SYMBOL_HASH_NEXT (hash, *string);
@ -1728,6 +1767,44 @@ cp_search_name_hash (const char *search_name)
return hash;
}
#if GDB_SELF_TEST
namespace selftests {
static void
test_cp_search_name_hash ()
{
SELF_CHECK (cp_search_name_hash ("void func<(enum_test)0>(int*, int)")
== cp_search_name_hash ("void func"));
SELF_CHECK (cp_search_name_hash ("operator")
!= cp_search_name_hash ("operator<"));
SELF_CHECK (cp_search_name_hash ("operator")
!= cp_search_name_hash ("operator<<"));
SELF_CHECK (cp_search_name_hash ("operator<")
!= cp_search_name_hash ("operator<<"));
SELF_CHECK (cp_search_name_hash ("operator<")
== cp_search_name_hash ("operator <"));
SELF_CHECK (cp_search_name_hash ("operator")
!= cp_search_name_hash ("foo_operator"));
SELF_CHECK (cp_search_name_hash ("operator")
!= cp_search_name_hash ("operator_foo"));
SELF_CHECK (cp_search_name_hash ("operator<")
!= cp_search_name_hash ("foo_operator"));
SELF_CHECK (cp_search_name_hash ("operator<")
!= cp_search_name_hash ("operator_foo"));
SELF_CHECK (cp_search_name_hash ("operator<<")
!= cp_search_name_hash ("foo_operator"));
SELF_CHECK (cp_search_name_hash ("operator<<")
!= cp_search_name_hash ("operator_foo"));
SELF_CHECK (cp_search_name_hash ("func")
== cp_search_name_hash ("func[abi:cxx11]"));
}
} /* namespace selftests */
#endif /* GDB_SELF_TEST */
/* Helper for cp_symbol_name_matches (i.e., symbol_name_matcher_ftype
implementation for symbol_name_match_type::WILD matching). Split
to a separate function for unit-testing convenience.
@ -2340,5 +2417,7 @@ display the offending symbol."),
selftests::test_cp_symbol_name_matches);
selftests::register_test ("cp_remove_params",
selftests::test_cp_remove_params);
selftests::register_test ("cp_search_name_hash",
selftests::test_cp_search_name_hash);
#endif
}

View File

@ -0,0 +1,52 @@
/* This testcase is part of GDB, the GNU debugger.
Copyright 2024 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
/* Construct "foo<(enum_test)*>()" symbols. These symbols can be tricky
to handle, because of having a type cast inside the template parameter
list, the "<(enum_test)*>" part. The "<(" sequence in there can throw
a wrench in "cp_search_name_hash()" function that tries to process
things like "operator<(...)" while ignoring template parameter lists
at the same time.
If a breakpoint can be set on "foo", then all is in good order. */
enum enum_test
{
zero = 0,
one
};
/* A template with a non-type parameter. */
template <enum_test test>
void
foo ()
{
}
int
main ()
{
/* Instantiate a "foo<(enum_test)1>()" symbol explicitly. */
foo<(enum_test)1> ();
/* Some compilers, like g++, transform "enum_test::zero" to
"(enum_test)0". For such compilers, this "foo" instance
would become "foo<(enum_test)0>()". */
foo<enum_test::zero> ();
return 0;
}

View File

@ -0,0 +1,29 @@
# Copyright 2024 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# Check if gdb can set breakpoint on a function like "foo" while
# the full symbol name is something like "foo<(type)0>()".
require allow_cplus_tests
standard_testfile .cc
if { [prepare_for_testing "failed to prepare" $testfile "$srcfile"\
{debug c++}] } {
return -1
}
gdb_test "break foo" \
"Breakpoint $decimal at $hex: foo\\. \\(2 locations\\)"