binutils-gdb/gdb/testsuite/README
Sergio Durigan Junior fb6a751f5f Improve analysis of racy testcases
This is an initial attempt to introduce some mechanisms to identify
racy testcases present in our testsuite.  As can be seen in previous
discussions, racy tests are really bothersome and cause our BuildBot
to pollute the gdb-testers mailing list with hundreds of
false-positives messages every month.  Hopefully, identifying these
racy tests in advance (and automatically) will contribute to the
reduction of noise traffic to gdb-testers, maybe to the point where we
will be able to send the failure messages directly to the authors of
the commits.

I spent some time trying to decide the best way to tackle this
problem, and decided that there is no silver bullet.  Racy tests are
tricky and it is difficult to catch them, so the best solution I could
find (for now?) is to run our testsuite a number of times in a row,
and then compare the results (i.e., the gdb.sum files generated during
each run).  The more times you run the tests, the more racy tests you
are likely to detect (at the expense of waiting longer and longer).
You can also run the tests in parallel, which makes things faster (and
contribute to catching more racy tests, because your machine will have
less resources for each test and some of them are likely to fail when
this happens).  I did some tests in my machine (8-core i7, 16GB RAM),
and running the whole GDB testsuite 5 times using -j6 took 23 minutes.
Not bad.

In order to run the racy test machinery, you need to specify the
RACY_ITER environment variable.  You will assign a number to this
variable, which represents the number of times you want to run the
tests.  So, for example, if you want to run the whole testsuite 3
times in parallel (using 2 cores), you will do:

  make check RACY_ITER=3 -j2

It is also possible to use the TESTS variable and specify which tests
you want to run:

  make check TEST='gdb.base/default.exp' RACY_ITER=3 -j2

And so on.  The output files will be put at the directory
gdb/testsuite/racy_outputs/.

After make invokes the necessary rules to run the tests, it finally
runs a Python script that will analyze the resulting gdb.sum files.
This Python script will read each file, and construct a series of sets
based on the results of the tests (one set for FAIL's, one for
PASS'es, one for KFAIL's, etc.).  It will then do some set operations
and come up with a list of unique, sorted testcases that are racy.
The algorithm behind this is:

  for state in PASS, FAIL, XFAIL, XPASS...; do
    if a test's state in every sumfile is $state; then
      it is not racy
    else
      it is racy

(The algorithm is actually a bit more complex than that, because it
takes into account other things in order to decide whether the test
should be ignored or not).

IOW, a test must have the same state in every sumfile.

After processing everything, the script prints the racy tests it could
identify on stdout.  I am redirecting this to a file named racy.sum.

Something else that I wasn't sure how to deal with was non-unique
messages in our testsuite.  I decided to do the same thing I do in our
BuildBot: include a unique identifier in the end of message, like:

  gdb.base/xyz.exp: non-unique message
  gdb.base/xyz.exp: non-unique message <<2>>

This means that you will have to be careful about them when you use
the racy.sum file.

I ran the script several times here, and it did a good job catching
some well-known racy tests.  Overall, I am satisfied with this
approach and I think it will be helpful to have it upstream'ed.  I
also intend to extend our BuildBot and create new, specialized
builders that will be responsible for detecting the racy tests every X
number of days.

2016-03-05  Sergio Durigan Junior  <sergiodj@redhat.com>

	* Makefile.in (DEFAULT_RACY_ITER): New variable.
	(CHECK_TARGET_TMP): Likewise.
	(check-single-racy): New rule.
	(check-parallel-racy): Likewise.
	(TEST_TARGETS): Adjust rule to account for RACY_ITER.
	(do-check-parallel-racy): New rule.
	(check-racy/%.exp): Likewise.
	* README (Racy testcases): New section.
	* analyze-racy-logs.py: New file.
2016-03-05 20:43:40 -05:00

576 lines
19 KiB
Plaintext

This is a collection of tests for GDB.
The file gdb/README contains basic instructions on how to run the
testsuite, while this file documents additional options and controls
that are available. The GDB wiki may also have some pages with ideas
and suggestions.
Running the Testsuite
*********************
There are two ways to run the testsuite and pass additional parameters
to DejaGnu. The first is to do `make check' in the main build
directory and specifying the makefile variable `RUNTESTFLAGS':
make check RUNTESTFLAGS='TRANSCRIPT=y gdb.base/a2-run.exp'
The second is to cd to the testsuite directory and invoke the DejaGnu
`runtest' command directly.
cd testsuite
make site.exp
runtest TRANSCRIPT=y
(The `site.exp' file contains a handful of useful variables like host
and target triplets, and pathnames.)
Parallel testing
****************
If not testing with a remote host (in DejaGnu's sense), you can run
the GDB test suite in a fully parallel mode. In this mode, each .exp
file runs separately and maybe simultaneously. The test suite ensures
that all the temporary files created by the test suite do not clash,
by putting them into separate directories. This mode is primarily
intended for use by the Makefile.
For GNU make, the Makefile tries to run the tests in parallel mode if
any -j option is given. For a non-GNU make, tests are not
parallelized.
If RUNTESTFLAGS is not empty, then by default the tests are
serialized. This can be overridden by either using the
`check-parallel' target in the Makefile, or by setting FORCE_PARALLEL
to any non-empty value:
make check-parallel RUNTESTFLAGS="--target_board=native-gdbserver"
make check RUNTESTFLAGS="--target_board=native-gdbserver" FORCE_PARALLEL=1
If you want to use runtest directly instead of using the Makefile, see
the description of GDB_PARALLEL below.
Racy testcases
**************
Sometimes, new testcases are added to the testsuite that are not
entirely deterministic, and can randomly pass or fail. We call them
"racy testcases", and they can be bothersome when one is comparing
different testsuite runs. In order to help identifying them, it is
possible to run the tests several times in a row and ask the testsuite
machinery to analyze the results. To do that, you need to specify the
RACY_ITER environment variable to make:
make check RACY_ITER=5 -j4
The value assigned to RACY_ITER represents the number of times you
wish to run the tests in sequence (in the example above, the entire
testsuite will be executed 5 times in a row, in parallel). It is also
possible to check just a specific test:
make check TESTS='gdb.base/default.exp' RACY_ITER=3
One can also decide to call the Makefile rules by hand inside the
gdb/testsuite directory, e.g.:
make check-paralell-racy -j4
In which case the value of the DEFAULT_RACY_ITER variable (inside
gdb/testsuite/Makefile.in) will be used to determine how many
iterations will be run.
After running the tests, you shall see a file name 'racy.sum' in the
gdb/testsuite directory. You can also inspect the generated *.log and
*.sum files by looking into the gdb/testsuite/racy_ouputs directory.
If you already have *.sum files generated from previous testsuite runs
and you would like to analyze them without having to run the testsuite
again, you can also use the 'analyze-racy-logs.py' script directly.
It is located in the gdb/testsuite/ directory, and it expects a list
of two or more *.sum files to be provided as its argument. For
example:
./gdb/testsuite/analyze-racy-logs.py testsuite-01/gdb.sum \
testsuite-02/gdb.sum testsuite-03/gdb.sum
The script will output its analysis report to the standard output.
Running the Performance Tests
*****************************
GDB Testsuite includes performance test cases, which are not run together
with other test cases, because performance test cases are slow and need
a quiet system. There are two ways to run the performance test cases.
The first is to do `make check-perf' in the main build directory:
make check-perf RUNTESTFLAGS="solib.exp SOLIB_COUNT=8"
The second is to cd to the testsuite directory and invoke the DejaGnu
`runtest' command directly.
cd testsuite
make site.exp
runtest GDB_PERFTEST_MODE=both GDB_PERFTEST_TIMEOUT=4000 --directory=gdb.perf solib.exp SOLIB_COUNT=8
Only "compile", "run" and "both" are valid to GDB_PERFTEST_MODE. They
stand for "compile tests only", "run tests only", and "compile and run
tests" respectively. "both" is the default. GDB_PERFTEST_TIMEOUT
specify the timeout, which is 3000 in default. The result of
performance test is appended in `testsuite/perftest.log'.
Testsuite Parameters
********************
The following parameters are DejaGNU variables that you can set to
affect the testsuite run globally.
TRANSCRIPT
You may find it useful to have a transcript of the commands that the
testsuite sends to GDB, for instance if GDB crashes during the run,
and you want to reconstruct the sequence of commands.
If the DejaGNU variable TRANSCRIPT is set (to any value), each
invocation of GDB during the test run will get a transcript file
written into the DejaGNU output directory. The file will have the
name transcript.<n>, where <n> is an integer. The first line of the
file shows the invocation command with all the options passed to it,
while subsequent lines are the GDB commands. A `make check' might
look like this:
make check RUNTESTFLAGS=TRANSCRIPT=y
The transcript may not be complete, as for instance tests of command
completion may show only partial command lines.
GDB
By default, the testsuite exercises the GDB in the build directory,
but you can set GDB to be a pathname to a different version. For
instance,
make check RUNTESTFLAGS=GDB=/usr/bin/gdb
runs the testsuite on the GDB in /usr/bin.
GDBSERVER
You can set GDBSERVER to be a particular GDBserver of interest, so for
instance
make check RUNTESTFLAGS="GDB=/usr/bin/gdb GDBSERVER=/usr/bin/gdbserver"
checks both the installed GDB and GDBserver.
INTERNAL_GDBFLAGS
Command line options passed to all GDB invocations.
The default is "-nw -nx".
`-nw' disables any of the windowed interfaces.
`-nx' disables ~/.gdbinit, so that it doesn't interfere with
the tests.
This is actually considered an internal variable, and you
won't normally want to change it. However, in some situations,
this may be tweaked as a last resort if the testsuite doesn't
have direct support for the specifics of your environment.
The testsuite does not override a value provided by the user.
As an example, when testing an installed GDB that has been
configured with `--with-system-gdbinit', like by default,
you do not want ~/.gdbinit to interfere with tests, but, you
may want the system .gdbinit file loaded. As there's no way to
ask the testsuite, or GDB, to load the system gdbinit but
not ~/.gdbinit, a workaround is then to remove `-nx' from
INTERNAL_GDBFLAGS, and point $HOME at a directory without
a .gdbinit. For example:
cd testsuite
HOME=`pwd` runtest \
GDB=/usr/bin/gdb \
GDBSERVER=/usr/bin/gdbserver \
INTERNAL_GDBFLAGS=-nw
GDB_PARALLEL
To use parallel testing mode without using the the Makefile, set
GDB_PARALLEL on the runtest command line to "yes". Before starting
the tests, you must ensure that the directories cache, outputs, and
temp in the test suite build directory are either empty or have been
deleted. cache in particular is used to share data across invocations
of runtest, and files there may affect the test results. The Makefile
automatically does these deletions.
FORCE_PARALLEL
Setting FORCE_PARALLEL to any non-empty value forces parallel testing
mode even if RUNTESTFLAGS is not empty.
GDB_INOTIFY
For debugging parallel mode, it is handy to be able to see when a test
case writes to a file outside of its designated output directory.
If you have the inotify-tools package installed, you can set the
GDB_INOTIFY variable on the runtest command line. This will cause the
test suite to watch for parallel-unsafe file creations and report
them, both to stdout and in the test suite log file.
This setting is only meaningful in conjunction with GDB_PARALLEL.
TESTS
This variable is used to specify which set of tests to run.
It is passed to make (not runtest) and its contents are a space separated
list of tests to run.
If using GNU make then the contents are wildcard-expanded using
GNU make's $(wildcard) function. Test paths must be fully specified,
relative to the "testsuite" subdirectory. This allows one to run all
tests in a subdirectory by passing "gdb.subdir/*.exp", or more simply
by using the check-gdb.subdir target in the Makefile.
If for some strange reason one wanted to run all tests that begin with
the letter "d" that is also possible: TESTS="*/d*.exp".
Do not write */*.exp to specify all tests (assuming all tests are only
nested one level deep, which is not necessarily true). This will pick up
.exp files in ancillary directories like "lib" and "config".
Instead write gdb.*/*.exp.
Example:
make -j10 check TESTS="gdb.server/[s-w]*.exp */x*.exp"
If not using GNU make then the value is passed directly to runtest.
If not specified, all tests are run.
READ1
This make (not runtest) variable is used to specify whether the
testsuite preloads the read1.so library into expect. Any non-empty
value means true. See "Race detection" below.
Race detection
**************
The testsuite includes a mechanism that helps detect test races.
For example, say the program running under expect outputs "abcd", and
a test does something like this:
expect {
"a.*c" {
}
"b" {
}
"a" {
}
}
Which case happens to match depends on what expect manages to read
into its internal buffer in one go. If it manages to read three bytes
or more, then the first case matches. If it manages to read two
bytes, then the second case matches. If it manages to read only one
byte, then the third case matches.
To help detect these cases, the race detection mechanism preloads a
library into expect that forces the `read' system call to always
return at most 1 byte.
To enable this, either pass a non-empty value in the READ1 make
variable, or use the check-read1 make target instead of check.
Examples:
make -j10 check-read1 TESTS="*/paginate-*.exp"
make -j10 check READ1="1"
Testsuite Configuration
***********************
It is possible to adjust the behavior of the testsuite by defining
the global variables listed below, either in a `site.exp' file,
or in a board file.
gdb_test_timeout
Defining this variable changes the default timeout duration used
during communication with GDB. More specifically, the global variable
used during testing is `timeout', but this variable gets reset to
`gdb_test_timeout' at the beginning of each testcase, which ensures
that any local change to `timeout' in a testcase does not affect
subsequent testcases.
This global variable comes in handy when the debugger is slower than
normal due to the testing environment, triggering unexpected `TIMEOUT'
test failures. Examples include when testing on a remote machine, or
against a system where communications are slow.
If not specifically defined, this variable gets automatically defined
to the same value as `timeout' during the testsuite initialization.
The default value of the timeout is defined in the file
`testsuite/config/unix.exp' (at least for Unix hosts; board files may
have their own values).
gdb_reverse_timeout
Defining this variable changes the default timeout duration when tests
under gdb.reverse directory are running. Process record and reverse
debugging is so slow that its tests have unexpected `TIMEOUT' test
failures. This global variable is useful to bump up the value of
`timeout' for gdb.reverse tests and doesn't cause any delay where
actual failures happen in the rest of the testsuite.
Board Settings
**************
DejaGNU includes the concept of a "board file", which specifies
testing details for a particular target (which are often bare circuit
boards, thus the name).
In the GDB testsuite specifically, the board file may include a
number of "board settings" that test cases may check before deciding
whether to exercise a particular feature. For instance, a board
lacking any I/O devices, or perhaps simply having its I/O devices
not wired up, should set `noinferiorio'.
Here are the supported board settings:
gdb,cannot_call_functions
The board does not support inferior call, that is, invoking inferior
functions in GDB.
gdb,can_reverse
The board supports reverse execution.
gdb,no_hardware_watchpoints
The board does not support hardware watchpoints.
gdb,nofileio
GDB is unable to intercept target file operations in remote and
perform them on the host.
gdb,noinferiorio
The board is unable to provide I/O capability to the inferior.
gdb,noresults
A program will not return an exit code or result code (or the value
of the result is undefined, and should not be looked at).
gdb,nosignals
The board does not support signals.
gdb,skip_huge_test
Skip time-consuming tests on the board with slow connection.
gdb,skip_float_tests
Skip tests related to floating point.
gdb,use_precord
The board supports process record.
gdb_init_command
gdb_init_commands
Commands to send to GDB every time a program is about to be run. The
first of these settings defines a single command as a string. The
second defines a TCL list of commands being a string each. The commands
are sent one by one in a sequence, first from `gdb_init_command', if any,
followed by individual commands from `gdb_init_command', if any, in this
list's order.
gdb_server_prog
The location of GDBserver. If GDBserver somewhere other than its
default location is used in test, specify the location of GDBserver in
this variable. The location is a file name for GDBserver, and may be
either absolute or relative to the testsuite subdirectory of the build
directory.
in_proc_agent
The location of the in-process agent (used for fast tracepoints and
other special tests). If the in-process agent of interest is anywhere
other than its default location, set this variable. The location is a
filename, and may be either absolute or relative to the testsuite
subdirectory of the build directory.
noargs
GDB does not support argument passing for inferior.
no_long_long
The board does not support type long long.
use_cygmon
The board is running the monitor Cygmon.
use_gdb_stub
The tests are running with a GDB stub.
exit_is_reliable
Set to true if GDB can assume that letting the program run to end
reliably results in program exits being reported as such, as opposed
to, e.g., the program ending in an infinite loop or the board
crashing/resetting. If not set, this defaults to $use_gdb_stub. In
other words, native targets are assumed reliable by default, and
remote stubs assumed unreliable.
gdb,predefined_tsv
The predefined trace state variables the board has.
gdb,no_thread_names
The target doesn't support thread names.
Testsuite Organization
**********************
The testsuite is entirely contained in `gdb/testsuite'. The main
directory of the testsuite includes some makefiles and configury, but
these are minimal, and used for little besides cleaning up, since the
tests themselves handle the compilation of the programs that GDB will
run.
The file `testsuite/lib/gdb.exp' contains common utility procs useful
for all GDB tests, while the directory testsuite/config contains
configuration-specific files, typically used for special-purpose
definitions of procs like `gdb_load' and `gdb_start'.
The tests themselves are to be found in directories named
'testsuite/gdb.* and subdirectories of those. The names of the test
files must always end with ".exp". DejaGNU collects the test files by
wildcarding in the test directories, so both subdirectories and
individual files typically get chosen and run in alphabetical order.
The following lists some notable types of subdirectories and what they
are for. Since DejaGNU finds test files no matter where they are
located, and since each test file sets up its own compilation and
execution environment, this organization is simply for convenience and
intelligibility.
gdb.base
This is the base testsuite. The tests in it should apply to all
configurations of GDB (but generic native-only tests may live here).
The test programs should be in the subset of C that is both valid
ANSI/ISO C, and C++.
gdb.<lang>
Language-specific tests for any language besides C. Examples are
gdb.cp for C++ and gdb.java for Java.
gdb.<platform>
Non-portable tests. The tests are specific to a specific
configuration (host or target), such as eCos.
gdb.arch
Architecture-specific tests that are (usually) cross-platform.
gdb.<subsystem>
Tests that exercise a specific GDB subsystem in more depth. For
instance, gdb.disasm exercises various disassemblers, while
gdb.stabs tests pathways through the stabs symbol reader.
gdb.perf
GDB performance tests.
Writing Tests
*************
In many areas, the GDB tests are already quite comprehensive; you
should be able to copy existing tests to handle new cases. Be aware
that older tests may use obsolete practices but have not yet been
updated.
You should try to use `gdb_test' whenever possible, since it includes
cases to handle all the unexpected errors that might happen. However,
it doesn't cost anything to add new test procedures; for instance,
gdb.base/exprs.exp defines a `test_expr' that calls `gdb_test'
multiple times.
Only use `send_gdb' and `gdb_expect' when absolutely necessary. Even
if GDB has several valid responses to a command, you can use
`gdb_test_multiple'. Like `gdb_test', `gdb_test_multiple' recognizes
internal errors and unexpected prompts.
Do not write tests which expect a literal tab character from GDB. On
some operating systems (e.g. OpenBSD) the TTY layer expands tabs to
spaces, so by the time GDB's output reaches `expect' the tab is gone.
The source language programs do *not* need to be in a consistent
style. Since GDB is used to debug programs written in many different
styles, it's worth having a mix of styles in the testsuite; for
instance, some GDB bugs involving the display of source lines might
never manifest themselves if the test programs used GNU coding style
uniformly.
Some testcase results need more detailed explanation:
KFAIL
Use KFAIL for known problem of GDB itself. You must specify the GDB
bug report number, as in these sample tests:
kfail "gdb/13392" "continue to marker 2"
or
setup_kfail gdb/13392 "*-*-*"
kfail "continue to marker 2"
XFAIL
Short for "expected failure", this indicates a known problem with the
environment. This could include limitations of the operating system,
compiler version, and other components.
This example from gdb.base/attach-pie-misread.exp is a sanity check
for the target environment:
# On x86_64 it is commonly about 4MB.
if {$stub_size > 25000000} {
xfail "stub size $stub_size is too large"
return
}
You should provide bug report number for the failing component of the
environment, if such bug report is available, as with this example
referring to a GCC problem:
if {[test_compiler_info {gcc-[0-3]-*}]
|| [test_compiler_info {gcc-4-[0-5]-*}]} {
setup_xfail "gcc/46955" *-*-*
}
gdb_test "python print ttype.template_argument(2)" "&C::c"
Note that it is also acceptable, and often preferable, to avoid
running the test at all. This is the better option if the limitation
is intrinsic to the environment, rather than a bug expected to be
fixed in the near future.