2006-01-05 03:34:58 +08:00
|
|
|
/* GNU/Linux native-dependent code for debugging multiple forks.
|
|
|
|
|
2024-01-12 23:30:44 +08:00
|
|
|
Copyright (C) 2005-2024 Free Software Foundation, Inc.
|
2006-01-05 03:34:58 +08:00
|
|
|
|
|
|
|
This file is part of GDB.
|
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU General Public License as published by
|
2007-08-24 02:08:50 +08:00
|
|
|
the Free Software Foundation; either version 3 of the License, or
|
2006-01-05 03:34:58 +08:00
|
|
|
(at your option) any later version.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
2007-08-24 02:08:50 +08:00
|
|
|
along with this program. If not, see <http://www.gnu.org/licenses/>. */
|
2006-01-05 03:34:58 +08:00
|
|
|
|
2019-01-28 03:51:36 +08:00
|
|
|
#ifndef LINUX_FORK_H
|
|
|
|
#define LINUX_FORK_H
|
|
|
|
|
2006-01-05 03:34:58 +08:00
|
|
|
struct fork_info;
|
Fix detach bug when lwp has exited/terminated
When using GDB on native linux, it can happen that, while attempting
to detach an inferior, the inferior may have been exited or have been
killed, yet still be in the list of lwps. Should that happen, the
assert in x86_linux_update_debug_registers in
gdb/nat/x86-linux-dregs.c will trigger. The line in question looks
like this:
gdb_assert (lwp_is_stopped (lwp));
For this case, the lwp isn't stopped - it's dead.
The bug which brought this problem to my attention is one in which the
pwntools library uses GDB to to debug a process; as the script is
shutting things down, it kills the process that GDB is debugging and
also sends GDB a SIGTERM signal, which causes GDB to detach all
inferiors prior to exiting. Here's a link to the bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2192169
The following shell command mimics part of what the pwntools
reproducer script does (with regard to shutting things down), but
reproduces the bug much less reliably. I have found it necessary to
run the command a bunch of times before seeing the bug. (I usually
see it within 5-10 repetitions.) If you choose to try this command,
make sure that you have no running "cat" or "gdb" processes first!
cat </dev/zero >/dev/null & \
(sleep 5; (kill -KILL `pgrep cat` & kill -TERM `pgrep gdb`)) & \
sleep 1 ; \
gdb -q -iex 'set debuginfod enabled off' -ex 'set height 0' \
-ex c /usr/bin/cat `pgrep cat`
So, basically, the idea here is to kill both gdb and cat at roughly
the same time. If we happen to attempt the detach before the process
lwp has been deleted from GDB's (linux native) LWP data structures,
then the assert will trigger. The relevant part of the backtrace
looks like this:
#8 0x00000000008a83ae in x86_linux_update_debug_registers (lwp=0x1873280)
at gdb/nat/x86-linux-dregs.c:146
#9 0x00000000008a862f in x86_linux_prepare_to_resume (lwp=0x1873280)
at gdb/nat/x86-linux.c:81
#10 0x000000000048ea42 in x86_linux_nat_target::low_prepare_to_resume (
this=0x121eee0 <the_amd64_linux_nat_target>, lwp=0x1873280)
at gdb/x86-linux-nat.h:70
#11 0x000000000081a452 in detach_one_lwp (lp=0x1873280, signo_p=0x7fff8ca3441c)
at gdb/linux-nat.c:1374
#12 0x000000000081a85f in linux_nat_target::detach (
this=0x121eee0 <the_amd64_linux_nat_target>, inf=0x16e8f70, from_tty=0)
at gdb/linux-nat.c:1450
#13 0x000000000083a23b in thread_db_target::detach (
this=0x1206ae0 <the_thread_db_target>, inf=0x16e8f70, from_tty=0)
at gdb/linux-thread-db.c:1385
#14 0x0000000000a66722 in target_detach (inf=0x16e8f70, from_tty=0)
at gdb/target.c:2526
#15 0x0000000000a8f0ad in kill_or_detach (inf=0x16e8f70, from_tty=0)
at gdb/top.c:1659
#16 0x0000000000a8f4fa in quit_force (exit_arg=0x0, from_tty=0)
at gdb/top.c:1762
#17 0x000000000070829c in async_sigterm_handler (arg=0x0)
at gdb/event-top.c:1141
My colleague, Andrew Burgess, has done some recent work on other
problems with detach. Upon hearing of this problem, he came up a test
case which reliably reproduces the problem and tests for a few other
problems as well. In addition to testing detach when the inferior has
terminated due to a signal, it also tests detach when the inferior has
exited normally. Andrew observed that the linux-native-only
"checkpoint" command would be affected too, so the test also tests
those cases when there's an active checkpoint.
For the LWP exit / termination case with no checkpoint, that's handled
via newly added checks of the waitstatus in detach_one_lwp in
linux-nat.c.
For the checkpoint detach problem, I chose to pass the lwp_info
to linux_fork_detach in linux-fork.c. With that in place, suitable
tests were added before attempting a PTRACE_DETACH operation.
I added a few asserts at the beginning of linux_fork_detach and
modified the caller code so that the newly added asserts shouldn't
trigger. (That's what the 'pid == inferior_ptid.pid' check is about
in gdb/linux-nat.c.)
Lastly, I'll note that the checkpoint code needs some work with regard
to background execution. This patch doesn't attempt to fix that
problem, but it doesn't make it any worse. It does slightly improve
the situation with detach because, due to the check noted above,
linux_fork_detach() won't be called for the wrong inferior when there
are multiple inferiors. (There are at least two other problems with
the checkpoint code when there are multiple inferiors. See:
https://sourceware.org/bugzilla/show_bug.cgi?id=31065)
This commit also adds a new test,
gdb.base/process-dies-while-detaching.exp. Andrew Burgess is the
primary author of this test case. Its design is similar to that of
gdb.threads/main-thread-exit-during-detach.exp, which was also written
by Andrew.
This test checks that GDB correctly handles several cases that can
occur when GDB attempts to detach an inferior process. The process
can exit or be terminated (e.g. via SIGKILL) prior to GDB's event
loop getting a chance to remove it from GDB's internal data
structures. To complicate things even more, detach works differently
when a checkpoint (created via GDB's "checkpoint" command) exists for
the inferior. This test checks all four possibilities: process exit
with no checkpoint, process termination with no checkpoint, process
exit with a checkpoint, and process termination with a checkpoint.
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
2023-12-03 11:25:31 +08:00
|
|
|
struct lwp_info;
|
2019-03-07 02:29:17 +08:00
|
|
|
extern void add_fork (pid_t);
|
2006-01-05 03:34:58 +08:00
|
|
|
extern struct fork_info *find_fork_pid (pid_t);
|
|
|
|
extern void linux_fork_killall (void);
|
|
|
|
extern void linux_fork_mourn_inferior (void);
|
Fix detach bug when lwp has exited/terminated
When using GDB on native linux, it can happen that, while attempting
to detach an inferior, the inferior may have been exited or have been
killed, yet still be in the list of lwps. Should that happen, the
assert in x86_linux_update_debug_registers in
gdb/nat/x86-linux-dregs.c will trigger. The line in question looks
like this:
gdb_assert (lwp_is_stopped (lwp));
For this case, the lwp isn't stopped - it's dead.
The bug which brought this problem to my attention is one in which the
pwntools library uses GDB to to debug a process; as the script is
shutting things down, it kills the process that GDB is debugging and
also sends GDB a SIGTERM signal, which causes GDB to detach all
inferiors prior to exiting. Here's a link to the bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2192169
The following shell command mimics part of what the pwntools
reproducer script does (with regard to shutting things down), but
reproduces the bug much less reliably. I have found it necessary to
run the command a bunch of times before seeing the bug. (I usually
see it within 5-10 repetitions.) If you choose to try this command,
make sure that you have no running "cat" or "gdb" processes first!
cat </dev/zero >/dev/null & \
(sleep 5; (kill -KILL `pgrep cat` & kill -TERM `pgrep gdb`)) & \
sleep 1 ; \
gdb -q -iex 'set debuginfod enabled off' -ex 'set height 0' \
-ex c /usr/bin/cat `pgrep cat`
So, basically, the idea here is to kill both gdb and cat at roughly
the same time. If we happen to attempt the detach before the process
lwp has been deleted from GDB's (linux native) LWP data structures,
then the assert will trigger. The relevant part of the backtrace
looks like this:
#8 0x00000000008a83ae in x86_linux_update_debug_registers (lwp=0x1873280)
at gdb/nat/x86-linux-dregs.c:146
#9 0x00000000008a862f in x86_linux_prepare_to_resume (lwp=0x1873280)
at gdb/nat/x86-linux.c:81
#10 0x000000000048ea42 in x86_linux_nat_target::low_prepare_to_resume (
this=0x121eee0 <the_amd64_linux_nat_target>, lwp=0x1873280)
at gdb/x86-linux-nat.h:70
#11 0x000000000081a452 in detach_one_lwp (lp=0x1873280, signo_p=0x7fff8ca3441c)
at gdb/linux-nat.c:1374
#12 0x000000000081a85f in linux_nat_target::detach (
this=0x121eee0 <the_amd64_linux_nat_target>, inf=0x16e8f70, from_tty=0)
at gdb/linux-nat.c:1450
#13 0x000000000083a23b in thread_db_target::detach (
this=0x1206ae0 <the_thread_db_target>, inf=0x16e8f70, from_tty=0)
at gdb/linux-thread-db.c:1385
#14 0x0000000000a66722 in target_detach (inf=0x16e8f70, from_tty=0)
at gdb/target.c:2526
#15 0x0000000000a8f0ad in kill_or_detach (inf=0x16e8f70, from_tty=0)
at gdb/top.c:1659
#16 0x0000000000a8f4fa in quit_force (exit_arg=0x0, from_tty=0)
at gdb/top.c:1762
#17 0x000000000070829c in async_sigterm_handler (arg=0x0)
at gdb/event-top.c:1141
My colleague, Andrew Burgess, has done some recent work on other
problems with detach. Upon hearing of this problem, he came up a test
case which reliably reproduces the problem and tests for a few other
problems as well. In addition to testing detach when the inferior has
terminated due to a signal, it also tests detach when the inferior has
exited normally. Andrew observed that the linux-native-only
"checkpoint" command would be affected too, so the test also tests
those cases when there's an active checkpoint.
For the LWP exit / termination case with no checkpoint, that's handled
via newly added checks of the waitstatus in detach_one_lwp in
linux-nat.c.
For the checkpoint detach problem, I chose to pass the lwp_info
to linux_fork_detach in linux-fork.c. With that in place, suitable
tests were added before attempting a PTRACE_DETACH operation.
I added a few asserts at the beginning of linux_fork_detach and
modified the caller code so that the newly added asserts shouldn't
trigger. (That's what the 'pid == inferior_ptid.pid' check is about
in gdb/linux-nat.c.)
Lastly, I'll note that the checkpoint code needs some work with regard
to background execution. This patch doesn't attempt to fix that
problem, but it doesn't make it any worse. It does slightly improve
the situation with detach because, due to the check noted above,
linux_fork_detach() won't be called for the wrong inferior when there
are multiple inferiors. (There are at least two other problems with
the checkpoint code when there are multiple inferiors. See:
https://sourceware.org/bugzilla/show_bug.cgi?id=31065)
This commit also adds a new test,
gdb.base/process-dies-while-detaching.exp. Andrew Burgess is the
primary author of this test case. Its design is similar to that of
gdb.threads/main-thread-exit-during-detach.exp, which was also written
by Andrew.
This test checks that GDB correctly handles several cases that can
occur when GDB attempts to detach an inferior process. The process
can exit or be terminated (e.g. via SIGKILL) prior to GDB's event
loop getting a chance to remove it from GDB's internal data
structures. To complicate things even more, detach works differently
when a checkpoint (created via GDB's "checkpoint" command) exists for
the inferior. This test checks all four possibilities: process exit
with no checkpoint, process termination with no checkpoint, process
exit with a checkpoint, and process termination with a checkpoint.
Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
2023-12-03 11:25:31 +08:00
|
|
|
extern void linux_fork_detach (int, lwp_info *);
|
gdb/
2009-07-02 Pedro Alves <pedro@codesourcery.com>
* linux-nat.c (linux_child_follow_fork): If we're staying attached
to the child process, enable event reporting on it. Don't handle
checkpoints here. Instead, add the child fork to the lwp thread
and inferior lists without clobbering the previous inferior. Let
the thread_db layer learn about a new child process, even if
following the parent.
(linux_nat_switch_fork): Delete lwps of the current inferior only,
instead of clearing the whole list. Use thread_change_ptid to
give the core the illusion the new checkpoint is still the same
inferior. Clear the register cache.
(linux_handle_extended_wait): Handle checkpoints here.
(linux_multi_process): Turn on.
* linux-fork.c (struct fork_info) <pc>: Remove field.
(init_fork_list): Do not delete the checkpoint from the inferior
list (it is not there).
(fork_load_infrun_state): Don't switch inferior_ptid here. Pass
the new checkpoint's ptid to linux_nat_switch_fork.
(fork_save_infrun_state): Make static. Don't stop the pc field of
fork_info, it's gone.
(linux_fork_mourn_inferior): Don't delete the checkpoint from the
inferior list, it's not there.
(linux_fork_detach): Ditto.
(delete_fork_command): Replace mention of fork/checkpoint by
checkpoint only.
(detach_fork_command): Likewise. Don't delete the checkpoint from
the inferior list.
(info_forks_command): Adjust.
(restore_detach_fork): Delete.
(checkpointing_pid): New.
(linux_fork_checkpointing_p): New.
(save_detach_fork): Delete.
(checkpoint_command): Delete temp_detach_fork. Don't remove
breakpoints, that's a nop. Store the pid of the process we're
checkpointing, and use make_cleanup_restore_integer to restore it.
Don't reinsert breakpoints here.
(process_command, fork_command): Delete.
(restart_command): Update comments to only mention checkpoints,
not forks.
(_initialize_linux_fork): Delete "fork", "process", "info forks"
commands.
* linux-fork.h (fork_save_infrun_state, fork_list): Delete
declarations.
(linux_fork_checkpointing_p): Declare.
* cli/cli-cmds.c (killlist): New.
* cli/cli-cmds.h (killlist): Declare.
* gdbcmd.h (killlist): Declare.
* inferior.c: Include "gdbthread.h".
(detach_inferior_command, kill_inferior_command)
(inferior_command): New.
(info_inferiors_command): Allow specifying a specific inferior id.
(_initialize_inferiors): Register "inferior", "kill inferior" and
"detach inferior" commands.
* infcmd.c (_initialize_infcmd): Make "kill" a prefix command.
* gdbthread.h (any_thread_of_process): Declare.
* thread.c (any_thread_of_process): New.
* NEWS: Mention multi-inferior debugging. Mention 'info
inferiors', 'inferior', 'detach inferior' and 'kill inferior' as
new commands.
(Removed commands): New section, mentioning that 'info forks',
'fork', 'process', 'delete fork' and 'detach fork' are now gone.
gdb/testsuite/
2009-07-02 Pedro Alves <pedro@codesourcery.com>
* gdb.base/multi-forks.exp: Only run detach-on-fork tests on
linux. Adjust to use "inferior", "info inferiors", "detach
inferior" and "kill inferior" instead of "restart", "info fork",
"detach fork" and "delete fork".
* gdb.base/ending-run.exp: Spell out "info".
* gdb.base/help.exp: Adjust to use test_prefix_command_help for
the "kill" command.
gdb/doc/
2009-07-02 Pedro Alves <pedro@codesourcery.com>
* gdb.texinfo (Debugging multiple inferiors): Document the
"inferior", "detach inferior" and "kill inferior" commands.
(Debugging Programs with Multiple Processes): Adjust to mention
generic "inferior" commands. Delete mention of "detach fork" and
"delete fork". Cross reference to "Debugging multiple inferiors"
section.
2009-07-03 05:57:28 +08:00
|
|
|
extern int forks_exist_p (void);
|
|
|
|
extern int linux_fork_checkpointing_p (int);
|
2019-01-28 03:51:36 +08:00
|
|
|
|
|
|
|
#endif /* LINUX_FORK_H */
|