Currently, when remote debugging, if you type Ctrl-C just while the
target stopped for an internal event, and GDB is busy doing something
that takes a while (e.g., fetching chunks of a shared library off of
the target, with vFile, to process ELF headers and debug info), the
Ctrl-C is lost.
The patch hooks up the QUIT macro to a new target method that lets the
target react to the double-Ctrl-C before the event loop is reached,
which allows reacting to a double-Ctrl-C even when GDB is busy doing
some long operation and not waiting for a stop reply. That end result
is:
(gdb) c
Continuing.
^C
^C
Interrupted while waiting for the program.
Give up waiting? (y or n) y
Quit
(gdb) info threads
Id Target Id Frame
* 1 Thread 11673 0x00007ffff7deb240 in _dl_debug_state () from target:/lib64/ld-linux-x86-64.so.2
(gdb)
If, however, GDB is waiting for a stop reply (because the target has
been resumed, with e.g., vCont;c), but the target isn't responding, we
now get:
(gdb) c
Continuing.
^C
^C
The target is not responding to interrupt requests.
Stop debugging it? (y or n) y
Disconnected from target.
(gdb) info threads
No threads.
This offers to disconnect, because when we're waiting for a stop
reply, there's nothing else we can send the target other than an
interrupt request. And if that doesn't work, there's nothing else we
can do.
The Ctrl-C is presently lost because until we get to a user-visible
stop, the SIGINT handler that is installed is the one that forwards
the interrupt to the remote side, with the \003 "packet" [1]. But,
gdbserver ignores an interrupt request if the program is stopped.
Still, even if it didn't, the server can only report back a
stop-because-of-SIGINT when the program is next resumed. And it may
take a while to actually re-resume the target.
[1] - In the old sync days, the remote target would react to a
double-Ctrl-C by asking users whether they wanted to give up waiting
and disconnect. The code is still there, but it it isn't reacheable
on most hosts, which support serial connections in async mode
(probably only DJGPP doesn't). Even then, in sync mode, remote.c's
SIGINT handler is only installed while the target is resumed, and is
removed as soon as the target sends back a stop reply. That means
that a Ctrl-C just while GDB is processing an internal event can end
up with an odd "Quit" at the prompt instead of "Program stopped by
SIGINT". In contrast, in async mode, remote.c's SIGINT handler is set
up as long as target_terminal_inferior or
target_terminal_ours_for_output are in effect (IOW, until we get a
user-visible stop and call target_terminal_ours), so the user
shouldn't get back a spurious Quit. However, it's still desirable to
be able to interrupt a long-running GDB operation, if GDB takes a
while to re-resume the target or get back to the event loop.
Tested on x86_64 Fedora 20.
gdb/ChangeLog:
2015-08-24 Pedro Alves <palves@redhat.com>
* defs.h (maybe_quit): Declare.
(QUIT): Now calls maybe_quit.
* event-loop.c (clear_async_signal_handler)
(async_signal_handler_is_marked): New functions.
* event-loop.h (async_signal_handler_is_marked)
(clear_async_signal_handler): New declarations.
* remote.c (remote_check_pending_interrupt): New function.
(interrupt_query): Use make_cleanup_restore_target_terminal. No
longer check whether the target is async. If waiting for a stop
reply, and a Ctrl-C as been sent to the target, offer to
disconnect, and throw TARGET_CLOSE_ERROR instead of a quit.
Otherwise do not disconnect and throw a quit.
(_initialize_remote): Install remote_check_pending_interrupt as
to_check_pending_interrupt.
* target.c (target_check_pending_interrupt): New function.
* target.h (struct target_ops) <to_check_pending_interrupt>: New
field.
(target_check_pending_interrupt): New declaration.
* utils.c (maybe_quit): New function.
* target-delegates.c: Regenerate.
Even with the previous patch installed, we'll still see
sigall-reverse.exp occasionally fail. The problem is that the event
loop's event handling processing is done in two steps:
#1 - poll all event sources, and push new event objects to the event
queue, until all event sources are drained.
#2 - go through the event queue, processing each event object at a
time. For each event, call the associated callback, and deletes the
event object from the queue.
and then bad things happen if between #1 and #2 something decides that
events from an event source that has already queued events shouldn't
be processed yet. To do that, we either remove the event source from
the list of event sources, or clear its "have events" flag. However,
if an event for that source has meanwhile already been pushed in the
event queue, #2 will still process it and call the associated
callback...
One way to fix it that I considered was to do something to the event
objects already in the event queue when an event source is no longer
interesting. But then I couldn't find any good reason for the
two-step process in the first place. It's much simpler (and less
code) to call the event source callbacks as we poll the sources and
find events.
Tested on x86-64 Fedora 20, native and gdbserver.
gdb/
2015-02-03 Pedro Alves <palves@redhat.com>
* event-loop.c: Don't declare nor define a queue type for
gdb_event_p.
(event_queue): Delete.
(create_event, create_file_event, gdb_event_xfree)
(initialize_event_loop, process_event): Delete.
(gdb_do_one_event): Return as soon as one event is handled.
(handle_file_event): Change prototype. Used the passed in
file_handler pointer and ready_mask instead of looping over all
file handlers.
(gdb_wait_for_event): Update the poll/select timeouts before
blocking. Run event handlers directly instead of queueing events.
Return as soon as one event is handled.
(struct async_event_handler_data): Delete.
(invoke_async_event_handler): Delete.
(check_async_event_handlers): Change return type to int. Run
event handlers directly instead of queueing events. Return as
soon as one event is handled.
(handle_timer_event): Delete.
(update_wait_timeout): New function, factored out from
poll_timers.
(poll_timers): Reimplement.
* event-loop.h (initialize_event_loop): Delete declaration.
* top.c (gdb_init): Don't call initialize_event_loop.
The sigall-reverse.exp test occasionally fails with something like this:
(gdb) PASS: gdb.reverse/sigall-reverse.exp: send signal TERM
continue
Continuing.
The next instruction is syscall exit_group. It will make the program exit. Do you want to stop the program?([y] or n) FAIL: gdb.reverse/sigall-reverse.exp: continue to signal exit (timeout)
FAIL: gdb.reverse/sigall-reverse.exp: reverse to handler of TERM (timeout)
FAIL: gdb.reverse/sigall-reverse.exp: reverse to gen_TERM (timeout)
This is another event-loop/async related problem exposed by the patch
that made 'query' use gdb_readline_wrapper (588dcc3edb).
The problem is that even though gdb_readline_wrapper disables
target-async while the secondary prompt is in progress, the record
target's async event source is left marked. So when
gdb_readline_wrapper nests an event loop to process input, it may
happen that that event loop ends up processing a target event while
GDB is not really ready for it. Here's the relevant part of the
backtrace showing the root issue in action:
...
#14 0x000000000061cb48 in fetch_inferior_event (client_data=0x0) at src/gdb/infrun.c:4158
#15 0x0000000000642917 in inferior_event_handler (event_type=INF_REG_EVENT, client_data=0x0) at src/gdb/inf-loop.c:57
#16 0x000000000077ca5c in record_full_async_inferior_event_handler (data=0x0) at src/gdb/record-full.c:791
#17 0x0000000000640fdf in invoke_async_event_handler (data=...) at src/gdb/event-loop.c:1067
#18 0x000000000063fb01 in process_event () at src/gdb/event-loop.c:339
#19 0x000000000063fb2a in gdb_do_one_event () at src/gdb/event-loop.c:360
#20 0x000000000074d607 in gdb_readline_wrapper (prompt=0x3588f40 "The next instruction is syscall exit_group. It will make the program exit. Do you want to stop the program?([y] or n) ") at src/gdb/top.c:842
#21 0x0000000000750bd9 in defaulted_query (ctlstr=0x8c6588 "The next instruction is syscall exit_group. It will make the program exit. Do you want to stop the program?", defchar=121 'y', args=0x7fff70524410) at src/gdb/utils.c:1279
#22 0x0000000000750e4c in yquery (ctlstr=0x8c6588 "The next instruction is syscall exit_group. It will make the program exit. Do you want to stop the program?") at src/gdb/utils.c:1358
#23 0x00000000004b020e in record_linux_system_call (syscall=gdb_sys_exit_group, regcache=0x3529450, tdep=0xd6c840 <amd64_linux_record_tdep>) at src/gdb/linux-record.c:1933
With my all-stop-on-top-of-non-stop series, I'm also seeing
gdb.server/ext-attach.exp fail occasionally due to the same issue.
The first part of the fix is for target_async implementations to make
sure to remove/unmark all target-related event sources from the event
loop.
Tested on x86_64 Fedora 20, native and gdbserver.
gdb/
2015-02-03 Pedro Alves <palves@redhat.com>
* event-loop.c (clear_async_event_handler): New function.
* event-loop.h (clear_async_event_handler): New declaration.
* record-btrace.c (record_btrace_async): New function.
(init_record_btrace_ops): Install record_btrace_async.
* record-full.c (record_full_async): New function.
(record_full_resume): Don't mark the async event source here.
(init_record_full_ops): Install record_full_async.
(record_full_core_resume): Don't mark the async event source here.
(init_record_full_core_ops): Install record_full_async.
* remote.c (remote_async): Mark and clear the async stop reply
queue event-loop token as appropriate.
Two modifications:
1. The addition of 2013 to the copyright year range for every file;
2. The use of a single year range, instead of potentially multiple
year ranges, as approved by the FSF.
* event-loop.c: ... here.
* tui/tui-io.c (tui_readline_output): Rename parameter `code' to
`error' for clarity.
(tui_getc): Pass correct value for `error' parameter to
tui_readline_output.
(async_event_handler): Forward declare.
(async_event_handler_func): New typedef.
(create_async_event_handler, delete_async_event_handler)
(mark_async_event_handler): Declare.
* event-loop.c (event_data): New.
(event_handler_func): Take an event_data instead of an integer.
(struct gdb_event): Replace the integer file descriptor by a
generic event_data.
(async_event_handler): New.
(async_handler_ready): Delete.
(async_event_handler_list): New.
(create_event): New.
(create_file_event): Use it.
(process_event): Adjust.
(gdb_do_one_event): Poll from the event sources in round-robin
fashion across calls. Be sure to consult all sources before
blocking.
(handle_file_event): Take an event_data instead of an integer.
Adjust.
(gdb_wait_for_event): Add `block' argument. Handle it.
(mark_async_signal_handler): Remove unneeded cast.
(invoke_async_signal_handler): Rename to ...
(invoke_async_signal_handlres): ... this. Return true if any was
handled.
(check_async_ready): Delete
(create_async_event_handler): New.
(mark_async_event_handler): New.
(struct async_event_handler_data): New.
(invoke_async_event_handler): New.
(check_async_event_handlers): New.
(delete_async_event_handler): New.
(handle_timer_event): Adjust.
* defs.h (struct continuation_arg): Change type of field 'data'
from PTR to void *.
* event-loop.h: Eliminate uses of PTR, use 'void *' instead.
* event-top.c: Ditto.