Fix AIX core dump while main thread exits.

Consider the test case:
void *thread_main(void *) {
  std::cout << getpid() << std::endl;
  sleep(20);
  return nullptr;
}

int main(void) {
  pthread_t thread;
  pthread_create(&thread, nullptr, thread_main, nullptr);
  pthread_join(thread, nullptr);

  return 0;
}

This program creates a thread via main that sleeps for 20 seconds.

When we debug this with gdb we get,
Reading symbols from ./test...
(gdb) b main
Breakpoint 1 at 0x10000934: file test.c, line 11.
(gdb) r
Starting program: /read_only_gdb/binutils-gdb/gdb/test

Breakpoint 1, main () at test.c:11
11   pthread_create(&thread, nullptr, thread_main, nullptr);
(gdb) c
Continuing.
15335884
[New Thread 258 (tid 31130079)]
Thread 2 received signal SIGINT, Interrupt.
[Switching to Thread 258 (tid 31130079)]
0xd0611d70 in _p_nsleep () from /usr/lib/libpthread.a(_shr_xpg5.o)
(gdb) thread 1
[Switching to thread 1 (Thread 1 (tid 25493845))]
(gdb) c
Continuing.
[Thread 1 (tid 25493845) exited]
[Thread 258 (tid 31130079) exited]
inferior.c:405: internal-error: find_inferior_pid: Assertion `pid != 0' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----

There are two bugs here. One is the core dump. The other is the main thread information
not captured.

So, while I was debugging the first part the reason, the reason I figured out was
the last for loop in sync_threadlists ().

Once both my threads exit we delete them as below:

for (struct thread_info *it : all_threads ())
      {
if (in_queue_threads.count (priv->pdtid) == 0
        && in_thread_list (proc_target, it->ptid)
        && pid == it->ptid.pid ())
      {
        delete_thread (it);
        data->exited_threads.insert (priv->pdtid);

But once these two threads are deleted, all_threads ()
has one more thread whose tid and pid are 0.

gdb) c
Continuing.
In for loop 8782296 is pid, 19857879 is tid
[Thread 1 (tid 19857879) exited]
In for loop 8782296 is pid, 30933401 is tid
[Thread 258 (tid 30933401) exited]
In for loop 0 is pid, 0 is tid
[Inferior 1 (process 8782296) exited normally]
(gdb) q

I used a printf in the for loop mentioned above for explaination.

You see the loop enters the third time with 0 as pid.

The reason being though the threads are removed but not deleted since they are
not deletable ().

Hence we use all_threads_safe () iterator instead.

The second part to the bug is the lack of information of the main thread.
Andrew was right here (https://sourceware.org/pipermail/gdb-patches/2024-September/211875.html)
Thank you Andrew.

The thread has loaded but then ptrace () call when we tried to fetch_regs_kernel_thread
failed. This returned EPERM as errno.

if (!ptrace32 (PTT_READ_GPRS, tid, (uintptr_t) gprs32, 0, NULL))
        memset (gprs32, 0, sizeof (gprs32));

Hence all registers were set to 0 and we did not get the required infromation.
This issue will be fixed within the AIX ptrace call.

Approved By: Ulrich Weigand <ulrich.weigand@de.ibm.com>.
This commit is contained in:
Aditya Vidyadhar Kamath 2024-11-04 02:42:05 -06:00
parent 55e32b3c68
commit 9247f05284

View File

@ -854,7 +854,7 @@ sync_threadlists (pid_t pid)
thread exits and gets into a PST_UNKNOWN state. So this thread
will not run in the above for loop. Therefore the below for loop
is to manually delete such threads. */
for (struct thread_info *it : all_threads ())
for (struct thread_info *it : all_threads_safe ())
{
aix_thread_info *priv = get_aix_thread_info (it);
if (in_queue_threads.count (priv->pdtid) == 0