linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-24 12:44:11 +08:00

Go to file

Linus Torvalds b5683a37c8 vfs-6.9.pidfd -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZem4/wAKCRCRxhvAZXjc opnBAQCaQWwxjT0VLHebPniw6tel/KYlZ9jH9kBQwLrk1pembwEA+BsCY2C8YS4a 75v9jOPxr+Z8j1SjxwwubcONPyqYXwQ= =+Wa3 -----END PGP SIGNATURE----- Merge tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pdfd updates from Christian Brauner: - Until now pidfds could only be created for thread-group leaders but not for threads. There was no technical reason for this. We simply had no users that needed support for this. Now we do have users that need support for this. This introduces a new PIDFD_THREAD flag for pidfd_open(). If that flag is set pidfd_open() creates a pidfd that refers to a specific thread. In addition, we now allow clone() and clone3() to be called with CLONE_PIDFD \| CLONE_THREAD which wasn't possible before. A pidfd that refers to an individual thread differs from a pidfd that refers to a thread-group leader: (1) Pidfds are pollable. A task may poll a pidfd and get notified when the task has exited. For thread-group leader pidfds the polling task is woken if the thread-group is empty. In other words, if the thread-group leader task exits when there are still threads alive in its thread-group the polling task will not be woken when the thread-group leader exits but rather when the last thread in the thread-group exits. For thread-specific pidfds the polling task is woken if the thread exits. (2) Passing a thread-group leader pidfd to pidfd_send_signal() will generate thread-group directed signals like kill(2) does. Passing a thread-specific pidfd to pidfd_send_signal() will generate thread-specific signals like tgkill(2) does. The default scope of the signal is thus determined by the type of the pidfd. Since use-cases exist where the default scope of the provided pidfd needs to be overriden the following flags are added to pidfd_send_signal(): - PIDFD_SIGNAL_THREAD Send a thread-specific signal. - PIDFD_SIGNAL_THREAD_GROUP Send a thread-group directed signal. - PIDFD_SIGNAL_PROCESS_GROUP Send a process-group directed signal. The scope change will only work if the struct pid is actually used for this scope. For example, in order to send a thread-group directed signal the provided pidfd must be used as a thread-group leader and similarly for PIDFD_SIGNAL_PROCESS_GROUP the struct pid must be used as a process group leader. - Move pidfds from the anonymous inode infrastructure to a tiny pseudo filesystem. This will unblock further work that we weren't able to do simply because of the very justified limitations of anonymous inodes. Moving pidfds to a tiny pseudo filesystem allows for statx on pidfds to become useful for the first time. They can now be compared by inode number which are unique for the system lifetime. Instead of stashing struct pid in file->private_data we can now stash it in inode->i_private. This makes it possible to introduce concepts that operate on a process once all file descriptors have been closed. A concrete example is kill-on-last-close. Another side-effect is that file->private_data is now freed up for per-file options for pidfds. Now, each struct pid will refer to a different inode but the same struct pid will refer to the same inode if it's opened multiple times. In contrast to now where each struct pid refers to the same inode. The tiny pseudo filesystem is not visible anywhere in userspace exactly like e.g., pipefs and sockfs. There's no lookup, there's no complex inode operations, nothing. Dentries and inodes are always deleted when the last pidfd is closed. We allocate a new inode and dentry for each struct pid and we reuse that inode and dentry for all pidfds that refer to the same struct pid. The code is entirely optional and fairly small. If it's not selected we fallback to anonymous inodes. Heavily inspired by nsfs. The dentry and inode allocation mechanism is moved into generic infrastructure that is now shared between nsfs and pidfs. The path_from_stashed() helper must be provided with a stashing location, an inode number, a mount, and the private data that is supposed to be used and it will provide a path that can be passed to dentry_open(). The helper will try retrieve an existing dentry from the provided stashing location. If a valid dentry is found it is reused. If not a new one is allocated and we try to stash it in the provided location. If this fails we retry until we either find an existing dentry or the newly allocated dentry could be stashed. Subsequent openers of the same namespace or task are then able to reuse it. - Currently it is only possible to get notified when a task has exited, i.e., become a zombie and userspace gets notified with EPOLLIN. We now also support waiting until the task has been reaped, notifying userspace with EPOLLHUP. - Ensure that ESRCH is reported for getfd if a task is exiting instead of the confusing EBADF. - Various smaller cleanups to pidfd functions. * tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) libfs: improve path_from_stashed() libfs: add stashed_dentry_prune() libfs: improve path_from_stashed() helper pidfs: convert to path_from_stashed() helper nsfs: convert to path_from_stashed() helper libfs: add path_from_stashed() pidfd: add pidfs pidfd: move struct pidfd_fops pidfd: allow to override signal scope in pidfd_send_signal() pidfd: change pidfd_send_signal() to respect PIDFD_THREAD signal: fill in si_code in prepare_kill_siginfo() selftests: add ESRCH tests for pidfd_getfd() pidfd: getfd should always report ESRCH if a task is exiting pidfd: clone: allow CLONE_THREAD \| CLONE_PIDFD together pidfd: exit: kill the no longer used thread_group_exited() pidfd: change do_notify_pidfd() to use __wake_up(poll_to_key(EPOLLIN)) pid: kill the obsolete PIDTYPE_PID code in transfer_pid() pidfd: kill the no longer needed do_notify_pidfd() in de_thread() pidfd_poll: report POLLHUP when pid_task() == NULL pidfd: implement PIDFD_THREAD flag for pidfd_open() ...		2024-03-11 10:21:06 -07:00
arch	linux_kselftest-next-6.9-rc1	2024-03-11 09:25:33 -07:00
block	vfs-6.9.iomap	2024-03-11 10:07:03 -07:00
certs	This update includes the following changes:	2023-11-02 16:15:30 -10:00
crypto	crypto: lskcipher - Copy IV in lskcipher glue code always	2024-02-24 08:37:24 +08:00
Documentation	vfs-6.9.ntfs	2024-03-11 09:55:17 -07:00
drivers	linux_kselftest-kunit-6.9-rc1	2024-03-11 09:32:28 -07:00
fs	vfs-6.9.pidfd	2024-03-11 10:21:06 -07:00
include	vfs-6.9.pidfd	2024-03-11 10:21:06 -07:00
init	vfs-6.9.pidfd	2024-03-11 10:21:06 -07:00
io_uring	io_uring/net: fix multishot accept overflow handling	2024-02-14 18:30:19 -07:00
ipc	shm: Slim down dependencies	2023-12-20 19:26:31 -05:00
kernel	vfs-6.9.pidfd	2024-03-11 10:21:06 -07:00
lib	vfs-6.9.misc	2024-03-11 09:38:17 -07:00
LICENSES	LICENSES: Add the copyleft-next-0.3.1 license	2022-11-08 15:44:01 +01:00
mm	vfs-6.9.misc	2024-03-11 09:38:17 -07:00
net	linux_kselftest-kunit-6.9-rc1	2024-03-11 09:32:28 -07:00
rust	Rust changes for v6.8	2024-01-11 13:05:41 -08:00
samples	work around gcc bugs with 'asm goto' with outputs	2024-02-09 15:57:48 -08:00
scripts	6 hotfixes. 4 are cc:stable and the remainder pertain to post-6.7	2024-03-07 17:16:38 -08:00
security	integrity-v6.8-fix	2024-03-05 13:21:30 -08:00
sound	ASoC: Fixes for v6.8	2024-03-08 08:53:36 +01:00
tools	vfs-6.9.pidfd	2024-03-11 10:21:06 -07:00
usr	Kbuild updates for v6.8	2024-01-18 17:57:07 -08:00
virt	KVM: Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY	2024-02-22 17:07:06 -08:00
.clang-format	clang-format: Update with v6.7-rc4's `for_each` macro list	2023-12-08 23:54:38 +01:00
.cocciconfig
.editorconfig	Add .editorconfig file for basic formatting	2023-12-28 16:22:47 +09:00
.get_maintainer.ignore	get_maintainer: add Alan to .get_maintainer.ignore	2022-08-20 15:17:44 -07:00
.gitattributes	.gitattributes: set diff driver for Rust source code files	2023-05-31 17:48:25 +02:00
.gitignore	Add .editorconfig file for basic formatting	2023-12-28 16:22:47 +09:00
.mailmap	drm fixes for 6.8 final	2024-03-08 12:44:56 -08:00
.rustfmt.toml	rust: add `.rustfmt.toml`	2022-09-28 09:02:20 +02:00
COPYING	COPYING: state that all contributions really are covered by this file	2020-02-10 13:32:20 -08:00
CREDITS	vfs-6.9.ntfs	2024-03-11 09:55:17 -07:00
Kbuild	Kbuild updates for v6.1	2022-10-10 12:00:45 -07:00
Kconfig	kbuild: ensure full rebuild when the compiler is updated	2020-05-12 13:28:33 +09:00
MAINTAINERS	vfs-6.9.ntfs	2024-03-11 09:55:17 -07:00
Makefile	Linux 6.8	2024-03-10 13:38:09 -07:00
README

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.