linux/fs/proc
Ivan Babrou f1f1f25699 proc: report open files as size in stat() for /proc/pid/fd
Many monitoring tools include open file count as a metric.  Currently the
only way to get this number is to enumerate the files in /proc/pid/fd.

The problem with the current approach is that it does many things people
generally don't care about when they need one number for a metric.  In our
tests for cadvisor, which reports open file counts per cgroup, we observed
that reading the number of open files is slow.  Out of 35.23% of CPU time
spent in `proc_readfd_common`, we see 29.43% spent in `proc_fill_cache`,
which is responsible for filling dentry info.  Some of this extra time is
spinlock contention, but it's a contention for the lock we don't want to
take to begin with.

We considered putting the number of open files in /proc/pid/status. 
Unfortunately, counting the number of fds involves iterating the
open_files bitmap, which has a linear complexity in proportion with the
number of open files (bitmap slots really, but it's close).  We don't want
to make /proc/pid/status any slower, so instead we put this info in
/proc/pid/fd as a size member of the stat syscall result.  Previously the
reported number was zero, so there's very little risk of breaking
anything, while still providing a somewhat logical way to count the open
files with a fallback if it's zero.

RFC for this patch included iterating open fds under RCU.  Thanks to Frank
Hofmann for the suggestion to use the bitmap instead.

Previously:

```
$ sudo stat /proc/1/fd | head -n2
  File: /proc/1/fd
  Size: 0         	Blocks: 0          IO Block: 1024   directory
```

With this patch:

```
$ sudo stat /proc/1/fd | head -n2
  File: /proc/1/fd
  Size: 65        	Blocks: 0          IO Block: 1024   directory
```

Correctness check:

```
$ sudo ls /proc/1/fd | wc -l
65
```

I added the docs for /proc/<pid>/fd while I'm at it.

[ivan@cloudflare.com: use bitmap_weight() to count the bits]
  Link: https://lkml.kernel.org/r/20221018045844.37697-1-ivan@cloudflare.com
[akpm@linux-foundation.org: include linux/bitmap.h for bitmap_weight()]
[ivan@cloudflare.com: return errno from proc_fd_getattr() instead of setting negative size]
  Link: https://lkml.kernel.org/r/20221024173140.30673-1-ivan@cloudflare.com
Link: https://lkml.kernel.org/r/20220922224027.59266-1-ivan@cloudflare.com
Signed-off-by: Ivan Babrou <ivan@cloudflare.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-18 13:55:07 -08:00
..
array.c ucounts: Split rlimit and ucount values and max values 2022-10-09 16:24:05 -07:00
base.c - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
bootconfig.c proc: bootconfig: Add null pointer check 2022-04-02 08:40:09 -04:00
cmdline.c proc: introduce proc_create_single{,_data} 2018-05-16 07:23:35 +02:00
consoles.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 191 2019-05-30 11:29:21 -07:00
cpuinfo.c x86/aperfmperf: Replace aperfmperf_get_khz() 2022-04-27 20:22:19 +02:00
devices.c proc: mark more files as permanent 2022-10-03 14:21:45 -07:00
fd.c proc: report open files as size in stat() for /proc/pid/fd 2022-11-18 13:55:07 -08:00
fd.h fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
generic.c proc: fix dentry/inode overinstantiating under /proc/${pid}/net 2022-05-09 18:29:19 -07:00
inode.c take care to handle NULL ->proc_lseek() 2022-08-14 15:16:18 -04:00
internal.h - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
interrupts.c proc: introduce proc_create_seq{,_data} 2018-05-16 07:23:35 +02:00
Kconfig proc: make config PROC_CHILDREN depend on PROC_FS 2022-10-03 14:21:43 -07:00
kcore.c fs/proc/kcore.c: remove check of list iterator against head past the loop body 2022-04-29 14:37:59 -07:00
kmsg.c printk changes for 6.1 2022-10-10 11:24:19 -07:00
loadavg.c proc: mark more files as permanent 2022-10-03 14:21:45 -07:00
Makefile proc: bootconfig: Add /proc/bootconfig to show boot config list 2020-01-13 13:19:39 -05:00
meminfo.c - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
namespaces.c Merge branch 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-01-29 11:20:24 -08:00
nommu.c proc: delete unused <linux/uaccess.h> includes 2022-07-17 17:31:39 -07:00
page.c proc: mark more files as permanent 2022-10-03 14:21:45 -07:00
proc_net.c proc: add some (hopefully) insightful comments 2022-07-29 18:12:35 -07:00
proc_sysctl.c kernel/sysctl.c: move sysctl_vals and sysctl_long_vals to sysctl.c 2022-09-08 16:56:45 -07:00
proc_tty.c proc: delete unused <linux/uaccess.h> includes 2022-07-17 17:31:39 -07:00
root.c proc: add some (hopefully) insightful comments 2022-07-29 18:12:35 -07:00
self.c Revert "proc: don't allow async path resolution of /proc/self components" 2021-02-23 20:32:11 -07:00
softirqs.c proc: mark more files as permanent 2022-10-03 14:21:45 -07:00
stat.c fs/proc/uptime.c: Fix idle time reporting in /proc/uptime 2021-10-05 15:51:35 +02:00
task_mmu.c mm: /proc/pid/smaps_rollup: fix maple tree search 2022-10-20 21:27:23 -07:00
task_nommu.c proc: remove VMA rbtree use from nommu 2022-09-26 19:46:16 -07:00
thread_self.c Revert "proc: don't allow async path resolution of /proc/thread-self components" 2021-02-23 20:32:11 -07:00
uptime.c proc: mark more files as permanent 2022-10-03 14:21:45 -07:00
util.c fs/proc/util.c: include fs/proc/internal.h for name_to_int() 2019-01-04 13:13:45 -08:00
version.c proc: mark more files as permanent 2022-10-03 14:21:45 -07:00
vmcore.c proc/vmcore: fix potential memory leak in vmcore_init() 2022-11-18 13:55:06 -08:00