2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-21 19:53:59 +08:00
linux-next/fs
NeilBrown a37b0715dd mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE
PF_LESS_THROTTLE exists for loop-back nfsd (and a similar need in the
loop block driver and callers of prctl(PR_SET_IO_FLUSHER)), where a
daemon needs to write to one bdi (the final bdi) in order to free up
writes queued to another bdi (the client bdi).

The daemon sets PF_LESS_THROTTLE and gets a larger allowance of dirty
pages, so that it can still dirty pages after other processses have been
throttled.  The purpose of this is to avoid deadlock that happen when
the PF_LESS_THROTTLE process must write for any dirty pages to be freed,
but it is being thottled and cannot write.

This approach was designed when all threads were blocked equally,
independently on which device they were writing to, or how fast it was.
Since that time the writeback algorithm has changed substantially with
different threads getting different allowances based on non-trivial
heuristics.  This means the simple "add 25%" heuristic is no longer
reliable.

The important issue is not that the daemon needs a *larger* dirty page
allowance, but that it needs a *private* dirty page allowance, so that
dirty pages for the "client" bdi that it is helping to clear (the bdi
for an NFS filesystem or loop block device etc) do not affect the
throttling of the daemon writing to the "final" bdi.

This patch changes the heuristic so that the task is not throttled when
the bdi it is writing to has a dirty page count below below (or equal
to) the free-run threshold for that bdi.  This ensures it will always be
able to have some pages in flight, and so will not deadlock.

In a steady-state, it is expected that PF_LOCAL_THROTTLE tasks might
still be throttled by global threshold, but that is acceptable as it is
only the deadlock state that is interesting for this flag.

This approach of "only throttle when target bdi is busy" is consistent
with the other use of PF_LESS_THROTTLE in current_may_throttle(), were
it causes attention to be focussed only on the target bdi.

So this patch
 - renames PF_LESS_THROTTLE to PF_LOCAL_THROTTLE,
 - removes the 25% bonus that that flag gives, and
 - If PF_LOCAL_THROTTLE is set, don't delay at all unless the
   global and the local free-run thresholds are exceeded.

Note that previously realtime threads were treated the same as
PF_LESS_THROTTLE threads.  This patch does *not* change the behvaiour
for real-time threads, so it is now different from the behaviour of nfsd
and loop tasks.  I don't know what is wanted for realtime.

[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Chuck Lever <chuck.lever@oracle.com>	[nfsd]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Link: http://lkml.kernel.org/r/87ftbf7gs3.fsf@notabene.neil.brown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:08 -07:00
..
9p 9p: read only once on O_NONBLOCK 2020-03-27 09:29:56 +00:00
adfs fs/adfs: bigdir: Fix an error code in adfs_fplus_read() 2020-01-25 11:31:59 -05:00
affs affs: fix a memory leak in affs_remount 2019-11-18 14:26:43 +01:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-05-23 17:16:18 -07:00
autofs LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat() 2020-03-13 21:08:17 -04:00
befs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
bfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
btrfs btrfs: use attach/detach_page_private 2020-06-02 10:59:07 -07:00
cachefiles cachefiles: Fix race between read_waiter and read_copier involving op->to_do 2020-05-08 23:01:10 +01:00
ceph ceph: flush release queue when handling caps for unknown inode 2020-05-27 13:03:57 +02:00
cifs cifs: fix leaked reference on requeued write 2020-05-14 17:47:01 -05:00
coda y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
configfs configfs: fix config_item refcnt leak in configfs_rmdir() 2020-04-27 08:17:10 +02:00
cramfs cramfs: switch to use of errofc() et.al. 2020-02-07 14:48:41 -05:00
crypto fscrypt updates for 5.8 2020-06-01 12:10:17 -07:00
debugfs debugfs: remove return value of debugfs_create_u32() 2020-04-17 17:08:50 +02:00
devpts devpts_pty_kill(): don't bother with d_delete() 2019-09-03 09:30:56 -04:00
dlm dlm: use SO_SNDTIMEO_NEW instead of SO_SNDTIMEO_OLD 2019-12-18 18:07:31 +01:00
ecryptfs ecryptfs: use crypto_shash_tfm_digest() 2020-05-08 15:32:15 +10:00
efivarfs efi: Use more granular check for availability for variable services 2020-02-23 21:59:42 +01:00
efs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
erofs erofs: convert compressed files from readpages to readahead 2020-06-02 10:59:07 -07:00
exfat fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
exportfs race in exportfs_decode_fh() 2019-11-11 09:21:59 -05:00
ext2 fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
ext4 ext4: pass the inode to ext4_mpage_readpages 2020-06-02 10:59:07 -07:00
f2fs f2fs: use attach/detach_page_private 2020-06-02 10:59:07 -07:00
fat fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
freevxfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
fscache proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
fuse fuse: convert from readpages to readahead 2020-06-02 10:59:07 -07:00
gfs2 fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
hfs hfs/hfsplus: use 64-bit inode timestamps 2019-12-18 18:07:32 +01:00
hfsplus hfsplus: fix crash and filesystem corruption when deleting files 2020-04-10 15:36:20 -07:00
hostfs hostfs: Use kasprintf() instead of fixed buffer formatting 2020-03-29 23:23:00 +02:00
hpfs fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
hugetlbfs hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race 2020-04-02 09:35:32 -07:00
iomap iomap: use attach/detach_page_private 2020-06-02 10:59:07 -07:00
isofs fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
jbd2 jbd2: improve comments about freeing data buffers whose page mapping is NULL 2020-03-05 20:25:05 -05:00
jffs2 fs_parse: fold fs_parameter_desc/fs_parameter_spec 2020-02-07 14:48:37 -05:00
jfs fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
kernfs kernfs: Add option to enable user xattrs 2020-03-16 15:53:47 -04:00
lockd proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
minix fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
nfs NFSv3: fix rpc receive buffer size for MOUNT call 2020-05-14 18:42:44 -04:00
nfs_common
nfsd mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE 2020-06-02 10:59:08 -07:00
nilfs2 fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
nls
notify fanotify: turn off support for FAN_DIR_MODIFY 2020-05-27 18:55:54 +02:00
ntfs ntfs: replace attach_page_buffers with attach_page_private 2020-06-02 10:59:07 -07:00
ocfs2 fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
omfs fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
openpromfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
orangefs orangefs: use attach/detach_page_private 2020-06-02 10:59:08 -07:00
overlayfs ovl: potential crash in ovl_fid_to_fh() 2020-05-13 11:10:57 +02:00
proc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-04-25 12:25:32 -07:00
pstore pstore/blk: Introduce "best_effort" mode 2020-05-31 19:49:01 -07:00
qnx4 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
qnx6 fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
quota \n 2020-01-30 15:37:41 -08:00
ramfs fs_parse: fold fs_parameter_desc/fs_parameter_spec 2020-02-07 14:48:37 -05:00
reiserfs fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
romfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
squashfs squashfs: migrate from ll_rw_block usage to BIO 2020-06-02 10:59:05 -07:00
sysfs sysfs: remove redundant __compat_only_sysfs_link_entry_to_kobj fn 2020-04-05 11:34:35 -07:00
sysv fs: sysv: Initialize filesystem timestamp ranges 2019-08-30 07:27:18 -07:00
tracefs simple_recursive_removal(): kernel-side rm -rf for ramfs-style filesystems 2019-12-10 22:29:58 -05:00
ubifs Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2020-06-01 12:00:10 -07:00
udf fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
ufs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
unicode .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
vboxsf vboxsf: don't use the source name in the bdi name 2020-05-07 08:45:47 -06:00
verity fs-verity: remove unnecessary extern keywords 2020-05-12 16:44:00 -07:00
xfs iomap: convert from readpages to readahead 2020-06-02 10:59:07 -07:00
zonefs iomap: convert from readpages to readahead 2020-06-02 10:59:07 -07:00
aio.c aio: prevent potential eventfd recursion on poll 2020-02-03 17:27:47 -07:00
anon_inodes.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
attr.c utimes: Clamp the timestamps in notify_change() 2019-12-08 19:10:50 -05:00
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c y2038: elfcore: Use __kernel_old_timeval for process times 2019-11-15 14:38:29 +01:00
binfmt_elf.c fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info() 2020-05-28 11:35:40 -07:00
binfmt_em86.c
binfmt_flat.c fs/binfmt_flat.c: remove set but not used variable 'inode' 2019-07-16 19:23:22 -07:00
binfmt_misc.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
binfmt_script.c
block_dev.c fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
buffer.c fs/buffer.c: use attach/detach_page_private 2020-06-02 10:59:07 -07:00
char_dev.c chardev: Avoid potential use-after-free in 'chrdev_open()' 2020-01-06 20:10:26 +01:00
compat_binfmt_elf.c y2038: elfcore: Use __kernel_old_timeval for process times 2019-11-15 14:38:29 +01:00
compat.c
coredump.c coredump: fix crash when umh is disabled 2020-04-28 17:54:13 +02:00
d_path.c [PATCH] fix d_absolute_path() interplay with fsmount() 2019-08-30 19:31:09 -04:00
dax.c dax,iomap: Add helper dax_iomap_zero() to zero a range 2020-04-02 19:15:03 -07:00
dcache.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-12-08 11:08:28 -08:00
dcookies.c
direct-io.c fs/direct-io.c: include fs/internal.h for missing prototype 2020-01-04 13:55:09 -08:00
drop_caches.c fs: avoid softlockups in s_inodes iterators 2019-12-18 00:03:01 -05:00
eventfd.c eventfd: track eventfd_signal() recursion depth 2020-02-03 17:27:38 -07:00
eventpoll.c epoll: call final ep_events_available() check under the lock 2020-05-14 10:00:35 -07:00
exec.c exec: Move would_dump into flush_old_exec 2020-05-17 10:48:24 -05:00
fcntl.c fcntl: Distribute switch variables for initialization 2020-03-03 10:55:06 -05:00
fhandle.c fs/handle.c - fix up kerneldoc 2019-08-07 21:51:47 -04:00
file_table.c vfs: track per-sb writeback errors and report them to syncfs 2020-06-02 10:59:05 -07:00
file.c fix multiplication overflow in copy_fdtable() 2020-05-19 18:29:36 -04:00
filesystems.c fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() 2020-04-10 15:36:22 -07:00
fs_context.c add prefix to fs_context->log 2020-02-07 14:48:35 -05:00
fs_parser.c fs_parse: remove pr_notice() about each validation 2020-04-02 09:35:26 -07:00
fs_pin.c switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
fs_struct.c
fs_types.c
fs-writeback.c memcg: fix a crash in wb_workfn when a device disappears 2020-01-31 10:30:36 -08:00
fsopen.c add prefix to fs_context->log 2020-02-07 14:48:35 -05:00
inode.c futex: Fix inode life-time issue 2020-03-06 11:06:15 +01:00
internal.h Merge branch 'work.dotdot1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-04-02 12:30:08 -07:00
io_uring.c io_uring: reset -EBUSY error when io sq thread is waken up 2020-05-20 07:26:47 -06:00
io-wq.c io_uring: use io-wq manager as backup task if task is exiting 2020-04-03 11:35:57 -06:00
io-wq.h io_uring: use io-wq manager as backup task if task is exiting 2020-04-03 11:35:57 -06:00
ioctl.c fibmap: Warn and return an error in case of block > INT_MAX 2020-04-30 07:57:46 -07:00
Kconfig exfat: add Kconfig and Makefile 2020-03-05 21:00:40 -05:00
Kconfig.binfmt
libfs.c libfs: fix infoleak in simple_attr_read() 2020-03-24 13:27:16 +01:00
locks.c locks: reinstate locks_delete_block optimization 2020-03-18 13:03:38 -07:00
Makefile exfat: add Kconfig and Makefile 2020-03-05 21:00:40 -05:00
mbcache.c
mount.h switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
mpage.c fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
namei.c fix a braino in legitimize_path() 2020-04-06 10:38:59 -04:00
namespace.c LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat() 2020-03-13 21:08:17 -04:00
no-block.c
nsfs.c fs/nsfs.c: Added ns_match 2020-03-12 17:33:11 -07:00
open.c vfs: track per-sb writeback errors and report them to syncfs 2020-06-02 10:59:05 -07:00
pipe.c mm: kmem: rename memcg_kmem_(un)charge() into memcg_kmem_(un)charge_page() 2020-04-02 09:35:28 -07:00
pnode.c propagate_one(): mnt_set_mountpoint() needs mount_lock 2020-04-27 10:37:14 -04:00
pnode.h
posix_acl.c fs/posix_acl.c: fix kernel-doc warnings 2020-01-04 13:55:09 -08:00
proc_namespace.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
read_write.c powerpc: Add back __ARCH_WANT_SYS_LLSEEK macro 2020-04-03 00:09:59 +11:00
readdir.c readdir: make user_access_begin() use the real access range 2020-01-23 10:15:28 -08:00
select.c y2038: syscalls: change remaining timeval to __kernel_old_timeval 2019-11-15 14:38:29 +01:00
seq_file.c fs/seq_file.c: seq_read(): add info message about buggy .next functions 2020-04-10 15:36:22 -07:00
signalfd.c
splice.c pipe: Fix pipe_full() test in opipe_prep(). 2020-05-20 10:54:29 -07:00
stack.c sched/rt, fs: Use CONFIG_PREEMPTION 2019-12-08 14:37:36 +01:00
stat.c fs: make two stat prep helpers available 2020-01-20 17:03:54 -07:00
statfs.c vfs: Fix EOVERFLOW testing in put_compat_statfs64 2019-10-03 14:21:35 -07:00
super.c Fix use after free in get_tree_bdev() 2020-04-28 14:37:40 -07:00
sync.c vfs: track per-sb writeback errors and report them to syncfs 2020-06-02 10:59:05 -07:00
timerfd.c timerfd: Make timerfd_settime() time namespace aware 2020-01-14 12:20:53 +01:00
userfaultfd.c userfaultfd: wp: declare _UFFDIO_WRITEPROTECT conditionally 2020-04-07 10:43:40 -07:00
utimes.c utimes: Clamp the timestamps in notify_change() 2019-12-08 19:10:50 -05:00
xattr.c xattr: fix uninitialized out-param 2020-04-09 15:33:09 -04:00