linux/fs
Rongwei Wang 55fc0d9174 mm, thp: lock filemap when truncating page cache
Patch series "fix two bugs for file THP".

This patch (of 2):

Transparent huge page has supported read-only non-shmem files.  The
file- backed THP is collapsed by khugepaged and truncated when written
(for shared libraries).

However, there is a race when multiple writers truncate the same page
cache concurrently.

In that case, subpage(s) of file THP can be revealed by find_get_entry
in truncate_inode_pages_range, which will trigger PageTail BUG_ON in
truncate_inode_page, as follows:

    page:000000009e420ff2 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff pfn:0x50c3ff
    head:0000000075ff816d order:9 compound_mapcount:0 compound_pincount:0
    flags: 0x37fffe0000010815(locked|uptodate|lru|arch_1|head)
    raw: 37fffe0000000000 fffffe0013108001 dead000000000122 dead000000000400
    raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
    head: 37fffe0000010815 fffffe001066bd48 ffff000404183c20 0000000000000000
    head: 0000000000000600 0000000000000000 00000001ffffffff ffff000c0345a000
    page dumped because: VM_BUG_ON_PAGE(PageTail(page))
    ------------[ cut here ]------------
    kernel BUG at mm/truncate.c:213!
    Internal error: Oops - BUG: 0 [#1] SMP
    Modules linked in: xfs(E) libcrc32c(E) rfkill(E) ...
    CPU: 14 PID: 11394 Comm: check_madvise_d Kdump: ...
    Hardware name: ECS, BIOS 0.0.0 02/06/2015
    pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
    Call trace:
     truncate_inode_page+0x64/0x70
     truncate_inode_pages_range+0x550/0x7e4
     truncate_pagecache+0x58/0x80
     do_dentry_open+0x1e4/0x3c0
     vfs_open+0x38/0x44
     do_open+0x1f0/0x310
     path_openat+0x114/0x1dc
     do_filp_open+0x84/0x134
     do_sys_openat2+0xbc/0x164
     __arm64_sys_openat+0x74/0xc0
     el0_svc_common.constprop.0+0x88/0x220
     do_el0_svc+0x30/0xa0
     el0_svc+0x20/0x30
     el0_sync_handler+0x1a4/0x1b0
     el0_sync+0x180/0x1c0
    Code: aa0103e0 900061e1 910ec021 9400d300 (d4210000)

This patch mainly to lock filemap when one enter truncate_pagecache(),
avoiding truncating the same page cache concurrently.

Link: https://lkml.kernel.org/r/20211025092134.18562-1-rongwei.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/20211025092134.18562-2-rongwei.wang@linux.alibaba.com
Fixes: eb6ecbed0a ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs")
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Song Liu <song@kernel.org>
Cc: Collin Fijalkovich <cfijalkovich@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
..
9p 9p: Fix a bunch of kerneldoc warnings shown up by W=1 2021-10-04 22:07:46 +01:00
adfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
affs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
afs netfslib, cachefiles and afs fixes 2021-10-07 11:20:08 -07:00
autofs autofs: fix wait name hash calculation in autofs_wait() 2021-10-20 21:09:02 -04:00
befs isystem: ship and use stdarg.h 2021-08-19 09:02:55 +09:00
bfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
btrfs for-5.15-rc7-tag 2021-10-29 10:46:59 -07:00
cachefiles cachefiles: Change %p in format strings to something else 2021-08-27 13:34:02 +01:00
ceph ceph: fix handling of "meta" errors 2021-10-19 09:36:06 +02:00
cifs cifs: fix incorrect check for null pointer in header_assemble 2021-09-23 21:12:53 -05:00
coda coda: fix reference counting in coda_file_mmap error path 2021-04-23 14:42:39 -07:00
configfs configfs: fix a race in configfs_lookup() 2021-08-25 07:58:49 +02:00
cramfs
crypto fscrypt: align Base64 encoding with RFC 4648 base64url 2021-07-25 20:47:05 -07:00
debugfs debugfs: debugfs_create_file_size(): use IS_ERR to check for error 2021-09-21 09:09:06 +02:00
devpts
dlm fs: dlm: avoid comms shutdown delay in release_lockspace 2021-09-01 11:29:14 -05:00
ecryptfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
efivarfs efivars: convert to fileattr 2021-04-12 15:04:29 +02:00
efs
erofs erofs: clear compacted_2b if compacted_4b_initial > totalidx 2021-09-23 23:23:04 +08:00
exfat Description for this pull request: 2021-07-06 11:06:04 -07:00
exportfs
ext2 ext2: fix sleeping in atomic bugs on error 2021-09-22 13:05:23 +02:00
ext4 Fix a number of ext4 bugs in fast_commit, inline data, and delayed 2021-10-03 13:56:53 -07:00
f2fs f2fs-for-5.15-rc1 2021-09-04 10:48:47 -07:00
fat linux-kselftest-kunit-5.15-rc1 2021-09-02 12:32:12 -07:00
freevxfs
fscache fscache: Remove an unused static variable 2021-10-04 22:13:12 +01:00
fuse fuse: clean up error exits in fuse_fill_super() 2021-10-21 10:01:39 +02:00
gfs2 Merge branch 'work.gfs2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-09-09 12:45:26 -07:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-15 10:13:49 -07:00
hfsplus hfsplus: report create_date to kstat.btime 2021-07-01 11:06:06 -07:00
hostfs hostfs: support splice_write 2021-08-26 22:28:02 +02:00
hpfs hpfs: use iomap_fiemap to implement ->fiemap 2021-07-27 11:00:36 +02:00
hugetlbfs hugetlbfs: fix mount mode command line processing 2021-07-23 17:43:28 -07:00
iomap iomap: standardize tracepoint formatting and storage 2021-08-26 09:18:53 -07:00
isofs isofs: joliet: Fix iocharset=utf8 mount option 2021-08-12 16:07:14 +02:00
jbd2 jbd2: add sparse annotations for add_transaction_credits() 2021-08-30 23:36:50 -04:00
jffs2 vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
jfs vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
kernfs kernfs: don't create a negative dentry if inactive node exists 2021-10-04 10:27:18 +02:00
ksmbd ksmbd: add buffer validation in session setup 2021-10-20 00:07:10 -05:00
lockd Critical bug fixes: 2021-09-22 09:21:02 -07:00
minix mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
netfs netfs: Fix READ/WRITE confusion when calling iov_iter_xarray() 2021-10-05 11:22:06 +01:00
nfs NFS Client Updates for Linux 5.15 2021-09-04 10:25:26 -07:00
nfs_common nfs: Fix kerneldoc warning shown up by W=1 2021-10-04 22:02:17 +01:00
nfsd Bug fixes for NFSD error handling paths 2021-10-07 14:11:40 -07:00
nilfs2 Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
nls
notify fsnotify: fix sb_connectors leak 2021-09-10 09:46:48 -07:00
ntfs Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-07-03 11:30:04 -07:00
ntfs3 Fixed xfstests generic/016 generic/021 generic/022 generic/041 generic/274 generic/423, 2021-10-15 09:58:11 -04:00
ocfs2 ocfs2: do not zero pages beyond i_size 2021-11-06 13:30:32 -07:00
omfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
openpromfs openpromfs: don't do unlock_new_inode() until the new inode is set up 2021-03-12 22:15:22 -05:00
orangefs vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
overlayfs ovl: fix IOCB_DIRECT if underlying fs doesn't support direct IO 2021-09-28 09:16:12 +02:00
proc mm/smaps: simplify shmem handling of pte holes 2021-11-06 13:30:33 -07:00
pstore for-5.14/drivers-2021-06-29 2021-06-30 12:21:16 -07:00
qnx4 qnx4: work around gcc false positive warning bug 2021-09-21 08:36:48 -07:00
qnx6
quota quota: remove unnecessary oom message 2021-06-22 10:40:52 +02:00
ramfs fs: move ramfs_aops to libfs 2021-06-29 10:53:48 -07:00
reiserfs Kbuild updates for v5.15 2021-09-03 15:33:47 -07:00
romfs
smbfs_common cifs: remove pathname for file from SPDX header 2021-09-13 14:51:10 -05:00
squashfs squashfs: use bvec_virt 2021-08-16 10:50:32 -06:00
sysfs sysfs: Allow deferred execution of iomem_get_mapping() 2021-08-06 13:05:28 +02:00
sysv mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
tracefs tracing: Fix various typos in comments 2021-03-23 14:08:18 -04:00
ubifs ubifs: report correct st_size for encrypted symlinks 2021-07-25 20:01:07 -07:00
udf udf_get_extendedattr() had no boundary checks. 2021-08-23 13:35:19 +02:00
ufs isystem: ship and use stdarg.h 2021-08-19 09:02:55 +09:00
unicode .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
vboxsf vboxfs: fix broken legacy mount signature checking 2021-09-27 11:26:21 -07:00
verity fs-verity: fix signed integer overflow with i_size near S64_MAX 2021-09-22 10:56:34 -07:00
xfs libnvdimm for v5.15 2021-09-09 11:39:57 -07:00
zonefs \n 2021-08-30 10:24:50 -07:00
aio.c eventfd: Make signal recursion protection a task bit 2021-08-28 01:33:02 +02:00
anon_inodes.c
attr.c fs: Move notify_change permission checks into may_setattr 2021-08-13 00:41:05 -04:00
bad_inode.c vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
binfmt_aout.c binfmt: a.out: Fix bogus semicolon 2021-09-05 10:15:05 -07:00
binfmt_elf_fdpic.c binfmt: remove in-tree usage of MAP_DENYWRITE 2021-09-03 18:42:01 +02:00
binfmt_elf.c elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings 2021-10-03 14:02:58 -07:00
binfmt_flat.c binfmt: remove in-tree usage of MAP_EXECUTABLE 2021-06-29 10:53:50 -07:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-13 11:27:30 -08:00
binfmt_script.c
buffer.c mm: fs: invalidate bh_lrus for only cold path 2021-09-24 16:13:35 -07:00
char_dev.c
compat_binfmt_elf.c
coredump.c coredump: fix memleak in dump_vma_snapshot() 2021-09-08 11:50:27 -07:00
d_path.c d_path: fix Kernel doc validator complaining 2021-11-06 13:30:32 -07:00
dax.c New code for 5.15: 2021-08-31 11:13:35 -07:00
dcache.c useful constants: struct qstr for ".." 2021-04-15 22:36:45 -04:00
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-09 14:54:23 -07:00
drop_caches.c fs: drop_caches: fix skipping over shadow cache inodes 2021-09-03 09:58:10 -07:00
eventfd.c eventfd: Export eventfd_wake_count to modules 2021-09-06 07:20:56 -04:00
eventpoll.c ARM development updates for 5.15: 2021-09-09 13:25:49 -07:00
exec.c Merge tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux 2021-09-04 11:35:47 -07:00
fcntl.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
fhandle.c switch file_open_root() to struct path 2021-04-07 13:56:43 -04:00
file_table.c
file.c virtio,vdpa,vhost: features, fixes 2021-09-11 14:48:42 -07:00
filesystems.c fs: simplify get_filesystem_list / get_all_fs_names 2021-08-23 01:25:40 -04:00
fs_context.c memcg: charge fs_context and legacy_fs_context 2021-09-03 09:58:12 -07:00
fs_parser.c namei: Standardize callers of filename_lookup() 2021-09-07 16:07:47 -04:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
fsopen.c
init.c
inode.c mm: Fully initialize invalidate_lock, amend lock class later 2021-09-17 13:39:23 +02:00
internal.h block: move fs/block_dev.c to block/bdev.c 2021-09-07 08:39:40 -06:00
io_uring.c io_uring: apply worker limits to previous users 2021-10-21 11:19:38 -06:00
io-wq.c io-wq: max_worker fixes 2021-10-19 17:09:34 -06:00
io-wq.h io-wq: provide a way to limit max number of workers 2021-08-29 07:55:55 -06:00
ioctl.c New code for 5.15: 2021-08-31 11:06:32 -07:00
Kconfig 4 cifs/smb3 fixes, one for DFS reconnect, and one to begin creating common headers for server and client and the other two to rename the cifs_common directory to smbfs_common to be more consistent ie change use of the name cifs to smb which is more accurate 2021-09-12 10:10:21 -07:00
Kconfig.binfmt binfmt: remove support for em86 (alpha only) 2021-07-25 22:33:03 -07:00
kernel_read_file.c vfs: check fd has read access in kernel_read_file_from_fd() 2021-10-18 20:22:03 -10:00
libfs.c fs: remove noop_set_page_dirty() 2021-06-29 10:53:48 -07:00
locks.c Revert "memcg: enable accounting for file lock caches" 2021-09-07 11:21:48 -07:00
Makefile 4 cifs/smb3 fixes, one for DFS reconnect, and one to begin creating common headers for server and client and the other two to rename the cifs_common directory to smbfs_common to be more consistent ie change use of the name cifs to smb which is more accurate 2021-09-12 10:10:21 -07:00
mbcache.c
mount.h
mpage.c block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
namei.c putname(): IS_ERR_OR_NULL() is wrong here 2021-09-07 16:14:05 -04:00
namespace.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
no-block.c
nsfs.c
open.c mm, thp: lock filemap when truncating page cache 2021-11-06 13:30:41 -07:00
pipe.c Revert "mm/gup: remove try_get_page(), call try_get_compound_head() directly" 2021-09-07 11:03:45 -07:00
pnode.c
pnode.h
posix_acl.c fs/posix_acl.c: avoid -Wempty-body warning 2021-11-06 13:30:32 -07:00
proc_namespace.c
read_write.c fs: clean up after mandatory file locking support removal 2021-08-24 07:52:45 -04:00
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-17 11:39:49 -07:00
remap_range.c fs: remove mandatory file locking support 2021-08-23 06:15:36 -04:00
select.c Revert "memcg: enable accounting for pollfd and select bits arrays" 2021-09-07 11:26:23 -07:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-19 17:18:48 -07:00
signalfd.c signal: Rename SIL_PERF_EVENT SIL_FAULT_PERF_EVENT for consistency 2021-07-23 13:16:43 -05:00
splice.c
stack.c
stat.c fs: add generic helper for filling statx attribute flags 2021-08-17 11:47:43 +02:00
statfs.c
super.c fs: explicitly unregister per-superblock BDIs 2021-11-06 13:30:34 -07:00
sync.c
timerfd.c timerfd: Provide timerfd_resume() 2021-08-10 17:57:22 +02:00
userfaultfd.c userfaultfd: fix a race between writeprotect and exit_mmap() 2021-10-18 20:22:02 -10:00
utimes.c
xattr.c xattr: fix kernel-doc for mnt_userns and vfs xattr helpers 2021-03-23 11:20:26 +01:00