linux/fs
Damien Le Moal 09fad23a1a zonefs: Improve error handling
commit 14db5f64a9 upstream.

Write error handling is racy and can sometime lead to the error recovery
path wrongly changing the inode size of a sequential zone file to an
incorrect value  which results in garbage data being readable at the end
of a file. There are 2 problems:

1) zonefs_file_dio_write() updates a zone file write pointer offset
   after issuing a direct IO with iomap_dio_rw(). This update is done
   only if the IO succeed for synchronous direct writes. However, for
   asynchronous direct writes, the update is done without waiting for
   the IO completion so that the next asynchronous IO can be
   immediately issued. However, if an asynchronous IO completes with a
   failure right before the i_truncate_mutex lock protecting the update,
   the update may change the value of the inode write pointer offset
   that was corrected by the error path (zonefs_io_error() function).

2) zonefs_io_error() is called when a read or write error occurs. This
   function executes a report zone operation using the callback function
   zonefs_io_error_cb(), which does all the error recovery handling
   based on the current zone condition, write pointer position and
   according to the mount options being used. However, depending on the
   zoned device being used, a report zone callback may be executed in a
   context that is different from the context of __zonefs_io_error(). As
   a result, zonefs_io_error_cb() may be executed without the inode
   truncate mutex lock held, which can lead to invalid error processing.

Fix both problems as follows:
- Problem 1: Perform the inode write pointer offset update before a
  direct write is issued with iomap_dio_rw(). This is safe to do as
  partial direct writes are not supported (IOMAP_DIO_PARTIAL is not
  set) and any failed IO will trigger the execution of zonefs_io_error()
  which will correct the inode write pointer offset to reflect the
  current state of the one on the device.
- Problem 2: Change zonefs_io_error_cb() into zonefs_handle_io_error()
  and call this function directly from __zonefs_io_error() after
  obtaining the zone information using blkdev_report_zones() with a
  simple callback function that copies to a local stack variable the
  struct blk_zone obtained from the device. This ensures that error
  handling is performed holding the inode truncate mutex.
  This change also simplifies error handling for conventional zone files
  by bypassing the execution of report zones entirely. This is safe to
  do because the condition of conventional zones cannot be read-only or
  offline and conventional zone files are always fully mapped with a
  constant file size.

Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: 8dcc1a9d90 ("fs: New zonefs file system")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-23 09:12:45 +01:00
..
9p 9p: Fix initialisation of netfs_inode for 9p 2024-02-05 20:12:59 +00:00
adfs fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
affs affs: initialize fsdata in affs_truncate() 2023-02-01 08:34:08 +01:00
afs afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*() 2024-02-05 20:12:48 +00:00
autofs autofs: fix memory leak of waitqueues in autofs_catatonic_mode 2023-09-23 11:10:59 +02:00
befs befs: Convert befs_symlink_read_folio() to use a folio 2022-08-02 12:34:03 -04:00
bfs fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
btrfs btrfs: don't drop extent_map for free space inode on write error 2024-02-23 09:12:29 +01:00
cachefiles mm, netfs, fscache: stop read optimisation when folio removed from pagecache 2024-01-10 17:10:31 +01:00
ceph ceph: fix invalid pointer access if get_quota_realm return ERR_PTR 2024-02-05 20:12:59 +00:00
coda coda: Avoid partial allocation of sig_inputArgs 2023-03-10 09:33:52 +01:00
configfs configfs: fix possible memory leak in configfs_create_dir() 2022-12-31 13:32:22 +01:00
cramfs fs/cramfs/inode.c: initialize file_ra_state 2023-03-10 09:34:09 +01:00
crypto blk-crypto: add a blk_crypto_config_supported_natively helper 2023-05-11 23:03:00 +09:00
debugfs debugfs: fix automount d_fsdata usage 2024-01-20 11:50:04 +01:00
devpts
dlm fs: dlm: don't put dlm_local_addrs on heap 2024-02-16 19:06:29 +01:00
ecryptfs ecryptfs: Reject casefold directory inodes 2024-02-05 20:12:49 +00:00
efivarfs efivarfs: Free s_fs_info on unmount 2024-01-25 15:27:20 -08:00
efs efs: Convert efs symlinks to read_folio 2022-05-09 16:21:45 -04:00
erofs erofs: fix ztailpacking for subpage compressed blocks 2024-02-05 20:12:48 +00:00
exfat exfat: support handle zero-size directory 2023-11-28 17:07:00 +00:00
exportfs Change calling conventions for filldir_t 2022-08-17 17:25:04 -04:00
ext2 ext2: fix datatype of block number in ext2_xattr_set2() 2023-09-23 11:11:05 +02:00
ext4 ext4: avoid bb_free and bb_fragments inconsistency in mb_free_blocks() 2024-02-23 09:12:39 +01:00
f2fs f2fs: add helper to check compression level 2024-02-16 19:06:31 +01:00
fat treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
freevxfs freevxfs: Convert vxfs_immed_read_folio() to use a folio 2022-08-02 12:34:03 -04:00
fscache netfs, fscache: Prevent Oops in fscache_put_cache() 2024-01-31 16:17:04 -08:00
fuse fuse: share lookup state between submount and its parent 2024-01-01 12:39:08 +00:00
gfs2 gfs2: Fix kernel NULL pointer dereference in gfs2_rgrp_dump 2024-01-25 15:27:22 -08:00
hfs hfs: fix missing hfs_bnode_get() in __hfs_bnode_create 2023-03-10 09:34:07 +01:00
hfsplus fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode() 2023-05-24 17:32:34 +01:00
hostfs hostfs: move from strlcpy with unused retval to strscpy 2022-09-19 22:46:25 +02:00
hpfs hpfs: Convert symlinks to read_folio 2022-05-09 16:21:45 -04:00
hugetlbfs hugetlbfs: fix null-ptr-deref in hugetlbfs_parse_param() 2022-12-31 13:33:05 +01:00
iomap iomap: update ki_pos a little later in iomap_dio_complete 2023-12-08 08:51:20 +01:00
isofs - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
jbd2 jbd2: fix soft lockup in journal_finish_inode_data_buffers() 2024-01-20 11:50:07 +01:00
jffs2 jffs2: reduce stack usage in jffs2_build_xattr_subsystem() 2023-07-19 16:22:11 +02:00
jfs jfs: fix array-index-out-of-bounds in diNewExt 2024-02-05 20:12:48 +00:00
kernfs fs/kernfs/dir: obey S_ISGID 2024-02-05 20:12:58 +00:00
lockd fs: lockd: avoid possible wrong NULL parameter 2023-09-13 09:42:49 +02:00
minix vfs: open inside ->tmpfile() 2022-09-24 07:00:00 +02:00
netfs netfs: Only call folio_start_fscache() one time for each folio 2023-10-06 14:56:32 +02:00
nfs pNFS: Fix the pnfs block driver's calculation of layoutget size 2024-01-25 15:27:23 -08:00
nfs_common
nfsd Revert "nfsd: separate nfsd_last_thread() from nfsd_put()" 2024-01-15 18:54:50 +01:00
nilfs2 nilfs2: fix hang in nilfs_lookup_dirty_data_buffers() 2024-02-23 09:12:44 +01:00
nls fs/nls: make load_nls() take a const parameter 2023-09-13 09:42:22 +02:00
notify fanotify: disallow mount/sb marks on kernel internal pseudo fs 2023-07-19 16:22:05 +02:00
ntfs - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
ntfs3 fs/ntfs3: Fix an NULL dereference bug 2024-02-16 19:06:28 +01:00
ocfs2 fs: ocfs2: namei: check return value of ocfs2_add_entry() 2023-09-13 09:42:33 +02:00
omfs fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
openpromfs
orangefs use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
overlayfs ima: detect changes to the backing overlay file 2023-11-28 17:07:12 +00:00
proc watchdog: move softlockup_panic back to early_param 2023-11-28 17:07:09 +00:00
pstore pstore/ram: Fix crash when setting number of cpus to an odd number 2024-02-05 20:12:48 +00:00
qnx4 fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
qnx6 fs/qnx6: delete unnecessary checks before brelse() 2022-09-11 21:55:07 -07:00
quota quota: explicitly forbid quota files from being encrypted 2023-11-28 17:07:13 +00:00
ramfs shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfs 2023-07-19 16:22:11 +02:00
reiserfs reiserfs: Check the return value from __getblk() 2023-09-13 09:42:27 +02:00
romfs romfs: Convert romfs to read_folio 2022-05-09 16:21:46 -04:00
smb ksmbd: free aux buffer if ksmbd_iov_pin_rsp_read fails 2024-02-23 09:12:41 +01:00
squashfs revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" 2023-02-22 12:59:50 +01:00
sysfs
sysv fs/sysv: Null check to prevent null-ptr-deref bug 2023-08-11 12:08:23 +02:00
tracefs tracefs: Add missing lockdown check to tracefs_create_dir() 2023-09-23 11:11:12 +02:00
ubifs ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path 2024-01-31 16:17:02 -08:00
udf udf: initialize newblock to 0 2023-09-13 09:43:05 +02:00
ufs ufs: replace ll_rw_block() 2022-09-11 20:26:07 -07:00
unicode
vboxsf vboxsf: Convert vboxsf to read_folio 2022-05-09 16:21:46 -04:00
verity fsverity: skip PKCS#7 parser when keyring is empty 2023-09-13 09:43:03 +02:00
xfs xfs: read only mounts with fsopen mount API are busted 2024-01-31 16:17:08 -08:00
zonefs zonefs: Improve error handling 2024-02-23 09:12:45 +01:00
aio.c aio: fix mremap after fork null-deref 2023-02-22 12:59:46 +01:00
anon_inodes.c dynamic_dname(): drop unused dentry argument 2022-08-20 11:34:04 -04:00
attr.c attr: block mode changes of symlinks 2023-09-23 11:11:10 +02:00
bad_inode.c vfs: open inside ->tmpfile() 2022-09-24 07:00:00 +02:00
binfmt_elf_fdpic.c fs: binfmt_elf_efpic: fix personality for ELF-FDPIC 2023-10-06 14:57:06 +02:00
binfmt_elf_test.c
binfmt_elf.c mm: always expand the stack with the mmap write lock held 2023-07-01 13:16:25 +02:00
binfmt_flat.c
binfmt_misc.c binfmt_misc: fix shift-out-of-bounds in check_special_flags 2022-12-31 13:32:57 +01:00
binfmt_script.c
buffer.c - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
char_dev.c chardev: fix error handling in cdev_device_add() 2022-12-31 13:32:41 +01:00
compat_binfmt_elf.c
coredump.c coredump: Move dump_emit_page() to kill unused warning 2023-02-22 12:59:50 +01:00
d_path.c d_path.c: typo fix... 2022-08-20 11:34:33 -04:00
dax.c Merge branch 'for-6.0/dax' into libnvdimm-fixes 2022-09-24 18:14:12 -07:00
dcache.c fast_dput(): handle underflows gracefully 2024-02-05 20:12:54 +00:00
direct-io.c block: remove PSI accounting from the bio layer 2022-09-20 08:24:38 -06:00
drop_caches.c
eventfd.c eventfd: prevent underflow for eventfd semaphores 2023-09-13 09:42:27 +02:00
eventpoll.c epoll: ep_autoremove_wake_function should use list_del_init_careful 2023-06-21 16:00:54 +02:00
exec.c exec: Fix error handling in begin_new_exec() 2024-01-31 16:17:07 -08:00
fcntl.c keep iocb_flags() result cached in struct file 2022-06-10 16:10:23 -04:00
fhandle.c do_sys_name_to_handle(): constify path 2022-09-01 17:36:39 -04:00
file_table.c locks: fix TOCTOU race when granting write lease 2022-08-16 10:59:54 -04:00
file.c file: reinstate f_pos locking optimization for regular files 2023-08-11 12:08:23 +02:00
filesystems.c
fs_context.c vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing 2023-09-13 09:42:28 +02:00
fs_parser.c ext4: journal_path mount options should follow links 2023-01-07 11:11:59 +01:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs 2023-11-20 11:51:50 +01:00
fsopen.c uninline may_mount() and don't opencode it in fspick(2)/fsopen(2) 2022-05-19 23:25:10 -04:00
init.c
inode.c filemap: add a per-mapping stable writes flag 2024-01-10 17:10:32 +01:00
internal.h nfs: use vfs setgid helper 2023-08-30 16:11:10 +02:00
ioctl.c lsm: new security_file_ioctl_compat() hook 2024-01-31 16:17:00 -08:00
Kconfig smb: move client and server files to common directory fs/smb 2023-06-28 11:12:40 +02:00
Kconfig.binfmt Xtensa updates for v6.1 2022-10-10 14:21:11 -07:00
kernel_read_file.c fs/kernel_read_file: allow to read files up-to ssize_t 2022-06-16 19:58:21 -07:00
libfs.c libfs: add DEFINE_SIMPLE_ATTRIBUTE_SIGNED for signed value 2022-12-31 13:31:58 +01:00
locks.c locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock 2023-09-23 11:11:00 +02:00
Makefile smb: move client and server files to common directory fs/smb 2023-06-28 11:12:40 +02:00
mbcache.c ext4: fix deadlock due to mbcache entry corruption 2023-01-07 11:12:02 +01:00
mount.h switch try_to_unlazy_next() to __legitimize_mnt() 2022-07-05 16:18:21 -04:00
mpage.c Folio changes for 6.0 2022-08-03 10:35:43 -07:00
namei.c rename(): fix the locking of subdirectories 2024-01-31 16:17:02 -08:00
namespace.c fs: indicate request originates from old mount API 2024-01-25 15:27:22 -08:00
no-block.c
nsfs.c dynamic_dname(): drop unused dentry argument 2022-08-20 11:34:04 -04:00
open.c open: make RESOLVE_CACHED correctly test for O_TMPFILE 2023-08-11 12:08:22 +02:00
pipe.c pipe: wakeup wr_wait after setting max_usage 2024-01-31 16:17:09 -08:00
pnode.c pnode: terminate at peers of source 2023-01-04 11:29:01 +01:00
pnode.h
posix_acl.c - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
proc_namespace.c vfs: escape hash as well 2022-06-28 13:58:05 -04:00
read_write.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
readdir.c Change calling conventions for filldir_t 2022-08-17 17:25:04 -04:00
remap_range.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
select.c
seq_file.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
signalfd.c
splice.c mm: merge folio_has_private()/filemap_release_folio() call pairs 2024-01-10 17:10:31 +01:00
stack.c
stat.c vfs: support STATX_DIOALIGN on block devices 2022-09-11 19:47:12 -05:00
statfs.c statfs: enforce statfs[64] structure initialization 2023-05-24 17:32:51 +01:00
super.c fs: Protect reconfiguration of sb read-write from racing writes 2023-08-11 12:08:24 +02:00
sync.c riscv: compat: syscall: Add compat_sys_call_table implementation 2022-04-26 13:36:25 -07:00
sysctls.c
timerfd.c
userfaultfd.c Revert "userfaultfd: don't fail on unrecognized features" 2023-04-26 14:28:37 +02:00
utimes.c
xattr.c fs: don't audit the capability check in simple_xattr_list() 2022-12-31 13:31:55 +01:00