Commit Graph

93147 Commits

Author SHA1 Message Date
Stephen Rothwell
71778d2059 Merge branch 'fs-current' of linux-next 2024-08-29 08:32:50 +10:00
Stephen Rothwell
30d8c4b01f Merge branch 'mm-hotfixes-unstable' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm 2024-08-29 08:32:49 +10:00
Stephen Rothwell
6af5dfc5df Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git 2024-08-29 08:13:54 +10:00
Lizhi Xu
eb1fdd9da4 ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate
When doing cleanup, if flags without OCFS2_BH_READAHEAD, it may trigger
NULL pointer dereference in the following ocfs2_set_buffer_uptodate() if
bh is NULL.

Link: https://lkml.kernel.org/r/20240821061450.3478602-1-lizhi.xu@windriver.com
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Reported-by: Heming Zhao <heming.zhao@suse.com>
Suggested-by: Heming Zhao <heming.zhao@suse.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-08-28 14:35:05 -07:00
Lizhi Xu
fa4129ba77 ocfs2: remove unreasonable unlock
There was a lock release before exiting, so remove the unreasonable
unlock.

Link: https://lkml.kernel.org/r/20240820094512.2228159-1-lizhi.xu@windriver.com
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Reported-by: syzbot+ab134185af9ef88dfed5@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ab134185af9ef88dfed5
Tested-by: syzbot+ab134185af9ef88dfed5@syzkaller.appspotmail.com
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-08-28 14:35:04 -07:00
Ryusuke Konishi
75fb207c1e nilfs2: fix state management in error path of log writing function
After commit a694291a62 ("nilfs2: separate wait function from
nilfs_segctor_write") was applied, the log writing function
nilfs_segctor_do_construct() was able to issue I/O requests continuously
even if user data blocks were split into multiple logs across segments,
but two potential flaws were introduced in its error handling.

First, if nilfs_segctor_begin_construction() fails while creating the
second or subsequent logs, the log writing function returns without
calling nilfs_segctor_abort_construction(), so the writeback flag set on
pages/folios will remain uncleared.  This causes page cache operations to
hang waiting for the writeback flag.  For example,
truncate_inode_pages_final(), which is called via nilfs_evict_inode() when
an inode is evicted from memory, will hang.

Second, the NILFS_I_COLLECTED flag set on normal inodes remain uncleared. 
As a result, if the next log write involves checkpoint creation, that's
fine, but if a partial log write is performed that does not, inodes with
NILFS_I_COLLECTED set are erroneously removed from the "sc_dirty_files"
list, and their data and b-tree blocks may not be written to the device,
corrupting the block mapping.

Fix these issues by uniformly calling nilfs_segctor_abort_construction()
on failure of each step in the loop in nilfs_segctor_do_construct(),
having it clean up logs and segment usages according to progress, and
correcting the conditions for calling nilfs_redirty_inodes() to ensure
that the NILFS_I_COLLECTED flag is cleared.

Link: https://lkml.kernel.org/r/20240814101119.4070-1-konishi.ryusuke@gmail.com
Fixes: a694291a62 ("nilfs2: separate wait function from nilfs_segctor_write")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-08-28 14:35:03 -07:00
Ryusuke Konishi
283a3c52d0 nilfs2: fix missing cleanup on rollforward recovery error
In an error injection test of a routine for mount-time recovery, KASAN
found a use-after-free bug.

It turned out that if data recovery was performed using partial logs
created by dsync writes, but an error occurred before starting the log
writer to create a recovered checkpoint, the inodes whose data had been
recovered were left in the ns_dirty_files list of the nilfs object and
were not freed.

Fix this issue by cleaning up inodes that have read the recovery data if
the recovery routine fails midway before the log writer starts.

Link: https://lkml.kernel.org/r/20240810065242.3701-1-konishi.ryusuke@gmail.com
Fixes: 0f3e1c7f23 ("nilfs2: recovery functions")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-08-28 14:35:03 -07:00
Ryusuke Konishi
ba4a8337b3 nilfs2: protect references to superblock parameters exposed in sysfs
The superblock buffers of nilfs2 can not only be overwritten at runtime
for modifications/repairs, but they are also regularly swapped, replaced
during resizing, and even abandoned when degrading to one side due to
backing device issues.  So, accessing them requires mutual exclusion using
the reader/writer semaphore "nilfs->ns_sem".

Some sysfs attribute show methods read this superblock buffer without the
necessary mutual exclusion, which can cause problems with pointer
dereferencing and memory access, so fix it.

Link: https://lkml.kernel.org/r/20240811100320.9913-1-konishi.ryusuke@gmail.com
Fixes: da7141fb78 ("nilfs2: add /sys/fs/nilfs2/<device> group")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-08-28 14:35:02 -07:00
Linus Torvalds
a18093afa3 nfsd-6.11 fixes:
- Fix a number of crashers
 - Update email address for an NFSD reviewer
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmbPLJQACgkQM2qzM29m
 f5edLBAAveHRyP1XKgLIFtIX40GxK029NEHT2vq9ageaLFsFGylAaAgXSB5rUXER
 3FfhOrNuAr/ZAV2bqm/qnA3yDuh/OuD9MIxEA+C2cNJ4bjEKXaqq5iePTK+AeNGH
 cnvE3UlI9rQkflpuZxbliJZm+u+mxaKnMO2riFbZeKrQh5C6Mn6z4fXp88CZj65U
 7oDeONqyjMEtkjJPutzJZr0gJbPjeGgZlrsVgMMg/nki3y+Fal6Rt0hDO2u9evV9
 3zTyJ7S7yUnsZ0b2JTx061EfJLd7KFmefWG4UKRKYk1XtiDbHt/cUSi4/QBA0EWw
 6VK5aJWUF2OwUpkAYohU0o4/qApoce1raR0cpwrRwzLINDdwPTkfz9L8dpjRPJ68
 ubUhWP7D/xASD5RhSrbH6lG8XY3ISXkRk2knwKXFtJtq2uIz2Gxc6F1gzzPE/whR
 N6JxdiMrUaKoO4qiwOtrXwiYACR9+qsVSW/QxNG1xdQhNXyLb5L6uUSe384MPybw
 nVIkrOdwO4uors6DzkFoHI/Au2QrTi9pzn53PK01RkhrJKUIS3lMPBnPbf5RAIQN
 EowYaJNSm52Px+omXzZzHoOgI0h0P3UWiuXoui1Zcy/xCT6uEbj5QtKxM6iKSXyd
 4KU3DnC+9nr6+3ld47MFn38f7gmPNrvB56XMCDQKmr9x9OmtB5k=
 =fRqF
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - Fix a number of crashers

 - Update email address for an NFSD reviewer

* tag 'nfsd-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  fs/nfsd: fix update of inode attrs in CB_GETATTR
  nfsd: fix potential UAF in nfsd4_cb_getattr_release
  nfsd: hold reference to delegation when updating it for cb_getattr
  MAINTAINERS: Update Olga Kornievskaia's email address
  nfsd: prevent panic for nfsv4.0 closed files in nfs4_show_open
  nfsd: ensure that nfsd4_fattr_args.context is zeroed out
2024-08-29 06:20:44 +12:00
Linus Torvalds
2840526875 for-6.11-rc5-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmbPBfoACgkQxWXV+ddt
 WDvzwxAAlL4L2EA3UhFJNHXqvDfGUKldyNv3pn9p6fT5fsSrJGMiVVbF0BDTtrw7
 ZekH/ELxNaGCndR7qPScahez9Qll9JiRHVx68yBuanAqjCU7UfE6omKG5wo82+NX
 vjNNovKcyXEdErsD6TFVpv7abErEEHmVVUxKjKB4nJxux6OuvZZDzmN0pm4WrmIm
 226az0nxPi+MQjROKMT3qcyccxDxUzDLRnzCunkHSJzojBC3KZimx707/rZSQi05
 w4HVL1QoxfhKLRA5qXWp27YWbz78UehFv68HlAZLabnJq6khWdHaVUpfkfdJurn9
 j/ZAFlb8vKWFRIh2WQE3twJ2nJTW/h2zvnMV0NbXFz3LGlG6krOPYuFARD0WUNAI
 OdZANhi1YJQTlgBjiKYU+bP6dN5hA49TFL6/LGNrgAHJzVw8Mf7ovsG4L9gwYRoU
 D9Ed2IFZj3SYJj92u8/1VHD0yejg8tZNBYT6fiSBdY+Da6q5rZ2fvxxJIHv7m3A6
 crkm5APdBLGCCQRm5Vnqy/K8PSf2vu5moLebS81pKsJWxA7/d8jLcnMhHfoBMw8X
 LpLxpsXnaatSUYysclmzaRazpChvTYbE4z5qAYc0kUNI/wbx/q/A42ixOaGnsPWX
 pnfDs6jb6VHxT1bPgDjipFUJAeB/og8C6nyX4Tuq53gmWuR6q2k=
 =+IpE
 -----END PGP SIGNATURE-----

Merge tag 'for-6.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fix use-after-free when submitting bios for read, after an error and
   partially submitted bio the original one is freed while it can be
   still be accessed again

 - fix fstests case btrfs/301, with enabled quotas wait for delayed
   iputs when flushing delalloc

 - fix periodic block group reclaim, an unitialized value can be
   returned if there are no block groups to reclaim

 - fix build warning (-Wmaybe-uninitialized)

* tag 'for-6.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix uninitialized return value from btrfs_reclaim_sweep()
  btrfs: fix a use-after-free when hitting errors inside btrfs_submit_chunk()
  btrfs: initialize last_extent_end to fix -Wmaybe-uninitialized warning in extent_fiemap()
  btrfs: run delayed iputs when flushing delalloc
2024-08-29 06:17:46 +12:00
Linus Torvalds
86987d84b9 four cifs.ko client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmbOBa8ACgkQiiy9cAdy
 T1GHBAwAmSK9FTCg4x5tTRBwiS9VSrC3KQ2TwLrkXjeWXhJycZjsDQHnbHG68Q+t
 Sbq711RClpdvwMWLUqiryjd+VVqPzG/9jZLOPeeW7SIljyksUzxaQXGbGcquz57N
 hnZjrjyyquU5NhtOALyVeO4lNYboYTH+fETsrMoJIGNoI0yBHZSM/eRQO1heLRBn
 629yKbqp0m/5/A/w3s1nKljO74sG//6LKDZld6es7tmxgku8TFNEqsI7SONw3pUg
 dYgM2kIPf4rwpqupfxSriylz0xlHIEmITn5wkvygS+TvcWXsG855TVLjtD1I6uX3
 JYOZ9gfqubGNXkT5SbbfsmAOma8PBm54oT0UWwJVUj/5Ed3D9EyBl2jlqjNLQjSU
 qJ0/ha+AyFCn2vviPA+vVnHd5I2Y82JlI4VrwrSyHG5E/6UMHNQ2Do8GbA/eRdcp
 HeqR57V4VNzNzVfKCp4XygwBuifbXdRX+yrUdBPDZm+CMLPD6wGZdUQu4FaESYv6
 i24UdVG9
 =BDrz
 -----END PGP SIGNATURE-----

Merge tag 'v6.11-rc5-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

 - two RDMA/smbdirect fixes and a minor cleanup

 - punch hole fix

* tag 'v6.11-rc5-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Fix FALLOC_FL_PUNCH_HOLE support
  smb/client: fix rdma usage in smb2_async_writev()
  smb/client: remove unused rq_iter_size from struct smb_rqst
  smb/client: avoid dereferencing rdata=NULL in smb2_new_read_req()
2024-08-28 15:05:02 +12:00
Filipe Manana
ecb54277cb btrfs: fix uninitialized return value from btrfs_reclaim_sweep()
The return variable 'ret' at btrfs_reclaim_sweep() is never assigned if
none of the space infos is reclaimable (for example if periodic reclaim
is disabled, which is the default), so we return an undefined value.

This can be fixed my making btrfs_reclaim_sweep() not return any value
as well as do_reclaim_sweep() because:

1) do_reclaim_sweep() always returns 0, so we can make it return void;

2) The only caller of btrfs_reclaim_sweep() (btrfs_reclaim_bgs()) doesn't
   care about its return value, and in its context there's nothing to do
   about any errors anyway.

Therefore remove the return value from btrfs_reclaim_sweep() and
do_reclaim_sweep().

Fixes: e4ca3932ae ("btrfs: periodic block_group reclaim")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-27 16:42:09 +02:00
Linus Torvalds
3e9bff3bbe vfs-6.11-rc6.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZsxg5QAKCRCRxhvAZXjc
 olSiAQDvFvim4YtMmUDagC3yWTBsf+o3lYdAIuzNE0NtSn4vpAEAl/HVhQCaEDjv
 mcE3jokEsbvyXLnzs78PrY0Heua2mQg=
 =AHAd
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.11-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "VFS:

   - Ensure that backing files uses file->f_ops->splice_write() for
     splice

  netfs:

   - Revert the removal of PG_private_2 from netfs_release_folio() as
     cephfs still relies on this

   - When AS_RELEASE_ALWAYS is set on a mapping the folio needs to
     always be invalidated during truncation

   - Fix losing untruncated data in a folio by making letting
     netfs_release_folio() return false if the folio is dirty

   - Fix trimming of streaming-write folios in netfs_inval_folio()

   - Reset iterator before retrying a short read

   - Fix interaction of streaming writes with zero-point tracker

  afs:

   - During truncation afs currently calls truncate_setsize() which sets
     i_size, expands the pagecache and truncates it. The first two
     operations aren't needed because they will have already been done.
     So call truncate_pagecache() instead and skip the redundant parts

  overlayfs:

   - Fix checking of the number of allowed lower layers so 500 layers
     can actually be used instead of just 499

   - Add missing '\n' to pr_err() output

   - Pass string to ovl_parse_layer() and thus allow it to be used for
     Opt_lowerdir as well

  pidfd:

   - Revert blocking the creation of pidfds for kthread as apparently
     userspace relies on this. Specifically, it breaks systemd during
     shutdown

  romfs:

   - Fix romfs_read_folio() to use the correct offset with
     folio_zero_tail()"

* tag 'vfs-6.11-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  netfs: Fix interaction of streaming writes with zero-point tracker
  netfs: Fix missing iterator reset on retry of short read
  netfs: Fix trimming of streaming-write folios in netfs_inval_folio()
  netfs: Fix netfs_release_folio() to say no if folio dirty
  afs: Fix post-setattr file edit to do truncation correctly
  mm: Fix missing folio invalidation calls during truncation
  ovl: ovl_parse_param_lowerdir: Add missed '\n' for pr_err
  ovl: fix wrong lowerdir number check for parameter Opt_lowerdir
  ovl: pass string to ovl_parse_layer()
  backing-file: convert to using fops->splice_write
  Revert "pidfd: prevent creation of pidfds for kthreads"
  romfs: fix romfs_read_folio()
  netfs, ceph: Partially revert "netfs: Replace PG_fscache by setting folio->private and marking dirty"
2024-08-27 16:57:35 +12:00
Qu Wenruo
10d9d8c351 btrfs: fix a use-after-free when hitting errors inside btrfs_submit_chunk()
[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

  BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
  CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ #76
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
  Call Trace:
   dump_stack_lvl+0x61/0x80
   print_address_description.constprop.0+0x5e/0x2f0
   print_report+0x118/0x216
   kasan_report+0x11d/0x1f0
   btrfs_check_read_bio+0xa68/0xb70 [btrfs]
   process_one_work+0xce0/0x12a0
   worker_thread+0x717/0x1250
   kthread+0x2e3/0x3c0
   ret_from_fork+0x2d/0x70
   ret_from_fork_asm+0x11/0x20

  Allocated by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   __kasan_slab_alloc+0x7d/0x80
   kmem_cache_alloc_noprof+0x16e/0x3e0
   mempool_alloc_noprof+0x12e/0x310
   bio_alloc_bioset+0x3f0/0x7a0
   btrfs_bio_alloc+0x2e/0x50 [btrfs]
   submit_extent_page+0x4d1/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

  Freed by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   kasan_save_free_info+0x37/0x50
   __kasan_slab_free+0x4b/0x60
   kmem_cache_free+0x214/0x5d0
   bio_free+0xed/0x180
   end_bbio_data_read+0x1cc/0x580 [btrfs]
   btrfs_submit_chunk+0x98d/0x1880 [btrfs]
   btrfs_submit_bio+0x33/0x70 [btrfs]
   submit_one_bio+0xd4/0x130 [btrfs]
   submit_extent_page+0x3ea/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I cannot reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.  And the original endio function called bio_put() to free the whole
bio.

This means a double freeing thus causing use-after-free, e.g.:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
label completely.

Reported-by: David Sterba <dsterba@suse.com>
Fixes: 852eee62d3 ("btrfs: allow btrfs_submit_bio to split bios")
CC: stable@vger.kernel.org # 6.6+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-27 01:34:08 +02:00
Jeff Layton
7e8ae8486e fs/nfsd: fix update of inode attrs in CB_GETATTR
Currently, we copy the mtime and ctime to the in-core inode and then
mark the inode dirty. This is fine for certain types of filesystems, but
not all. Some require a real setattr to properly change these values
(e.g. ceph or reexported NFS).

Fix this code to call notify_change() instead, which is the proper way
to effect a setattr. There is one problem though:

In this case, the client is holding a write delegation and has sent us
attributes to update our cache. We don't want to break the delegation
for this since that would defeat the purpose. Add a new ATTR_DELEG flag
that makes notify_change bypass the try_break_deleg call.

Fixes: c5967721e1 ("NFSD: handle GETATTR conflict with write delegation")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-08-26 19:04:00 -04:00
Jeff Layton
1116e0e372 nfsd: fix potential UAF in nfsd4_cb_getattr_release
Once we drop the delegation reference, the fields embedded in it are no
longer safe to access. Do that last.

Fixes: c5967721e1 ("NFSD: handle GETATTR conflict with write delegation")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-08-26 11:53:05 -04:00
Jeff Layton
da05ba23d4 nfsd: hold reference to delegation when updating it for cb_getattr
Once we've dropped the flc_lock, there is nothing that ensures that the
delegation that was found will still be around later. Take a reference
to it while holding the lock and then drop it when we've finished with
the delegation.

Fixes: c5967721e1 ("NFSD: handle GETATTR conflict with write delegation")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-08-26 11:52:40 -04:00
David Sterba
33f58a0480 btrfs: initialize last_extent_end to fix -Wmaybe-uninitialized warning in extent_fiemap()
There's a warning (probably on some older compiler version):

fs/btrfs/fiemap.c: warning: 'last_extent_end' may be used uninitialized in this function [-Wmaybe-uninitialized]:  => 822:19

Initialize the variable to 0 although it's not necessary as it's either
properly set or not used after an error. The called function is in the
same file so this is a false alert but we want to fix all
-Wmaybe-uninitialized reports.

Link: https://lore.kernel.org/all/20240819070639.2558629-1-geert@linux-m68k.org/
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-26 16:58:13 +02:00
Josef Bacik
2d34472610 btrfs: run delayed iputs when flushing delalloc
We have transient failures with btrfs/301, specifically in the part
where we do

  for i in $(seq 0 10); do
	  write 50m to file
	  rm -f file
  done

Sometimes this will result in a transient quota error, and it's because
sometimes we start writeback on the file which results in a delayed
iput, and thus the rm doesn't actually clean the file up.  When we're
flushing the quota space we need to run the delayed iputs to make sure
all the unlinks that we think have completed have actually completed.
This removes the small window where we could fail to find enough space
in our quota.

CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-25 19:15:34 +02:00
David Howells
416871f4fb cifs: Fix FALLOC_FL_PUNCH_HOLE support
The cifs filesystem doesn't quite emulate FALLOC_FL_PUNCH_HOLE correctly
(note that due to lack of protocol support, it can't actually implement it
directly).  Whilst it will (partially) invalidate dirty folios in the
pagecache, it doesn't write them back first, and so the EOF marker on the
server may be lower than inode->i_size.

This presents a problem, however, as if the punched hole invalidates the
tail of the locally cached dirty data, writeback won't know it needs to
move the EOF over to account for the hole punch (which isn't supposed to
move the EOF).  We could just write zeroes over the punched out region of
the pagecache and write that back - but this is supposed to be a
deallocatory operation.

Fix this by manually moving the EOF over on the server after the operation
if the hole punched would corrupt it.

Note that the FSCTL_SET_ZERO_DATA RPC and the setting of the EOF should
probably be compounded to stop a third party interfering (or, at least,
massively reduce the chance).

This was reproducible occasionally by using fsx with the following script:

	truncate 0x0 0x375e2 0x0
	punch_hole 0x2f6d3 0x6ab5 0x375e2
	truncate 0x0 0x3a71f 0x375e2
	mapread 0xee05 0xcf12 0x3a71f
	write 0x2078e 0x5604 0x3a71f
	write 0x3ebdf 0x1421 0x3a71f *
	punch_hole 0x379d0 0x8630 0x40000 *
	mapread 0x2aaa2 0x85b 0x40000
	fallocate 0x1b401 0x9ada 0x40000
	read 0x15f2 0x7d32 0x40000
	read 0x32f37 0x7a3b 0x40000 *

The second "write" should extend the EOF to 0x40000, and the "punch_hole"
should operate inside of that - but that depends on whether the VM gets in
and writes back the data first.  If it doesn't, the file ends up 0x3a71f in
size, not 0x40000.

Fixes: 31742c5a33 ("enable fallocate punch hole ("fallocate -p") for SMB3")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-25 09:06:25 -05:00
Stefan Metzmacher
017d170174 smb/client: fix rdma usage in smb2_async_writev()
rqst.rq_iter needs to be truncated otherwise we'll
also send the bytes into the stream socket...

This is the logic behind rqst.rq_npages = 0, which was removed in
"cifs: Change the I/O paths to use an iterator rather than a page list"
(d08089f649).

Cc: stable@vger.kernel.org
Fixes: d08089f649 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-25 09:06:25 -05:00
Stefan Metzmacher
b608e2c318 smb/client: remove unused rq_iter_size from struct smb_rqst
Reviewed-by: David Howells <dhowells@redhat.com>
Fixes: d08089f649 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-25 09:06:25 -05:00
Stefan Metzmacher
c724b2ab6a smb/client: avoid dereferencing rdata=NULL in smb2_new_read_req()
This happens when called from SMB2_read() while using rdma
and reaching the rdma_readwrite_threshold.

Cc: stable@vger.kernel.org
Fixes: a6559cc1d3 ("cifs: split out smb3_use_rdma_offload() helper")
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-25 09:06:25 -05:00
Linus Torvalds
72bea05cb1 bcachefs fixes for 6.11-rc5, v2
- rhashtable conversion for vfs inodes
 - rcu_pending, btree key cache conversion
 + nocow deadlock fix
 + fix for new rebalance_work accounting
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmbKB9wACgkQE6szbY3K
 bnYMtw/+LSGV/eqwLwdeuABggU5gehWjxqkWF/uGE7fPP8pP0dJQnvCLRVKtAro2
 0mJh6j+kM602fU5jH/W8WNn1h8J7dAdkyqI7P/D3ZgTdaCtxso+A0Nj95CYpNdWY
 ESX9CLUYSxtFatT/kfWvlRvqlSJBYo7WgNsV6tcnPdpC+Oki6Kwlq22iI+ma9Ty7
 uDbgd05/R9KCSxaaV+9iojCsEq6h/tuFH8Z3f3SevA8H29odh5mt0UWNn05pf3mt
 rAnDUJ5TQVYubMIcbS6MhjVoLZ3AxOefkk4pctdbdmGSPJcssDeXvATn/wHYl6Fp
 +et1ECRU3Sc3dqcmT0RaTm/yxYytdtKA4HVxS4ELKbsIM2xU0Pjq3JQwKzHRwXDd
 a3r0WXa+LqHkBP37g0HhuxhxAECnbpUM9bvDivgGssVDLyxfMKUkhDsuzegjrHAF
 v5H08myk5maKvLv+dD6e23t0l1i9eB/bSsw1iNGOgZP4k9gsUlESvppFGw/10F+Q
 1Y/qeSiNTG9kJyo9PQTOZ6rFVxrfaZ9NFP4EAXcWId81OsQHYY8XnE5XaJATxnwF
 MzCgNdmzuf67X6Q8fCeNCJtiZ5sCmbyENGd6hbyYFDg+R02p0NOM4ABVN6BBfXJ+
 eHPyu2bvusIZt8MD6c7fOxyGsGdgLxIv/SkqLayZdxEaY3VvS2g=
 =ejxu
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-08-24' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:

 - assorted syzbot fixes

 - some upgrade fixes for old (pre 1.0) filesystems

 - fix for moving data off a device that was switched to durability=0
   after data had been written to it.

 - nocow deadlock fix

 - fix for new rebalance_work accounting

* tag 'bcachefs-2024-08-24' of git://evilpiepirate.org/bcachefs: (28 commits)
  bcachefs: Fix rebalance_work accounting
  bcachefs: Fix failure to flush moves before sleeping in copygc
  bcachefs: don't use rht_bucket() in btree_key_cache_scan()
  bcachefs: add missing inode_walker_exit()
  bcachefs: clear path->should_be_locked in bch2_btree_key_cache_drop()
  bcachefs: Fix double assignment in check_dirent_to_subvol()
  bcachefs: Fix refcounting in discard path
  bcachefs: Fix compat issue with old alloc_v4 keys
  bcachefs: Fix warning in bch2_fs_journal_stop()
  fs/super.c: improve get_tree() error message
  bcachefs: Fix missing validation in bch2_sb_journal_v2_validate()
  bcachefs: Fix replay_now_at() assert
  bcachefs: Fix locking in bch2_ioc_setlabel()
  bcachefs: fix failure to relock in btree_node_fill()
  bcachefs: fix failure to relock in bch2_btree_node_mem_alloc()
  bcachefs: unlock_long() before resort in journal replay
  bcachefs: fix missing bch2_err_str()
  bcachefs: fix time_stats_to_text()
  bcachefs: Fix bch2_bucket_gens_init()
  bcachefs: Fix bch2_trigger_alloc assert
  ...
2024-08-25 17:20:48 +12:00
Linus Torvalds
780bdc1ba7 five ksmbd server fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmbJteoACgkQiiy9cAdy
 T1F+Pwv/RHXSnQD+jkFEfQCEgsZZOfWD0V74VZqm90N48gfB3giZw9mtV4I1jQzI
 0+UerZjN7lHIDC4f6qp48TSEodHpprAxLfsg5JJN/OxDE+0MSbctTjLeHlduVzw6
 iHEdaE3jWN0p4YZRdbyrUCaOoTEk9cKwiG7r2DjArNyQ8kClveeqrGfdZUDTHNkv
 IIs6CJ8PFo7dicpAIGPmMz1TGq5Lh2EFjZTYEweSSlyXUNKaWgz3BXBIXD4LwK6w
 mFjGPxGNBDorcvzHcOUZnrpfACB3WNOSPN/WK5sQL6LXGCx3sWtUvGxLFkxFwjSq
 D7gvo7qnBuycNyR03RfmWyXYx+2KzdYoAUGTNV114zMJskBC0QhIIF6JK+xZdPZX
 XHxbr4CRR7fsaZOur5MTWXEzVJxvC1irULKoBp7lvYpEoAV6yXpK3XegAHIASKUE
 /Cw9qikIvxrMg4BjWPP1JhbKRw92uL2ty4oO913hbnBsScS8jCystuNl6ataiXWq
 PN5rN4sy
 =bGOb
 -----END PGP SIGNATURE-----

Merge tag '6.11-rc5-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

 - query directory flex array fix

 - fix potential null ptr reference in open

 - fix error message in some open cases

 - two minor cleanups

* tag '6.11-rc5-server-fixes' of git://git.samba.org/ksmbd:
  smb/server: update misguided comment of smb2_allocate_rsp_buf()
  smb/server: remove useless assignment of 'file_present' in smb2_open()
  smb/server: fix potential null-ptr-deref of lease_ctx_info in smb2_open()
  smb/server: fix return value of smb2_open()
  ksmbd: the buffer of smb2 query dir response has at least 1 byte
2024-08-25 12:15:04 +12:00
Kent Overstreet
49aa783039 bcachefs: Fix rebalance_work accounting
rebalance_work was keying off of the presence of rebelance_opts in the
extent - but that was incorrect, we keep those around after rebalance
for indirect extents since the inode's options are not directly
available

Fixes: 20ac515a9c ("bcachefs: bch_acct_rebalance_work")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-24 10:16:21 -04:00
Kent Overstreet
d3204616a6 bcachefs: Fix failure to flush moves before sleeping in copygc
This fixes an apparent deadlock - rebalance would get stuck trying to
take nocow locks because they weren't being released by copygc.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-24 10:16:21 -04:00
David Howells
e00e99ba6c
netfs: Fix interaction of streaming writes with zero-point tracker
When a folio that is marked for streaming write (dirty, but not uptodate,
with partial content specified in the private data) is written back, the
folio is effectively switched to the blank state upon completion of the
write.  This means that if we want to read it in future, we need to reread
the whole folio.

However, if the folio is above the zero_point position, when it is read
back, it will just be cleared and the read skipped, leading to apparent
local corruption.

Fix this by increasing the zero_point to the end of the dirty data in the
folio when clearing the folio state after writeback.  This is analogous to
the folio having ->release_folio() called upon it.

This was causing the config.log generated by configuring a cpython tree on
a cifs share to get corrupted because the scripts involved were appending
text to the file in small pieces.

Fixes: 288ace2f57 ("netfs: New writeback implementation")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/563286.1724500613@warthog.procyon.org.uk
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-24 16:09:17 +02:00
David Howells
950b03d0f6
netfs: Fix missing iterator reset on retry of short read
Fix netfs_rreq_perform_resubmissions() to reset before retrying a short
read, otherwise the wrong part of the output buffer will be used.

Fixes: 92b6cc5d1e ("netfs: Add iov_iters to (sub)requests to describe various buffers")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20240823200819.532106-6-dhowells@redhat.com
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-24 16:09:17 +02:00
David Howells
cce6bfa6ca
netfs: Fix trimming of streaming-write folios in netfs_inval_folio()
When netfslib writes to a folio that it doesn't have data for, but that
data exists on the server, it will make a 'streaming write' whereby it
stores data in a folio that is marked dirty, but not uptodate.  When it
does this, it attaches a record to folio->private to track the dirty
region.

When truncate() or fallocate() wants to invalidate part of such a folio, it
will call into ->invalidate_folio(), specifying the part of the folio that
is to be invalidated.  netfs_invalidate_folio(), on behalf of the
filesystem, must then determine how to trim the streaming write record.  In
a couple of cases, however, it does this incorrectly (the reduce-length and
move-start cases are switched over and don't, in any case, calculate the
value correctly).

Fix this by making the logic tree more obvious and fixing the cases.

Fixes: 9ebff83e64 ("netfs: Prep to use folio->private for write grouping and streaming write")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20240823200819.532106-5-dhowells@redhat.com
cc: Matthew Wilcox (Oracle) <willy@infradead.org>
cc: Pankaj Raghav <p.raghav@samsung.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: netfs@lists.linux.dev
cc: linux-mm@kvack.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-24 16:09:16 +02:00
David Howells
7dfc8f0c61
netfs: Fix netfs_release_folio() to say no if folio dirty
Fix netfs_release_folio() to say no (ie. return false) if the folio is
dirty (analogous with iomap's behaviour).  Without this, it will say yes to
the release of a dirty page by split_huge_page_to_list_to_order(), which
will result in the loss of untruncated data in the folio.

Without this, the generic/075 and generic/112 xfstests (both fsx-based
tests) fail with minimum folio size patches applied[1].

Fixes: c1ec4d7c2e ("netfs: Provide invalidate_folio and release_folio calls")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20240815090849.972355-1-kernel@pankajraghav.com/ [1]
Link: https://lore.kernel.org/r/20240823200819.532106-4-dhowells@redhat.com
cc: Matthew Wilcox (Oracle) <willy@infradead.org>
cc: Pankaj Raghav <p.raghav@samsung.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: netfs@lists.linux.dev
cc: linux-mm@kvack.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-24 16:09:16 +02:00
David Howells
a74ee0e878
afs: Fix post-setattr file edit to do truncation correctly
At the end of an kAFS RPC operation, there is an "edit" phase (originally
intended for post-directory modification ops to edit the local image) that
the setattr VFS op uses to fix up the pagecache if the RPC that requested
truncation of a file was successful.

afs_setattr_edit_file() calls truncate_setsize() which sets i_size, expands
the pagecache if needed and truncates the pagecache.  The first two of
those, however, are redundant as they've already been done by
afs_setattr_success() under the io_lock and the first is also done under
the callback lock (cb_lock).

Fix afs_setattr_edit_file() to call truncate_pagecache() instead (which is
called by truncate_setsize(), thereby skipping the redundant parts.

Fixes: 100ccd18bb ("netfs: Optimise away reads above the point at which there can be no data")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20240823200819.532106-3-dhowells@redhat.com
cc: Matthew Wilcox (Oracle) <willy@infradead.org>
cc: Pankaj Raghav <p.raghav@samsung.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: netfs@lists.linux.dev
cc: linux-mm@kvack.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-24 16:09:16 +02:00
Christian Brauner
d10771d51b
Merge patch series "ovl: simplify ovl_parse_param_lowerdir()"
Simplify and fix overlayfs layer parsing so the maximum of 500 layers
can be used.

* patches from https://lore.kernel.org/r/20240705011510.794025-1-chengzhihao1@huawei.com:
  ovl: ovl_parse_param_lowerdir: Add missed '\n' for pr_err
  ovl: fix wrong lowerdir number check for parameter Opt_lowerdir
  ovl: pass string to ovl_parse_layer()

Link: https://lore.kernel.org/r/20240705011510.794025-1-chengzhihao1@huawei.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-24 16:00:46 +02:00
Linus Torvalds
60f0560f53 NFS Client Bugfixes for Linux 6.11-rc
Bugfixes:
 * Fix rpcrdma refcounting in xa_alloc
 * Fix rpcrdma usage of XA_FLAGS_ALLOC
 * Fix requesting FATTR4_WORD2_OPEN_ARGUMENTS
 * Fix attribute bitmap decoder to handle a 3rd word
 * Add reschedule points when returning delegations to avoid soft lockups
 * Fix clearing layout segments in layoutreturn
 * Avoid unnecessary rescanning of the per-server delegation list
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmbIywQACgkQ18tUv7Cl
 QOtvHQ//VJO6iTh3kmONCru2ohxgAl7+qX+3HHbMR62S8AWCp0ujId0nir7CuxDa
 49c+R03s36lGXkTX5x0d3Idhbv12a5Jdy21oZuJU+RCm6Z7MdNXdDl9HN/gXXqdl
 0Z3Wk8r3Pi4/tgejau0i2zN2wXxVNKSQgovETnuI/BQLHupDvDy8Sd8lrIqqoiXY
 sffCiKSTbCFWg6JLEF1UWZZ1VtLUsDZRBQJD+67l1NbjSX/tiBsY0CquWcHjXAlY
 2VGDXdFCZwsQyYuqNdMVh1Cr95hcT0F1YZLOT+vn+6b6rA+UbtmPlURt8iR4gBFo
 Fadpp5pRziYb9wyg/DgFABihB6PzcboIg5Lm0rx870WEuzxSs8NQeQ9sw5hJC797
 At8C4I+cNLOaPU5nUEcG53+svEl9F2jDI2jFc8aa5zAW2hHAtpZhLVre5to0CDb/
 hu/H+h2yvjJyfSB7kCdVqlU93PJM96P7F1KEdVYmkuXQQMhkZknntVnu41w7KOst
 SKy0iU29idlU7SFHvKYyc4URC63kTKLWmTZZn3uDJouwDRYudCRPFQpiwnoDJOY3
 wRi3jRPhmZQXn7ArChQSPrHqjRVTu9Y0gUcrtAj4YODv+bkngxmZbG3J3IqZKw21
 AkMiDZESyriKVOTurX0Fzaj63zHSrIc+TwyTTXwFCGpCbVCf5/k=
 =OnQe
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-6.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:

 - Fix rpcrdma refcounting in xa_alloc

 - Fix rpcrdma usage of XA_FLAGS_ALLOC

 - Fix requesting FATTR4_WORD2_OPEN_ARGUMENTS

 - Fix attribute bitmap decoder to handle a 3rd word

 - Add reschedule points when returning delegations to avoid soft lockups

 - Fix clearing layout segments in layoutreturn

 - Avoid unnecessary rescanning of the per-server delegation list

* tag 'nfs-for-6.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS: Avoid unnecessary rescanning of the per-server delegation list
  NFSv4: Fix clearing of layout segments in layoutreturn
  NFSv4: Add missing rescheduling points in nfs_client_return_marked_delegations
  nfs: fix bitmap decoder to handle a 3rd word
  nfs: fix the fetch of FATTR4_OPEN_ARGUMENTS
  rpcrdma: Trace connection registration and unregistration
  rpcrdma: Use XA_FLAGS_ALLOC instead of XA_FLAGS_ALLOC1
  rpcrdma: Device kref is over-incremented on error from xa_alloc
2024-08-24 09:03:25 +08:00
Linus Torvalds
66ace9a8f9 four cifs.ko client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmbIqhgACgkQiiy9cAdy
 T1EAPgwAnW+vu15huT1zQn2BtFcn85zdBGXL/avjbbMLDwNHj5Lpae+PbbRa4gZ0
 VN6OQdq5Rt3Z2pJDfFZtFECKq4AN1Lxn1ur4wujBIzez3CxyFCXjDeS5/3lRP6c+
 0CiHVtRe7IgncGUnnhvwPhiG6/cjTNiXlImb6SgmFLP/0U7ZnWl5p3LmR7exfVY9
 Fubqq3HF0UpxMUD3thM055ftqT/xP6RdrITX2K2Led+BlJAJm1x+0E//4nApQ2IX
 C3VeBRZTvQtBC+pay754BqSnfAifgVObF8cfswDMS4U7ImV5gS+CxSx4vlg4bF7o
 2f32mZAXz9U3yMIBMjtBT/q/LbN28SRSjo1x35CJ9LCUK6IzARHiLZG/PVltK3Cj
 copuH3n5ZV0nGVdsv10Uheo3euFlrKKylPn8xAEhMsQzG7Q6ek/pT+avb+xl6MWf
 i8eOnMobCFiOEJtSk/uV23579wf8maVQM92M2rf2UO6K5eHIceOq0HGfSoeVV9dZ
 1rgZb1D6
 =8U5O
 -----END PGP SIGNATURE-----

Merge tag 'v6.11-rc4-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

 - fix refcount leak (can cause rmmod fail)

 - fix byte range locking problem with cached reads

 - fix for mount failure if reparse point unrecognized

 - minor typo

* tag 'v6.11-rc4-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb/client: fix typo: GlobalMid_Sem -> GlobalMid_Lock
  smb: client: ignore unhandled reparse tags
  smb3: fix problem unloading module due to leaked refcount on shutdown
  smb3: fix broken cached reads when posix locks
2024-08-24 08:50:21 +08:00
Zhihao Cheng
441e36ef5b
ovl: ovl_parse_param_lowerdir: Add missed '\n' for pr_err
Add '\n' for pr_err in function ovl_parse_param_lowerdir(), which
ensures that error message is displayed at once.

Fixes: b36a5780cb ("ovl: modify layer parameter parsing")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Link: https://lore.kernel.org/r/20240705011510.794025-4-chengzhihao1@huawei.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-23 19:58:59 +02:00
Zhihao Cheng
ca76ac36bb
ovl: fix wrong lowerdir number check for parameter Opt_lowerdir
The max count of lowerdir is OVL_MAX_STACK[500], which is broken by
commit 37f32f526438("ovl: fix memory leak in ovl_parse_param()") for
parameter Opt_lowerdir. Since commit 819829f0319a("ovl: refactor layer
parsing helpers") and commit 24e16e385f22("ovl: add support for
appending lowerdirs one by one") added check ovl_mount_dir_check() in
function ovl_parse_param_lowerdir(), the 'ctx->nr' should be smaller
than OVL_MAX_STACK, after commit 37f32f526438("ovl: fix memory leak in
ovl_parse_param()") is applied, the 'ctx->nr' is updated before the
check ovl_mount_dir_check(), which leads the max count of lowerdir
to become 499 for parameter Opt_lowerdir.
Fix it by replacing lower layers parsing code with the existing helper
function ovl_parse_layer().

Fixes: 37f32f5264 ("ovl: fix memory leak in ovl_parse_param()")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Link: https://lore.kernel.org/r/20240705011510.794025-3-chengzhihao1@huawei.com
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-23 19:56:38 +02:00
Christian Brauner
7eff3453cb
ovl: pass string to ovl_parse_layer()
So it can be used for parsing the Opt_lowerdir.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Link: https://lore.kernel.org/r/20240705011510.794025-2-chengzhihao1@huawei.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-23 19:56:38 +02:00
Olga Kornievskaia
a204501e17 nfsd: prevent panic for nfsv4.0 closed files in nfs4_show_open
Prior to commit 3f29cc82a8 ("nfsd: split sc_status out of
sc_type") states_show() relied on sc_type field to be of valid
type before calling into a subfunction to show content of a
particular stateid. From that commit, we split the validity of
the stateid into sc_status and no longer changed sc_type to 0
while unhashing the stateid. This resulted in kernel oopsing
for nfsv4.0 opens that stay around and in nfs4_show_open()
would derefence sc_file which was NULL.

Instead, for closed open stateids forgo displaying information
that relies of having a valid sc_file.

To reproduce: mount the server with 4.0, read and close
a file and then on the server cat /proc/fs/nfsd/clients/2/states

[  513.590804] Call trace:
[  513.590925]  _raw_spin_lock+0xcc/0x160
[  513.591119]  nfs4_show_open+0x78/0x2c0 [nfsd]
[  513.591412]  states_show+0x44c/0x488 [nfsd]
[  513.591681]  seq_read_iter+0x5d8/0x760
[  513.591896]  seq_read+0x188/0x208
[  513.592075]  vfs_read+0x148/0x470
[  513.592241]  ksys_read+0xcc/0x178

Fixes: 3f29cc82a8 ("nfsd: split sc_status out of sc_type")
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-08-23 12:25:46 -04:00
Ed Tsai
996b37da1e
backing-file: convert to using fops->splice_write
Filesystems may define their own splice write. Therefore, use the file
fops instead of invoking iter_file_splice_write() directly.

Signed-off-by: Ed Tsai <ed.tsai@mediatek.com>
Link: https://lore.kernel.org/r/20240708072208.25244-1-ed.tsai@mediatek.com
Fixes: 5ca7346861 ("fuse: implement splice read/write passthrough")
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-23 13:08:31 +02:00
Trond Myklebust
f92214e4c3 NFS: Avoid unnecessary rescanning of the per-server delegation list
If the call to nfs_delegation_grab_inode() fails, we will not have
dropped any locks that require us to rescan the list.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-08-22 17:01:10 -04:00
Trond Myklebust
d72b796311 NFSv4: Fix clearing of layout segments in layoutreturn
Make sure that we clear the layout segments in cases where we see a
fatal error, and also in the case where the layout is invalid.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-08-22 17:01:10 -04:00
Trond Myklebust
a017ad1313 NFSv4: Add missing rescheduling points in nfs_client_return_marked_delegations
We're seeing reports of soft lockups when iterating through the loops,
so let's add rescheduling points.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-08-22 17:01:10 -04:00
Jeff Layton
95832998fb nfs: fix bitmap decoder to handle a 3rd word
It only decodes the first two words at this point. Have it decode the
third word as well. Without this, the client doesn't send delegated
timestamps in the CB_GETATTR response.

With this change we also need to expand the on-stack bitmap in
decode_recallany_args to 3 elements, in case the server sends a larger
bitmap than expected.

Fixes: 43df7110f4 ("NFSv4: Add CB_GETATTR support for delegated attributes")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-08-22 17:01:10 -04:00
Jeff Layton
cb78f9b7d0 nfs: fix the fetch of FATTR4_OPEN_ARGUMENTS
The client doesn't properly request FATTR4_OPEN_ARGUMENTS in the initial
SERVER_CAPS getattr. Add FATTR4_WORD2_OPEN_ARGUMENTS to the initial
request.

Fixes: 707f13b3d0 (NFSv4: Add support for the FATTR4_OPEN_ARGUMENTS attribute)
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-08-22 17:01:09 -04:00
ChenXiaoSong
5e51224d2a smb/client: fix typo: GlobalMid_Sem -> GlobalMid_Lock
The comments have typos, fix that to not confuse readers.

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-22 15:44:19 -05:00
Jeff Layton
f58bab6fd4 nfsd: ensure that nfsd4_fattr_args.context is zeroed out
If nfsd4_encode_fattr4 ends up doing a "goto out" before we get to
checking for the security label, then args.context will be set to
uninitialized junk on the stack, which we'll then try to free.
Initialize it early.

Fixes: f59388a579 ("NFSD: Add nfsd4_encode_fattr4_sec_label()")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-08-22 14:49:10 -04:00
Paulo Alcantara
ec68680411 smb: client: ignore unhandled reparse tags
Just ignore reparse points that the client can't parse rather than
bailing out and not opening the file or directory.

Reported-by: Marc <1marc1@gmail.com>
Closes: https://lore.kernel.org/r/CAMHwNVv-B+Q6wa0FEXrAuzdchzcJRsPKDDRrNaYZJd6X-+iJzw@mail.gmail.com
Fixes: 539aad7f14 ("smb: client: introduce ->parse_reparse_point()")
Tested-by: Anthony Nandaa (Microsoft) <profnandaa@gmail.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-22 12:37:16 -05:00
Steve French
15179cf280 smb3: fix problem unloading module due to leaked refcount on shutdown
The shutdown ioctl can leak a refcount on the tlink which can
prevent rmmod (unloading the cifs.ko) module from working.

Found while debugging xfstest generic/043

Fixes: 69ca1f5755 ("smb3: add dynamic tracepoints for shutdown ioctl")
Reviewed-by: Meetakshi Setiya <msetiya@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-22 12:36:57 -05:00
ChenXiaoSong
2b7e0573a4 smb/server: update misguided comment of smb2_allocate_rsp_buf()
smb2_allocate_rsp_buf() will return other error code except -ENOMEM.

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-22 09:52:00 -05:00