Commit Graph

93229 Commits

Author SHA1 Message Date
Pali Rohár
37408843f2 cifs: Update SFU comments about fifos and sockets
In SFU mode, activated by -o sfu mount option is now also support for
creating new fifos and sockets.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-16 20:10:37 -05:00
Pali Rohár
41d3f256c6 cifs: Add support for creating SFU symlinks
Linux cifs client can already detect SFU symlinks and reads it content
(target location). But currently is not able to create new symlink. So
implement this missing support.

When 'sfu' mount option is specified and 'mfsymlinks' is not specified then
create new symlinks in SFU-style. This will provide full SFU compatibility
of symlinks when mounting cifs share with 'sfu' option. 'mfsymlinks' option
override SFU for better Apple compatibility as explained in fs_context.c
file in smb3_update_mnt_flags() function.

Extend __cifs_sfu_make_node() function, which now can handle also S_IFLNK
type and refactor structures passed to sync_write() in this function, by
splitting SFU type and SFU data from original combined struct win_dev as
combined fixed-length struct cannot be used for variable-length symlinks.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-16 20:10:34 -05:00
Hongbo Li
21dcbc17eb smb: use LIST_HEAD() to simplify code
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). No functional impact.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:45 -05:00
Pali Rohár
2ba0d8947e cifs: Recognize SFU socket type
SFU since its (first) version 3.0 supports AF_LOCAL sockets and stores them
on filesytem as system file with one zero byte. Add support for detecting
this SFU socket type into cifs_sfu_type() function.

With this change cifs_sfu_type() would correctly detect all special file
types created by SFU: fifo, socket, symlink, block and char.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:45 -05:00
Pali Rohár
25f6bd0fb0 cifs: Show debug message when SFU Fifo type was detected
For debugging purposes it is a good idea to show detected SFU type also for
Fifo. Debug message is already print for all other special types.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:45 -05:00
Pali Rohár
bb68327053 cifs: Put explicit zero byte into SFU block/char types
SFU types IntxCHR and IntxBLK are 8 bytes with zero as last byte. Make it
explicit in memcpy and memset calls, so the zero byte is visible in the
code (and not hidden as string trailing nul byte).

It is important for reader to show the last byte for block and char types
because it differs from the last byte of symlink type (which has it 0x01).

Also it is important to show that the type is not nul-term string, but
rather 8 bytes (with some printable bytes).

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:45 -05:00
Pali Rohár
cf2ce67345 cifs: Add support for reading SFU symlink location
Currently when sfu mount option is specified then CIFS can recognize SFU
symlink, but is not able to read symlink target location. readlink()
syscall just returns that operation is not supported.

Implement this missing functionality in cifs_sfu_type() function. Read
target location of SFU-style symlink, parse it and fill into fattr's
cf_symlink_target member.

SFU-style symlink is file which has system attribute set and file content
is buffer "IntxLNK\1" (8th byte is 0x01) followed by the target location
encoded in little endian UCS-2/UTF-16. This format was introduced in
Interix 3.0 subsystem, as part of the Microsoft SFU 3.0 and is used also by
all later versions. Previous versions had no symlink support.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:45 -05:00
Pali Rohár
89c601ab7c cifs: Fix recognizing SFU symlinks
SFU symlinks have 8 byte prefix: "IntxLNK\1".
So check also the last 8th byte 0x01.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:45 -05:00
Qianqiang Liu
9b4af91346 smb: client: compress: fix an "illegal accesses" issue
Using uninitialized value "bkt" when calling "kfree"

Fixes: 13b68d44990d ("smb: client: compress: LZ77 code improvements cleanup")
Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:45 -05:00
Qianqiang Liu
590efcd3c7 smb: client: compress: fix a potential issue of freeing an invalid pointer
The dst pointer may not be initialized when calling kvfree(dst)

Fixes: 13b68d44990d9 ("smb: client: compress: LZ77 code improvements cleanup")
Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:45 -05:00
Enzo Matsumiya
94ae8c3fee smb: client: compress: LZ77 code improvements cleanup
- Check data compressibility with some heuristics (copied from
  btrfs):
  - should_compress() final decision is is_compressible(data)

- Cleanup compress/lz77.h leaving only lz77_compress() exposed:
  - Move parts to compress/lz77.c, while removing the rest of it
    because they were either unused, used only once, were
    implemented wrong (thanks to David Howells for the help)

- Updated the compression parameters (still compatible with
  Windows implementation) trading off ~20% compression ratio
  for ~40% performance:
  - min match len: 3 -> 4
  - max distance: 8KiB -> 1KiB
  - hash table type: u32 * -> u64 *

Known bugs:
This implementation currently works fine in general, but breaks with
some payloads used during testing.  Investigation ongoing, to be
fixed in a next commit.

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Co-developed-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:45 -05:00
Enzo Matsumiya
f046d71e84 smb: client: insert compression check/call on write requests
On smb2_async_writev(), set CIFS_COMPRESS_REQ on request flags if
should_compress() returns true.

On smb_send_rqst() check the flags, and compress and send the request to
the server.

(*) If the compression fails with -EMSGSIZE (i.e. compressed size is >=
uncompressed size), the original uncompressed request is sent instead.

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
Steve French
d14bbfff25 smb3: mark compression as CONFIG_EXPERIMENTAL and fix missing compression operation
Move SMB3.1.1 compression code into experimental config option,
and fix the compress mount option. Implement unchained LZ77
"plain" compression algorithm as per MS-XCA specification
section "2.3 Plain LZ77 Compression Algorithm Details".

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
Gaosheng Cui
6795dab403 cifs: Remove obsoleted declaration for cifs_dir_open
The cifs_dir_open() have been removed since
commit 737b758c96 ("[PATCH] cifs: character mapping of special
characters (part 3 of 3)"), and now it is useless, so remove it.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
Shen Lichuan
25e68c37ca smb: client: Use min() macro
Use the min() macro to simplify the function and improve
its readability.

Signed-off-by: Shen Lichuan <shenlichuan@vivo.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
Yuesong Li
9290038be2 cifs: convert to use ERR_CAST()
Use ERR_CAST() as it is designed for casting an error pointer to
another type.

This macro uses the __force and __must_check modifiers, which are used
to tell the compiler to check for errors where this macro is used.

Signed-off-by: Yuesong Li <liyuesong@vivo.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
ChenXiaoSong
e2fcd3fa03 smb: add comment to STATUS_MCA_OCCURED
Explained why the typo was not corrected.

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
ChenXiaoSong
78181a5504 smb: move SMB2 Status code to common header file
There are only 4 different definitions between the client and server:

  - STATUS_SERVER_UNAVAILABLE: from client/smb2status.h
  - STATUS_FILE_NOT_AVAILABLE: from client/smb2status.h
  - STATUS_NO_PREAUTH_INTEGRITY_HASH_OVERLAP: from server/smbstatus.h
  - STATUS_INVALID_LOCK_RANGE: from server/smbstatus.h

Rename client/smb2status.h to common/smb2status.h, and merge the
2 different definitions of server to common header file.

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
ChenXiaoSong
b51174da74 smb: move some duplicate definitions to common/smbacl.h
In order to maintain the code more easily, move duplicate definitions
to new common header file.

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
ChenXiaoSong
09bedafc1e smb/client: rename cifs_ace to smb_ace
Preparation for moving acl definitions to new common header file.

Use the following shell command to rename:

  find fs/smb/client -type f -exec sed -i \
    's/struct cifs_ace/struct smb_ace/g' {} +

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
ChenXiaoSong
251b93ae73 smb/client: rename cifs_acl to smb_acl
Preparation for moving acl definitions to new common header file.

Use the following shell command to rename:

  find fs/smb/client -type f -exec sed -i \
    's/struct cifs_acl/struct smb_acl/g' {} +

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
ChenXiaoSong
7f599d8fb3 smb/client: rename cifs_sid to smb_sid
Preparation for moving acl definitions to new common header file.

Use the following shell command to rename:

  find fs/smb/client -type f -exec sed -i \
    's/struct cifs_sid/struct smb_sid/g' {} +

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
ChenXiaoSong
3651487607 smb/client: rename cifs_ntsd to smb_ntsd
Preparation for moving acl definitions to new common header file.

Use the following shell command to rename:

  find fs/smb/client -type f -exec sed -i \
    's/struct cifs_ntsd/struct smb_ntsd/g' {} +

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-15 10:42:44 -05:00
Linus Torvalds
d9bc226584 fix for packet signing of write
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmbk8HsACgkQiiy9cAdy
 T1Fklgv/ejyQXCCLVWCQoYFxjMtOEGQkITsq2XYjjbyFv8hQGmvFDPbm+/NmZ+SL
 9hER3CjrcDHPaidTWI/1mKCR5+vcjOLqmSC2GF/nsd9x2OdsxpIXazihKPt3nRh4
 4RSeAyNbmRAxi5auBQmW+gfBEquwkqdJbBkZyTbKmfcpWNyJWDZp6LpdKyTTLK6Z
 448QqFWLGNmO5cZDuYSl7R+J2DsCNmpe7vDKUX4dO/aU+fg0+OcCgRWxgYfEPk7b
 Mtizf1soBxVE72yDgRpr6ARr7YIBHRhtC2e5xLb62yvslqk8y5qrqL1u5WU7F1d/
 h0PYE/5YMSvjJxXRMSrfPFyq0yPwsGZJK4epkKxtLV0WC4hNnSWLrg0JNOPZVF5k
 OsofXUOSbtUQkwT/T3epp54Z83bzRUxgdbD80OfFM2SmglZY2zjTAcCUzxEwp56T
 q7ehhmZrl23FzrkFBkQcZbBP3PCKBJ4iPevTLnNt2xsLVjnAVQE9Ss5EIqCYh/Ge
 D3NsmmhS
 =ygm/
 -----END PGP SIGNATURE-----

Merge tag '6.11-rc7-SMB3-client-fix' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fix from Steve French:
 "Fix for packet signing of write"

* tag '6.11-rc7-SMB3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Fix signature miscalculation
2024-09-14 11:43:24 +02:00
David Howells
5a20b7cb0d cifs: Fix signature miscalculation
Fix the calculation of packet signatures by adding the offset into a page
in the read or write data payload when hashing the pages from it.

Fixes: 39bc58203f ("cifs: Add a function to Hash the contents of an iterator")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-12 19:28:48 -05:00
Linus Torvalds
bc83b4d1f0 bcachefs fixes for 6.11-rc8
- fix ca->io_ref usage; analagous to previous patch doing that for main
   discard path
 - cond_resched() in __journal_keys_sort(), cutting down on "hung task"
   warnings when journal is big
 - rest of basic BCH_SB_MEMBER_INVALID support
 - and the critical one: don't delete open files in online fsck, this was
   causing the "dirent points to inode that doesn't point back"
   inconsistencies some users were seeing
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmbe/BoACgkQE6szbY3K
 bnbGuRAApGK3NNIlvMQkldLmDgHgeN/vuYHT4dx7XBquwS/R1T84WffIJklVOahD
 KdrXhGsLH63kjqck3ngCe3DFDD7Fhirmx9syVHhLAkzGFkDEYGQuIzeDXIn+XOMe
 kUTMhNgtSJL/eEc8zGmvyPIjtTrwoih2V0EeC1aW7h4tWtIsMk3q45aMX3yTlyqI
 FnMmKFjzqnOIVjT8nBLuzDP97FG4w2foSuNiZWTYo7FLi8IhPrp95tr58NqM4jvd
 99U5I3aOFHe9WcCDT1vgr0P5dmmSEwIKBCfvIlA8fbVlnZzjJqEjSKh+C4878KFv
 FP51FOY/ZQVRE/+p8AQ82N1Zc3OTZ2488X6ajDt0Ir5EHMMmqiEXz5Zx9/7mmdta
 egmiVX5OAVHgWR61xzTa6LKnGIjT0XE/lYJT9kc8iox9BBduQEx+iZ8OeRDAeObW
 048K3jBhXST+hK91lbgj7/lvj3IWabPSyfPyzpe46aejS3N7b79bEvKanD7dH5Dy
 KhdGuCKv2PXvlYbxI3rLPGUeL3InIe8TjvYa2ryl5qICSKhHjk7+8tvLeGWIXI55
 rDglrYqw3s1IiGeg4QpKAB4YIeQfZn3g1WbfEs/H5GnoA7UDnQw9IkJb0/S39bEw
 8OVYh52+USafMceqhwxbI4dfX7RcI00JBcVCZO5hcVu77MQA8G4=
 =6PAk
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-09-09' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:

 - fix ca->io_ref usage; analagous to previous patch doing that for main
   discard path

 - cond_resched() in __journal_keys_sort(), cutting down on "hung task"
   warnings when journal is big

 - rest of basic BCH_SB_MEMBER_INVALID support

 - and the critical one: don't delete open files in online fsck, this
   was causing the "dirent points to inode that doesn't point back"
   inconsistencies some users were seeing

* tag 'bcachefs-2024-09-09' of git://evilpiepirate.org/bcachefs:
  bcachefs: Don't delete open files in online fsck
  bcachefs: fix btree_key_cache sysfs knob
  bcachefs: More BCH_SB_MEMBER_INVALID support
  bcachefs: Simplify bch2_bkey_drop_ptrs()
  bcachefs: Add a cond_resched() to __journal_keys_sort()
  bcachefs: Fix ca->io_ref usage
2024-09-09 09:49:23 -07:00
Kent Overstreet
16005147cc bcachefs: Don't delete open files in online fsck
If a file is unlinked but still open, we don't want online fsck to
delete it - or fun inconsistencies will happen.

https://github.com/koverstreet/bcachefs/issues/727

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:47 -04:00
Kent Overstreet
2c377d8a71 bcachefs: fix btree_key_cache sysfs knob
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:47 -04:00
Kent Overstreet
52df04f039 bcachefs: More BCH_SB_MEMBER_INVALID support
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:46 -04:00
Kent Overstreet
df88febc20 bcachefs: Simplify bch2_bkey_drop_ptrs()
bch2_bkey_drop_ptrs() had a some complicated machinery for avoiding
O(n^2) when dropping multiple pointers - but when n is only going to be
~4, it's not worth it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:46 -04:00
Kent Overstreet
ec36573dcd bcachefs: Add a cond_resched() to __journal_keys_sort()
Without this, we'd potentially sort multiple times without a
cond_resched(), leading to hung task warnings on larger systems.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:46 -04:00
Kent Overstreet
5a6e43af1e bcachefs: Fix ca->io_ref usage
ca->io_ref does not protect against the filesystem going way,
c->write_ref does. Much like

0b50b7313e bcachefs: Fix refcounting in discard path

the other async paths need fixing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:46 -04:00
Linus Torvalds
a86b83f777 five smb3 client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmbbVKkACgkQiiy9cAdy
 T1FTUgv8C/Qek0abESCC9AEvKUiAGwabOcdvKQnpCjI3eLQVmwGIHXXPdnkgxJmL
 gUQm4CBj6jWw5OfhBw2BTvnVz9YahQC8Xbg0XfLomaggD8NxVFnQyiWyyjPJtIiQ
 JRhOqV82Ko2NFMpouwfNTLPLMBpjNp6IrvkAY2bH5vUzPmoC/aU+eQMVXMqTFalD
 Q+vV2cFBcMsTTsRFCMG0er8114A1XvyG4IKr/95bTDjn/wnOVX9sUGrMbNXuoCsj
 yzMAkBoc60k2PjGoYMIQJsVDFryz7TpF7wyS2Oo5EkqzR/GKcIYGxTn0AznVhs83
 5mAPXgyqpxg3wAsIVAs+vj0Jo2/cfpWuLb9pR5kt3lNA5EH7D1DNzXcHSe8GPvC6
 iwrFI0RnR59HbDh1UGOSoVZv/W9cwmam6WG5HpS7YcRYocZqZyv+XjxUTlj2r+nV
 12v9nnAWkH2Ub6kf3WHPzeXS3L6mvucody8b01UUL+j8hqWKN67sbXzH0Y2Nv0tv
 KFgbJCSk
 =CntT
 -----END PGP SIGNATURE-----

Merge tag 'v6.11-rc6-cifs-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

 - fix potential mount hang

 - fix retry problem in two types of compound operations

 - important netfs integration fix in SMB1 read paths

 - fix potential uninitialized zero point of inode

 - minor patch to improve debugging for potential crediting problems

* tag 'v6.11-rc6-cifs-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  netfs, cifs: Improve some debugging bits
  cifs: Fix SMB1 readv/writev callback in the same way as SMB2/3
  cifs: Fix zero_point init on inode initialisation
  smb: client: fix double put of @cfile in smb2_set_path_size()
  smb: client: fix double put of @cfile in smb2_rename_path()
  smb: client: fix hang in wait_for_response() for negproto
2024-09-06 17:30:33 -07:00
Christian Brauner
4e32c25b58 libfs: fix get_stashed_dentry()
get_stashed_dentry() tries to optimistically retrieve a stashed dentry
from a provided location.  It needs to ensure to hold rcu lock before it
dereference the stashed location to prevent UAF issues.  Use
rcu_dereference() instead of READ_ONCE() it's effectively equivalent
with some lockdep bells and whistles and it communicates clearly that
this expects rcu protection.

Link: https://lore.kernel.org/r/20240906-vfs-hotfix-5959800ffa68@brauner
Fixes: 07fd7c3298 ("libfs: add path_from_stashed()")
Reported-by: syzbot+f82b36bffae7ef78b6a7@syzkaller.appspotmail.com
Fixes: syzbot+f82b36bffae7ef78b6a7@syzkaller.appspotmail.com
Reported-by: syzbot+cbe4b96e1194b0e34db6@syzkaller.appspotmail.com
Fixes: syzbot+cbe4b96e1194b0e34db6@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-06 11:08:58 -07:00
Linus Torvalds
e4b42053b7 Tracing fixes for 6.11:
- Fix adding a new fgraph callback after function graph tracing has
   already started.
 
   If the new caller does not initialize its hash before registering the
   fgraph_ops, it can cause a NULL pointer dereference. Fix this by adding
   a new parameter to ftrace_graph_enable_direct() passing in the newly
   added gops directly and not rely on using the fgraph_array[], as entries
   in the fgraph_array[] must be initialized. Assign the new gops to the
   fgraph_array[] after it goes through ftrace_startup_subops() as that
   will properly initialize the gops->ops and initialize its hashes.
 
 - Fix a memory leak in fgraph storage memory test.
 
   If the "multiple fgraph storage on a function" boot up selftest
   fails in the registering of the function graph tracer, it will
   not free the memory it allocated for the filter. Break the loop
   up into two where it allocates the filters first and then registers
   the functions where any errors will do the appropriate clean ups.
 
 - Only clear the timerlat timers if it has an associated kthread.
 
   In the rtla tool that uses timerlat, if it was killed just as it
   was shutting down, the signals can free the kthread and the timer.
   But the closing of the timerlat files could cause the hrtimer_cancel()
   to be called on the already freed timer. As the kthread variable is
   is set to NULL when the kthreads are stopped and the timers are freed
   it can be used to know not to call hrtimer_cancel() on the timer if
   the kthread variable is NULL.
 
 - Use a cpumask to keep track of osnoise/timerlat kthreads
 
   The timerlat tracer can use user space threads for its analysis.
   With the killing of the rtla tool, the kernel can get confused
   between if it is using a user space thread to analyze or one of its
   own kernel threads. When this confusion happens, kthread_stop()
   can be called on a user space thread and bad things happen.
   As the kernel threads are per-cpu, a bitmask can be used to know
   when a kernel thread is used or when a user space thread is used.
 
 - Add missing interface_lock to osnoise/timerlat stop_kthread()
 
   The stop_kthread() function in osnoise/timerlat clears the
   osnoise kthread variable, and if it was a user space thread does
   a put_task on it. But this can race with the closing of the timerlat
   files that also does a put_task on the kthread, and if the race happens
   the task will have put_task called on it twice and oops.
 
 - Add cond_resched() to the tracing_iter_reset() loop.
 
   The latency tracers keep writing to the ring buffer without resetting
   when it issues a new "start" event (like interrupts being disabled).
   When reading the buffer with an iterator, the tracing_iter_reset()
   sets its pointer to that start event by walking through all the events
   in the buffer until it gets to the time stamp of the start event.
   In the case of a very large buffer, the loop that looks for the start
   event has been reported taking a very long time with a non preempt kernel
   that it can trigger a soft lock up warning. Add a cond_resched() into
   that loop to make sure that doesn't happen.
 
 - Use list_del_rcu() for eventfs ei->list variable
 
   It was reported that running loops of creating and deleting  kprobe events
   could cause a crash due to the eventfs list iteration hitting a LIST_POISON
   variable. This is because the list is protected by SRCU but when an item is
   deleted from the list, it was using list_del() which poisons the "next"
   pointer. This is what list_del_rcu() was to prevent.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZtohNBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qtoNAQDQKjomYLCpLz2EqgHZ6VB81QVrHuqt
 cU7xuEfUJDzyyAEA/n0t6quIdjYRd6R2/KxGkP6By/805Coq4IZMTgNQmw0=
 =nZ7k
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix adding a new fgraph callback after function graph tracing has
   already started.

   If the new caller does not initialize its hash before registering the
   fgraph_ops, it can cause a NULL pointer dereference. Fix this by
   adding a new parameter to ftrace_graph_enable_direct() passing in the
   newly added gops directly and not rely on using the fgraph_array[],
   as entries in the fgraph_array[] must be initialized.

   Assign the new gops to the fgraph_array[] after it goes through
   ftrace_startup_subops() as that will properly initialize the
   gops->ops and initialize its hashes.

 - Fix a memory leak in fgraph storage memory test.

   If the "multiple fgraph storage on a function" boot up selftest fails
   in the registering of the function graph tracer, it will not free the
   memory it allocated for the filter. Break the loop up into two where
   it allocates the filters first and then registers the functions where
   any errors will do the appropriate clean ups.

 - Only clear the timerlat timers if it has an associated kthread.

   In the rtla tool that uses timerlat, if it was killed just as it was
   shutting down, the signals can free the kthread and the timer. But
   the closing of the timerlat files could cause the hrtimer_cancel() to
   be called on the already freed timer. As the kthread variable is is
   set to NULL when the kthreads are stopped and the timers are freed it
   can be used to know not to call hrtimer_cancel() on the timer if the
   kthread variable is NULL.

 - Use a cpumask to keep track of osnoise/timerlat kthreads

   The timerlat tracer can use user space threads for its analysis. With
   the killing of the rtla tool, the kernel can get confused between if
   it is using a user space thread to analyze or one of its own kernel
   threads. When this confusion happens, kthread_stop() can be called on
   a user space thread and bad things happen. As the kernel threads are
   per-cpu, a bitmask can be used to know when a kernel thread is used
   or when a user space thread is used.

 - Add missing interface_lock to osnoise/timerlat stop_kthread()

   The stop_kthread() function in osnoise/timerlat clears the osnoise
   kthread variable, and if it was a user space thread does a put_task
   on it. But this can race with the closing of the timerlat files that
   also does a put_task on the kthread, and if the race happens the task
   will have put_task called on it twice and oops.

 - Add cond_resched() to the tracing_iter_reset() loop.

   The latency tracers keep writing to the ring buffer without resetting
   when it issues a new "start" event (like interrupts being disabled).
   When reading the buffer with an iterator, the tracing_iter_reset()
   sets its pointer to that start event by walking through all the
   events in the buffer until it gets to the time stamp of the start
   event. In the case of a very large buffer, the loop that looks for
   the start event has been reported taking a very long time with a non
   preempt kernel that it can trigger a soft lock up warning. Add a
   cond_resched() into that loop to make sure that doesn't happen.

 - Use list_del_rcu() for eventfs ei->list variable

   It was reported that running loops of creating and deleting kprobe
   events could cause a crash due to the eventfs list iteration hitting
   a LIST_POISON variable. This is because the list is protected by SRCU
   but when an item is deleted from the list, it was using list_del()
   which poisons the "next" pointer. This is what list_del_rcu() was to
   prevent.

* tag 'trace-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()
  tracing/timerlat: Only clear timer if a kthread exists
  tracing/osnoise: Use a cpumask to know what threads are kthreads
  eventfs: Use list_del_rcu() for SRCU protected list variable
  tracing: Avoid possible softlockup in tracing_iter_reset()
  tracing: Fix memory leak in fgraph storage selftest
  tracing: fgraph: Fix to add new fgraph_ops to array after ftrace_startup_subops()
2024-09-05 16:29:41 -07:00
Steven Rostedt
d2603279c7 eventfs: Use list_del_rcu() for SRCU protected list variable
Chi Zhiling reported:

  We found a null pointer accessing in tracefs[1], the reason is that the
  variable 'ei_child' is set to LIST_POISON1, that means the list was
  removed in eventfs_remove_rec. so when access the ei_child->is_freed, the
  panic triggered.

  by the way, the following script can reproduce this panic

  loop1 (){
      while true
      do
          echo "p:kp submit_bio" > /sys/kernel/debug/tracing/kprobe_events
          echo "" > /sys/kernel/debug/tracing/kprobe_events
      done
  }
  loop2 (){
      while true
      do
          tree /sys/kernel/debug/tracing/events/kprobes/
      done
  }
  loop1 &
  loop2

  [1]:
  [ 1147.959632][T17331] Unable to handle kernel paging request at virtual address dead000000000150
  [ 1147.968239][T17331] Mem abort info:
  [ 1147.971739][T17331]   ESR = 0x0000000096000004
  [ 1147.976172][T17331]   EC = 0x25: DABT (current EL), IL = 32 bits
  [ 1147.982171][T17331]   SET = 0, FnV = 0
  [ 1147.985906][T17331]   EA = 0, S1PTW = 0
  [ 1147.989734][T17331]   FSC = 0x04: level 0 translation fault
  [ 1147.995292][T17331] Data abort info:
  [ 1147.998858][T17331]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
  [ 1148.005023][T17331]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
  [ 1148.010759][T17331]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  [ 1148.016752][T17331] [dead000000000150] address between user and kernel address ranges
  [ 1148.024571][T17331] Internal error: Oops: 0000000096000004 [#1] SMP
  [ 1148.030825][T17331] Modules linked in: team_mode_loadbalance team nlmon act_gact cls_flower sch_ingress bonding tls macvlan dummy ib_core bridge stp llc veth amdgpu amdxcp mfd_core gpu_sched drm_exec drm_buddy radeon crct10dif_ce video drm_suballoc_helper ghash_ce drm_ttm_helper sha2_ce ttm sha256_arm64 i2c_algo_bit sha1_ce sbsa_gwdt cp210x drm_display_helper cec sr_mod cdrom drm_kms_helper binfmt_misc sg loop fuse drm dm_mod nfnetlink ip_tables autofs4 [last unloaded: tls]
  [ 1148.072808][T17331] CPU: 3 PID: 17331 Comm: ls Tainted: G        W         ------- ----  6.6.43 #2
  [ 1148.081751][T17331] Source Version: 21b3b386e948bedd29369af66f3e98ab01b1c650
  [ 1148.088783][T17331] Hardware name: Greatwall GW-001M1A-FTF/GW-001M1A-FTF, BIOS KunLun BIOS V4.0 07/16/2020
  [ 1148.098419][T17331] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  [ 1148.106060][T17331] pc : eventfs_iterate+0x2c0/0x398
  [ 1148.111017][T17331] lr : eventfs_iterate+0x2fc/0x398
  [ 1148.115969][T17331] sp : ffff80008d56bbd0
  [ 1148.119964][T17331] x29: ffff80008d56bbf0 x28: ffff001ff5be2600 x27: 0000000000000000
  [ 1148.127781][T17331] x26: ffff001ff52ca4e0 x25: 0000000000009977 x24: dead000000000100
  [ 1148.135598][T17331] x23: 0000000000000000 x22: 000000000000000b x21: ffff800082645f10
  [ 1148.143415][T17331] x20: ffff001fddf87c70 x19: ffff80008d56bc90 x18: 0000000000000000
  [ 1148.151231][T17331] x17: 0000000000000000 x16: 0000000000000000 x15: ffff001ff52ca4e0
  [ 1148.159048][T17331] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
  [ 1148.166864][T17331] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff8000804391d0
  [ 1148.174680][T17331] x8 : 0000000180000000 x7 : 0000000000000018 x6 : 0000aaab04b92862
  [ 1148.182498][T17331] x5 : 0000aaab04b92862 x4 : 0000000080000000 x3 : 0000000000000068
  [ 1148.190314][T17331] x2 : 000000000000000f x1 : 0000000000007ea8 x0 : 0000000000000001
  [ 1148.198131][T17331] Call trace:
  [ 1148.201259][T17331]  eventfs_iterate+0x2c0/0x398
  [ 1148.205864][T17331]  iterate_dir+0x98/0x188
  [ 1148.210036][T17331]  __arm64_sys_getdents64+0x78/0x160
  [ 1148.215161][T17331]  invoke_syscall+0x78/0x108
  [ 1148.219593][T17331]  el0_svc_common.constprop.0+0x48/0xf0
  [ 1148.224977][T17331]  do_el0_svc+0x24/0x38
  [ 1148.228974][T17331]  el0_svc+0x40/0x168
  [ 1148.232798][T17331]  el0t_64_sync_handler+0x120/0x130
  [ 1148.237836][T17331]  el0t_64_sync+0x1a4/0x1a8
  [ 1148.242182][T17331] Code: 54ffff6c f9400676 910006d6 f9000676 (b9405300)
  [ 1148.248955][T17331] ---[ end trace 0000000000000000 ]---

The issue is that list_del() is used on an SRCU protected list variable
before the synchronization occurs. This can poison the list pointers while
there is a reader iterating the list.

This is simply fixed by using list_del_rcu() that is specifically made for
this purpose.

Link: https://lore.kernel.org/linux-trace-kernel/20240829085025.3600021-1-chizhiling@163.com/

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20240904131605.640d42b1@gandalf.local.home
Fixes: 43aa6f97c2 ("eventfs: Get rid of dentry pointers without refcounts")
Reported-by: Chi Zhiling <chizhiling@kylinos.cn>
Tested-by: Chi Zhiling <chizhiling@kylinos.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-09-05 10:18:48 -04:00
Linus Torvalds
c763c43396 bcachefs fixes for 6.11-rc1
- Fix a typo in the rebalance accounting changes
 - BCH_SB_MEMBER_INVALID: small on disk format feature which will be
   needed for full erasure coding support; this is only the minimum so
   that 6.11 can handle future versions without barfing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmbYsMQACgkQE6szbY3K
 bna2GQ/9Hbj2VecuEmuWkzS3fMAJbQSJ3AFKLJ2BWmi7Zvez57jhFIXVL1kdehi1
 K0tW9T8yLCtT8vUHTy42fb/MQzE1ARkLk9qubOnJj1M8+JGm0LL3WoDnNr2gM11i
 cSsxIk++8WqCEWw3+0a57vHc97zugzOSE3Np/J8zKLUuXEGLOrNtgFj/OHXRlYSz
 iSg0JwZp+MrpmdcUN9SpymNcTQp9VlpCKjcLvxV28aFR2PwJm1LnFrFf+RhsGl94
 NXEwHRYj9vqEm+8UI4u9owyBbeU7c+gtt3cKrayU4cGQoKk/la8biZvgEKDkGJwy
 9W+zO7GthRCD5tLVTxsnYYDTLyO5KOvDaHXm9iZrQzbe2wSayOx4HPVR55XkLDHj
 P/qN60rQvMactTrqhZVRerybvvOGS94280qkR2BPkm6gvdEu8eTYq+0uQgqpoHLi
 sIXRJuYDuTB+24Hx9wc42TjEYqkOHdZ7T3ZFuP4e9j3vjo+0znJOb/aY6SsqD/wR
 Wonw5/NFxW53gkXytX5MNctnizy1HrL5Kq5qIZZgLXWGqfCBcie3yT7MtItuqVFa
 sMENVGpZ0vxhx6GbL/5D2rgIAK9X6NQybpPRmGvUpg/BqahcG+/aNH+LXeJPBcUt
 2kkd1nqKXaJn14gTh1bmkYwKlQdLmWQQT8cJ9D29wDI7q7hvRDw=
 =ABhR
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-09-04' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:

 - Fix a typo in the rebalance accounting changes

 - BCH_SB_MEMBER_INVALID: small on disk format feature which will be
   needed for full erasure coding support; this is only the minimum so
   that 6.11 can handle future versions without barfing.

* tag 'bcachefs-2024-09-04' of git://evilpiepirate.org/bcachefs:
  bcachefs: BCH_SB_MEMBER_INVALID
  bcachefs: fix rebalance accounting
2024-09-04 13:54:47 -07:00
Linus Torvalds
1263a7bf8a for-6.11-rc6-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmbYn2YACgkQxWXV+ddt
 WDum5Q//Topfw8yGOMpSUajZ7n4Iy81CknH6GnV2r0qj/0vK4XZ8a8PHpJPLn0gc
 neTGo62vfaQ1HstKPvXWMJkoew5cL+khXW6zaEnieVLvlrVGD9i5NgtmgiC/kK00
 Pwj8h2MFhdrXEJEXdk0g9IVaGRs78lruGuc0eI0sGESMbZdQ4OsLToU4zFCqgb6b
 LZrHENyTIoYjiqMPYrZh4X4TxDV9lVw3XTbebB9vZPsC1Bj0H8uZ3rMU5hS7VboH
 e/c7qmJWs/Gq0CNCGvQmguO2eK29NVE24XHoLgsTwpYFSXW1VOLNUlihgkP1aZsB
 Zh7ETuMah7M/yjwXNASdM2mJcO3yVRryUZXApJFCdHTRz12aIcCYfIRCZZ+GQuQg
 gZaRgEW4kpTOmdUY3weeJcmfgQiHem0+cOy4dC6ykvNpfCwj3HcOft3U5qaR3C6p
 c+Gd4lurnWn3CtPmYZRQ/7g9vvKth7jXvBMTkPoS4KyaTe5Kk+ph9h7uUtyHZpQP
 /zxaZlYNMX1C+4atVTpQhRTBqHEbiK9BLDErWkqG0Dv6x/NJv3iDSAX+S64WWJwK
 +LkHW7m+5HnCQi++8uxE+V1dWispczbgIcMEmPoyQhhEVKHg9dx9EItr8MEvNpyd
 YIV6qfGoQTWzTPGbApLxe94WOm4tpcaFUbyaWjTrXexsYK6lo2I=
 =LHQV
 -----END PGP SIGNATURE-----

Merge tag 'for-6.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - followup fix for direct io and fsync under some conditions, reported
   by QEMU users

 - fix a potential leak when disabling quotas while some extent tracking
   work can still happen

 - in zoned mode handle unexpected change of zone write pointer in
   RAID1-like block groups, turn the zones to read-only

* tag 'for-6.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix race between direct IO write and fsync when using same fd
  btrfs: zoned: handle broken write pointer on zones
  btrfs: qgroup: don't use extent changeset when not needed
2024-09-04 11:53:47 -07:00
Linus Torvalds
d8abb73f58 three smb3 server fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmbX21YACgkQiiy9cAdy
 T1Hp9gv/dX8tAaYOAE6h5FpzI7kYWsOD0AqEEboZm17rP1M0ihqWhj+tXTjqa5Tb
 T31Kyl/yZ0lRLe6B9cuAWVJCo+1cFnM1sdnL99yE/WlxZzZ3C3exntNlOkcUanCM
 FeyFnVaxWDhZ53mroOX1KBJ1r9LOkGL7czjBwgyhpDu4Q63H4ZsgXJDIu/TJVf4t
 TZkreFoBvn/WocpPl1VXxapILqcW7v5hzfof4MEvAPsHJwP3ZlN0LJuHe6YaBfff
 p8jMZeFfdQc02jjAgL+7KZxlppvRzrZsm+5DZ6C9HyLLJmMJpvGODFG9hVNA8wHT
 xLdekOCgekVx0UlSOzkivSu5FW4XJHPuycr4ak+XI0n20LglGbyA8bT0X5kuslSt
 ejjZbx+uSlT4jjTSJsateTd8B14UO0iIrAaPumOwvBGGtcDenH0/cQ8ktWY79x97
 Pc19JEPSAK2usViFonD4WUEwlg1sFFpV1TCu/HM8VJv6XOb0QzCyZgF7k7o78ztz
 Fp51C0LQ
 =yxks
 -----END PGP SIGNATURE-----

Merge tag 'v6.11-rc6-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

 - Fix crash in session setup

 - Fix locking bug

 - Improve access bounds checking

* tag 'v6.11-rc6-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: Unlock on in ksmbd_tcp_set_interfaces()
  ksmbd: unset the binding mark of a reused connection
  smb: Annotate struct xattr_smb_acl with __counted_by()
2024-09-04 09:41:51 -07:00
Linus Torvalds
4356ab331c vfs-6.11-rc7.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZtQmqAAKCRCRxhvAZXjc
 os+mAP47NBhOecERCJSmS0RFMuRvc0ijxz1642emEthZhtf8qQD/cy56WmGZqEFZ
 bfj5v6tGmsxGt4xMDUDNG0pvqba8hwA=
 =JBA5
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.11-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "Two netfs fixes for this merge window:

   - Ensure that fscache_cookie_lru_time is deleted when the fscache
     module is removed to prevent UAF

   - Fix filemap_invalidate_inode() to use invalidate_inode_pages2_range()

     Before it used truncate_inode_pages_partial() which causes
     copy_file_range() to fail on cifs"

* tag 'vfs-6.11-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fscache: delete fscache_cookie_lru_timer when fscache exits to avoid UAF
  mm: Fix filemap_invalidate_inode() to use invalidate_inode_pages2_range()
2024-09-04 09:33:57 -07:00
Linus Torvalds
76c0f27d06 17 hotfixes, 15 of which are cc:stable.
Mostly MM, no identifiable theme.  And a few nilfs2 fixups.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZtfR/wAKCRDdBJ7gKXxA
 jofjAP9rUlliIcn8zcy7vmBTuMaH4SkoULB64QWAUddaWV+SCAEA+q0sntLPnTIZ
 My3sfihR6mbvhkgKbvIHm6YYQI56NAc=
 =b4Lr
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-09-03-20-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "17 hotfixes, 15 of which are cc:stable.

  Mostly MM, no identifiable theme.  And a few nilfs2 fixups"

* tag 'mm-hotfixes-stable-2024-09-03-20-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  alloc_tag: fix allocation tag reporting when CONFIG_MODULES=n
  mm: vmalloc: optimize vmap_lazy_nr arithmetic when purging each vmap_area
  mailmap: update entry for Jan Kuliga
  codetag: debug: mark codetags for poisoned page as empty
  mm/memcontrol: respect zswap.writeback setting from parent cg too
  scripts: fix gfp-translate after ___GFP_*_BITS conversion to an enum
  Revert "mm: skip CMA pages when they are not available"
  maple_tree: remove rcu_read_lock() from mt_validate()
  kexec_file: fix elfcorehdr digest exclusion when CONFIG_CRASH_HOTPLUG=y
  mm/slub: add check for s->flags in the alloc_tagging_slab_free_hook
  nilfs2: fix state management in error path of log writing function
  nilfs2: fix missing cleanup on rollforward recovery error
  nilfs2: protect references to superblock parameters exposed in sysfs
  userfaultfd: don't BUG_ON() if khugepaged yanks our page table
  userfaultfd: fix checks for huge PMDs
  mm: vmalloc: ensure vmap_block is initialised before adding to queue
  selftests: mm: fix build errors on armhf
2024-09-04 08:37:33 -07:00
Kent Overstreet
53f6619554 bcachefs: BCH_SB_MEMBER_INVALID
Create a sentinal value for "invalid device".

This is needed for removing devices that have stripes on them (force
removing, without evacuating); we need a sentinal value for the stripe
pointers to the device being removed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-03 20:43:14 -04:00
Linus Torvalds
88fac17500 fuse fixes for 6.11-rc7
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCZtbV4AAKCRDh3BK/laaZ
 PC33AP9XvLpQii0mLo12hTSP11TYpaatdhUvyFFKERle1yWkUgEAvtVutUJryTD2
 sz7x5jj4GD9tCWyMlp8Xs5h1Dr4U6wc=
 =XdIb
 -----END PGP SIGNATURE-----

Merge tag 'fuse-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse fixes from Miklos Szeredi:

 - Fix EIO if splice and page stealing are enabled on the fuse device

 - Disable problematic combination of passthrough and writeback-cache

 - Other bug fixes found by code review

* tag 'fuse-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: disable the combination of passthrough and writeback cache
  fuse: update stats for pages in dropped aux writeback list
  fuse: clear PG_uptodate when using a stolen page
  fuse: fix memory leak in fuse_create_open
  fuse: check aborted connection before adding requests to pending list for resending
  fuse: use unsigned type for getxattr/listxattr size truncation
2024-09-03 12:32:00 -07:00
Filipe Manana
cd9253c23a btrfs: fix race between direct IO write and fsync when using same fd
If we have 2 threads that are using the same file descriptor and one of
them is doing direct IO writes while the other is doing fsync, we have a
race where we can end up either:

1) Attempt a fsync without holding the inode's lock, triggering an
   assertion failures when assertions are enabled;

2) Do an invalid memory access from the fsync task because the file private
   points to memory allocated on stack by the direct IO task and it may be
   used by the fsync task after the stack was destroyed.

The race happens like this:

1) A user space program opens a file descriptor with O_DIRECT;

2) The program spawns 2 threads using libpthread for example;

3) One of the threads uses the file descriptor to do direct IO writes,
   while the other calls fsync using the same file descriptor.

4) Call task A the thread doing direct IO writes and task B the thread
   doing fsyncs;

5) Task A does a direct IO write, and at btrfs_direct_write() sets the
   file's private to an on stack allocated private with the member
   'fsync_skip_inode_lock' set to true;

6) Task B enters btrfs_sync_file() and sees that there's a private
   structure associated to the file which has 'fsync_skip_inode_lock' set
   to true, so it skips locking the inode's VFS lock;

7) Task A completes the direct IO write, and resets the file's private to
   NULL since it had no prior private and our private was stack allocated.
   Then it unlocks the inode's VFS lock;

8) Task B enters btrfs_get_ordered_extents_for_logging(), then the
   assertion that checks the inode's VFS lock is held fails, since task B
   never locked it and task A has already unlocked it.

The stack trace produced is the following:

   assertion failed: inode_is_locked(&inode->vfs_inode), in fs/btrfs/ordered-data.c:983
   ------------[ cut here ]------------
   kernel BUG at fs/btrfs/ordered-data.c:983!
   Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
   CPU: 9 PID: 5072 Comm: worker Tainted: G     U     OE      6.10.5-1-default #1 openSUSE Tumbleweed 69f48d427608e1c09e60ea24c6c55e2ca1b049e8
   Hardware name: Acer Predator PH315-52/Covini_CFS, BIOS V1.12 07/28/2020
   RIP: 0010:btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs]
   Code: 50 d6 86 c0 e8 (...)
   RSP: 0018:ffff9e4a03dcfc78 EFLAGS: 00010246
   RAX: 0000000000000054 RBX: ffff9078a9868e98 RCX: 0000000000000000
   RDX: 0000000000000000 RSI: ffff907dce4a7800 RDI: ffff907dce4a7800
   RBP: ffff907805518800 R08: 0000000000000000 R09: ffff9e4a03dcfb38
   R10: ffff9e4a03dcfb30 R11: 0000000000000003 R12: ffff907684ae7800
   R13: 0000000000000001 R14: ffff90774646b600 R15: 0000000000000000
   FS:  00007f04b96006c0(0000) GS:ffff907dce480000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 00007f32acbfc000 CR3: 00000001fd4fa005 CR4: 00000000003726f0
   Call Trace:
    <TASK>
    ? __die_body.cold+0x14/0x24
    ? die+0x2e/0x50
    ? do_trap+0xca/0x110
    ? do_error_trap+0x6a/0x90
    ? btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a]
    ? exc_invalid_op+0x50/0x70
    ? btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a]
    ? asm_exc_invalid_op+0x1a/0x20
    ? btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a]
    ? btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a]
    btrfs_sync_file+0x21a/0x4d0 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a]
    ? __seccomp_filter+0x31d/0x4f0
    __x64_sys_fdatasync+0x4f/0x90
    do_syscall_64+0x82/0x160
    ? do_futex+0xcb/0x190
    ? __x64_sys_futex+0x10e/0x1d0
    ? switch_fpu_return+0x4f/0xd0
    ? syscall_exit_to_user_mode+0x72/0x220
    ? do_syscall_64+0x8e/0x160
    ? syscall_exit_to_user_mode+0x72/0x220
    ? do_syscall_64+0x8e/0x160
    ? syscall_exit_to_user_mode+0x72/0x220
    ? do_syscall_64+0x8e/0x160
    ? syscall_exit_to_user_mode+0x72/0x220
    ? do_syscall_64+0x8e/0x160
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

Another problem here is if task B grabs the private pointer and then uses
it after task A has finished, since the private was allocated in the stack
of task A, it results in some invalid memory access with a hard to predict
result.

This issue, triggering the assertion, was observed with QEMU workloads by
two users in the Link tags below.

Fix this by not relying on a file's private to pass information to fsync
that it should skip locking the inode and instead pass this information
through a special value stored in current->journal_info. This is safe
because in the relevant section of the direct IO write path we are not
holding a transaction handle, so current->journal_info is NULL.

The following C program triggers the issue:

   $ cat repro.c
   /* Get the O_DIRECT definition. */
   #ifndef _GNU_SOURCE
   #define _GNU_SOURCE
   #endif

   #include <stdio.h>
   #include <stdlib.h>
   #include <unistd.h>
   #include <stdint.h>
   #include <fcntl.h>
   #include <errno.h>
   #include <string.h>
   #include <pthread.h>

   static int fd;

   static ssize_t do_write(int fd, const void *buf, size_t count, off_t offset)
   {
       while (count > 0) {
           ssize_t ret;

           ret = pwrite(fd, buf, count, offset);
           if (ret < 0) {
               if (errno == EINTR)
                   continue;
               return ret;
           }
           count -= ret;
           buf += ret;
       }
       return 0;
   }

   static void *fsync_loop(void *arg)
   {
       while (1) {
           int ret;

           ret = fsync(fd);
           if (ret != 0) {
               perror("Fsync failed");
               exit(6);
           }
       }
   }

   int main(int argc, char *argv[])
   {
       long pagesize;
       void *write_buf;
       pthread_t fsyncer;
       int ret;

       if (argc != 2) {
           fprintf(stderr, "Use: %s <file path>\n", argv[0]);
           return 1;
       }

       fd = open(argv[1], O_WRONLY | O_CREAT | O_TRUNC | O_DIRECT, 0666);
       if (fd == -1) {
           perror("Failed to open/create file");
           return 1;
       }

       pagesize = sysconf(_SC_PAGE_SIZE);
       if (pagesize == -1) {
           perror("Failed to get page size");
           return 2;
       }

       ret = posix_memalign(&write_buf, pagesize, pagesize);
       if (ret) {
           perror("Failed to allocate buffer");
           return 3;
       }

       ret = pthread_create(&fsyncer, NULL, fsync_loop, NULL);
       if (ret != 0) {
           fprintf(stderr, "Failed to create writer thread: %d\n", ret);
           return 4;
       }

       while (1) {
           ret = do_write(fd, write_buf, pagesize, 0);
           if (ret != 0) {
               perror("Write failed");
               exit(5);
           }
       }

       return 0;
   }

   $ mkfs.btrfs -f /dev/sdi
   $ mount /dev/sdi /mnt/sdi
   $ timeout 10 ./repro /mnt/sdi/foo

Usually the race is triggered within less than 1 second. A test case for
fstests will follow soon.

Reported-by: Paulo Dias <paulo.miguel.dias@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219187
Reported-by: Andreas Jahn <jahn-andi@web.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219199
Reported-by: syzbot+4704b3cc972bd76024f1@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/00000000000044ff540620d7dee2@google.com/
Fixes: 939b656bc8 ("btrfs: fix corruption after buffer fault in during direct IO append write")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-03 20:29:55 +02:00
David Howells
ab85218910 netfs, cifs: Improve some debugging bits
Improve some debugging bits:

 (1) The netfslib _debug() macro doesn't need a newline in its format
     string.

 (2) Display the request debug ID and subrequest index in messages emitted
     in smb2_adjust_credits() to make it easier to reference in traces.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-03 10:17:51 -05:00
David Howells
a68c74865f cifs: Fix SMB1 readv/writev callback in the same way as SMB2/3
Port a number of SMB2/3 async readv/writev fixes to the SMB1 transport:

    commit a88d609036
    cifs: Don't advance the I/O iterator before terminating subrequest

    commit ce5291e560
    cifs: Defer read completion

    commit 1da29f2c39
    netfs, cifs: Fix handling of short DIO read

Fixes: 3ee1a1fc39 ("cifs: Cut over to using netfslib")
Signed-off-by: David Howells <dhowells@redhat.com>
Reported-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-03 10:17:03 -05:00
David Howells
517b58c1f9 cifs: Fix zero_point init on inode initialisation
Fix cifs_fattr_to_inode() such that the ->zero_point tracking variable
is initialised when the inode is initialised.

Fixes: 3ee1a1fc39 ("cifs: Cut over to using netfslib")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-03 10:16:05 -05:00
Paulo Alcantara
f9c169b51b smb: client: fix double put of @cfile in smb2_set_path_size()
If smb2_compound_op() is called with a valid @cfile and returned
-EINVAL, we need to call cifs_get_writable_path() before retrying it
as the reference of @cfile was already dropped by previous call.

This fixes the following KASAN splat when running fstests generic/013
against Windows Server 2022:

  CIFS: Attempting to mount //w22-fs0/scratch
  run fstests generic/013 at 2024-09-02 19:48:59
  ==================================================================
  BUG: KASAN: slab-use-after-free in detach_if_pending+0xab/0x200
  Write of size 8 at addr ffff88811f1a3730 by task kworker/3:2/176

  CPU: 3 UID: 0 PID: 176 Comm: kworker/3:2 Not tainted 6.11.0-rc6 #2
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40
  04/01/2014
  Workqueue: cifsoplockd cifs_oplock_break [cifs]
  Call Trace:
   <TASK>
   dump_stack_lvl+0x5d/0x80
   ? detach_if_pending+0xab/0x200
   print_report+0x156/0x4d9
   ? detach_if_pending+0xab/0x200
   ? __virt_addr_valid+0x145/0x300
   ? __phys_addr+0x46/0x90
   ? detach_if_pending+0xab/0x200
   kasan_report+0xda/0x110
   ? detach_if_pending+0xab/0x200
   detach_if_pending+0xab/0x200
   timer_delete+0x96/0xe0
   ? __pfx_timer_delete+0x10/0x10
   ? rcu_is_watching+0x20/0x50
   try_to_grab_pending+0x46/0x3b0
   __cancel_work+0x89/0x1b0
   ? __pfx___cancel_work+0x10/0x10
   ? kasan_save_track+0x14/0x30
   cifs_close_deferred_file+0x110/0x2c0 [cifs]
   ? __pfx_cifs_close_deferred_file+0x10/0x10 [cifs]
   ? __pfx_down_read+0x10/0x10
   cifs_oplock_break+0x4c1/0xa50 [cifs]
   ? __pfx_cifs_oplock_break+0x10/0x10 [cifs]
   ? lock_is_held_type+0x85/0xf0
   ? mark_held_locks+0x1a/0x90
   process_one_work+0x4c6/0x9f0
   ? find_held_lock+0x8a/0xa0
   ? __pfx_process_one_work+0x10/0x10
   ? lock_acquired+0x220/0x550
   ? __list_add_valid_or_report+0x37/0x100
   worker_thread+0x2e4/0x570
   ? __kthread_parkme+0xd1/0xf0
   ? __pfx_worker_thread+0x10/0x10
   kthread+0x17f/0x1c0
   ? kthread+0xda/0x1c0
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x31/0x60
   ? __pfx_kthread+0x10/0x10
   ret_from_fork_asm+0x1a/0x30
   </TASK>

  Allocated by task 1118:
   kasan_save_stack+0x30/0x50
   kasan_save_track+0x14/0x30
   __kasan_kmalloc+0xaa/0xb0
   cifs_new_fileinfo+0xc8/0x9d0 [cifs]
   cifs_atomic_open+0x467/0x770 [cifs]
   lookup_open.isra.0+0x665/0x8b0
   path_openat+0x4c3/0x1380
   do_filp_open+0x167/0x270
   do_sys_openat2+0x129/0x160
   __x64_sys_creat+0xad/0xe0
   do_syscall_64+0xbb/0x1d0
   entry_SYSCALL_64_after_hwframe+0x77/0x7f

  Freed by task 83:
   kasan_save_stack+0x30/0x50
   kasan_save_track+0x14/0x30
   kasan_save_free_info+0x3b/0x70
   poison_slab_object+0xe9/0x160
   __kasan_slab_free+0x32/0x50
   kfree+0xf2/0x300
   process_one_work+0x4c6/0x9f0
   worker_thread+0x2e4/0x570
   kthread+0x17f/0x1c0
   ret_from_fork+0x31/0x60
   ret_from_fork_asm+0x1a/0x30

  Last potentially related work creation:
   kasan_save_stack+0x30/0x50
   __kasan_record_aux_stack+0xad/0xc0
   insert_work+0x29/0xe0
   __queue_work+0x5ea/0x760
   queue_work_on+0x6d/0x90
   _cifsFileInfo_put+0x3f6/0x770 [cifs]
   smb2_compound_op+0x911/0x3940 [cifs]
   smb2_set_path_size+0x228/0x270 [cifs]
   cifs_set_file_size+0x197/0x460 [cifs]
   cifs_setattr+0xd9c/0x14b0 [cifs]
   notify_change+0x4e3/0x740
   do_truncate+0xfa/0x180
   vfs_truncate+0x195/0x200
   __x64_sys_truncate+0x109/0x150
   do_syscall_64+0xbb/0x1d0
   entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 71f15c90e7 ("smb: client: retry compound request without reusing lease")
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-03 10:06:48 -05:00
Paulo Alcantara
3523a3df03 smb: client: fix double put of @cfile in smb2_rename_path()
If smb2_set_path_attr() is called with a valid @cfile and returned
-EINVAL, we need to call cifs_get_writable_path() again as the
reference of @cfile was already dropped by previous smb2_compound_op()
call.

Fixes: 71f15c90e7 ("smb: client: retry compound request without reusing lease")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-03 09:48:50 -05:00
Paulo Alcantara
7ccc146546 smb: client: fix hang in wait_for_response() for negproto
Call cifs_reconnect() to wake up processes waiting on negotiate
protocol to handle the case where server abruptly shut down and had no
chance to properly close the socket.

Simple reproducer:

  ssh 192.168.2.100 pkill -STOP smbd
  mount.cifs //192.168.2.100/test /mnt -o ... [never returns]

Cc: Rickard Andersson <rickaran@axis.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-02 20:00:04 -05:00