-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmP2kaAACgkQiiy9cAdy
T1Eergv9FHVs7hS0anJF0xgRghR4+g0m5UUo08iJazgJdDgcS5JY+ZasIpYpEsG3
QmsIT33XVYZypXoOzjMSsPlwo6esTCJQScVLz85e4ebedCbCBDks+wVQcbfTzD5/
KrwmUoTBLU0L/ppFhqRk9k53nrSf1SXCWPthjdfWa3mTHdIVM4kQJruTWwUDiJXp
mdYwTx6FnTNer3QWetNzYOwdUgLu3rk0zLcBwQNCo6g5LOpA44iFfEAO4zeiOuZT
LMDPbDj0nWQyWPLLdcbtsn2laYyEBDBLZevLirSaqPQ/KCtGcw0mBt6dCAzg8/CM
ONqHHxdEpvPON8Sxujcn4CxpXhl0nCLwwtKtWU4rt7IevI9U+PynNl57TtJJ16/s
b3XD2QVbFjlcdAMTmArvqnogdzoC3mZu1R1IRs+jukhLAOqZiLN6o/E2HAllt47i
krzXeXIzQr10w9fnJ7LtIc/7IUFgtUfrOkg4TKyNcnRVHQaSSxv+JLRgqMPOr/M0
I7zt0G0j
=4hIT
-----END PGP SIGNATURE-----
Merge tag '6.3-rc-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client updates from Steve French:
"The largest subset of this is from David Howells et al: making the
cifs/smb3 driver pass iov_iters down to the lowest layers, directly to
the network transport rather than passing lists of pages around,
helping multiple areas:
- Pin user pages, thereby fixing the race between concurrent DIO read
and fork, where the pages containing the DIO read buffer may end up
belonging to the child process and not the parent - with the result
that the parent might not see the retrieved data.
- cifs shouldn't take refs on pages extracted from non-user-backed
iterators (eg. KVEC). With these changes, cifs will apply the
appropriate cleanup.
- Making it easier to transition to using folios in cifs rather than
pages by dealing with them through BVEC and XARRAY iterators.
- Allowing cifs to use the new splice function
The remainder are:
- fixes for stable, including various fixes for uninitialized memory,
wrong length field causing mount issue to very old servers,
important directory lease fixes and reconnect fixes
- cleanups (unused code removal, change one element array usage, and
a change form strtobool to kstrtobool, and Kconfig cleanups)
- SMBDIRECT (RDMA) fixes including iov_iter integration and UAF fixes
- reconnect fixes
- multichannel fixes, including improving channel allocation (to
least used channel)
- remove the last use of lock_page_killable by moving to
folio_lock_killable"
* tag '6.3-rc-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (46 commits)
update internal module version number for cifs.ko
cifs: update ip_addr for ses only for primary chan setup
cifs: use tcon allocation functions even for dummy tcon
cifs: use the least loaded channel for sending requests
cifs: DIO to/from KVEC-type iterators should now work
cifs: Remove unused code
cifs: Build the RDMA SGE list directly from an iterator
cifs: Change the I/O paths to use an iterator rather than a page list
cifs: Add a function to read into an iter from a socket
cifs: Add some helper functions
cifs: Add a function to Hash the contents of an iterator
cifs: Add a function to build an RDMA SGE list from an iterator
netfs: Add a function to extract an iterator into a scatterlist
netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator
cifs: Implement splice_read to pass down ITER_BVEC not ITER_PIPE
splice: Export filemap/direct_splice_read()
iov_iter: Add a function to extract a page list from an iterator
iov_iter: Define flags to qualify page extraction.
splice: Add a func to do a splice from an O_DIRECT file without ITER_PIPE
splice: Add a func to do a splice from a buffered file without ITER_PIPE
...
Use a struct assignment with implicit member initialization
Signed-off-by: Volker Lendecke <vl@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
To work around some Window servers that return
STATUS_OBJECT_NAME_INVALID on query infos under DFS namespaces that
contain non-ASCII characters, we started checking for -ENOENT on every
file open, and if so, then send additional requests to figure out
whether it is a DFS link or not. It means that all those requests
will be sent to every non-existing file.
So, in order to reduce the number of roundtrips, check earlier whether
status code is STATUS_OBJECT_NAME_INVALID and tcon supports dfs, and
if so, then map -ENOENT to -EREMOTE so mount or automount will take
care of chasing the DFS link -- if it isn't an DFS link, then -ENOENT
will be returned appropriately.
Before patch
SMB2 438 Create Request File: ada.test\dfs\foo;GetInfo Request...
SMB2 310 Create Response, Error: STATUS_OBJECT_NAME_NOT_FOUND;...
SMB2 228 Ioctl Request FSCTL_DFS_GET_REFERRALS, File: \ada.test\dfs\foo
SMB2 143 Ioctl Response, Error: STATUS_OBJECT_PATH_NOT_FOUND
SMB2 438 Create Request File: ada.test\dfs\foo;GetInfo Request...
SMB2 310 Create Response, Error: STATUS_OBJECT_NAME_NOT_FOUND;...
SMB2 228 Ioctl Request FSCTL_DFS_GET_REFERRALS, File: \ada.test\dfs\foo
SMB2 143 Ioctl Response, Error: STATUS_OBJECT_PATH_NOT_FOUND
After patch
SMB2 438 Create Request File: ada.test\dfs\foo;GetInfo Request...
SMB2 310 Create Response, Error: STATUS_OBJECT_NAME_NOT_FOUND;...
SMB2 438 Create Request File: ada.test\dfs\foo;GetInfo Request...
SMB2 310 Create Response, Error: STATUS_OBJECT_NAME_NOT_FOUND;...
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
stat was returning default owner and group (unlike readdir)
for SMB3.1.1 POSIX extensions
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
xfstests generic/011 reported use-after-free bug as follows:
BUG: KASAN: use-after-free in __d_alloc+0x269/0x859
Read of size 15 at addr ffff8880078933a0 by task dirstress/952
CPU: 1 PID: 952 Comm: dirstress Not tainted 6.1.0-rc3+ #77
Call Trace:
__dump_stack+0x23/0x29
dump_stack_lvl+0x51/0x73
print_address_description+0x67/0x27f
print_report+0x3e/0x5c
kasan_report+0x7b/0xa8
kasan_check_range+0x1b2/0x1c1
memcpy+0x22/0x5d
__d_alloc+0x269/0x859
d_alloc+0x45/0x20c
d_alloc_parallel+0xb2/0x8b2
lookup_open+0x3b8/0x9f9
open_last_lookups+0x63d/0xc26
path_openat+0x11a/0x261
do_filp_open+0xcc/0x168
do_sys_openat2+0x13b/0x3f7
do_sys_open+0x10f/0x146
__se_sys_creat+0x27/0x2e
__x64_sys_creat+0x55/0x6a
do_syscall_64+0x40/0x96
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Allocated by task 952:
kasan_save_stack+0x1f/0x42
kasan_set_track+0x21/0x2a
kasan_save_alloc_info+0x17/0x1d
__kasan_kmalloc+0x7e/0x87
__kmalloc_node_track_caller+0x59/0x155
kstrndup+0x60/0xe6
parse_mf_symlink+0x215/0x30b
check_mf_symlink+0x260/0x36a
cifs_get_inode_info+0x14e1/0x1690
cifs_revalidate_dentry_attr+0x70d/0x964
cifs_revalidate_dentry+0x36/0x62
cifs_d_revalidate+0x162/0x446
lookup_open+0x36f/0x9f9
open_last_lookups+0x63d/0xc26
path_openat+0x11a/0x261
do_filp_open+0xcc/0x168
do_sys_openat2+0x13b/0x3f7
do_sys_open+0x10f/0x146
__se_sys_creat+0x27/0x2e
__x64_sys_creat+0x55/0x6a
do_syscall_64+0x40/0x96
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 950:
kasan_save_stack+0x1f/0x42
kasan_set_track+0x21/0x2a
kasan_save_free_info+0x1c/0x34
____kasan_slab_free+0x1c1/0x1d5
__kasan_slab_free+0xe/0x13
__kmem_cache_free+0x29a/0x387
kfree+0xd3/0x10e
cifs_fattr_to_inode+0xb6a/0xc8c
cifs_get_inode_info+0x3cb/0x1690
cifs_revalidate_dentry_attr+0x70d/0x964
cifs_revalidate_dentry+0x36/0x62
cifs_d_revalidate+0x162/0x446
lookup_open+0x36f/0x9f9
open_last_lookups+0x63d/0xc26
path_openat+0x11a/0x261
do_filp_open+0xcc/0x168
do_sys_openat2+0x13b/0x3f7
do_sys_open+0x10f/0x146
__se_sys_creat+0x27/0x2e
__x64_sys_creat+0x55/0x6a
do_syscall_64+0x40/0x96
entry_SYSCALL_64_after_hwframe+0x63/0xcd
When opened a symlink, link name is from 'inode->i_link', but it may be
reset to a new value when revalidate the dentry. If some processes get the
link name on the race scenario, then UAF will happen on link name.
Fix this by implementing 'get_link' interface to duplicate the link name.
Fixes: 76894f3e2f ("cifs: improve symlink handling for smb2+")
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
If stardup the symlink target failed, should free the xid,
otherwise the xid will be leaked.
Fixes: 76894f3e2f ("cifs: improve symlink handling for smb2+")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
This expands the directory caching to now cache an open handle for all
directories (up to a maximum) and not just the root directory.
In this patch, locking and refcounting is intended to work as so:
The main function to get a reference to a cached handle is
find_or_create_cached_dir() called from open_cached_dir()
These functions are protected under the cfid_list_lock spin-lock
to make sure we do not race creating new references for cached dirs
with deletion of expired ones.
An successful open_cached_dir() will take out 2 references to the cfid if
this was the very first and successful call to open the directory and
it acquired a lease from the server.
One reference is for the lease and the other is for the cfid that we
return. The is lease reference is tracked by cfid->has_lease.
If the directory already has a handle with an active lease, then we just
take out one new reference for the cfid and return it.
It can happen that we have a thread that tries to open a cached directory
where we have a cfid already but we do not, yet, have a working lease. In
this case we will just return NULL, and this the caller will fall back to
the case when no handle was available.
In this model the total number of references we have on a cfid is
1 for while the handle is open and we have a lease, and one additional
reference for each open instance of a cfid.
Once we get a lease break (cached_dir_lease_break()) we remove the
cfid from the list under the spinlock. This prevents any new threads to
use it, and we also call smb2_cached_lease_break() via the work_queue
in order to drop the reference we got for the lease (we drop it outside
of the spin-lock.)
Anytime a thread calls close_cached_dir() we also drop a reference to the
cfid.
When the last reference to the cfid is released smb2_close_cached_fid()
will be invoked which will drop the reference ot the dentry we held for
this cfid and it will also, if we the handle is open/has a lease
also call SMB2_close() to close the handle on the server.
Two events require special handling:
invalidate_all_cached_dirs() this function is called from SMB2_tdis()
and cifs_mark_open_files_invalid().
In both cases the tcon is either gone already or will be shortly so
we do not need to actually close the handles. They will be dropped
server side as part of the tcon dropping.
But we have to be careful about a potential race with a concurrent
lease break so we need to take out additional refences to avoid the
cfid from being freed while we are still referencing it.
free_cached_dirs() which is called from tconInfoFree().
This is called quite late in the umount process so there should no longer
be any open handles or files and we can just free all the remaining data.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
When creating inode for symlink, the client used to send below
requests to fill it in:
* create+query_info+close (STATUS_STOPPED_ON_SYMLINK)
* create(+reparse_flag)+query_info+close (set file attrs)
* create+ioctl(get_reparse)+close (query reparse tag)
and then for every access to the symlink dentry, the ->link() method
would send another:
* create+ioctl(get_reparse)+close (parse symlink)
So, in order to improve:
(i) Get rid of unnecessary roundtrips and then resolve symlinks as
follows:
* create+query_info+close (STATUS_STOPPED_ON_SYMLINK +
parse symlink + get reparse tag)
* create(+reparse_flag)+query_info+close (set file attrs)
(ii) Set the resolved symlink target directly in inode->i_link and
use simple_get_link() for ->link() to simply return it.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmNCWv0ACgkQiiy9cAdy
T1EaIAv/bkcILC2M6pFdAp4i5hDuJoJvYnuh3Yc9P19uUsFWhHlZsG0rZLwRmQ2e
kGzvO3fmS1V8iE72ZXYqj1ARAahMm2T6HsxcpLQ39bH645WsAAFUoapqUQZcENif
u+faCJXK2ALTUoB3hSqxfZ4Ck9ILxXBeQTi4G2XQ0yf8jlXePM38Q8mo+qSN5pSQ
8XVui3vHIi8u7JZnSuhcxRm8WCLBj2Dbuod5uJALKwl90WVD/HvHZUdiMiz4OnG2
s3+MkXvAdg7QmaF+3jhpSkzeQl3EDToMg6zqI+mi0K4CHRpeQbm+kRu3lQYwh7nR
gbAyQH/tgz2QlqkLbW5FpsOC3v3KtsrgqfYWPA8o6pMh2zs8p1EBJ/hX4e5EbzaF
bOgQVi/sY1tfILB1q7B1JfB/0HFUy+vnGiNowJleoKF5Lw/fJSwgb5aY/YDTqwA0
B6tXAHo0srPz4h8PMdZZqJQeue13zhKKpMxW/3i2o/fkOXJvTjp6Y2k6IzRYubtv
28sgtq6x
=8GDQ
-----END PGP SIGNATURE-----
Merge tag '6.1-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs updates from Steve French:
- data corruption fix when cache disabled
- four RDMA (smbdirect) improvements, including enabling support for
SoftiWARP
- four signing improvements
- three directory lease improvements
- four cleanup fixes
- minor security fix
- two debugging improvements
* tag '6.1-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: (21 commits)
smb3: fix oops in calculating shash_setkey
cifs: secmech: use shash_desc directly, remove sdesc
smb3: rename encryption/decryption TFMs
cifs: replace kfree() with kfree_sensitive() for sensitive data
cifs: remove initialization value
cifs: Replace a couple of one-element arrays with flexible-array members
smb3: do not log confusing message when server returns no network interfaces
smb3: define missing create contexts
cifs: store a pointer to a fid in the cfid structure instead of the struct
cifs: improve handlecaching
cifs: Make tcon contain a wrapper structure cached_fids instead of cached_fid
smb3: add dynamic trace points for tree disconnect
Fix formatting of client smbdirect RDMA logging
Handle variable number of SGEs in client smbdirect send.
Reduce client smbdirect max receive segment size
Decrease the number of SMB3 smbdirect client SGEs
cifs: Fix the error length of VALIDATE_NEGOTIATE_INFO message
cifs: destage dirty pages before re-reading them for cache=none
cifs: return correct error in ->calc_signature()
MAINTAINERS: Add Tom Talpey as cifs.ko reviewer
...
- Debuggability:
- Change most occurances of BUG_ON() to WARN_ON_ONCE()
- Reorganize & fix TASK_ state comparisons, turn it into a bitmap
- Update/fix misc scheduler debugging facilities
- Load-balancing & regular scheduling:
- Improve the behavior of the scheduler in presence of lot of
SCHED_IDLE tasks - in particular they should not impact other
scheduling classes.
- Optimize task load tracking, cleanups & fixes
- Clean up & simplify misc load-balancing code
- Freezer:
- Rewrite the core freezer to behave better wrt thawing and be simpler
in general, by replacing PF_FROZEN with TASK_FROZEN & fixing/adjusting
all the fallout.
- Deadline scheduler:
- Fix the DL capacity-aware code
- Factor out dl_task_is_earliest_deadline() & replenish_dl_new_period()
- Relax/optimize locking in task_non_contending()
- Cleanups:
- Factor out the update_current_exec_runtime() helper
- Various cleanups, simplifications
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmM/01cRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1geZA/+PB4KC1T9aVxzaTHI36R03YgJYZmIdtxw
wTf02MixePmz+gQCbepJbempGOh5ST28aOcI0xhdYOql5B63MaUBBMlB0HvGUyDG
IU3zETqLMRtAbnSTdQFv8m++ECUtZYp8/x1FCel4WO7ya4ETkRu1NRfCoUepEhpZ
aVAlae9LH3NBaF9t7s0PT2lTjf3pIzMFRkddJ0ywJhbFR3VnWat05fAK+J6fGY8+
LS54coefNlJD4oDh5TY8uniL1j5SmWmmwbk9Cdj7bLU5P3dFSS0/+5FJNHJPVGDE
srGT7wstRUcDrN0CnZo48VIUBiApJCCDqTfJYi9wNYd0NAHvwY6MIJJgEIY8mKsI
L/qH26H81Wt+ezSZ/5JIlGlZ/LIeNaa6OO/fbWEYABBQogvvx3nxsRNUYKSQzumH
CnSBasBjLnjWyLlK4qARM9cI7NFSEK6NUigrEx/7h8JFu/8T4DlSy6LsF1HUyKgq
4+FJLAqG6cL0tcwB/fHYd0oRESN8dStnQhGxSojgufwLc7dlFULvCYF5JM/dX+/V
IKwbOfIOeOn6ViMtSOXAEGdII+IQ2/ZFPwr+8Z5JC7NzvTVL6xlu/3JXkLZR3L7o
yaXTSaz06h1vil7Z+GRf7RHc+wUeGkEpXh5vnarGZKXivhFdWsBdROIJANK+xR0i
TeSLCxQxXlU=
=KjMD
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Debuggability:
- Change most occurances of BUG_ON() to WARN_ON_ONCE()
- Reorganize & fix TASK_ state comparisons, turn it into a bitmap
- Update/fix misc scheduler debugging facilities
Load-balancing & regular scheduling:
- Improve the behavior of the scheduler in presence of lot of
SCHED_IDLE tasks - in particular they should not impact other
scheduling classes.
- Optimize task load tracking, cleanups & fixes
- Clean up & simplify misc load-balancing code
Freezer:
- Rewrite the core freezer to behave better wrt thawing and be
simpler in general, by replacing PF_FROZEN with TASK_FROZEN &
fixing/adjusting all the fallout.
Deadline scheduler:
- Fix the DL capacity-aware code
- Factor out dl_task_is_earliest_deadline() &
replenish_dl_new_period()
- Relax/optimize locking in task_non_contending()
Cleanups:
- Factor out the update_current_exec_runtime() helper
- Various cleanups, simplifications"
* tag 'sched-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
sched: Fix more TASK_state comparisons
sched: Fix TASK_state comparisons
sched/fair: Move call to list_last_entry() in detach_tasks
sched/fair: Cleanup loop_max and loop_break
sched/fair: Make sure to try to detach at least one movable task
sched: Show PF_flag holes
freezer,sched: Rewrite core freezer logic
sched: Widen TAKS_state literals
sched/wait: Add wait_event_state()
sched/completion: Add wait_for_completion_state()
sched: Add TASK_ANY for wait_task_inactive()
sched: Change wait_task_inactive()s match_state
freezer,umh: Clean up freezer/initrd interaction
freezer: Have {,un}lock_system_sleep() save/restore flags
sched: Rename task_running() to task_on_cpu()
sched/fair: Cleanup for SIS_PROP
sched/fair: Default to false in test_idle_cores()
sched/fair: Remove useless check in select_idle_core()
sched/fair: Avoid double search on same cpu
sched/fair: Remove redundant check in select_idle_smt()
...
Needed this for debugging a failing xfstest.
Also change camel case for "treeName" to "tree_name" in tcon struct.
Example trace output (from "trace-cmd record -e smb3_tdis*"):
umount-9718 [006] ..... 5909.780244: smb3_tdis_enter: xid=206 sid=0xcf38894e tid=0x3d0b8cf8 path=\\localhost\test
umount-9718 [007] ..... 5909.780878: smb3_tdis_done: xid=206 sid=0xcf38894e tid=0x3d0b8cf8
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Rewrite the core freezer to behave better wrt thawing and be simpler
in general.
By replacing PF_FROZEN with TASK_FROZEN, a special block state, it is
ensured frozen tasks stay frozen until thawed and don't randomly wake
up early, as is currently possible.
As such, it does away with PF_FROZEN and PF_FREEZER_SKIP, freeing up
two PF_flags (yay!).
Specifically; the current scheme works a little like:
freezer_do_not_count();
schedule();
freezer_count();
And either the task is blocked, or it lands in try_to_freezer()
through freezer_count(). Now, when it is blocked, the freezer
considers it frozen and continues.
However, on thawing, once pm_freezing is cleared, freezer_count()
stops working, and any random/spurious wakeup will let a task run
before its time.
That is, thawing tries to thaw things in explicit order; kernel
threads and workqueues before doing bringing SMP back before userspace
etc.. However due to the above mentioned races it is entirely possible
for userspace tasks to thaw (by accident) before SMP is back.
This can be a fatal problem in asymmetric ISA architectures (eg ARMv9)
where the userspace task requires a special CPU to run.
As said; replace this with a special task state TASK_FROZEN and add
the following state transitions:
TASK_FREEZABLE -> TASK_FROZEN
__TASK_STOPPED -> TASK_FROZEN
__TASK_TRACED -> TASK_FROZEN
The new TASK_FREEZABLE can be set on any state part of TASK_NORMAL
(IOW. TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE) -- any such state
is already required to deal with spurious wakeups and the freezer
causes one such when thawing the task (since the original state is
lost).
The special __TASK_{STOPPED,TRACED} states *can* be restored since
their canonical state is in ->jobctl.
With this, frozen tasks need an explicit TASK_FROZEN wakeup and are
free of undue (early / spurious) wakeups.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114649.055452969@infradead.org
Also rename crfid to cfid to have consistent naming for this variable.
This commit does not change any logic.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Currently much of the smb1 code is built even when
CONFIG_CIFS_ALLOW_INSECURE_LEGACY is disabled.
Move cifssmb.c to only be compiled when insecure legacy is disabled,
and move various SMB1/CIFS helper functions to that ifdef. Some
functions that were not SMB1/CIFS specific needed to be moved out of
cifssmb.c
This shrinks cifs.ko by more than 10% which is good - but also will
help with the eventual movement of the legacy code to a distinct
module. Follow on patches can shrink the number of ifdefs by
code restructuring where smb1 code is wedged in functions that
should be calling dialect specific helper functions instead,
and also by moving some functions from file.c/dir.c/inode.c into
smb1 specific c files.
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
While randstruct was satisfied with using an open-coded "void *" offset
cast for the netfs_i_context <-> inode casting, __builtin_object_size() as
used by FORTIFY_SOURCE was not as easily fooled. This was causing the
following complaint[1] from gcc v12:
In file included from include/linux/string.h:253,
from include/linux/ceph/ceph_debug.h:7,
from fs/ceph/inode.c:2:
In function 'fortify_memset_chk',
inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2,
inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2:
include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
242 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by embedding a struct inode into struct netfs_i_context (which
should perhaps be renamed to struct netfs_inode). The struct inode
vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode
structs and vfs_inode is then simply changed to "netfs.inode" in those
filesystems.
Further, rename netfs_i_context to netfs_inode, get rid of the
netfs_inode() function that converted a netfs_i_context pointer to an
inode pointer (that can now be done with &ctx->inode) and rename the
netfs_i_context() function to netfs_inode() (which is now a wrapper
around container_of()).
Most of the changes were done with:
perl -p -i -e 's/vfs_inode/netfs.inode/'g \
`git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
Kees suggested doing it with a pair structure[2] and a special
declarator to insert that into the network filesystem's inode
wrapper[3], but I think it's cleaner to embed it - and then it doesn't
matter if struct randomisation reorders things.
Dave Chinner suggested using a filesystem-specific VFS_I() function in
each filesystem to convert that filesystem's own inode wrapper struct
into the VFS inode struct[4].
Version #2:
- Fix a couple of missed name changes due to a disabled cifs option.
- Rename nfs_i_context to nfs_inode
- Use "netfs" instead of "nic" as the member name in per-fs inode wrapper
structs.
[ This also undoes commit 507160f46c ("netfs: gcc-12: temporarily
disable '-Wattribute-warning' for now") that is no longer needed ]
Fixes: bc899ee1c8 ("netfs: Add a netfs inode context")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
cc: Jonathan Corbet <corbet@lwn.net>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Steve French <smfrench@gmail.com>
cc: William Kucharski <william.kucharski@oracle.com>
cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
cc: Dave Chinner <david@fromorbit.com>
cc: linux-doc@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: samba-technical@lists.samba.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-hardening@vger.kernel.org
Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1]
Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3]
Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4]
Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All filesystems have now been converted to use ->readahead, so
remove the ->readpages operation and fix all the comments that
used to refer to it.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
For example if mtime or size has changed.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Change the cifs filesystem to take account of the changes to fscache's
indexing rewrite and reenable caching in cifs.
The following changes have been made:
(1) The fscache_netfs struct is no more, and there's no need to register
the filesystem as a whole.
(2) The session cookie is now an fscache_volume cookie, allocated with
fscache_acquire_volume(). That takes three parameters: a string
representing the "volume" in the index, a string naming the cache to
use (or NULL) and a u64 that conveys coherency metadata for the
volume.
For cifs, I've made it render the volume name string as:
"cifs,<ipaddress>,<sharename>"
where the sharename has '/' characters replaced with ';'.
This probably needs rethinking a bit as the total name could exceed
the maximum filename component length.
Further, the coherency data is currently just set to 0. It needs
something else doing with it - I wonder if it would suffice simply to
sum the resource_id, vol_create_time and vol_serial_number or maybe
hash them.
(3) The fscache_cookie_def is no more and needed information is passed
directly to fscache_acquire_cookie(). The cache no longer calls back
into the filesystem, but rather metadata changes are indicated at
other times.
fscache_acquire_cookie() is passed the same keying and coherency
information as before.
(4) The functions to set/reset cookies are removed and
fscache_use_cookie() and fscache_unuse_cookie() are used instead.
fscache_use_cookie() is passed a flag to indicate if the cookie is
opened for writing. fscache_unuse_cookie() is passed updates for the
metadata if we changed it (ie. if the file was opened for writing).
These are called when the file is opened or closed.
(5) cifs_setattr_*() are made to call fscache_resize() to change the size
of the cache object.
(6) The functions to read and write data are stubbed out pending a
conversion to use netfslib.
Changes
=======
ver #8:
- Abstract cache invalidation into a helper function.
- Fix some checkpatch warnings[3].
ver #7:
- Removed the accidentally added-back call to get the super cookie in
cifs_root_iget().
- Fixed the right call to cifs_fscache_get_super_cookie() to take account
of the "-o fsc" mount flag.
ver #6:
- Moved the change of gfpflags_allow_blocking() to current_is_kswapd() for
cifs here.
- Fixed one of the error paths in cifs_atomic_open() to jump around the
call to use the cookie.
- Fixed an additional successful return in the middle of cifs_open() to
use the cookie on the way out.
- Only get a volume cookie (and thus inode cookies) when "-o fsc" is
supplied to mount.
ver #5:
- Fixed a couple of bits of cookie handling[2]:
- The cookie should be released in cifs_evict_inode(), not
cifsFileInfo_put_final(). The cookie needs to persist beyond file
closure so that writepages will be able to write to it.
- fscache_use_cookie() needs to be called in cifs_atomic_open() as it is
for cifs_open().
ver #4:
- Fixed the use of sizeof with memset.
- tcon->vol_create_time is __le64 so doesn't need cpu_to_le64().
ver #3:
- Canonicalise the cifs coherency data to make the cache portable.
- Set volume coherency data.
ver #2:
- Use gfpflags_allow_blocking() rather than using flag directly.
- Upgraded to -rc4 to allow for upstream changes[1].
- fscache_acquire_volume() now returns errors.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
cc: Steve French <smfrench@gmail.com>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: linux-cifs@vger.kernel.org
cc: linux-cachefs@redhat.com
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23b55d673d7527b093cd97b7c217c82e70cd1af0 [1]
Link: https://lore.kernel.org/r/3419813.1641592362@warthog.procyon.org.uk/ [2]
Link: https://lore.kernel.org/r/CAH2r5muTanw9pJqzAHd01d9A8keeChkzGsCEH6=0rHutVLAF-A@mail.gmail.com/ [3]
Link: https://lore.kernel.org/r/163819671009.215744.11230627184193298714.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906982979.143852.10672081929614953210.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967187187.1823006.247415138444991444.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021579335.640689.2681324337038770579.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/3462849.1641593783@warthog.procyon.org.uk/ # v5
Link: https://lore.kernel.org/r/1318953.1642024578@warthog.procyon.org.uk/ # v6
Signed-off-by: Steve French <stfrench@microsoft.com>
Windows SMB server responds with STATUS_OBJECT_NAME_INVALID code to
SMB2 QUERY_INFO request for "\<server>\<dfsname>\<linkpath>" DFS reference,
where <dfsname> contains non-ASCII unicode symbols.
Check such DFS reference and emulate -EREMOTE if it is actual.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215440
Signed-off-by: Eugene Korenevsky <ekorenevsky@astralinux.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
We have a cyclic dependency between fscache super cookie
and root inode cookie. The super cookie relies on
tcon->resource_id, which gets populated from the root inode
number. However, fetching the root inode initializes inode
cookie as a child of super cookie, which is yet to be populated.
resource_id is only used as auxdata to check the validity of
super cookie. We can completely avoid setting resource_id to
remove the circular dependency. Since vol creation time and
vol serial numbers are used for auxdata, we should be fine.
Additionally, there will be auxiliary data check for each
inode cookie as well.
Fixes: 5bf91ef03d ("cifs: wait for tcon resource_id before getting fscache super")
CC: David Howells <dhowells@redhat.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
The logic for initializing tcon->resource_id is done inside
cifs_root_iget. fscache super cookie relies on this for aux
data. So we need to push the fscache initialization to this
later point during mount.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
During unlink/rename instead of closing all the deferred handles
under tcon, close only handles under the requested dentry.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
checkpatch complains about source files with filenames (e.g. in
these cases just below the SPDX header in comments at the top of
various files in fs/cifs). It also is helpful to change this now
so will be less confusing when the parent directory is renamed
e.g. from fs/cifs to fs/smb_client (or fs/smbfs)
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
So far, the fscache implementation we had supports only
a small set of use cases. Particularly for files opened
with O_RDONLY.
This commit enables it even for rw based file opens. It
also enables the reuse of cached data in case of mount
option (cache=singleclient) where it is guaranteed that
this is the only client (and server) which operates on
the files. There's also a single line change in fscache.c
to get around a bug seen in fscache.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
When rename is executed on directory which has files for which
close is deferred, then rename will fail with EACCES.
This patch will try to close all deferred files when EACCES is received
and retry rename on a directory.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Cc: stable@vger.kernel.org # 5.13
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
We were trying to fill in uninitialized file attributes in the error case.
Addresses-Coverity: 139689 ("Uninitialized variables")
Signed-off-by: Steve French <stfrench@microsoft.com>
Add SPDX license identifier and replace license boilerplate.
Corrects various checkpatch errors with the older format for
noting the LGPL license.
Signed-off-by: Steve French <stfrench@microsoft.com>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmCSKIIACgkQiiy9cAdy
T1F+Hgv+L3NkOFwMvBgGjHP9b+Lkv/YWGKeJkLwQW1xqoHIUHn0/+C5+9ScJGBZc
WVuzp4pqEIgv4my4UQiyqVwzcmz4BqY2KTDJzBYtqANt6pVp1w6YtC2GplgJE3J2
qoQh1RwZaqSXfjcoPSRnv5EiSF6DbHlBUhPMd53qOE9pwaf/38i9/M3d9G7EIB8h
rRNmpGtFuzBHdtGQ2b+4+8ftCIpEBDu/OXcA6QXMUMcvKGaruU39NOxBuW6a/VO5
9P47Nsof3dlN758uesoQT2VMEc0pcpwAs9BwOkinfXWGUyNqJmbPNvddIOlaP/dv
vG58n/+JqvWUKgEnrNk5h+wD7wmXpgxpQ523sD5k6bID1hc+vh4lXf+O+iltbYtc
1ce9ITglSVxA7z4qwFWhtawBy1j1YyvltTAGvhnzdtKZLRk6e5AYIFOUn9O+AMJw
Eofk4lD0kNTdXyMkveGluRMBXrOzKMdmfw5FW/9hObYgebEGpTQkGyIMpaStraZM
8hDNAGTk
=BFOj
-----END PGP SIGNATURE-----
Merge tag '5.13-rc-smb3-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs updates from Steve French:
"Ten CIFS/SMB3 changes - including two marked for stable - including
some important multichannel fixes, as well as support for handle
leases (deferred close) and shutdown support:
- some important multichannel fixes
- support for handle leases (deferred close)
- shutdown support (which is also helpful since it enables multiple
xfstests)
- enable negotiating stronger encryption by default (GCM256)
- improve wireshark debugging by allowing more options for root to
dump decryption keys
SambaXP and the SMB3 Plugfest test event are going on now so I am
expecting more patches over the next few days due to extra testing
(including more multichannel fixes)"
* tag '5.13-rc-smb3-part2' of git://git.samba.org/sfrench/cifs-2.6:
fs/cifs: Fix resource leak
Cifs: Fix kernel oops caused by deferred close for files.
cifs: fix regression when mounting shares with prefix paths
cifs: use echo_interval even when connection not ready.
cifs: detect dead connections only when echoes are enabled.
smb3.1.1: allow dumping keys for multiuser mounts
smb3.1.1: allow dumping GCM256 keys to improve debugging of encrypted shares
cifs: add shutdown support
cifs: Deferred close for files
smb3.1.1: enable negotiating stronger encryption by default
Fix regression issue caused by deferred close for files.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Various filesystem support the shutdown ioctl which is used by various
xfstests. The shutdown ioctl sets a flag on the superblock which
prevents open, unlink, symlink, hardlink, rmdir, create etc.
on the file system until unmount and remounted. The two flags supported
in this patch are:
FSOP_GOING_FLAGS_LOGFLUSH and FSOP_GOING_FLAGS_NOLOGFLUSH
which require very little other than blocking new operations (since
we do not cache writes to metadata on the client with cifs.ko).
FSOP_GOING_FLAGS_DEFAULT is not supported yet, but could be added in
the future but would need to call syncfs or equivalent to write out
pending data on the mount.
With this patch various xfstests now work including tests 043 through
046 for example.
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
When file is closed, SMB2 close request is not sent to server
immediately and is deferred for acregmax defined interval. When file is
reopened by same process for read or write, the file handle
is reused if an oplock is held.
When client receives a oplock/lease break, file is closed immediately
if reference count is zero, else oplock is downgraded.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull vfs inode type handling updates from Al Viro:
"We should never change the type bits of ->i_mode or the method tables
(->i_op and ->i_fop) of a live inode.
Unfortunately, not all filesystems took care to prevent that"
* 'work.inode-type-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
spufs: fix bogosity in S_ISGID handling
9p: missing chunk of "fs/9p: Don't update file type when updating file attributes"
openpromfs: don't do unlock_new_inode() until the new inode is set up
hostfs_mknod(): don't bother with init_special_inode()
cifs: have cifs_fattr_to_inode() refuse to change type on live inode
cifs: have ->mkdir() handle race with another client sanely
do_cifs_create(): don't set ->i_mode of something we had not created
gfs2: be careful with inode refresh
ocfs2_inode_lock_update(): make sure we don't change the type bits of i_mode
orangefs_inode_is_stale(): i_mode type bits do *not* form a bitmap...
vboxsf: don't allow to change the inode type
afs: Fix updating of i_mode due to 3rd party change
ceph: don't allow type or device number to change on non-I_NEW inodes
ceph: fix up error handling with snapdirs
new helper: inode_wrong_type()
Needed for the final patch in the directory caching series
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
build_path_from_dentry() open-codes dentry_path_raw(). The reason
we can't use dentry_path_raw() in there (and postprocess the
result as needed) is that the callers of build_path_from_dentry()
expect that the object to be freed on cleanup and the string to
be used are at the same address. That's painful, since the path
is naturally built end-to-beginning - we start at the leaf and
go through the ancestors, accumulating the pathname.
Life would be easier if we left the buffer allocation to callers.
It wouldn't be exact-sized buffer, but none of the callers keep
the result for long - it's always freed before the caller returns.
So there's no need to do exact-sized allocation; better use
__getname()/__putname(), same as we do for pathname arguments
of syscalls. What's more, there's no need to do allocation under
spinlocks, so GFP_ATOMIC is not needed.
Next patch will replace the open-coded dentry_path_raw() (in
build_path_from_dentry_optional_prefix()) with calling the real
thing. This patch only introduces wrappers for allocating/freeing
the buffers and switches to new calling conventions:
build_path_from_dentry(dentry, buf)
expects buf to be address of a page-sized object or NULL,
return value is a pathname built inside that buffer on success,
ERR_PTR(-ENOMEM) if buf is NULL and ERR_PTR(-ENAMETOOLONG) if
the pathname won't fit into page. Note that we don't need to
check for failure when allocating the buffer in the caller -
build_path_from_dentry() will do the right thing.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steve French <stfrench@microsoft.com>
While reviewing a patch clarifying locks and locking hierarchy I
realized some locks were unused.
This commit removes old data and code that isn't actually used
anywhere, or hidden in ifdefs which cannot be enabled from the kernel
config.
* The uid/gid trees and associated locks are left-overs from when
uid/sid mapping had an extra caching layer on top of the keyring and
are now unused.
See commit faa65f07d2 ("cifs: simplify id_to_sid and sid_to_id mapping code")
from 2012.
* cifs_oplock_break_ops is a left-over from when slow_work was remplaced
by regular workqueue and is now unused.
See commit 9b64697246 ("cifs: use workqueue instead of slow-work")
from 2010.
* CIFSSMBSetAttrLegacy is SMB1 cruft dealing with some legacy
NT4/Win9x behaviour.
* Remove CONFIG_CIFS_DNOTIFY_EXPERIMENTAL left-overs. This was already
partially removed in 392e1c5dc9 ("cifs: rename and clarify CIFS_ASYNC_OP and CIFS_NO_RESP")
from 2019. Kill it completely.
* Another candidate that was considered but spared is
CONFIG_CIFS_NFSD_EXPORT which has an empty implementation and cannot
be enabled by a config option (although it is listed but disabled with
"BROKEN" as a dep). It's unclear whether this could even function
today in its current form but it has it's own .c file and Kconfig
entry which is a bit more involved to remove and might make a come
back?
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Applications that create and extend and write to a file do not
expect to see 0 allocation size. When file is extended,
set its allocation size to a plausible value until we have a
chance to query the server for it. When the file is cached
this will prevent showing an impossible number of allocated
blocks (like 0). This fixes e.g. xfstests 614 which does
1) create a file and set its size to 64K
2) mmap write 64K to the file
3) stat -c %b for the file (to query the number of allocated blocks)
It was failing because we returned 0 blocks. Even though we would
return the correct cached file size, we returned an impossible
allocation size.
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: <stable@vger.kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
if we have mkdir request reported successful *and* simulating lookup
gets us a non-directory (which is possible if another client has
managed to get rmdir and create in between), the sane action is not
to mangle ->i_mode of non-directory inode to S_IFDIR | mode, it's
"report success and return with dentry negative unhashed" - that
way the next lookup will do the right thing.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
inode_wrong_type(inode, mode) returns true if setting inode->i_mode
to given value would've changed the inode type. We have enough of
those checks open-coded to make a helper worthwhile.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmA4bocACgkQiiy9cAdy
T1HuMwv/bmZ53ilDwhCph/UKoLJm/bRyvCp+GBECtLLS/C/4qz5IBLpPPr2yhOyH
gmkeCZZWhj0nzGAYxhVDAdBz9IPEA7bae503IaOuk5uXaCXC8htsq/Cd7qpJmHlf
5vCdBTmHBiUt02dUcZ9A3bm855xJLEINHH9YdJM157ysqLgttIibLQB0F/gJ49DR
QIIdq7sZNJXcTgRsUzJZNnrWLDi2oVIoUlq5M6d8ypmZC0ArPNfrSafjW6h5rqpj
UYBwtUDNwQiS0lgwR4mji4PCen0GGwMFtyVDOpdJLJq3fO995yse2BRk0BFHVH1i
xfAskQjkxAHEcfzQC1cM4ouT/WYu8nHaLK1vp/1lVr93mo8KqSX+SW/bLsXfpQkm
w//xMy94HdM2pgyM6J1pYnKPb7s/DG19RYPktQ5oYn0fR5qYlqALAmd02JRjO3xV
cbjbmWXXzFFsFJc5MJmM6wVLJzRb4a50SN1W37aHNXWi8ktpsaNzz33LGw1pF2OT
K6P2DqJe
=RQo/
-----END PGP SIGNATURE-----
Merge tag '5.12-smb3-part1' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs updates from Steve French:
- improvements to mode bit conversion, chmod and chown when using
cifsacl mount option
- two new mount options for controlling attribute caching
- improvements to crediting and reconnect, improved debugging
- reconnect fix
- add SMB3.1.1 dialect to default dialects for vers=3
* tag '5.12-smb3-part1' of git://git.samba.org/sfrench/cifs-2.6: (27 commits)
cifs: update internal version number
cifs: use discard iterator to discard unneeded network data more efficiently
cifs: introduce helper for finding referral server to improve DFS target resolution
cifs: check all path components in resolved dfs target
cifs: fix DFS failover
cifs: fix nodfs mount option
cifs: fix handling of escaped ',' in the password mount argument
cifs: Add new parameter "acregmax" for distinct file and directory metadata timeout
cifs: convert revalidate of directories to using directory metadata cache timeout
cifs: Add new mount parameter "acdirmax" to allow caching directory metadata
cifs: If a corrupted DACL is returned by the server, bail out.
cifs: minor simplification to smb2_is_network_name_deleted
TCON Reconnect during STATUS_NETWORK_NAME_DELETED
cifs: cleanup a few le16 vs. le32 uses in cifsacl.c
cifs: Change SIDs in ACEs while transferring file ownership.
cifs: Retain old ACEs when converting between mode bits and ACL.
cifs: Fix cifsacl ACE mask for group and others.
cifs: clarify hostname vs ip address in /proc/fs/cifs/DebugData
cifs: change confusing field serverName (to ip_addr)
cifs: Fix inconsistent IS_ERR and PTR_ERR
...
The new optional mount parameter "acregmax" allows a different
timeout for file metadata ("acdirmax" now allows controlling timeout
for directory metadata). Setting "actimeo" still works as before,
and changes timeout for both files and directories, but
specifying "acregmax" or "acdirmax" allows overriding the
default more granularly which can be a big performance benefit
on some workloads. "acregmax" is already used by NFS as a mount
parameter (albeit with a larger default and thus looser caching).
Suggested-by: Tom Talpey <tom@talpey.com>
Reviewed-By: Tom Talpey <tom@talpey.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
The new optional mount parm, "acdirmax" allows caching the metadata
for a directory longer than file metadata, which can be very helpful
for performance. Convert cifs_inode_needs_reval to check acdirmax
for revalidating directory metadata.
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-By: Tom Talpey <tom@talpey.com>
Signed-off-by: Steve French <stfrench@microsoft.com>