Commit Graph

61844 Commits

Author SHA1 Message Date
Jan Kara
a4a8b99ec8 udf: Fix free space reporting for metadata and virtual partitions
Free space on filesystems with metadata or virtual partition maps
currently gets misreported. This is because these partitions are just
remapped onto underlying real partitions from which keep track of free
blocks. Take this remapping into account when counting free blocks as
well.

Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Reported-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-09 18:38:18 +01:00
Pali Rohár
6146446763 udf: Update header files to UDF 2.60
This change synchronizes header files ecma_167.h and osta_udf.h with
udftools 2.2 project which already has definitions for UDF 2.60 revision.

Link: https://lore.kernel.org/r/20200107212904.30471-3-pali.rohar@gmail.com
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-08 11:12:20 +01:00
Pali Rohár
871b9b14c6 udf: Move OSTA Identifier Suffix macros from ecma_167.h to osta_udf.h
Rename structure name and its members to match naming convention and fix
endianity type for UDFRevision member. Also remove duplicate definition of
UDF_ID_COMPLIANT which is already in osta_udf.h.

Link: https://lore.kernel.org/r/20200107212904.30471-2-pali.rohar@gmail.com
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-08 11:12:18 +01:00
Pali Rohár
800552ceec udf: Fix spelling in EXT_NEXT_EXTENT_ALLOCDESCS
Change EXT_NEXT_EXTENT_ALLOCDECS to proper spelling
EXT_NEXT_EXTENT_ALLOCDESCS.

Link: https://lore.kernel.org/r/20200107212904.30471-1-pali.rohar@gmail.com
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-08 11:11:46 +01:00
Nathan Chancellor
d9e9866803 ext2: Adjust indentation in ext2_fill_super
Clang warns:

../fs/ext2/super.c:1076:3: warning: misleading indentation; statement is
not part of the previous 'if' [-Wmisleading-indentation]
        sbi->s_groups_count = ((le32_to_cpu(es->s_blocks_count) -
        ^
../fs/ext2/super.c:1074:2: note: previous statement is here
        if (EXT2_BLOCKS_PER_GROUP(sb) == 0)
        ^
1 warning generated.

This warning occurs because there is a space before the tab on this
line. Remove it so that the indentation is consistent with the Linux
kernel coding style and clang no longer warns.

Fixes: 41f04d852e ("[PATCH] ext2: fix mounts at 16T")
Link: https://github.com/ClangBuiltLinux/linux/issues/827
Link: https://lore.kernel.org/r/20191218031930.31393-1-natechancellor@gmail.com
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-06 10:09:35 +01:00
Arnd Bergmann
1ead083ae1 quota: avoid time_t in v1_disk_dqblk definition
The time_t type is part of the user interface and not always the
same, with the move to 64-bit timestamps and the difference between
architectures.

Make the quota format definition independent of this type and use
a basic type of the same length. Make it unsigned in the process
to keep the v1 format working until year 2106 instead of 2038
on 32-bit architectures.

Hopefully, everybody has already moved to a newer format long
ago (v2 was introduced with linux-2.4), but it's hard to be sure.

Link: https://lore.kernel.org/r/20191213205221.3787308-6-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-12-16 14:15:30 +01:00
Jan Kara
4d5c1adaf8 reiserfs: Fix spurious unlock in reiserfs_fill_super() error handling
When we fail to allocate string for journal device name we jump to
'error' label which tries to unlock reiserfs write lock which is not
held. Jump to 'error_unlocked' instead.

Fixes: f32485be83 ("reiserfs: delay reiserfs lock until journal initialization")
Signed-off-by: Jan Kara <jack@suse.cz>
2019-12-16 12:59:32 +01:00
Jan Kara
5474ca7da6 reiserfs: Fix memory leak of journal device string
When a filesystem is mounted with jdev mount option, we store the
journal device name in an allocated string in superblock. However we
fail to ever free that string. Fix it.

Reported-by: syzbot+1c6756baf4b16b94d2a6@syzkaller.appspotmail.com
Fixes: c3aa077648 ("reiserfs: Properly display mount options in /proc/mounts")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
2019-12-16 12:59:32 +01:00
Chengguang Xu
34e92542da ext2: set proper errno in error case of ext2_fill_super()
Set proper errno in the case of failure of
initializing percpu variables.

Link: https://lore.kernel.org/r/20191129013636.7624-1-cgxu519@mykernel.net
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-12-16 12:59:14 +01:00
Linus Torvalds
2e6d304515 Merge branch 'remove-ksys-mount-dup' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux
Pull ksys_mount() and ksys_dup() removal from Dominik Brodowski:
 "This small series replaces all in-kernel calls to the
  userspace-focused ksys_mount() and ksys_dup() with calls to
  kernel-centric functions:

  For each replacement of ksys_mount() with do_mount(), one needs to
  verify that the first and third parameter (char *dev_name, char *type)
  are strings allocated in kernelspace and that the fifth parameter
  (void *data) is either NULL or refers to a full page (only occurence
  in init/do_mounts.c::do_mount_root()). The second and fourth
  parameters (char *dir_name, unsigned long flags) are passed by
  ksys_mount() to do_mount() unchanged, and therefore do not require
  particular care.

  Moreover, instead of pretending to be userspace, the opening of
  /dev/console as stdin/stdout/stderr can be implemented using in-kernel
  functions as well. Thereby, ksys_dup() can be removed for good"

[ This doesn't get rid of the special "kernel init runs with KERNEL_DS"
  case, but it at least removes _some_ of the users of "treat kernel
  pointers as user pointers for our magical init sequence".

  One day we'll hopefully be rid of it all, and can initialize our
  init_thread addr_limit to USER_DS.    - Linus ]

* 'remove-ksys-mount-dup' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux:
  fs: remove ksys_dup()
  init: unify opening /dev/console as stdin/stdout/stderr
  init: use do_mount() instead of ksys_mount()
  initrd: use do_mount() instead of ksys_mount()
  devtmpfs: use do_mount() instead of ksys_mount()
2019-12-15 11:36:12 -08:00
Linus Torvalds
103a022d6b 3 small smb3 fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCAAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl30eFcACgkQiiy9cAdy
 T1FrCAv+Lwq/7pwyzoS0F1DWXJfs+3a2KCR2QSnZE9Mok8AobvRnNS9dZcXZdp0U
 RnSFfkv9ia5vXegO0899nM6n0jtS5/OCex9RITVBtvIk8HLKrpcmYP+gS3III5Qq
 oMF11JZCWDI2HHJA6xEV677rgFjOiEDjtjaZrXOS2TClnBCU3ZDRghul46DKACbA
 xQDr0ifgOcDdxKBSTERhGvKA6xOf+gPP73SKB5Kg6OaPWL9FjhGvN/1ic2LVmFHF
 cc7rbXhl1jTnKfw22qrS5XsElBSQdbM7X23CJ9ik8zXpF8gLBjGefx894xyFLELb
 efJcHC1vroBc11HfLaLmAoQRp7leDIP4Icyq2TqJC6+mWsxJ0G96ofn7tGR9z/FK
 SlcggWkrC8frJQQoYMYylQcknXK9+WahyyNswMXFiUYU7XSojyx4MoBtvRavOI5l
 JKJtoiZd3OPtXOjvnKzNqSFcHC3YNQDE9GmIOXHgRuhFcnO0UQOKvbwXik2HCXBp
 7A1dxRob
 =65tY
 -----END PGP SIGNATURE-----

Merge tag '5.5-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cfis fixes from Steve French:
 "Three small smb3 fixes: this addresses two recent issues reported in
  additional testing during rc1, a refcount underflow and a problem with
  an intermittent crash in SMB2_open_init"

* tag '5.5-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Close cached root handle only if it has a lease
  SMB3: Fix crash in SMB2_open_init due to uninitialized field in compounding path
  smb3: fix refcount underflow warning on unmount when no directory leases
2019-12-14 11:44:14 -08:00
Linus Torvalds
81c64b0bd0 overlayfs fixes for 5.5-rc2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXfNhGQAKCRDh3BK/laaZ
 PGSEAP9Nyv3XCN2wdqMLdrgn07B3Pk9w2Unf3Y5amKOxNXqyQwEAy2/E6DCiGjSa
 WRheJoTgDSeqUQNY6GFHsCIgLWOCHgs=
 =WH5O
 -----END PGP SIGNATURE-----

Merge tag 'ovl-fixes-5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs fixes from Miklos Szeredi:
 "Fix some bugs and documentation"

* tag 'ovl-fixes-5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  docs: filesystems: overlayfs: Fix restview warnings
  docs: filesystems: overlayfs: Rename overlayfs.txt to .rst
  ovl: relax WARN_ON() on rename to self
  ovl: fix corner case of non-unique st_dev;st_ino
  ovl: don't use a temp buf for encoding real fh
  ovl: make sure that real fid is 32bit aligned in memory
  ovl: fix lookup failure on multi lower squashfs
2019-12-14 11:13:54 -08:00
Linus Torvalds
5bd831a469 io_uring-5.5-20191212
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl3y5kIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpv0MEADY9LOC2JkG2aNP53gCXGu74rFQNstJfusr
 xPpkMs5I7hxoeVp/bkVAvR6BbpfPDsyGMhSMURELgMFzkuw+03z2IbxVuexdGOEu
 X1+6EkunAq/r331fXL+cXUKJM3ThlkUBbNTuV8z5oWWSM1EfoBAWYfh2u4L4oBcc
 yIHRH8by9qipx+fhBWPU/ZWeCcMVt5Ju2b6PHAE/GWwteZh00PsTwq0oqQcVcm/5
 4Ennr0OELb7LaZWblAIIZe96KeJuCPzGfbRV/y12Ne/t6STH/sWA1tMUzgT22Eju
 27uXtwAJtoLsep4V66a4VYObs/hXmEP281wIsXEnIkX0YCvbAiM+J6qVHGjYSMGf
 mRrzfF6nDC1vaMduE4waoO3VFiDFQU/qfQRa21ZP50dfQOAXTpEz8kJNG/PjN1VB
 gmz9xsujDU1QPD7IRiTreiPPHE5AocUzlUYuYOIMt7VSbFZIMHW5S/pML93avkJt
 nx6g3gOP0JKkjaCEvIKY4VLljDT8eDZ/WScDsSedSbPZikMzEo8DSx4U6nbGPqzy
 qSljgQniDcrH8GdRaJFDXgLkaB8pu83NH7zUH+xioUZjAHq/XEzKQFYqysf1DbtU
 SEuSnUnLXBvbwfb3Z2VpQ/oz8G3a0nD9M7oudt1sN19oTCJKxYSbOUgNUrj9JsyQ
 QlAVrPKRkQ==
 =fbsf
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.5-20191212' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

 - A tweak to IOSQE_IO_LINK (also marked for stable) to allow links that
   don't sever if the result is < 0.

   This is mostly for linked timeouts, where if we ask for a pure
   timeout we always get -ETIME. This makes links useless for that case,
   hence allow a case where it works.

 - Five minor optimizations to fix and improve cases that regressed
   since v5.4.

 - An SQTHREAD locking fix.

 - A sendmsg/recvmsg iov assignment fix.

 - Net fix where read_iter/write_iter don't honor IOCB_NOWAIT, and
   subsequently ensuring that works for io_uring.

 - Fix a case where for an invalid opcode we might return -EBADF instead
   of -EINVAL, if the ->fd of that sqe was set to an invalid fd value.

* tag 'io_uring-5.5-20191212' of git://git.kernel.dk/linux-block:
  io_uring: ensure we return -EINVAL on unknown opcode
  io_uring: add sockets to list of files that support non-blocking issue
  net: make socket read/write_iter() honor IOCB_NOWAIT
  io_uring: only hash regular files for async work execution
  io_uring: run next sqe inline if possible
  io_uring: don't dynamically allocate poll data
  io_uring: deferred send/recvmsg should assign iov
  io_uring: sqthread should grab ctx->uring_lock for submissions
  io-wq: briefly spin for new work after finishing work
  io-wq: remove worker->wait waitqueue
  io_uring: allow unbreakable links
2019-12-13 14:24:54 -08:00
Linus Torvalds
22ff311af9 treewide conversion from FIELD_SIZEOF() to sizeof_field()
-----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl3umDgWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJlvsD/49R12HK7UzTxNTrcpvbadJ4t7j
 j/qJvjMerW7iVNAPOoNAOePUa21+y3rI1AZPvoPyzIqp1Bf2eOICf5SdisG2cG+O
 X0A8EKWvS0SSQWSKaT6udUKJ3nBJItwvOvQ5B58KQzcOj3S4X7B9iVBWgieMHrzz
 urkZm7pqowrZB3wuF8keRtli5IZaoiCwzApy48Qrn70G3OeXymknFbpHTDwIAiGw
 RiE5Xh0R4EzQdsYyCgjR8U56gBchadAmj8BUJU0ppMnOFMyIAG670hNLrs0L3roP
 8TOIeyb993ZC5GZaMlnR8mz0jfibfkPa3Z85VAsVyQSPaOQldwc9j8TGBqD5Gfat
 1PjOU5RVwma0pH5xTPOeevWPQpIK9KovQpQYqMMN9GMxOEx96IOUjwTrnNK2xWoN
 UGyOVlESFGoniClhCiKYzPSrYOjlIBk5ovf15PdTe+bwyUDMfyfy5CZV88OS2DHz
 ZBZvpLrH/EMW9zJ+FqMTp0C4s4wa2Ioid3bSh6XuNUTtltKSjp71eUja8ZEz+2sd
 5AGstCC+hYqxaEk+6/851pfkQ9sbBjwuGtNrtX+pqreiLUvWLhQ0yUj6cLXlEQNH
 aucjCukCjI+4lMzofeaQ2LbNhtff4YsfO4b1Ye8maoDdHjzUVL57n3bTOxKhdzbt
 y6FM3lApOjk3OyaTJQ==
 =YU4A
 -----END PGP SIGNATURE-----

Merge tag 'sizeof_field-v5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull FIELD_SIZEOF conversion from Kees Cook:
 "A mostly mechanical treewide conversion from FIELD_SIZEOF() to
  sizeof_field(). This avoids the redundancy of having 2 macros
  (actually 3) doing the same thing, and consolidates on sizeof_field().
  While "field" is not an accurate name, it is the common name used in
  the kernel, and doesn't result in any unintended innuendo.

  As there are still users of FIELD_SIZEOF() in -next, I will clean up
  those during this coming development cycle and send the final old
  macro removal patch at that time"

* tag 'sizeof_field-v5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  treewide: Use sizeof_field() macro
  MIPS: OCTEON: Replace SIZEOF_FIELD() macro
2019-12-13 14:02:12 -08:00
Pavel Shilovsky
d919131935 CIFS: Close cached root handle only if it has a lease
SMB2_tdis() checks if a root handle is valid in order to decide
whether it needs to close the handle or not. However if another
thread has reference for the handle, it may end up with putting
the reference twice. The extra reference that we want to put
during the tree disconnect is the reference that has a directory
lease. So, track the fact that we have a directory lease and
close the handle only in that case.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-12-13 00:49:57 -06:00
Steve French
e0fc5b1153 SMB3: Fix crash in SMB2_open_init due to uninitialized field in compounding path
Ran into an intermittent crash in
	SMB2_open_init+0x2f6/0x970
due to oparms.cifs_sb not being initialized when called from:
	smb2_compound_op+0x45d/0x1690
Zero the whole oparms struct in the compounding path before setting up the
oparms so we don't risk any uninitialized fields.

Fixes: fdef665ba4 ("smb3: fix mode passed in on create for modetosid mount option")

Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-12-13 00:49:38 -06:00
Linus Torvalds
37d4e84f76 A fix to avoid a corner case when scheduling cap reclaim in batches
from Xiubo, a patch to add some observability into cap waiters from
 Jeff and a couple of cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAl3yiJMTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi5k9CACmM3fJGrTUuOLgXAxxllCfiV6UQLoY
 nuTo/bx0DmG603n+Ze8+Z0iz7hDc1Gw2XUeLkJcAE/xSetgZXO/MvJ0Ionq5Ac/k
 CrqS6ucIa1bPxbE1QMTHswHjkajKwBpAZ5+khdLNLuXJxy3c9HDCGOT4VZav7Yc9
 99W4kIdzOKdYLpZHAedMK97IJIrD5WhYTAFW4rNPY0GL6OPD1V0uiS9v7xUWIxnZ
 Uusnu+zY8miQlLVx/V9DyLh/6G5X7XyQO1nkSQcVXZOOG7+qnkq6jDhQW8adgOSZ
 wUFigTxxhSTIcntWg01TaCRNoi1N3/P8Z9/rD27zBHPbl93ANH+lUkCh
 =NicF
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.5-rc2' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A fix to avoid a corner case when scheduling cap reclaim in batches
  from Xiubo, a patch to add some observability into cap waiters from
  Jeff and a couple of cleanups"

* tag 'ceph-for-5.5-rc2' of git://github.com/ceph/ceph-client:
  ceph: add more debug info when decoding mdsmap
  ceph: switch to global cap helper
  ceph: trigger the reclaim work once there has enough pending caps
  ceph: show tasks waiting on caps in debugfs caps file
  ceph: convert int fields in ceph_mount_options to unsigned int
2019-12-12 10:56:37 -08:00
Dominik Brodowski
8243186f0c fs: remove ksys_dup()
ksys_dup() is used only at one place in the kernel, namely to duplicate
fd 0 of /dev/console to stdout and stderr. The same functionality can be
achieved by using functions already available within the kernel namespace.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2019-12-12 19:00:36 +01:00
Dominik Brodowski
cccaa5e335 init: use do_mount() instead of ksys_mount()
In prepare_namespace(), do_mount() can be used instead of ksys_mount()
as the first and third argument are const strings in the kernel, the
second and fourth argument are passed through anyway, and the fifth
argument is NULL.

In do_mount_root(), ksys_mount() is called with the first and third
argument being already kernelspace strings, which do not need to be
copied over from userspace to kernelspace (again). The second and
fourth arguments are passed through to do_mount() anyway. The fifth
argument, while already residing in kernelspace, needs to be put into
a page of its own. Then, do_mount() can be used instead of
ksys_mount().

Once this is done, there are no in-kernel users to ksys_mount() left,
which can therefore be removed.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2019-12-12 14:50:05 +01:00
Linus Torvalds
ae4b064e2a AFS fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl3xW+cACgkQ+7dXa6fL
 C2t+UQ/9ETwAYJ2XgfwatNhAHX0+Tf4XMlgEDybf5a+ERRLbkSQEqRjTImopk5m4
 T7yKea0ctsVb1avdUVdayh2KuJhjmO627u3w48kd/uf2ut4IyOyaraatYKYOJ+XL
 upr8WxAMXUHgnwsb38avjH0dwq7+eDyX8FNLHiDY4Sk3mTAVHbafL6V0ujp0ms2o
 VmHLX6ihwECehUqrNEyy1pLX2nuFmd+MeLnoi/EWLa47/X0te21G8u8UdPWHGgHn
 cZp8kqWSEoaeChO7x6XOoJZCY6N/7o+hogsrU/N5YrB6FXPpFWGdwFpej3YcyFso
 QqffyXX5jVAyUg9I/v3WCrtDmTQ9xVG4kxhmBMVId6bBk3ZVpGaHuLE3ENLbr1w/
 Z1k26BI41nJwrqV0C2bPoMalpDprP2WIuWsBIpYTqaICpc53KJ72c6mLhcDEt+oc
 YCSd9X0Bv5O7tZS9rLI8mt0JtCr9Uw+VfnK+uOuUTPv2YcqQwK1/xZ0/9hd2P8XS
 L5GKcEeuk3indgmefjwsz5YJcmzeCttaOQTlmaN/Hz3MbkK2CC+spMG4OZWzUXL4
 axZmCIo/kKrv64bLyVZul6BjkhawyvfWzCs2NM6Ni63akW3Yb0S2WRdgJhvGILml
 48YgqcG7R1y8LRPYiJzcuAPnbumuu6ZSxVuyZN0FU0qc4lt5mPM=
 =ldOp
 -----END PGP SIGNATURE-----

Merge tag 'afs-fixes-20191211' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:
 "Fixes for AFS plus one patch to make debugging easier:

   - Fix how addresses are matched to server records. This is currently
     incorrect which means cache invalidation callbacks from the server
     don't necessarily get delivered correctly. This causes stale data
     and metadata to be seen under some circumstances.

   - Make the dynamic root superblock R/W so that rpm/dnf can reapply
     the SELinux label to it when upgrading the Fedora filesystem-afs
     package. If the filesystem is R/O, this fails and the upgrade
     fails.

     It might be better in future to allow setxattr from an LSM to
     bypass the R/O protections, if only for pseudo-filesystems.

   - Fix the parsing of mountpoint strings. The mountpoint object has to
     have a terminal dot, whereas the source/device string passed to
     mount should not. This confuses type-forcing suffix detection
     leading to the wrong volume variant being mounted.

   - Make lookups in the dynamic root superblock for creation events
     (such as mkdir) fail with EOPNOTSUPP rather than something like
     EEXIST. The dynamic root only allows implicit creation by the
     ->lookup() method - and only if the target cell exists.

   - Fix the looking up of an AFS superblock to include the cell in the
     matching key - otherwise all volumes with the same ID number are
     treated as the same thing, irrespective of which cell they're in.

   - Show the volume name of each volume in the volume records displayed
     in /proc/net/afs/<cell>/volumes. This proved useful in debugging as
     it provides a way to map the volume IDs to names, where the names
     are what appear in /proc/mounts"

* tag 'afs-fixes-20191211' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Show volume name in /proc/net/afs/<cell>/volumes
  afs: Fix missing cell comparison in afs_test_super()
  afs: Fix creation calls in the dynamic root to fail with EOPNOTSUPP
  afs: Fix mountpoint parsing
  afs: Fix SELinux setting security label on /afs
  afs: Fix afs_find_server lookups for ipv4 peers
2019-12-11 18:10:40 -08:00
Jens Axboe
9e3aa61ae3 io_uring: ensure we return -EINVAL on unknown opcode
If we submit an unknown opcode and have fd == -1, io_op_needs_file()
will return true as we default to needing a file. Then when we go and
assign the file, we find the 'fd' invalid and return -EBADF. We really
should be returning -EINVAL for that case, as we normally do for
unsupported opcodes.

Change io_op_needs_file() to have the following return values:

0   - does not need a file
1   - does need a file
< 0 - error value

and use this to pass back the right value for this invalid case.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-11 16:02:32 -07:00
Linus Torvalds
687dec9b94 Changes since last update:
- Fix improper return value of listxattr() with no xattr;
 
 - Keep up documentation with latest code.
 -----BEGIN PGP SIGNATURE-----
 
 iIwEABYIADQWIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCXfELlBYcZ2FveGlhbmcy
 NUBodWF3ZWkuY29tAAoJEDk3MdwfteYEtUABAN164UwGU9QKEsqgZQcmbz23qXSJ
 QDR8r/ch2LxzXKkVAQDXCNU+ol6jkiapLcTvsXEjBk8sUxsCEVnmZ36jru+TBA==
 =kRp9
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-5.5-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:
 "Mainly address a regression reported by David recently observed
  together with overlayfs due to the improper return value of
  listxattr() without xattr. Update outdated expressions in document as
  well.

  Summary:

   - Fix improper return value of listxattr() with no xattr

   - Keep up documentation with latest code"

* tag 'erofs-for-5.5-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: update documentation
  erofs: zero out when listxattr is called with no xattr
2019-12-11 12:25:32 -08:00
Linus Torvalds
d1c6a2aa02 pipe: simplify signal handling in pipe_read() and add comments
There's no need to separately check for signals while inside the locked
region, since we're going to do "wait_event_interruptible()" right
afterwards anyway, and the error handling is much simpler there.

The check for whether we had already read anything was also redundant,
since we no longer do the odd merging of reads when there are pending
writers.

But perhaps more importantly, this adds commentary about why we still
need to wake up possible writers even though we didn't read any data,
and why we can skip all the finishing touches now if we get a signal (or
had a signal pending) while waiting for more data.

[ This is a split-out cleanup from my "make pipe IO use exclusive wait
  queues" thing, which I can't apply because it triggers a nasty bug in
  the GNU make jobserver   - Linus ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-11 11:46:19 -08:00
David Howells
50559800b7 afs: Show volume name in /proc/net/afs/<cell>/volumes
Show the name of each volume in /proc/net/afs/<cell>/volumes to make it
easier to work out the name corresponding to a volume ID.  This makes it
easier to work out which mounts in /proc/mounts correspond to which volume
ID.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
2019-12-11 17:48:20 +00:00
David Howells
106bc79843 afs: Fix missing cell comparison in afs_test_super()
Fix missing cell comparison in afs_test_super().  Without this, any pair
volumes that have the same volume ID will share a superblock, no matter the
cell, unless they're in different network namespaces.

Normally, most users will only deal with a single cell and so they won't
see this.  Even if they do look into a second cell, they won't see a
problem unless they happen to hit a volume with the same ID as one they've
already got mounted.

Before the patch:

    # ls /afs/grand.central.org/archive
    linuxdev/  mailman/  moin/  mysql/  pipermail/  stage/  twiki/
    # ls /afs/kth.se/
    linuxdev/  mailman/  moin/  mysql/  pipermail/  stage/  twiki/
    # cat /proc/mounts | grep afs
    none /afs afs rw,relatime,dyn,autocell 0 0
    #grand.central.org:root.cell /afs/grand.central.org afs ro,relatime 0 0
    #grand.central.org:root.archive /afs/grand.central.org/archive afs ro,relatime 0 0
    #grand.central.org:root.archive /afs/kth.se afs ro,relatime 0 0

After the patch:

    # ls /afs/grand.central.org/archive
    linuxdev/  mailman/  moin/  mysql/  pipermail/  stage/  twiki/
    # ls /afs/kth.se/
    admin/        common/  install/  OldFiles/  service/  system/
    bakrestores/  home/    misc/     pkg/       src/      wsadmin/
    # cat /proc/mounts | grep afs
    none /afs afs rw,relatime,dyn,autocell 0 0
    #grand.central.org:root.cell /afs/grand.central.org afs ro,relatime 0 0
    #grand.central.org:root.archive /afs/grand.central.org/archive afs ro,relatime 0 0
    #kth.se:root.cell /afs/kth.se afs ro,relatime 0 0

Fixes: ^1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Carsten Jacobi <jacobi@de.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: Jonathan Billings <jsbillings@jsbillings.org>
cc: Todd DeSantis <atd@us.ibm.com>
2019-12-11 17:47:51 +00:00
David Howells
1da4bd9f9d afs: Fix creation calls in the dynamic root to fail with EOPNOTSUPP
Fix the lookup method on the dynamic root directory such that creation
calls, such as mkdir, open(O_CREAT), symlink, etc. fail with EOPNOTSUPP
rather than failing with some odd error (such as EEXIST).

lookup() itself tries to create automount directories when it is invoked.
These are cached locally in RAM and not committed to storage.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: Jonathan Billings <jsbillings@jsbillings.org>
2019-12-11 17:47:51 +00:00
David Howells
158d583353 afs: Fix mountpoint parsing
Each AFS mountpoint has strings that define the target to be mounted.  This
is required to end in a dot that is supposed to be stripped off.  The
string can include suffixes of ".readonly" or ".backup" - which are
supposed to come before the terminal dot.  To add to the confusion, the "fs
lsmount" afs utility does not show the terminal dot when displaying the
string.

The kernel mount source string parser, however, assumes that the terminal
dot marks the suffix and that the suffix is always "" and is thus ignored.
In most cases, there is no suffix and this is not a problem - but if there
is a suffix, it is lost and this affects the ability to mount the correct
volume.

The command line mount command, on the other hand, is expected not to
include a terminal dot - so the problem doesn't arise there.

Fix this by making sure that the dot exists and then stripping it when
passing the string to the mount configuration.

Fixes: bec5eb6141 ("AFS: Implement an autocell mount capability [ver #2]")
Reported-by: Jonathan Billings <jsbillings@jsbillings.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: Jonathan Billings <jsbillings@jsbillings.org>
2019-12-11 16:56:54 +00:00
Jens Axboe
10d5934557 io_uring: add sockets to list of files that support non-blocking issue
In chasing a performance issue between using IORING_OP_RECVMSG and
IORING_OP_READV on sockets, tracing showed that we always punt the
socket reads to async offload. This is due to io_file_supports_async()
not checking for S_ISSOCK on the inode. Since sockets supports the
O_NONBLOCK (or MSG_DONTWAIT) flag just fine, add sockets to the list
of file types that we can do a non-blocking issue to.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-10 16:33:23 -07:00
Jens Axboe
53108d476a io_uring: only hash regular files for async work execution
We hash regular files to avoid having multiple threads hammer on the
inode mutex, but it should not be needed on other types of files
(like sockets).

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-10 16:33:23 -07:00
Jens Axboe
4a0a7a1874 io_uring: run next sqe inline if possible
One major use case of linked commands is the ability to run the next
link inline, if at all possible. This is done correctly for async
offload, but somewhere along the line we lost the ability to do so when
we were able to complete a request without having to punt it. Ensure
that we do so correctly.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-10 16:33:23 -07:00
Jens Axboe
392edb45b2 io_uring: don't dynamically allocate poll data
This essentially reverts commit e944475e69. For high poll ops
workloads, like TAO, the dynamic allocation of the wait_queue
entry for IORING_OP_POLL_ADD adds considerable extra overhead.
Go back to embedding the wait_queue_entry, but keep the usage of
wait->private for the pointer stashing.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-10 16:33:23 -07:00
Jens Axboe
d96885658d io_uring: deferred send/recvmsg should assign iov
Don't just assign it from the main call path, that can miss the case
when we're called from issue deferral.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-10 16:33:23 -07:00
Jens Axboe
8a4955ff1c io_uring: sqthread should grab ctx->uring_lock for submissions
We use the mutex to guard against registered file updates, for instance.
Ensure we're safe in accessing that state against concurrent updates.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-10 16:33:23 -07:00
Jens Axboe
e995d5123e io-wq: briefly spin for new work after finishing work
To avoid going to sleep only to get woken shortly thereafter, spin
briefly for new work upon completion of work.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-10 16:33:22 -07:00
Jens Axboe
506d95ff5d io-wq: remove worker->wait waitqueue
We only have one cases of using the waitqueue to wake the worker, the
rest are using wake_up_process(). Since we can save some cycles not
fiddling with the waitqueue io_wqe_worker(), switch the work activation
to task wakeup and get rid of the now unused wait_queue_head_t in
struct io_worker.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-10 16:33:22 -07:00
Jens Axboe
4e88d6e779 io_uring: allow unbreakable links
Some commands will invariably end in a failure in the sense that the
completion result will be less than zero. One such example is timeouts
that don't have a completion count set, they will always complete with
-ETIME unless cancelled.

For linked commands, we sever links and fail the rest of the chain if
the result is less than zero. Since we have commands where we know that
will happen, add IOSQE_IO_HARDLINK as a stronger link that doesn't sever
regardless of the completion result. Note that the link will still sever
if we fail submitting the parent request, hard links are only resilient
in the presence of completion results for requests that did submit
correctly.

Cc: stable@vger.kernel.org # v5.4
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Reported-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-10 16:33:06 -07:00
Amir Goldstein
6889ee5a53 ovl: relax WARN_ON() on rename to self
In ovl_rename(), if new upper is hardlinked to old upper underneath
overlayfs before upper dirs are locked, user will get an ESTALE error
and a WARN_ON will be printed.

Changes to underlying layers while overlayfs is mounted may result in
unexpected behavior, but it shouldn't crash the kernel and it shouldn't
trigger WARN_ON() either, so relax this WARN_ON().

Reported-by: syzbot+bb1836a212e69f8e201a@syzkaller.appspotmail.com
Fixes: 804032fabb ("ovl: don't check rename to self")
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-12-10 16:00:55 +01:00
Amir Goldstein
9c6d8f13e9 ovl: fix corner case of non-unique st_dev;st_ino
On non-samefs overlay without xino, non pure upper inodes should use a
pseudo_dev assigned to each unique lower fs and pure upper inodes use the
real upper st_dev.

It is fine for an overlay pure upper inode to use the same st_dev;st_ino
values as the real upper inode, because the content of those two different
filesystem objects is always the same.

In this case, however:
 - two filesystems, A and B
 - upper layer is on A
 - lower layer 1 is also on A
 - lower layer 2 is on B

Non pure upper overlay inode, whose origin is in layer 1 will have the same
st_dev;st_ino values as the real lower inode. This may result with a false
positive results of 'diff' between the real lower and copied up overlay
inode.

Fix this by using the upper st_dev;st_ino values in this case.  This breaks
the property of constant st_dev;st_ino across copy up of this case. This
breakage will be fixed by a later patch.

Fixes: 5148626b80 ("ovl: allocate anon bdev per unique lower fs")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-12-10 16:00:55 +01:00
Amir Goldstein
ec7bbb53d3 ovl: don't use a temp buf for encoding real fh
We can allocate maximum fh size and encode into it directly.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-12-10 16:00:55 +01:00
Amir Goldstein
cbe7fba8ed ovl: make sure that real fid is 32bit aligned in memory
Seprate on-disk encoding from in-memory and on-wire resresentation
of overlay file handle.

In-memory and on-wire we only ever pass around pointers to struct
ovl_fh, which encapsulates at offset 3 the on-disk format struct
ovl_fb. struct ovl_fb encapsulates at offset 21 the real file handle.
That makes sure that the real file handle is always 32bit aligned
in-memory when passed down to the underlying filesystem.

On-disk format remains the same and store/load are done into
correctly aligned buffer.

New nfs exported file handles are exported with aligned real fid.
Old nfs file handles are copied to an aligned buffer before being
decoded.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-12-10 16:00:55 +01:00
Amir Goldstein
7e63c87fc2 ovl: fix lookup failure on multi lower squashfs
In the past, overlayfs required that lower fs have non null uuid in
order to support nfs export and decode copy up origin file handles.

Commit 9df085f3c9 ("ovl: relax requirement for non null uuid of
lower fs") relaxed this requirement for nfs export support, as long
as uuid (even if null) is unique among all lower fs.

However, said commit unintentionally also relaxed the non null uuid
requirement for decoding copy up origin file handles, regardless of
the unique uuid requirement.

Amend this mistake by disabling decoding of copy up origin file handle
from lower fs with a conflicting uuid.

We still encode copy up origin file handles from those fs, because
file handles like those already exist in the wild and because they
might provide useful information in the future.

There is an unhandled corner case described by Miklos this way:
- two filesystems, A and B, both have null uuid
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B

In this case bad_uuid won't be set for B, because the check only
involves the list of lower fs.  Hence we'll try to decode a layer 2
origin on layer 1 and fail.

We will deal with this corner case later.

Reported-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/lkml/20191106234301.283006-1-colin.king@canonical.com/
Fixes: 9df085f3c9 ("ovl: relax requirement for non null uuid ...")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-12-10 16:00:55 +01:00
Steve French
281393894a smb3: fix refcount underflow warning on unmount when no directory leases
Fix refcount underflow warning when unmounting to servers which didn't grant
directory leases.

[  301.680095] refcount_t: underflow; use-after-free.
[  301.680192] WARNING: CPU: 1 PID: 3569 at lib/refcount.c:28
refcount_warn_saturate+0xb4/0xf3
...
[  301.682139] Call Trace:
[  301.682240]  close_shroot+0x97/0xda [cifs]
[  301.682351]  SMB2_tdis+0x7c/0x176 [cifs]
[  301.682456]  ? _get_xid+0x58/0x91 [cifs]
[  301.682563]  cifs_put_tcon.part.0+0x99/0x202 [cifs]
[  301.682637]  ? ida_free+0x99/0x10a
[  301.682727]  ? cifs_umount+0x3d/0x9d [cifs]
[  301.682829]  cifs_put_tlink+0x3a/0x50 [cifs]
[  301.682929]  cifs_umount+0x44/0x9d [cifs]

Fixes: 72e73c78c4 ("cifs: close the shared root handle on tree disconnect")

Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reported-and-tested-by: Arthur Marsh <arthur.marsh@internode.on.net>
2019-12-09 19:47:10 -06:00
Xiubo Li
da08e1e1d7 ceph: add more debug info when decoding mdsmap
Show the laggy state.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-12-09 20:55:10 +01:00
Xiubo Li
bd84fbcb31 ceph: switch to global cap helper
__ceph_is_any_caps is a duplicate helper.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-12-09 20:55:10 +01:00
Xiubo Li
bba1560bd4 ceph: trigger the reclaim work once there has enough pending caps
The nr in ceph_reclaim_caps_nr() is very possibly larger than 1,
so we may miss it and the reclaim work couldn't triggered as expected.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-12-09 20:55:10 +01:00
Jeff Layton
3a3430affc ceph: show tasks waiting on caps in debugfs caps file
Add some visibility of tasks that are waiting for caps to the "caps"
debugfs file. Display the tgid of the waiting task, inode number, and
the caps the task needs and wants.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-12-09 20:55:10 +01:00
Jeff Layton
ad8c28a9eb ceph: convert int fields in ceph_mount_options to unsigned int
Most of these values should never be negative, so convert them to
unsigned values. Add some sanity checking to the parsed values, and
clean up some unneeded casts.

Note that while caps_max should never be negative, this patch leaves
it signed, since this value ends up later being compared to a signed
counter. Just ensure that userland never passes in a negative value
for caps_max.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-12-09 20:55:10 +01:00
Pankaj Bharadiya
c593642c8b treewide: Use sizeof_field() macro
Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
at places where these are defined. Later patches will remove the unused
definition of FIELD_SIZEOF().

This patch is generated using following script:

EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"

git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
do

	if [[ "$file" =~ $EXCLUDE_FILES ]]; then
		continue
	fi
	sed -i  -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
done

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: David Miller <davem@davemloft.net> # for net
2019-12-09 10:36:44 -08:00
David Sterba
78f926f72e btrfs: add Kconfig dependency for BLAKE2B
Because the BLAKE2B code went through a different tree, it was not
available at the time the btrfs part was merged. Now that the Kconfig
symbol exists, add it to the list.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-12-09 17:56:06 +01:00
David Howells
bcbccaf2ed afs: Fix SELinux setting security label on /afs
Make the AFS dynamic root superblock R/W so that SELinux can set the
security label on it.  Without this, upgrades to, say, the Fedora
filesystem-afs RPM fail if afs is mounted on it because the SELinux label
can't be (re-)applied.

It might be better to make it possible to bypass the R/O check for LSM
label application through setxattr.

Fixes: 4d673da145 ("afs: Support the AFS dynamic root")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: selinux@vger.kernel.org
cc: linux-security-module@vger.kernel.org
2019-12-09 16:37:36 +00:00