Commit Graph

66310 Commits

Author SHA1 Message Date
Mike Snitzer
1471308fb5 Merge remote-tracking branch 'jens/for-5.10/block' into dm-5.10
DM depends on these block 5.10 commits:

22ada802ed block: use lcm_not_zero() when stacking chunk_sectors
07d098e6bb block: allow 'chunk_sectors' to be non-power-of-2
021a24460d block: add QUEUE_FLAG_NOWAIT
6abc49468e dm: add support for REQ_NOWAIT and enable it for linear target

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-09-29 16:31:35 -04:00
Christoph Hellwig
fa01b1e973 block: add a bdev_is_partition helper
Add a littler helper to make the somewhat arcane bd_contains checks a
little more obvious.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25 08:18:57 -06:00
Christoph Hellwig
f56753ac2a bdi: replace BDI_CAP_NO_{WRITEBACK,ACCT_DIRTY} with a single flag
Replace the two negative flags that are always used together with a
single positive flag that indicates the writeback capability instead
of two related non-capabilities.  Also remove the pointless wrappers
to just check the flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-24 13:43:39 -06:00
Christoph Hellwig
823423ef55 bdi: invert BDI_CAP_NO_ACCT_WB
Replace BDI_CAP_NO_ACCT_WB with a positive BDI_CAP_WRITEBACK_ACCT to
make the checks more obvious.  Also remove the pointless
bdi_cap_account_writeback wrapper that just obsfucates the check.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-24 13:43:39 -06:00
Christoph Hellwig
1cb039f3dc bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag
The BDI_CAP_STABLE_WRITES is one of the few bits of information in the
backing_dev_info shared between the block drivers and the writeback code.
To help untangling the dependency replace it with a queue flag and a
superblock flag derived from it.  This also helps with the case of e.g.
a file system requiring stable writes due to its own checksumming, but
not forcing it on other users of the block device like the swap code.

One downside is that we an't support the stable_pages_required bdi
attribute in sysfs anymore.  It is replaced with a queue attribute which
also is writable for easier testing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-24 13:43:39 -06:00
Christoph Hellwig
ed7b6b4f6e bdi: remove BDI_CAP_CGROUP_WRITEBACK
Just checking SB_I_CGROUPWB for cgroup writeback support is enough.
Either the file system allocates its own bdi (e.g. btrfs), in which case
it is known to support cgroup writeback, or the bdi comes from the block
layer, which always supports cgroup writeback.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-24 13:43:39 -06:00
Christoph Hellwig
55b2598e84 bdi: initialize ->ra_pages and ->io_pages in bdi_init
Set up a readahead size by default, as very few users have a good
reason to change it.  This means code, ecryptfs, and orangefs now
set up the values while they were previously missing it, while ubifs,
mtd and vboxsf manually set it to 0 to avoid readahead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Acked-by: Richard Weinberger <richard@nod.at> [ubifs, mtd]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-24 13:43:39 -06:00
Christoph Hellwig
402dd2cf46 fs: remove the unused SB_I_MULTIROOT flag
The last user of SB_I_MULTIROOT is disappeared with commit f2aedb713c
("NFS: Add fs_context support.")

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-24 13:43:38 -06:00
Christoph Hellwig
1fb1a2ad75 block: mark blkdev_get static
There are no users outside the core block code left now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-23 10:43:19 -06:00
Christoph Hellwig
bb3247a399 PM: rewrite is_hibernate_resume_dev to not require an inode
Just check the dev_t to help simplifying the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-23 10:43:19 -06:00
Christoph Hellwig
e455ed2290 ocfs2: cleanup o2hb_region_dev_store
Use blkdev_get_by_dev instead of igrab (aka open coded bdgrab) +
blkdev_get.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-23 10:43:19 -06:00
Christoph Hellwig
38430f0876 block: move the NEED_PART_SCAN flag to struct gendisk
We can only scan for partitions on the whole disk, so move the flag
from struct block_device to struct gendisk.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-23 10:43:18 -06:00
Tobias Klauser
9ca48e20ec fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype
Commit 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer.  Adjust the
definition of dirtytime_interval_handler to match its prototype in
linux/writeback.h which fixes the following sparse error/warning:

fs/fs-writeback.c:2189:50: warning: incorrect type in argument 3 (different address spaces)
fs/fs-writeback.c:2189:50:    expected void *
fs/fs-writeback.c:2189:50:    got void [noderef] __user *buffer
fs/fs-writeback.c:2184:5: error: symbol 'dirtytime_interval_handler' redeclared with different type (incompatible argument 3 (different address spaces)):
fs/fs-writeback.c:2184:5:    int extern [addressable] [signed] [toplevel] dirtytime_interval_handler( ... )
fs/fs-writeback.c: note: in included file:
./include/linux/writeback.h:374:5: note: previously declared as:
./include/linux/writeback.h:374:5:    int extern [addressable] [signed] [toplevel] dirtytime_interval_handler( ... )

Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/20200907093140.13434-1-tklauser@distanz.ch
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-19 13:13:39 -07:00
Linus Torvalds
fc4f28bb3d for-5.9-rc5-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl9fju4ACgkQxWXV+ddt
 WDukWw//d3D33s1SnHI2ooub6VllM5xC/VCB5hfpHdeuLfhIR4iFrXlKCwFNuyuF
 A1CgKu7W4eMkQ9E6dKIvYhRQj9ELcHWxS1LRUkuXVGojvPVQAbGhUibjcmTy6xoi
 t4KkvdI0t/puEwRTrF2ooInWzdbiJvzCuCHLWkX4wK0IAkaVt+IK2ZJuSTCZhUEv
 Gx5CSLqP6zq/XD/03yWQvILE3MoM9BOc28EQ345dhEu+SWWpdhzy4W825A8xUe3Q
 bThcK8vJgeLUo/KVtUJNLoFTl+2BpxzaiHWqh+3plGjQ1qtZyDgFhZOoP0JEJb/f
 qEnyN2XbY26p6nj2J+34Euz5OX65ot6bePx6v/h5jY+UjNa8mpZD6d4Y+OII0XW0
 kCLPyaNc8P6IkmAYzSLQPmrjCoRgR/2we8HRBylTWaeEcvpduZs/5YlJNO9q59Et
 voclNx6kk4AdFjSxXHwERlSA3v4+pkq41bjnbpv7a9FGRBUJGVfJFCNukuhCx5Oi
 nUKg7jBJEZNhE2DOA0MKnrZTBnufkLCiH+f/7pn+ypPdw1dna/ix9Suwbz4+UElx
 TtwGLF99CQCX09ieJBVWB90a3xgQBIZLaqu1HtZ1u7dzKkutl+uKMcnfEEtLSfJy
 qMLHHsRQj/iwytQHSdWboL6xQCUSkSefPTmmNfstaKSxeq6w0Q0=
 =pdp0
 -----END PGP SIGNATURE-----

Merge tag 'for-5.9-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fix from David Sterba:
 "One of the recent lockdep fixes introduced a bug that breaks the
  search ioctl, which is used by some applications (bees, compsize). The
  patch made it to stable trees so we need this fixup to make it work
  again"

* tag 'for-5.9-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix wrong address when faulting in pages in the search ioctl
2020-09-14 15:41:58 -07:00
Filipe Manana
1c78544eaa btrfs: fix wrong address when faulting in pages in the search ioctl
When faulting in the pages for the user supplied buffer for the search
ioctl, we are passing only the base address of the buffer to the function
fault_in_pages_writeable(). This means that after the first iteration of
the while loop that searches for leaves, when we have a non-zero offset,
stored in 'sk_offset', we try to fault in a wrong page range.

So fix this by adding the offset in 'sk_offset' to the base address of the
user supplied buffer when calling fault_in_pages_writeable().

Several users have reported that the applications compsize and bees have
started to operate incorrectly since commit a48b73eca4 ("btrfs: fix
potential deadlock in the search ioctl") was added to stable trees, and
these applications make heavy use of the search ioctls. This fixes their
issues.

Link: https://lore.kernel.org/linux-btrfs/632b888d-a3c3-b085-cdf5-f9bb61017d92@lechevalier.se/
Link: https://github.com/kilobyte/compsize/issues/34
Fixes: a48b73eca4 ("btrfs: fix potential deadlock in the search ioctl")
CC: stable@vger.kernel.org # 4.4+
Tested-by: A L <mail@lechevalier.se>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-14 17:27:16 +02:00
Linus Torvalds
20a7b6be05 Driver core fixes for 5.9-rc5
Here are some small driver core and debugfs fixes for 5.9-rc5
 
 Included in here are:
 	- firmware loader memory leak fix
 	- firmware loader testing fixes for non-EFI systems
 	- device link locking fixes found by lockdep
 	- kobject_del() bugfix that has been affecting some callers
 	- debugfs minor fix
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX13Zhw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylfNQCfX4Bx9J1aLGr0/MOBwlXXEycChE0AmwQ9rXa7
 u5Przdz+fMr1mLyNBaY5
 =4kmo
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are some small driver core and debugfs fixes for 5.9-rc5

  Included in here are:

   - firmware loader memory leak fix

   - firmware loader testing fixes for non-EFI systems

   - device link locking fixes found by lockdep

   - kobject_del() bugfix that has been affecting some callers

   - debugfs minor fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  test_firmware: Test platform fw loading on non-EFI systems
  PM: <linux/device.h>: fix @em_pd kernel-doc warning
  kobject: Drop unneeded conditional in __kobject_del()
  driver core: Fix device_pm_lock() locking for device links
  MAINTAINERS: Add the security document to SECURITY CONTACT
  driver code: print symbolic error code
  debugfs: Fix module state check condition
  kobject: Restore old behaviour of kobject_del(NULL)
  firmware_loader: fix memory leak for paged buffer
2020-09-13 09:02:59 -07:00
Linus Torvalds
edf6b0e1e4 for-5.9-rc4-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl9bj/UACgkQxWXV+ddt
 WDs1HhAAgAvJVM2WJuMCjQhQiKljFjRT1a0Kbsp+9ayw5Q225t5S5kCMWrsA6mXF
 /9bGRmmELm/Nr5pSH9hp5Bhbke0vNV+Y9XiRQXpegla4LLMF4MulVgADRIL3WoxO
 ZAtNmZUokkjvB0CkzDuI7PqrF67TXLqV2hlctZo0p5SAFFgLaELyIYC6uAaO9Qo/
 +EAAK+7oJyzWcUp44APu90wBbF79umwNVKEEkDfc6bwiA2Cut1JGzvPWgGvvQnta
 fAd114LFViKg05GXcbnx4NxHYtf9tKHjDk9yYWssR+uV6vo/pWwAkDwYxXm/LzA4
 Zv8QK5uvng1fW4eq9QkN3KflIDn+YhaH1jgwNcgyS+ZCdqZR1Mi949f+6Nj1fXt2
 NeXOx3nhtqgNthKQNvHSMVJZrPjV3bdzOz+bULA+hMvTkr5gJy+ToAs30SLxGF5Y
 BCJEE6b5M5Jnb+UHEBMuoxubBfmPHkY8LxfDzVWDLESsKcW2eYyeJyJXx4DNe/v9
 O7Z5pcku+7R9LOlYQEzKeSuiYMqYLtmQtcNXyFBysksikjFJBWNgENna1LmgvmRH
 j6fC5S9h4sIxzyKQkJgihIDt/a3f9WnhsoHw8EIn62tfdOIvMcT/xWq9YYgWaOjZ
 H9040WXvEAFVcDn4cQ22DNgV+toJMpe0pLg6UXe7VtESUtbwMFM=
 =JTfF
 -----END PGP SIGNATURE-----

Merge tag 'for-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A few more fixes:

   - regression fix for a crash after failed snapshot creation

   - one more lockep fix: use nofs allocation when allocating missing
     device

   - fix reloc tree leak on degraded mount

   - make some extent buffer alignment checks less strict to mount
     filesystems created by btrfs-convert"

* tag 'for-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix NULL pointer dereference after failure to create snapshot
  btrfs: free data reloc tree on failed mount
  btrfs: require only sector size alignment for parent eb bytenr
  btrfs: fix lockdep splat in add_missing_dev
2020-09-12 12:28:39 -07:00
Linus Torvalds
5a3c558a9f A fix for lookup on DFS link when cifsacl or modefromsid used
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl9c/IoACgkQiiy9cAdy
 T1GB8Av/XCQ1NqifphSf9/AJ9GJRIJ9fl5pdoPDDPPpY88XGUBJ21frXJCm5Uv/5
 MXmNltZnIPMqBa+K6QvMR2WjmGlE28zVaapiIXJK+ckaMClukvn09FOCKWmhVkgi
 0m1LwLOCL92TslRxkmW7wI1IFdNFCDdcehQ9xvjkaCCkpkiZ6ec4cV68fvF9nQ0P
 CD5I7E4m5KZiMDIyF1DGZvPEp/fVv5Wu8Fmvi/4DdUCMF8EXmBYrbCax6+SfqMEs
 3ut3pJEiUPdaojZQKb2s0u4vf5zqLYI7azdplqeyEqACDvSmDDhiXyIddbLrT/It
 OkfiffaRLXJADLEQrH1E1uVbABa7pY44LhorxV1J0+4T/v33kItbRcwLfi9LeOpn
 jIIBXYzRP7Pmj2ymH9irf1F18z8O97gcQf8HiR8e6WaSC1Mf8GJLTcwZRsiR4J5H
 4no9vDkL+T+lXLUGwnW8aRe10w2gRoIZeQxhkkWUZJZnf2x0NgecljckFOZaMQff
 ycRefbVh
 =JaQw
 -----END PGP SIGNATURE-----

Merge tag '5.9-rc4-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fix from Steve French:
 "A fix for lookup on DFS link when cifsacl or modefromsid is used"

* tag '5.9-rc4-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix DFS mount with cifsacl/modefromsid
2020-09-12 11:48:04 -07:00
Linus Torvalds
581cb3a26b f2fs-for-5.9-rc5
This introduces some bug fixes including 1) SMR drive fix, 2) infinite loop
 when building free node ids, 3) EOF at DIO read.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAl9Y0skACgkQQBSofoJI
 UNKggA//a8cCkIx0wp1BEabPQ8x8TrHkiOVZX9zTbiaLkI9MLYweF0YFjbCx/i/t
 41r6JbTxWLiM+GRI+u9kOrkBOUgTHAZThIBcWbfYKR5Jz7XdijhsuNYqRf8n33ya
 u4PtT2AiMW3gt8gJML08eI/U+/rBBpI6fkc8Sza03iTCLjJiWwvZ99h2rPSfgmIw
 UdyRuof3mPgUz12fQ8BoO/oQ5KhKgA7kxMsCc4UJE9VX/hBYamTfIEF1AmnxlyKn
 HYhkXyuAmhEFEenBA+ts9YZXMubWHwcj4GlTkWTP6Ib+A3cIinHIqpzEaJnPQdWY
 JPG72L2TNg0Mtl8w+46zJa1mr6KvXTA5GNowX4OTTf2oHxeHJ4sxax3cmvrDOIfe
 7Ga8kGyi5OON1Hymwuyyz4KuKM9zvyUouX0i4wsiamm9kFQoxzNSbfsM5L0weCwM
 MWVOFhtOUhrQdJ+zpbtJpF8Ijj3nuOoUpw3/hUhl3GM+J4bzE8MdzRmzB3JjkVCr
 4mfYuXCITPOH31Fu3obXEFI/9kgyZdxk6OHsXLZCCfRCvgRTDfk5KFQx1OIY3jk0
 I/yRjmvnOyD37GVXHyGZxjRplDCgCpTOL6HFk1GOLFUvuoQeOOqwwL0oa/J94ED0
 nE80dqs4e8c3/YmUqcgiOm4d4EkdwVzhE9RxdUq7s2SJgh5Eu/o=
 =gbal
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs fixes from Jaegeuk Kim:
 "Small bug fixes for:

   - SMR drive fix

   - infinite loop when building free node ids

   - EOF at DIO read"

* tag 'f2fs-for-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
  f2fs: Return EOF on unaligned end of file DIO read
  f2fs: fix indefinite loop scanning for free nid
  f2fs: Fix type of section block count variables
2020-09-10 13:12:46 -07:00
Christoph Hellwig
b92b53079a block: remove check_disk_change
Remove the now unused check_disk_change helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-10 09:32:31 -06:00
Christoph Hellwig
95f6f3a46f block: add a bdev_check_media_change helper
Like check_disk_changed, except that it does not call ->revalidate_disk
but leaves that to the caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-10 09:32:30 -06:00
Linus Torvalds
ab29a807a7 NFS client bugfixes for Linux 5.9
Highlights include:
 
 Bugfixes:
 - Fix an NFS/RDMA resource leak
 - Fix the error handling during delegation recall
 - NFSv4.0 needs to return the delegation on a zero-stateid SETATTR
 - Stop printk reading past end of string
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAl9ZFYAACgkQZwvnipYK
 APLg+RAArQ0J54M4vTg7avKhUEwIrAlPCFjHvZ5jtlXiY8JDT7Cy2lEo9W/pC9x2
 BiV02H6seKXq6vKUHIBgzVq0BdZBKeWQcOpoO/dfvWSPs9u+lxKlOEwcdsaXwdXz
 31u5HS4xHYg2SlYj+BcKGfVexcWVEVyPqqPvflGBZIlKfzQLHo9YY390deUHMC6o
 HrRXWADvpYXC1sJb3mtNtCojqr9a5A8Ty4clT19YvdwQL7cUt3HjjsOvJfbmB9S+
 fW5/u3sdWJ1nYoz8AxC+utIMNmtXFBUhW0Sg+TPWMJj8yG9rclAgTxbobhXyzGph
 j2ZamPhUtpcSYXBlwiQCm7GbUIItnzHgU6MSCs/nq8AeDc3WEx4qVONVqNvNr/sY
 1T3znylZpXCHvxLmDWzDGsW8XvZT1r86Lm6zrJCmjWm+eoSKBzeoENcXGsGGYuJu
 6NGz7pgQbYMb9t7VfOEFSxxt5w0wt7nRyhV1R7taBhm5B9XjF+BOmJBI0epQ1S7i
 XRIr7WqxT00wijWyunNCQZxi1aDMHVYZXPwaqkEHTwJqeDzCtmir+ajAnZQUgUId
 1MNiv8BDoN5YlPmj/gt+E3kbyj0Pu7M+09NvVEKqG7j8W80ltf6eb85XGrq+vp1E
 Y0lmDXElBdNo3AA+dBOmk+peoVv4bfoog5PymElaRiwRM25VCOM=
 =3fw2
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - Fix an NFS/RDMA resource leak

 - Fix the error handling during delegation recall

 - NFSv4.0 needs to return the delegation on a zero-stateid SETATTR

 - Stop printk reading past end of string

* tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: stop printk reading past end of string
  NFS: Zero-stateid SETATTR should first return delegation
  NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recall
  xprtrdma: Release in-flight MRs on disconnect
2020-09-09 11:14:20 -07:00
Gabriel Krisman Bertazi
20d0a107fb f2fs: Return EOF on unaligned end of file DIO read
Reading past end of file returns EOF for aligned reads but -EINVAL for
unaligned reads on f2fs.  While documentation is not strict about this
corner case, most filesystem returns EOF on this case, like iomap
filesystems.  This patch consolidates the behavior for f2fs, by making
it return EOF(0).

it can be verified by a read loop on a file that does a partial read
before EOF (A file that doesn't end at an aligned address).  The
following code fails on an unaligned file on f2fs, but not on
btrfs, ext4, and xfs.

  while (done < total) {
    ssize_t delta = pread(fd, buf + done, total - done, off + done);
    if (!delta)
      break;
    ...
  }

It is arguable whether filesystems should actually return EOF or
-EINVAL, but since iomap filesystems support it, and so does the
original DIO code, it seems reasonable to consolidate on that.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-08 20:31:33 -07:00
Sahitya Tummala
e2cab031ba f2fs: fix indefinite loop scanning for free nid
If the sbi->ckpt->next_free_nid is not NAT block aligned and if there
are free nids in that NAT block between the start of the block and
next_free_nid, then those free nids will not be scanned in scan_nat_page().
This results into mismatch between nm_i->available_nids and the sum of
nm_i->free_nid_count of all NAT blocks scanned. And nm_i->available_nids
will always be greater than the sum of free nids in all the blocks.
Under this condition, if we use all the currently scanned free nids,
then it will loop forever in f2fs_alloc_nid() as nm_i->available_nids
is still not zero but nm_i->free_nid_count of that partially scanned
NAT block is zero.

Fix this to align the nm_i->next_scan_nid to the first nid of the
corresponding NAT block.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-08 20:31:33 -07:00
Shin'ichiro Kawasaki
123aaf774f f2fs: Fix type of section block count variables
Commit da52f8ade4 ("f2fs: get the right gc victim section when section
has several segments") added code to count blocks of each section using
variables with type 'unsigned short', which has 2 bytes size in many
systems. However, the counts can be larger than the 2 bytes range and
type conversion results in wrong values. Especially when the f2fs
sections have blocks as many as USHRT_MAX + 1, the count is handled as 0.
This triggers eternal loop in init_dirty_segmap() at mount system call.
Fix this by changing the type of the variables to block_t.

Fixes: da52f8ade4 ("f2fs: get the right gc victim section when section has several segments")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-08 20:31:33 -07:00
Jan Kara
384d87ef2c block: Do not discard buffers under a mounted filesystem
Discarding blocks and buffers under a mounted filesystem is hardly
anything admin wants to do. Usually it will confuse the filesystem and
sometimes the loss of buffer_head state (including b_private field) can
even cause crashes like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 4 PID: 203778 Comm: jbd2/dm-3-8 Kdump: loaded Tainted: G O     --------- -  - 4.18.0-147.5.0.5.h126.eulerosv2r9.x86_64 #1
Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 1.57 08/11/2015
RIP: 0010:jbd2_journal_grab_journal_head+0x1b/0x40 [jbd2]
...
Call Trace:
 __jbd2_journal_insert_checkpoint+0x23/0x70 [jbd2]
 jbd2_journal_commit_transaction+0x155f/0x1b60 [jbd2]
 kjournald2+0xbd/0x270 [jbd2]

So if we don't have block device open with O_EXCL already, claim the
block device while we truncate buffer cache. This makes sure any
exclusive block device user (such as filesystem) cannot operate on the
device while we are discarding buffer cache.

Reported-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[axboe: fix !CONFIG_BLOCK error in truncate_bdev_range()]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-07 20:10:55 -06:00
Filipe Manana
2d892ccdc1 btrfs: fix NULL pointer dereference after failure to create snapshot
When trying to get a new fs root for a snapshot during the transaction
at transaction.c:create_pending_snapshot(), if btrfs_get_new_fs_root()
fails we leave "pending->snap" pointing to an error pointer, and then
later at ioctl.c:create_snapshot() we dereference that pointer, resulting
in a crash:

  [12264.614689] BUG: kernel NULL pointer dereference, address: 00000000000007c4
  [12264.615650] #PF: supervisor write access in kernel mode
  [12264.616487] #PF: error_code(0x0002) - not-present page
  [12264.617436] PGD 0 P4D 0
  [12264.618328] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
  [12264.619150] CPU: 0 PID: 2310635 Comm: fsstress Tainted: G        W         5.9.0-rc3-btrfs-next-67 #1
  [12264.619960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  [12264.621769] RIP: 0010:btrfs_mksubvol+0x438/0x4a0 [btrfs]
  [12264.622528] Code: bc ef ff ff (...)
  [12264.624092] RSP: 0018:ffffaa6fc7277cd8 EFLAGS: 00010282
  [12264.624669] RAX: 00000000fffffff4 RBX: ffff9d3e8f151a60 RCX: 0000000000000000
  [12264.625249] RDX: 0000000000000001 RSI: ffffffff9d56c9be RDI: fffffffffffffff4
  [12264.625830] RBP: ffff9d3e8f151b48 R08: 0000000000000000 R09: 0000000000000000
  [12264.626413] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffff4
  [12264.626994] R13: ffff9d3ede380538 R14: ffff9d3ede380500 R15: ffff9d3f61b2eeb8
  [12264.627582] FS:  00007f140d5d8200(0000) GS:ffff9d3fb5e00000(0000) knlGS:0000000000000000
  [12264.628176] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [12264.628773] CR2: 00000000000007c4 CR3: 000000020f8e8004 CR4: 00000000003706f0
  [12264.629379] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [12264.629994] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [12264.630594] Call Trace:
  [12264.631227]  btrfs_mksnapshot+0x7b/0xb0 [btrfs]
  [12264.631840]  __btrfs_ioctl_snap_create+0x16f/0x1a0 [btrfs]
  [12264.632458]  btrfs_ioctl_snap_create_v2+0xb0/0xf0 [btrfs]
  [12264.633078]  btrfs_ioctl+0x1864/0x3130 [btrfs]
  [12264.633689]  ? do_sys_openat2+0x1a7/0x2d0
  [12264.634295]  ? kmem_cache_free+0x147/0x3a0
  [12264.634899]  ? __x64_sys_ioctl+0x83/0xb0
  [12264.635488]  __x64_sys_ioctl+0x83/0xb0
  [12264.636058]  do_syscall_64+0x33/0x80
  [12264.636616]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

  (gdb) list *(btrfs_mksubvol+0x438)
  0x7c7b8 is in btrfs_mksubvol (fs/btrfs/ioctl.c:858).
  853		ret = 0;
  854		pending_snapshot->anon_dev = 0;
  855	fail:
  856		/* Prevent double freeing of anon_dev */
  857		if (ret && pending_snapshot->snap)
  858			pending_snapshot->snap->anon_dev = 0;
  859		btrfs_put_root(pending_snapshot->snap);
  860		btrfs_subvolume_release_metadata(root, &pending_snapshot->block_rsv);
  861	free_pending:
  862		if (pending_snapshot->anon_dev)

So fix this by setting "pending->snap" to NULL if we get an error from the
call to btrfs_get_new_fs_root() at transaction.c:create_pending_snapshot().

Fixes: 2dfb1e43f5 ("btrfs: preallocate anon block device at first phase of snapshot creation")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-07 21:18:35 +02:00
Jan Kara
6dbf7bb555 fs: Don't invalidate page buffers in block_write_full_page()
If block_write_full_page() is called for a page that is beyond current
inode size, it will truncate page buffers for the page and return 0.
This logic has been added in 2.5.62 in commit 81eb69062588 ("fix ext3
BUG due to race with truncate") in history.git tree to fix a problem
with ext3 in data=ordered mode. This particular problem doesn't exist
anymore because ext3 is long gone and ext4 handles ordered data
differently. Also normally buffers are invalidated by truncate code and
there's no need to specially handle this in ->writepage() code.

This invalidation of page buffers in block_write_full_page() is causing
issues to filesystems (e.g. ext4 or ocfs2) when block device is shrunk
under filesystem's hands and metadata buffers get discarded while being
tracked by the journalling layer. Although it is obviously "not
supported" it can cause kernel crashes like:

[ 7986.689400] BUG: unable to handle kernel NULL pointer dereference at
+0000000000000008
[ 7986.697197] PGD 0 P4D 0
[ 7986.699724] Oops: 0002 [#1] SMP PTI
[ 7986.703200] CPU: 4 PID: 203778 Comm: jbd2/dm-3-8 Kdump: loaded Tainted: G
+O     --------- -  - 4.18.0-147.5.0.5.h126.eulerosv2r9.x86_64 #1
[ 7986.716438] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 1.57 08/11/2015
[ 7986.723462] RIP: 0010:jbd2_journal_grab_journal_head+0x1b/0x40 [jbd2]
...
[ 7986.810150] Call Trace:
[ 7986.812595]  __jbd2_journal_insert_checkpoint+0x23/0x70 [jbd2]
[ 7986.818408]  jbd2_journal_commit_transaction+0x155f/0x1b60 [jbd2]
[ 7986.836467]  kjournald2+0xbd/0x270 [jbd2]

which is not great. The crash happens because bh->b_private is suddently
NULL although BH_JBD flag is still set (this is because
block_invalidatepage() cleared BH_Mapped flag and subsequent bh lookup
found buffer without BH_Mapped set, called init_page_buffers() which has
rewritten bh->b_private). So just remove the invalidation in
block_write_full_page().

Note that the buffer cache invalidation when block device changes size
is already careful to avoid similar problems by using
invalidate_mapping_pages() which skips busy buffers so it was only this
odd block_write_full_page() behavior that could tear down bdev buffers
under filesystem's hands.

Reported-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
CC: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-07 10:23:07 -06:00
Josef Bacik
9e3aa80544 btrfs: free data reloc tree on failed mount
While testing a weird problem with -o degraded, I noticed I was getting
leaked root errors

  BTRFS warning (device loop0): writable mount is not allowed due to too many missing devices
  BTRFS error (device loop0): open_ctree failed
  BTRFS error (device loop0): leaked root -9-0 refcount 1

This is the DATA_RELOC root, which gets read before the other fs roots,
but is included in the fs roots radix tree.  Handle this by adding a
btrfs_drop_and_free_fs_root() on the data reloc root if it exists.  This
is ok to do here if we fail further up because we will only drop the ref
if we delete the root from the radix tree, and all other cleanup won't
be duplicated.

CC: stable@vger.kernel.org # 5.8+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-07 14:51:15 +02:00
Qu Wenruo
ea57788eb7 btrfs: require only sector size alignment for parent eb bytenr
[BUG]
A completely sane converted fs will cause kernel warning at balance
time:

  [ 1557.188633] BTRFS info (device sda7): relocating block group 8162107392 flags data
  [ 1563.358078] BTRFS info (device sda7): found 11722 extents
  [ 1563.358277] BTRFS info (device sda7): leaf 7989321728 gen 95 total ptrs 213 free space 3458 owner 2
  [ 1563.358280] 	item 0 key (7984947200 169 0) itemoff 16250 itemsize 33
  [ 1563.358281] 		extent refs 1 gen 90 flags 2
  [ 1563.358282] 		ref#0: tree block backref root 4
  [ 1563.358285] 	item 1 key (7985602560 169 0) itemoff 16217 itemsize 33
  [ 1563.358286] 		extent refs 1 gen 93 flags 258
  [ 1563.358287] 		ref#0: shared block backref parent 7985602560
  [ 1563.358288] 			(parent 7985602560 is NOT ALIGNED to nodesize 16384)
  [ 1563.358290] 	item 2 key (7985635328 169 0) itemoff 16184 itemsize 33
  ...
  [ 1563.358995] BTRFS error (device sda7): eb 7989321728 invalid extent inline ref type 182
  [ 1563.358996] ------------[ cut here ]------------
  [ 1563.359005] WARNING: CPU: 14 PID: 2930 at 0xffffffff9f231766

Then with transaction abort, and obviously failed to balance the fs.

[CAUSE]
That mentioned inline ref type 182 is completely sane, it's
BTRFS_SHARED_BLOCK_REF_KEY, it's some extra check making kernel to
believe it's invalid.

Commit 64ecdb647d ("Btrfs: add one more sanity check for shared ref
type") introduced extra checks for backref type.

One of the requirement is, parent bytenr must be aligned to node size,
which is not correct.

One example is like this:

0	1G  1G+4K		2G 2G+4K
	|   |///////////////////|//|  <- A chunk starts at 1G+4K
            |   |	<- A tree block get reserved at bytenr 1G+4K

Then we have a valid tree block at bytenr 1G+4K, but not aligned to
nodesize (16K).

Such chunk is not ideal, but current kernel can handle it pretty well.
We may warn about such tree block in the future, but should not reject
them.

[FIX]
Change the alignment requirement from node size alignment to sector size
alignment.

Also, to make our lives a little easier, also output @iref when
btrfs_get_extent_inline_ref_type() failed, so we can locate the item
easier.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205475
Fixes: 64ecdb647d ("Btrfs: add one more sanity check for shared ref type")
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update comments and messages ]
Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-07 14:51:05 +02:00
Josef Bacik
fccc0007b8 btrfs: fix lockdep splat in add_missing_dev
Nikolay reported a lockdep splat in generic/476 that I could reproduce
with btrfs/187.

  ======================================================
  WARNING: possible circular locking dependency detected
  5.9.0-rc2+ #1 Tainted: G        W
  ------------------------------------------------------
  kswapd0/100 is trying to acquire lock:
  ffff9e8ef38b6268 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x3f/0x330

  but task is already holding lock:
  ffffffffa9d74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (fs_reclaim){+.+.}-{0:0}:
	 fs_reclaim_acquire+0x65/0x80
	 slab_pre_alloc_hook.constprop.0+0x20/0x200
	 kmem_cache_alloc_trace+0x3a/0x1a0
	 btrfs_alloc_device+0x43/0x210
	 add_missing_dev+0x20/0x90
	 read_one_chunk+0x301/0x430
	 btrfs_read_sys_array+0x17b/0x1b0
	 open_ctree+0xa62/0x1896
	 btrfs_mount_root.cold+0x12/0xea
	 legacy_get_tree+0x30/0x50
	 vfs_get_tree+0x28/0xc0
	 vfs_kern_mount.part.0+0x71/0xb0
	 btrfs_mount+0x10d/0x379
	 legacy_get_tree+0x30/0x50
	 vfs_get_tree+0x28/0xc0
	 path_mount+0x434/0xc00
	 __x64_sys_mount+0xe3/0x120
	 do_syscall_64+0x33/0x40
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}:
	 __mutex_lock+0x7e/0x7e0
	 btrfs_chunk_alloc+0x125/0x3a0
	 find_free_extent+0xdf6/0x1210
	 btrfs_reserve_extent+0xb3/0x1b0
	 btrfs_alloc_tree_block+0xb0/0x310
	 alloc_tree_block_no_bg_flush+0x4a/0x60
	 __btrfs_cow_block+0x11a/0x530
	 btrfs_cow_block+0x104/0x220
	 btrfs_search_slot+0x52e/0x9d0
	 btrfs_lookup_inode+0x2a/0x8f
	 __btrfs_update_delayed_inode+0x80/0x240
	 btrfs_commit_inode_delayed_inode+0x119/0x120
	 btrfs_evict_inode+0x357/0x500
	 evict+0xcf/0x1f0
	 vfs_rmdir.part.0+0x149/0x160
	 do_rmdir+0x136/0x1a0
	 do_syscall_64+0x33/0x40
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&delayed_node->mutex){+.+.}-{3:3}:
	 __lock_acquire+0x1184/0x1fa0
	 lock_acquire+0xa4/0x3d0
	 __mutex_lock+0x7e/0x7e0
	 __btrfs_release_delayed_node.part.0+0x3f/0x330
	 btrfs_evict_inode+0x24c/0x500
	 evict+0xcf/0x1f0
	 dispose_list+0x48/0x70
	 prune_icache_sb+0x44/0x50
	 super_cache_scan+0x161/0x1e0
	 do_shrink_slab+0x178/0x3c0
	 shrink_slab+0x17c/0x290
	 shrink_node+0x2b2/0x6d0
	 balance_pgdat+0x30a/0x670
	 kswapd+0x213/0x4c0
	 kthread+0x138/0x160
	 ret_from_fork+0x1f/0x30

  other info that might help us debug this:

  Chain exists of:
    &delayed_node->mutex --> &fs_info->chunk_mutex --> fs_reclaim

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(fs_reclaim);
				 lock(&fs_info->chunk_mutex);
				 lock(fs_reclaim);
    lock(&delayed_node->mutex);

   *** DEADLOCK ***

  3 locks held by kswapd0/100:
   #0: ffffffffa9d74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
   #1: ffffffffa9d65c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x115/0x290
   #2: ffff9e8e9da260e0 (&type->s_umount_key#48){++++}-{3:3}, at: super_cache_scan+0x38/0x1e0

  stack backtrace:
  CPU: 1 PID: 100 Comm: kswapd0 Tainted: G        W         5.9.0-rc2+ #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
  Call Trace:
   dump_stack+0x92/0xc8
   check_noncircular+0x12d/0x150
   __lock_acquire+0x1184/0x1fa0
   lock_acquire+0xa4/0x3d0
   ? __btrfs_release_delayed_node.part.0+0x3f/0x330
   __mutex_lock+0x7e/0x7e0
   ? __btrfs_release_delayed_node.part.0+0x3f/0x330
   ? __btrfs_release_delayed_node.part.0+0x3f/0x330
   ? lock_acquire+0xa4/0x3d0
   ? btrfs_evict_inode+0x11e/0x500
   ? find_held_lock+0x2b/0x80
   __btrfs_release_delayed_node.part.0+0x3f/0x330
   btrfs_evict_inode+0x24c/0x500
   evict+0xcf/0x1f0
   dispose_list+0x48/0x70
   prune_icache_sb+0x44/0x50
   super_cache_scan+0x161/0x1e0
   do_shrink_slab+0x178/0x3c0
   shrink_slab+0x17c/0x290
   shrink_node+0x2b2/0x6d0
   balance_pgdat+0x30a/0x670
   kswapd+0x213/0x4c0
   ? _raw_spin_unlock_irqrestore+0x46/0x60
   ? add_wait_queue_exclusive+0x70/0x70
   ? balance_pgdat+0x670/0x670
   kthread+0x138/0x160
   ? kthread_create_worker_on_cpu+0x40/0x40
   ret_from_fork+0x1f/0x30

This is because we are holding the chunk_mutex when we call
btrfs_alloc_device, which does a GFP_KERNEL allocation.  We don't want
to switch that to a GFP_NOFS lock because this is the only place where
it matters.  So instead use memalloc_nofs_save() around the allocation
in order to avoid the lockdep splat.

Reported-by: Nikolay Borisov <nborisov@suse.com>
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-07 14:50:57 +02:00
Ronnie Sahlberg
01ec372cef cifs: fix DFS mount with cifsacl/modefromsid
RHBZ: 1871246

If during cifs_lookup()/get_inode_info() we encounter a DFS link
and we use the cifsacl or modefromsid mount options we must suppress
any -EREMOTE errors that triggers or else we will not be able to follow
the DFS link and automount the target.

This fixes an issue with modefromsid/cifsacl where these mountoptions
would break DFS and we would no longer be able to access the share.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-09-06 23:59:53 -05:00
Linus Torvalds
a8205e3100 io_uring-5.9-2020-09-06
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl9U/MMQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpg4BEAC6TQ6ctc1yNTTWwiz4UrdIKMAWP7S7wepu
 k00A9+JjLLJBVdkz9rZ2Y/SyGe12qBM+riiRSn/gNTbkd4qq2rCY2d8U1vXXKyP5
 VDXPo12zsUD5WpqdEMfXUB4+DOQs0a3DAyDLT0+K/Qw0rpOQVZA26Ovn6GOh9+Kq
 KJCllynYkwUQ/7CXsdI13ktMI1HADFOx3149SlGkbuPOggwNQrGiLJAxyUOn/i+E
 uiHy8b8o3B2nun61+Y98q+MDISLf0xXYEbHeAsvETEy52ya2iadnMo1lwS5zHzM+
 p6jOBybHaM/wz5t1V44VTBvfog1KAUtp8K0gsxcB6Ezf7LhYfTVjtvfZqpwVz1sl
 txkxhnGEBURHpdr0aCtC15cIpbDGM85ymjP4RD0YlV7oyT0+Ufx7r9jJjLf7IhZO
 FMyEFmVwSPD/NdJE9ZNx0I2v/qdIzYwfeAix4Z4bXoe51BtPrf1uBgsGWVuqNVz/
 dKVf1vK1tPF8PgIiIW/o8GI4iF3RRVQcLwJGiAWMzBS11iniJLf9mUPRVl5Bxpeb
 YRd2chm+ppHC/IgtK44x7Ce6415hnbKehE2KUr43PHB7nIMNWcRsurxLizM7ZGei
 Gv2/K9PM3+4O/b1k20xLakz4Vw3Isk4W6/Flj9LoxheBo+CUG4Rsx5UyzFEB6apA
 uXxiF6ORLg==
 =v5QQ
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block

Pull more io_uring fixes from Jens Axboe:
 "Two followup fixes. One is fixing a regression from this merge window,
  the other is two commits fixing cancelation of deferred requests.

  Both have gone through full testing, and both spawned a few new
  regression test additions to liburing.

   - Don't play games with const, properly store the output iovec and
     assign it as needed.

   - Deferred request cancelation fix (Pavel)"

* tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block:
  io_uring: fix linked deferred ->files cancellation
  io_uring: fix cancel of deferred reqs with ->files
  io_uring: fix explicit async read/write mapping for large segments
2020-09-06 12:10:27 -07:00
Pavel Begunkov
c127a2a1b7 io_uring: fix linked deferred ->files cancellation
While looking for ->files in ->defer_list, consider that requests there
may actually be links.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-05 16:02:42 -06:00
Pavel Begunkov
b7ddce3cbf io_uring: fix cancel of deferred reqs with ->files
While trying to cancel requests with ->files, it also should look for
requests in ->defer_list, otherwise it might end up hanging a thread.

Cancel all requests in ->defer_list up to the last request there with
matching ->files, that's needed to follow drain ordering semantics.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-05 15:59:51 -06:00
Linus Torvalds
9322c47b21 Fixes (2) for 5.9:
- Fix a broken metadata verifier that would incorrectly validate attr
 fork extents of a realtime file against the realtime volume.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl9RDTMACgkQ+H93GTRK
 tOseqA/8DTnwKRKuTXc6UboIXbgMwjGTJBv7xTD3KIDF3ls+Gmp2/RWHs7lfGRVY
 LVdvlgAwDLEe1al8aLaibhrsRCYX8MLvMp8dX6xoqMWCY0KF80VCpXXy+FXbwBmZ
 GQRqrThyjepseERJj15UC249p9ztBj4DAHQD0r2XD7JMKHBHo6cQ3eYY2NTzgqc3
 q0kYcP5xPZZ4R6UFUNkJpkeV8PKpkzWTXyMod0+e8h4njZoRAmHimj7Id9DMQ6HB
 ciHjZjNCd04Nu6JIpNBwlQE4epCPh79zX8hTQZf5nGNAh13CB9Wc2nHDvwCFBLxw
 HUdU5BpxMNWGujJ+T5X+RVbzA4VXXaL3iypag9EjXadg+vOV6XmPQv6Fa3WqkfBT
 GjEXB9+TG9ZuyxAObjP6yn1PRvob0l0iccAbtfnX5bE/mwqx1aDxN1QTfJrjG2Fv
 2EiUnqI+jxro66HdS5QU+W8ko7dG0tiQPMF3DCv7nfb3ZxEgfXiDrgbHyYbzdLJq
 pi5LqXBAfRHkDwRUzRU6G6pT7mplW31iTwh2AuiQogXnpPTWylb0XsUEVfYKpK9q
 Z0HzwRfHmwQSkIEDZ0rZxP3wH5ssniHyLdXh5FtfFcPSBTyDvflMqFpLmYmvDV/6
 MPfwCQAnMTrGv2GRJ8hbFiWd987bYTTfOKwJBEsFKDH01ZKYcV8=
 =5U8o
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.9-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
 "Fix a broken metadata verifier that would incorrectly validate attr
  fork extents of a realtime file against the realtime volume"

* tag 'xfs-5.9-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix xfs_bmap_validate_extent_raw when checking attr fork of rt files
2020-09-05 10:04:53 -07:00
Mikulas Patocka
b17164e258 xfs: don't update mtime on COW faults
When running in a dax mode, if the user maps a page with MAP_PRIVATE and
PROT_WRITE, the xfs filesystem would incorrectly update ctime and mtime
when the user hits a COW fault.

This breaks building of the Linux kernel.  How to reproduce:

 1. extract the Linux kernel tree on dax-mounted xfs filesystem
 2. run make clean
 3. run make -j12
 4. run make -j12

at step 4, make would incorrectly rebuild the whole kernel (although it
was already built in step 3).

The reason for the breakage is that almost all object files depend on
objtool.  When we run objtool, it takes COW page fault on its .data
section, and these faults will incorrectly update the timestamp of the
objtool binary.  The updated timestamp causes make to rebuild the whole
tree.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 10:00:05 -07:00
Mikulas Patocka
1ef6ea0efe ext2: don't update mtime on COW faults
When running in a dax mode, if the user maps a page with MAP_PRIVATE and
PROT_WRITE, the ext2 filesystem would incorrectly update ctime and mtime
when the user hits a COW fault.

This breaks building of the Linux kernel.  How to reproduce:

 1. extract the Linux kernel tree on dax-mounted ext2 filesystem
 2. run make clean
 3. run make -j12
 4. run make -j12

at step 4, make would incorrectly rebuild the whole kernel (although it
was already built in step 3).

The reason for the breakage is that almost all object files depend on
objtool.  When we run objtool, it takes COW page fault on its .data
section, and these faults will incorrectly update the timestamp of the
objtool binary.  The updated timestamp causes make to rebuild the whole
tree.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05 10:00:05 -07:00
Jens Axboe
c183edff33 io_uring: fix explicit async read/write mapping for large segments
If we exceed UIO_FASTIOV, we don't handle the transition correctly
between an allocated vec for requests that are queued with IOSQE_ASYNC.
Store the iovec appropriately and re-set it in the iter iov in case
it changed.

Fixes: ff6165b2d7 ("io_uring: retain iov_iter state over io_read/io_write calls")
Reported-by: Nick Hill <nick@nickhill.org>
Tested-by: Norman Maurer <norman.maurer@googlemail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-05 09:02:47 -06:00
Chuck Lever
644c9f40cf NFS: Zero-stateid SETATTR should first return delegation
If a write delegation isn't available, the Linux NFS client uses
a zero-stateid when performing a SETATTR.

NFSv4.0 provides no mechanism for an NFS server to match such a
request to a particular client. It recalls all delegations for that
file, even delegations held by the client issuing the request. If
that client happens to hold a read delegation, the server will
recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/
DELEGRETURN sequence.

Optimize out this pipeline bubble by having the client return any
delegations it may hold on a file before it issues a
SETATTR(zero-stateid) on that file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-09-05 10:39:41 -04:00
Linus Torvalds
d849ca483d io_uring-5.9-2020-09-04
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl9SWN8QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpvtbD/4yI5dkopv6E2RHVuupFWmGlGoxLhPecnAZ
 UHbKU+LA/tzWWMA7gZuwzzDEK1QWT/KmctpGTI22SXUKQCtjGzO/qnRMfyJ34TdQ
 l4leYdw/QzUOHZG7dKVYUACHiaSxzQSallNuX1I9eM084KSXH3DgUkrwMLoew/8n
 WJHKN+oRhcppnLVDekaLXbZEI9idTnY+gs/Dg8TNsxNSeO6y51OOlKltaNfL+npQ
 dwlgMoolBYWHFozqgVyzIV7sU7fQ9QGppwBIfqBb1jEe9JU2ZymtlcDgfxUVpKcg
 W8/PCoVT60AGiMdjV0EBoQO09r+nvwAcRQUSlWJU7Dn/pcZmFoaJkyse+SnD0Dac
 cLTKhnhgMJSI4Zt3yQidFSNhz0Ouw15J8k7OTftn81zhtkHzPBgGnA7R6b7UUQsZ
 5lJvlZh5aFPNBFp9A0do5+f5/lUMhHkxDpFVmZo+ywPtoNHJeDL2+jzzFawJ8kqv
 IoFvVL8hl4DzqN+vShsJ40jH93+oITF/Jlq6kY8ILKtu42i5qAxpP0wUwycrN6Pz
 /YNTKPveCoPU7zaFDvMfbc7U56Ke6ma+lmtTn6q6JOWFvUAYh7SUY4JGzEMpxfxK
 QVyFMwXnCKhB66ZypJIFdbT4zqkTXmhxvu/Oz5txDv/uoytqT1o+zLHb3USi4Lw8
 89NyvBc0aQ==
 =NLOn
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.9-2020-09-04' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

 - EAGAIN with O_NONBLOCK retry fix

 - Two small fixes for registered files (Jiufei)

* tag 'io_uring-5.9-2020-09-04' of git://git.kernel.dk/linux-block:
  io_uring: no read/write-retry on -EAGAIN error and O_NONBLOCK marked file
  io_uring: set table->files[i] to NULL when io_sqe_file_register failed
  io_uring: fix removing the wrong file in __io_sqe_files_update()
2020-09-04 12:55:22 -07:00
Vladis Dronov
e3b9fc7eec debugfs: Fix module state check condition
The '#ifdef MODULE' check in the original commit does not work as intended.
The code under the check is not built at all if CONFIG_DEBUG_FS=y. Fix this
by using a correct check.

Fixes: 275678e7a9 ("debugfs: Check module state before warning in {full/open}_proxy_open()")
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200811150129.53343-1-vdronov@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-04 18:12:52 +02:00
Linus Torvalds
3e8d3bdc2a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi
    Kivilinna.

 2) Fix loss of RTT samples in rxrpc, from David Howells.

 3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu.

 4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka.

 5) We disable BH for too lokng in sctp_get_port_local(), add a
    cond_resched() here as well, from Xin Long.

 6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu.

 7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from
    Yonghong Song.

 8) Missing of_node_put() in mt7530 DSA driver, from Sumera
    Priyadarsini.

 9) Fix crash in bnxt_fw_reset_task(), from Michael Chan.

10) Fix geneve tunnel checksumming bug in hns3, from Yi Li.

11) Memory leak in rxkad_verify_response, from Dinghao Liu.

12) In tipc, don't use smp_processor_id() in preemptible context. From
    Tuong Lien.

13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu.

14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter.

15) Fix ABI mismatch between driver and firmware in nfp, from Louis
    Peens.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits)
  net/smc: fix sock refcounting in case of termination
  net/smc: reset sndbuf_desc if freed
  net/smc: set rx_off for SMCR explicitly
  net/smc: fix toleration of fake add_link messages
  tg3: Fix soft lockup when tg3_reset_task() fails.
  doc: net: dsa: Fix typo in config code sample
  net: dp83867: Fix WoL SecureOn password
  nfp: flower: fix ABI mismatch between driver and firmware
  tipc: fix shutdown() of connectionless socket
  ipv6: Fix sysctl max for fib_multipath_hash_policy
  drivers/net/wan/hdlc: Change the default of hard_header_len to 0
  net: gemini: Fix another missing clk_disable_unprepare() in probe
  net: bcmgenet: fix mask check in bcmgenet_validate_flow()
  amd-xgbe: Add support for new port mode
  net: usb: dm9601: Add USB ID of Keenetic Plus DSL
  vhost: fix typo in error message
  net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()
  pktgen: fix error message with wrong function name
  net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode
  cxgb4: fix thermal zone device registration
  ...
2020-09-03 18:50:48 -07:00
Linus Torvalds
26acd8b07a affs-for-5.9-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl9Qx/MACgkQxWXV+ddt
 WDvT8g//bO9+xB3dRYLVl5FWur826Zz/dmzpp+A3QCI0er1+zh8/3FHD6dvGyBQu
 CHQE1GFTdewry1usuGu08WopGiywnCa1svmp+qCJDugiDAG/ciTq2zQWemV2gRXU
 1/9yZ62B3eBKuiL+lFNPYFzqWhsX0zP21nxHEkTJi2d8enJDNcmrrWjBrVHUVa1o
 D0al0zPvaXpcknadhsNSkJnvgMeTcSz89AEIEC5OVfitbaIX09UAadG9kjPLlGJN
 KNL+/vmAARNe1ze5qjTf3++E1LH28CdJ7rrlhAqyR4L4rApCQw8ZQv3h8EbnidSy
 VJgDQCk35Z6IxVwbUKYEGiperV2dnsZHukzPT/sqjGrHO0Ct+Fy+YeK+4aw9yaof
 mWG7RGeczj9t/FbKEGmELlEOHKmJw+PHi5byTokPfsA8smrfaPKNJKlb5bbjALni
 hOThGiYr18n3RrSidpZ0GVxRcl8j3OwHSVDS/9U3eAF7Z31QtTDoX0MHbK6PrbvW
 KvGkqIwyLknshS868lyAzXSdg6K4wYqvuh2Z/TuFFPV2MGbsXxHEFgM7M1wueV6x
 iLhqUDYn2ki79FqqCYsUqWzDMc8m/4v//T4KPTj+BN9H8ij9bdzTXpaHrZAqLX0Y
 ielkB51hdAlCML/Ctd5xojn5YOXgo5TkSC32QOHvxovSI1qJCIo=
 =lVnf
 -----END PGP SIGNATURE-----

Merge tag 'affs-for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull affs fix from David Sterba:
 "One fix to make permissions work the same way as on AmigaOS"

* tag 'affs-for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  affs: fix basic permission bits to actually work
2020-09-03 08:41:36 -07:00
Darrick J. Wong
d0c20d38af xfs: fix xfs_bmap_validate_extent_raw when checking attr fork of rt files
The realtime flag only applies to the data fork, so don't use the
realtime block number checks on the attr fork of a realtime file.

Fixes: 30b0984d91 ("xfs: refactor bmap record validation")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2020-09-03 08:33:50 -07:00
Linus Torvalds
e1d0126ca3 Fixes for 5.9:
- Avoid a log recovery failure for an insert range operation by rolling
 deferred ops incrementally instead of at the end.
 - Fix an off-by-one error when calculating log space reservations for
 anything involving an inode allocation or free.
 - Fix a broken shortform xattr verifier.
 - Ensure that the shortform xattr header padding is always initialized
 to zero.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl9H1PkACgkQ+H93GTRK
 tOsfXA//QYUtx6BccgiXGJCJ7dkOJwM+2aKS1kvadfJlN6hZh7x6jzi424qsBHbH
 jfNGcGjlrvFp3MAex3w+LeQFEK9gX47mKXGVR8MakWjo/khuxNoqpx4rzFI+wdMS
 Sh51f9ovbsZIRrKHPv9YrxBUhO3yzbWbz2bbWYAV1/kN4jxdzKNByVrhIn97yFsg
 NOj9uh1R3UsD1IKez9FeMKwYt/6jXZKs/LzroBKGxcBvdEu99X4vpD9WBqghe0Ye
 e57rf+zb5b5M0ydOp9mEme8wogQcLKmMPgNaZXzduPuhx8GtsqZ4gTKoIpRSBCWA
 z5oVerIr2jMN4+OMOaBWrEI34kRFUh5KG0Z0U4uOLavm04//KfQ3zUWAyea20Gmg
 rkq4me7DJk0cVmWlJkB9X4LbR4OjeS7ANqvM94AmcYVHzxw76O/UhgMYXBgBin2J
 4egsiXVi+Pii5Syw7kU9dTxWETP9mlN+2yO1CQdW0HgFnCJ3ARng0hdJBftQEIVo
 Rec/YmpE9Mi7QSRjfp+I+30ZPNMl3lHMIfAVEj2Nhn32do5l6h9ZN4XP5OSQyqa6
 yoNAi2oB1lqPaiV1/N2hRZwbxIexzeAZZSkGKGpiF/g9u5nwDTGRxVOmc/vTSGVN
 7swYctIl9HuzIAuLGU3d8yCj6FnWvzNM6JGM0GDPi8ZFBqezhY4=
 =9IZj
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.9-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "Various small corruption fixes that have come in during the past
  month:

   - Avoid a log recovery failure for an insert range operation by
     rolling deferred ops incrementally instead of at the end.

   - Fix an off-by-one error when calculating log space reservations for
     anything involving an inode allocation or free.

   - Fix a broken shortform xattr verifier.

   - Ensure that the shortform xattr header padding is always
     initialized to zero"

* tag 'xfs-5.9-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: initialize the shortform attr header padding entry
  xfs: fix boundary test in xfs_attr_shortform_verify
  xfs: fix off-by-one in inode alloc block reservation calculation
  xfs: finish dfops on every insert range shift iteration
2020-09-02 11:42:18 -07:00
Linus Torvalds
54e54d5818 Merge branch 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull epoll fixup from Al Viro:
 "Fixup for epoll regression; there's a better solution longer term, but
  this is the least intrusive fix"

* 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix regression in "epoll: Keep a reference on files added to the check list"
2020-09-02 11:29:34 -07:00
Jens Axboe
355afaeb57 io_uring: no read/write-retry on -EAGAIN error and O_NONBLOCK marked file
Actually two things that need fixing up here:

- The io_rw_reissue() -EAGAIN retry is explicit to block devices and
  regular files, so don't ever attempt to do that on other types of
  files.

- If we hit -EAGAIN on a nonblock marked file, don't arm poll handler for
  it. It should just complete with -EAGAIN.

Cc: stable@vger.kernel.org
Reported-by: Norman Maurer <norman.maurer@googlemail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02 10:20:41 -06:00
Al Viro
77f4689de1 fix regression in "epoll: Keep a reference on files added to the check list"
epoll_loop_check_proc() can run into a file already committed to destruction;
we can't grab a reference on those and don't need to add them to the set for
reverse path check anyway.

Tested-by: Marc Zyngier <maz@kernel.org>
Fixes: a9ed4a6560 ("epoll: Keep a reference on files added to the check list")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-02 11:30:48 -04:00
Jiufei Xue
95d1c8e5f8 io_uring: set table->files[i] to NULL when io_sqe_file_register failed
While io_sqe_file_register() failed in __io_sqe_files_update(),
table->files[i] still point to the original file which may freed
soon, and that will trigger use-after-free problems.

Cc: stable@vger.kernel.org
Fixes: f3bd9dae37 ("io_uring: fix memleak in __io_sqe_files_update()")
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02 09:11:59 -06:00