Go to file
Yang Yang 72d04bdcf3 sbitmap: fix io hung due to race on sbitmap_word::cleared
Configuration for sbq:
  depth=64, wake_batch=6, shift=6, map_nr=1

1. There are 64 requests in progress:
  map->word = 0xFFFFFFFFFFFFFFFF
2. After all the 64 requests complete, and no more requests come:
  map->word = 0xFFFFFFFFFFFFFFFF, map->cleared = 0xFFFFFFFFFFFFFFFF
3. Now two tasks try to allocate requests:
  T1:                                       T2:
  __blk_mq_get_tag                          .
  __sbitmap_queue_get                       .
  sbitmap_get                               .
  sbitmap_find_bit                          .
  sbitmap_find_bit_in_word                  .
  __sbitmap_get_word  -> nr=-1              __blk_mq_get_tag
  sbitmap_deferred_clear                    __sbitmap_queue_get
  /* map->cleared=0xFFFFFFFFFFFFFFFF */     sbitmap_find_bit
    if (!READ_ONCE(map->cleared))           sbitmap_find_bit_in_word
      return false;                         __sbitmap_get_word -> nr=-1
    mask = xchg(&map->cleared, 0)           sbitmap_deferred_clear
    atomic_long_andnot()                    /* map->cleared=0 */
                                              if (!(map->cleared))
                                                return false;
                                     /*
                                      * map->cleared is cleared by T1
                                      * T2 fail to acquire the tag
                                      */

4. T2 is the sole tag waiter. When T1 puts the tag, T2 cannot be woken
up due to the wake_batch being set at 6. If no more requests come, T1
will wait here indefinitely.

This patch achieves two purposes:
1. Check on ->cleared and update on both ->cleared and ->word need to
be done atomically, and using spinlock could be the simplest solution.
2. Add extra check in sbitmap_deferred_clear(), to identify whether
->word has free bits.

Fixes: ea86ea2cdc ("sbitmap: ammortize cost of clearing bits")
Signed-off-by: Yang Yang <yang.yang@vivo.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240716082644.659566-1-yang.yang@vivo.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-19 09:39:32 -06:00
arch block: add a bvec_phys helper 2024-07-08 01:51:05 -06:00
block block: avoid polling configuration errors 2024-07-19 09:35:35 -06:00
certs kbuild: use $(src) instead of $(srctree)/$(src) for source directory 2024-05-10 04:34:52 +09:00
crypto This push fixes a bug in the new ecc P521 code as well as a buggy 2024-05-20 08:47:54 -07:00
Documentation block: Add core atomic write support 2024-06-20 15:19:17 -06:00
drivers s390/dasd: fix error checks in dasd_copy_pair_store() 2024-07-15 10:57:39 -06:00
fs block: Add atomic write support for statx 2024-06-20 15:19:17 -06:00
include sbitmap: fix io hung due to race on sbitmap_word::cleared 2024-07-19 09:39:32 -06:00
init Driver core changes for 6.10-rc1 2024-05-22 12:13:40 -07:00
io_uring fs: Initial atomic write support 2024-06-20 15:19:17 -06:00
ipc Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
kernel Fix race between perf_event_free_task() and perf_event_release_kernel() 2024-06-08 09:26:59 -07:00
lib sbitmap: fix io hung due to race on sbitmap_word::cleared 2024-07-19 09:39:32 -06:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm mm: fix xyz_noprof functions calling profiled functions 2024-06-05 19:19:26 -07:00
net nfsd-6.10 fixes: 2024-06-07 15:07:57 -07:00
rust rust: block: fix generated bindings after refactoring of features 2024-06-28 14:27:45 -06:00
samples tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
scripts Kbuild fixes for v6.10 (second) 2024-06-08 10:12:33 -07:00
security tomoyo: update project links 2024-06-03 22:43:11 +09:00
sound ALSA: seq: ump: Fix swapped song position pointer data 2024-05-31 09:51:44 +02:00
tools perf tools fixes for v6.10: 2nd batch 2024-06-09 09:04:51 -07:00
usr kbuild: use $(src) instead of $(srctree)/$(src) for source directory 2024-05-10 04:34:52 +09:00
virt The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
.clang-format clang-format: Update with v6.7-rc4's for_each macro list 2023-12-08 23:54:38 +01:00
.cocciconfig
.editorconfig Add .editorconfig file for basic formatting 2023-12-28 16:22:47 +09:00
.get_maintainer.ignore Add Jeff Kirsher to .get_maintainer.ignore 2024-03-08 11:36:54 +00:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: create a list of all built DTB files 2024-02-19 18:20:39 +09:00
.mailmap mailmap: add entry for Weiwen Hu 2024-06-24 12:53:42 -07:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Drop Gustavo Pimentel as PCI DWC Maintainer 2024-03-27 13:41:02 -05:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: add entry for Rust block device driver API 2024-06-14 07:45:04 -06:00
Makefile Linux 6.10-rc3 2024-06-09 14:19:43 -07:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.