When the weight of an active iocg is updated, weight_updated() is called
which in turn calls __propagate_weights() to update the active and inuse
weights so that the effective hierarchical weights are update accordingly.
The current implementation is incorrect for inner active nodes. For an
active leaf iocg, inuse can be any value between 1 and active and the
difference represents how much the iocg is donating. When weight is updated,
as long as inuse is clamped between 1 and the new weight, we're alright and
this is what __propagate_weights() currently implements.
However, that's not how an active inner node's inuse is set. An inner node's
inuse is solely determined by the ratio between the sums of inuse's and
active's of its children - ie. they're results of propagating the leaves'
active and inuse weights upwards. __propagate_weights() incorrectly applies
the same clamping as for a leaf when an active inner node's weight is
updated. Consider a hierarchy which looks like the following with saturating
workloads in AA and BB.
R
/ \
A B
| |
AA BB
1. For both A and B, active=100, inuse=100, hwa=0.5, hwi=0.5.
2. echo 200 > A/io.weight
3. __propagate_weights() update A's active to 200 and leave inuse at 100 as
it's already between 1 and the new active, making A:active=200,
A:inuse=100. As R's active_sum is updated along with A's active,
A:hwa=2/3, B:hwa=1/3. However, because the inuses didn't change, the
hwi's remain unchanged at 0.5.
4. The weight of A is now twice that of B but AA and BB still have the same
hwi of 0.5 and thus are doing the same amount of IOs.
Fix it by making __propgate_weights() always calculate the inuse of an
active inner iocg based on the ratio of child_inuse_sum to child_active_sum.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dan Schatzberg <dschatzberg@fb.com>
Fixes: 7caa47151a ("blkcg: implement blk-iocost")
Cc: stable@vger.kernel.org # v5.4+
Link: https://lore.kernel.org/r/YJsxnLZV1MnBcqjj@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit 32b48bf851 ("KVM: PPC: Book3S HV: Fix conversion to gfn-based
MMU notifier callbacks") fixed kvm_unmap_gfn_range_hv() by adding a for
loop over each gfn in the range.
But for the Hash MMU it repeatedly calls kvm_unmap_rmapp() with the
first gfn of the range, rather than iterating through the range.
This exhibits as strange guest behaviour, sometimes crashing in firmare,
or booting and then guest userspace crashing unexpectedly.
Fix it by passing the iterator, gfn, to kvm_unmap_rmapp().
Fixes: 32b48bf851 ("KVM: PPC: Book3S HV: Fix conversion to gfn-based MMU notifier callbacks")
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210511105459.800788-1-mpe@ellerman.id.au
UBSAN complains when a pointer is calculated with invalid
'legacy_serial_console' index, allthough the index is verified
before dereferencing the pointer.
Fix it by checking 'legacy_serial_console' validity before
calculating pointers.
Fixes: 0bd3f9e953 ("powerpc/legacy_serial: Use early_ioremap()")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210511010712.750096-1-mpe@ellerman.id.au
When neither CONFIG_VSX nor CONFIG_PPC_FPU_REGS are selected,
unsafe_copy_fpr_to_user() and unsafe_copy_fpr_from_user() are
doing nothing.
Then, unless the 'label' operand is used elsewhere, GCC complains
about it being defined but not used.
To fix that, add an impossible 'goto label'.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cadc0a328bc8e6c5bf133193e7547d5c10ae7895.1620465920.git.christophe.leroy@csgroup.eu
Building kernel mainline with GCC 11 leads to following failure
when starting 'init':
init[1]: bad frame in sys_sigreturn: 7ff5a900 nip 001083cc lr 001083c4
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
This is an issue due to a segfault happening in
__unsafe_restore_general_regs() in a loop copying registers from user
to kernel:
10: 7d 09 03 a6 mtctr r8
14: 80 ca 00 00 lwz r6,0(r10)
18: 80 ea 00 04 lwz r7,4(r10)
1c: 90 c9 00 08 stw r6,8(r9)
20: 90 e9 00 0c stw r7,12(r9)
24: 39 0a 00 08 addi r8,r10,8
28: 39 29 00 08 addi r9,r9,8
2c: 81 4a 00 08 lwz r10,8(r10) <== r10 is clobbered here
30: 81 6a 00 0c lwz r11,12(r10)
34: 91 49 00 08 stw r10,8(r9)
38: 91 69 00 0c stw r11,12(r9)
3c: 39 48 00 08 addi r10,r8,8
40: 39 29 00 08 addi r9,r9,8
44: 42 00 ff d0 bdnz 14 <__unsafe_restore_general_regs+0x14>
As shown above, this is due to r10 being re-used by GCC. This didn't
happen with CLANG.
This is fixed by tagging 'x' output as an earlyclobber operand in
__get_user_asm2_goto().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cf0a050d124d4f426cdc7a74009d17b01d8d8969.1620465917.git.christophe.leroy@csgroup.eu
The hcall tracing code has a recursion check built in, which skips
tracing if we are already tracing an hcall.
However if the tracing code has problems with recursion, this check
may not catch all cases because the tracing code could be invoked from
a different tracepoint first, then make an hcall that gets traced,
then recurse.
Add an explicit warning if recursion is detected here, which might help
to notice tracing code making hcalls. Really the core trace code should
have its own recursion checking and warnings though.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210508101455.1578318-5-npiggin@gmail.com
Rather than special-case H_CEDE in the hcall trace wrappers, make the
idle H_CEDE call use plpar_hcall_norets_notrace().
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210508101455.1578318-4-npiggin@gmail.com
This doesn't seem very useful to trace before the recursion check, even
if the ftrace code has any recursion checks of its own. Be on the safe
side and don't trace the hcall trace wrappers.
Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210508101455.1578318-3-npiggin@gmail.com
The paravit queued spinlock slow path adds itself to the queue then
calls pv_wait to wait for the lock to become free. This is implemented
by calling H_CONFER to donate cycles.
When hcall tracing is enabled, this H_CONFER call can lead to a spin
lock being taken in the tracing code, which will result in the lock to
be taken again, which will also go to the slow path because it queues
behind itself and so won't ever make progress.
An example trace of a deadlock:
__pv_queued_spin_lock_slowpath
trace_clock_global
ring_buffer_lock_reserve
trace_event_buffer_lock_reserve
trace_event_buffer_reserve
trace_event_raw_event_hcall_exit
__trace_hcall_exit
plpar_hcall_norets_trace
__pv_queued_spin_lock_slowpath
trace_clock_global
ring_buffer_lock_reserve
trace_event_buffer_lock_reserve
trace_event_buffer_reserve
trace_event_raw_event_rcu_dyntick
rcu_irq_exit
irq_exit
__do_irq
call_do_irq
do_IRQ
hardware_interrupt_common_virt
Fix this by introducing plpar_hcall_norets_notrace(), and using that to
make SPLPAR virtual processor dispatching hcalls by the paravirt
spinlock code.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210508101455.1578318-2-npiggin@gmail.com
In f2fs_destroy_compress_ctx(), after f2fs_destroy_compress_ctx(),
cc.cluster_idx will be cleared w/ NULL_CLUSTER, f2fs_cluster_blocks()
may check wrong cluster metadata, fix it.
Fixes: 4c8ff7095b ("f2fs: support data compression")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
pos_fsstress testcase complains a panic as belew:
------------[ cut here ]------------
kernel BUG at fs/f2fs/compress.c:1082!
invalid opcode: 0000 [#1] SMP PTI
CPU: 4 PID: 2753477 Comm: kworker/u16:2 Tainted: G OE 5.12.0-rc1-custom #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Workqueue: writeback wb_workfn (flush-252:16)
RIP: 0010:prepare_compress_overwrite+0x4c0/0x760 [f2fs]
Call Trace:
f2fs_prepare_compress_overwrite+0x5f/0x80 [f2fs]
f2fs_write_cache_pages+0x468/0x8a0 [f2fs]
f2fs_write_data_pages+0x2a4/0x2f0 [f2fs]
do_writepages+0x38/0xc0
__writeback_single_inode+0x44/0x2a0
writeback_sb_inodes+0x223/0x4d0
__writeback_inodes_wb+0x56/0xf0
wb_writeback+0x1dd/0x290
wb_workfn+0x309/0x500
process_one_work+0x220/0x3c0
worker_thread+0x53/0x420
kthread+0x12f/0x150
ret_from_fork+0x22/0x30
The root cause is truncate() may race with overwrite as below,
so that one reference count left in page can not guarantee the
page attaching in mapping tree all the time, after truncation,
later find_lock_page() may return NULL pointer.
- prepare_compress_overwrite
- f2fs_pagecache_get_page
- unlock_page
- f2fs_setattr
- truncate_setsize
- truncate_inode_page
- delete_from_page_cache
- find_lock_page
Fix this by avoiding referencing updated page.
Fixes: 4c8ff7095b ("f2fs: support data compression")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In error path of f2fs_write_compressed_pages(), it needs to call
f2fs_compress_free_page() to release temporary page.
Fixes: 5e6bbde959 ("f2fs: introduce mempool for {,de}compress intermediate page allocation")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In f2fs_fileattr_set(),
if (!fa->flags_valid)
mask &= FS_COMMON_FL;
In this case, we can set supported flags by mask only instead of BUG_ON.
/* Flags shared betwen flags/xflags */
(FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \
FS_NODUMP_FL | FS_NOATIME_FL | FS_DAX_FL | \
FS_PROJINHERIT_FL)
Fixes: 9b1bb01c8a ("f2fs: convert to fileattr")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
RTC drivers used to leave .set_alarm() NULL in order to signal the RTC
device doesn't support alarms. The drivers are now clearing the
RTC_FEATURE_ALARM bit for that purpose in order to keep the rtc_class_ops
structure const. So now, .set_alarm() is set unconditionally and this
possibly causes the alarmtimer code to select an RTC device that doesn't
support alarms.
Test RTC_FEATURE_ALARM instead of relying on ops->set_alarm to determine
whether alarms are available.
Fixes: 7ae41220ef ("rtc: introduce features bitfield")
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210511014516.563031-1-alexandre.belloni@bootlin.com
The rocket driver and documentation were removed in this commit, but
the corresponding entry in index.rst was not removed.
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Fixes: 3b00b6af7a ("tty: rocket, remove the driver")
Link: https://lore.kernel.org/r/20210511134937.2442291-1-desmondcheongzx@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Probably because the original file was pre-processed by some
tool, both i40e.rst and iavf.rst files are using this character:
- U+2013 ('–'): EN DASH
meaning an hyphen when calling a command line application, which
is obviously wrong. So, replace them by an hyphen, ensuring
that it will be properly displayed as literals when building
the documentation.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/95eb2a48d0ca3528780ce0dfce64359977fa8cb3.1620744606.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
While UTF-8 characters can be used at the Linux documentation,
the best is to use them only when ASCII doesn't offer a good replacement.
So, replace the occurences of the following UTF-8 characters:
- U+2013 ('–'): EN DASH
In this specific case, EN DASH was used instead of a minus
sign. So, replace it by a single hyphen.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/73b3c7c1eef5c12ddc941624d23689313bd56529.1620744606.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmCaiuQACgkQxWXV+ddt
WDv3Ww//bDUlNXqAYEoLKePohy1bupiqG8lKYX4s4bGEq0x0cyh4qVER/Q/lU2l2
AMf8t6Pwr/iBOPwfckreLDuFrhacvWq0K4eMkgpf++3P0Mzbj2sIBX0+XnrWluRL
yFCZudJej+cpM55Ve4l6M8zrk1nbzYJLFPRRdOIFe4HonWkhI/zY6RD7kFybQevW
mAxqMgIpUQAjoj5F/EhwXQ9dk6PXSZj+gaOoNrmQmN7mZMqNgSLHBEoJUHrotm1K
rDlEwIRUTtNPV+rcPxcXD1GFiUxU0cZhg0jts252z89Mvaqb2g/YKaHPAR/IVIt5
enf4llZzoEeiMnHuSj9zCg4HxOvCCFV8zZYXlO7/9IqdgLJjQkElZoqTz45obWdE
aoJrHAWWlulS2jPocJfJ/Zti2xBYGLjQASH0kYS+vjVxjKyqz3fuM1Tsasaf9Mcp
+M2m6yMBjJ0nJMTL2CgBksCd0dHwfiBZ/YYClrMSjYlzYSU6ofA2b2hej0OjqZ4X
FmpEmCBK4lySdJI+JlJKikeneOOxKSpT0xGqU+OMmbpwFH3k1N3oseu0hrG8Xreo
RU1xNbekGTwRbCcCA9l5HQ/RYptT7rt/KqkC70UFEvdIijCNcptOGaTAoYvLS14O
T+yu0Cizt7O0Fdg5E+MAS/qaI2yacXxBfIkMDbPxHGUg7+vUteM=
=Phtq
-----END PGP SIGNATURE-----
Merge tag 'for-5.13-rc1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"Handle transaction start error in btrfs_fileattr_set()
This is fix for code introduced by the new fileattr merge"
* tag 'for-5.13-rc1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: handle transaction start error in btrfs_fileattr_set
Host can send invalid commands and flood the target with error messages.
Demote the error message from pr_err() to pr_debug() in
nvmet_parse_fabrics_cmd() and nvmet_parse_connect_cmd().
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Use the helper nvmet_report_invalid_opcode() to report invalid opcode
so we can remove the duplicate code.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Host can send invalid commands and flood the target with error messages
for the discovery controller. Demote the error message from pr_err() to
pr_debug( in nvmet_parse_discovery_cmd().
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When handling passthru commands, for inline bio allocation we only
consider the transfer size. This works well when req->sg_cnt fits into
the req->inline_bvec, but it will result in the early return from
bio_add_hw_page() when req->sg_cnt > NVMET_MAX_INLINE_BVEC.
Consider an I/O of size 32768 and first buffer is not aligned to the
page boundary, then I/O is split in following manner :-
[ 2206.256140] nvmet: sg->length 3440 sg->offset 656
[ 2206.256144] nvmet: sg->length 4096 sg->offset 0
[ 2206.256148] nvmet: sg->length 4096 sg->offset 0
[ 2206.256152] nvmet: sg->length 4096 sg->offset 0
[ 2206.256155] nvmet: sg->length 4096 sg->offset 0
[ 2206.256159] nvmet: sg->length 4096 sg->offset 0
[ 2206.256163] nvmet: sg->length 4096 sg->offset 0
[ 2206.256166] nvmet: sg->length 4096 sg->offset 0
[ 2206.256170] nvmet: sg->length 656 sg->offset 0
Now the req->transfer_size == NVMET_MAX_INLINE_DATA_LEN i.e. 32768, but
the req->sg_cnt is (9) > NVMET_MAX_INLINE_BIOVEC which is (8).
This will result in early return in the following code path :-
nvmet_bdev_execute_rw()
bio_add_pc_page()
bio_add_hw_page()
if (bio_full(bio, len))
return 0;
Use previously introduced helper nvmet_use_inline_bvec() to consider
req->sg_cnt when using inline bio. This only affects nvme-loop
transport.
Fixes: dab3902b19 ("nvmet: use inline bio for passthru fast path")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When handling rw commands, for inline bio case we only consider
transfer size. This works well when req->sg_cnt fits into the
req->inline_bvec, but it will result in the warning in
__bio_add_page() when req->sg_cnt > NVMET_MAX_INLINE_BVEC.
Consider an I/O size 32768 and first page is not aligned to the page
boundary, then I/O is split in following manner :-
[ 2206.256140] nvmet: sg->length 3440 sg->offset 656
[ 2206.256144] nvmet: sg->length 4096 sg->offset 0
[ 2206.256148] nvmet: sg->length 4096 sg->offset 0
[ 2206.256152] nvmet: sg->length 4096 sg->offset 0
[ 2206.256155] nvmet: sg->length 4096 sg->offset 0
[ 2206.256159] nvmet: sg->length 4096 sg->offset 0
[ 2206.256163] nvmet: sg->length 4096 sg->offset 0
[ 2206.256166] nvmet: sg->length 4096 sg->offset 0
[ 2206.256170] nvmet: sg->length 656 sg->offset 0
Now the req->transfer_size == NVMET_MAX_INLINE_DATA_LEN i.e. 32768, but
the req->sg_cnt is (9) > NVMET_MAX_INLINE_BIOVEC which is (8).
This will result in the following warning message :-
nvmet_bdev_execute_rw()
bio_add_page()
__bio_add_page()
WARN_ON_ONCE(bio_full(bio, len));
This scenario is very hard to reproduce on the nvme-loop transport only
with rw commands issued with the passthru IOCTL interface from the host
application and the data buffer is allocated with the malloc() and not
the posix_memalign().
Fixes: 73383adfad ("nvmet: don't split large I/Os unconditionally")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
nvme_init_identify and thus nvme_mpath_init can be called multiple
times and thus must not overwrite potentially initialized or in-use
fields. Split out a helper for the basic initialization when the
controller is initialized and make sure the init_identify path does
not blindly change in-use data structures.
Fixes: 0d0b660f21 ("nvme: add ANA support")
Reported-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.de>
__blk_mq_sched_bio_merge() gets the ctx and hctx for the current CPU and
passes the hctx to ->bio_merge(). kyber_bio_merge() then gets the ctx
for the current CPU again and uses that to get the corresponding Kyber
context in the passed hctx. However, the thread may be preempted between
the two calls to blk_mq_get_ctx(), and the ctx returned the second time
may no longer correspond to the passed hctx. This "works" accidentally
most of the time, but it can cause us to read garbage if the second ctx
came from an hctx with more ctx's than the first one (i.e., if
ctx->index_hw[hctx->type] > hctx->nr_ctx).
This manifested as this UBSAN array index out of bounds error reported
by Jakub:
UBSAN: array-index-out-of-bounds in ../kernel/locking/qspinlock.c:130:9
index 13106 is out of range for type 'long unsigned int [128]'
Call Trace:
dump_stack+0xa4/0xe5
ubsan_epilogue+0x5/0x40
__ubsan_handle_out_of_bounds.cold.13+0x2a/0x34
queued_spin_lock_slowpath+0x476/0x480
do_raw_spin_lock+0x1c2/0x1d0
kyber_bio_merge+0x112/0x180
blk_mq_submit_bio+0x1f5/0x1100
submit_bio_noacct+0x7b0/0x870
submit_bio+0xc2/0x3a0
btrfs_map_bio+0x4f0/0x9d0
btrfs_submit_data_bio+0x24e/0x310
submit_one_bio+0x7f/0xb0
submit_extent_page+0xc4/0x440
__extent_writepage_io+0x2b8/0x5e0
__extent_writepage+0x28d/0x6e0
extent_write_cache_pages+0x4d7/0x7a0
extent_writepages+0xa2/0x110
do_writepages+0x8f/0x180
__writeback_single_inode+0x99/0x7f0
writeback_sb_inodes+0x34e/0x790
__writeback_inodes_wb+0x9e/0x120
wb_writeback+0x4d2/0x660
wb_workfn+0x64d/0xa10
process_one_work+0x53a/0xa80
worker_thread+0x69/0x5b0
kthread+0x20b/0x240
ret_from_fork+0x1f/0x30
Only Kyber uses the hctx, so fix it by passing the request_queue to
->bio_merge() instead. BFQ and mq-deadline just use that, and Kyber can
map the queues itself to avoid the mismatch.
Fixes: a6088845c2 ("block: kyber: make kyber more friendly with merging")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Link: https://lore.kernel.org/r/c7598605401a48d5cfeadebb678abd10af22b83f.1620691329.git.osandov@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add error handling in btrfs_fileattr_set in case of an error while
starting a transaction. This fixes btrfs/232 which otherwise used to
fail with below signature on Power.
btrfs/232 [ 1119.474650] run fstests btrfs/232 at 2021-04-21 02:21:22
<...>
[ 1366.638585] BUG: Unable to handle kernel data access on read at 0xffffffffffffff86
[ 1366.638768] Faulting instruction address: 0xc0000000009a5c88
cpu 0x0: Vector: 380 (Data SLB Access) at [c000000014f177b0]
pc: c0000000009a5c88: btrfs_update_root_times+0x58/0xc0
lr: c0000000009a5c84: btrfs_update_root_times+0x54/0xc0
<...>
pid = 24881, comm = fsstress
btrfs_update_inode+0xa0/0x140
btrfs_fileattr_set+0x5d0/0x6f0
vfs_fileattr_set+0x2a8/0x390
do_vfs_ioctl+0x1290/0x1ac0
sys_ioctl+0x6c/0x120
system_call_exception+0x3d4/0x410
system_call_common+0xec/0x278
Fixes: 97fc297754 ("btrfs: convert to fileattr")
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
A couple of high priority core fixes and the usual bits scattered
across individual drivers.
core:
* Fix ioctl handler double free.
* Fix an accidental ABI change wrt to error codes when an IOCTL is not
supported.
gp2ap002:
* Runtime pm imbalance on error.
hid-sensors:
* Fix a Kconfig dependency issue in a particularly crazy config.
mpu3050:
* Fix wrong temperature calculation due to a type needing to be signed.
pulsedlight:
* Runtime pm imbalance on error.
tsl2583
* Fix a potential division by zero.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmCae78RHGppYzIzQGtl
cm5lbC5vcmcACgkQVIU0mcT0Foi+IRAAr+DfHNTGeyHmQAC2I9q3uE4T9wZniAcm
uD4piZTXh96uCe9ebSBOs7WbM16wv1TKGjoYWLQHGi4fyZPqIRUBCFYJ/VfQrL6h
fAyqJWP3fWEw8T4+Z5UdQgo6Aofg1K0CzJOGshnKACIjt9BDa72AmqUKwqvU0wRb
SEDm8tc/x+a+3s8UjEtHDzHhlE26qQVs0Rxaaln9L8XXMcp9tg03bitfMBcb9nPD
bRqUukBYa4YJ210esDMzrckFxAhBULyWKOOgrIR8TdXveF1sbZE9o9iycOFzsxYz
KIkHjNIiwvsaklHhHXlhiskaJBDC0i0MaEXinne4sawOmSqXIyyDehq19z8qOukc
tsRiWtsZGBD5QRJkB6p1LAk+RpYn0IRFbFSyxn6KBfhZsAjwiqUVAxxN5uDq/g13
C1cp329N0RSEZzDKwjqLbI6osW9sdms8CweazwBYRD5cnsjuvNv1XKjhmZldNwsU
S4smwOvRcyLih9w8PhlxHOPvhhOFO6xY1OmA3uZTiJhNWYW24KDhNU80hkcD3v+s
sPbwrozJzc25gcrc7ujlyerQO3ZQ0Ht7uUCEeG9KxVRU04CAeOdFcu9niYxAm5QZ
kxthMfDXjkjm+yJTI35NImSS9gf6fFrmXj8rj7lnnLbCrnmwDW54l9RdmKQ7EDMR
WuML9jtzDCQ=
=2wUO
-----END PGP SIGNATURE-----
Merge tag 'iio-fixes-5.13a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:
First set of IIO fixes for the 5.13 cycle
A couple of high priority core fixes and the usual bits scattered
across individual drivers.
core:
* Fix ioctl handler double free.
* Fix an accidental ABI change wrt to error codes when an IOCTL is not
supported.
gp2ap002:
* Runtime pm imbalance on error.
hid-sensors:
* Fix a Kconfig dependency issue in a particularly crazy config.
mpu3050:
* Fix wrong temperature calculation due to a type needing to be signed.
pulsedlight:
* Runtime pm imbalance on error.
tsl2583
* Fix a potential division by zero.
* tag 'iio-fixes-5.13a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
iio: tsl2583: Fix division by a zero lux_val
iio: core: return ENODEV if ioctl is unknown
iio: core: fix ioctl handlers removal
iio: gyro: mpu3050: Fix reported temperature value
iio: hid-sensors: select IIO_TRIGGERED_BUFFER under HID_SENSOR_IIO_TRIGGER
iio: proximity: pulsedlight: Fix rumtime PM imbalance on error
iio: light: gp2ap002: Fix rumtime PM imbalance on error
Screen flickers on Innolux eDP 1.3 panel when clock rate 540000 is in use.
According to the panel vendor, though clock rate 540000 is advertised,
but the max clock rate it really supports is 270000.
Ville Syrjälä mentioned that fast and narrow also breaks some eDP 1.4
panel, so use slow and wide training for all panels to resolve the
issue.
User also confirmed that the new strategy doesn't introduce any
regression on XPS 9380.
v2:
- Use slow and wide for everything.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3384
References: https://gitlab.freedesktop.org/drm/intel/-/issues/272
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210421052054.1434718-1-kai.heng.feng@canonical.com
(cherry picked from commit acca7762eb)
Fixes: 2bbd6dba84 ("drm/i915: Try to use fast+narrow link on eDP again and fall back to the old max strategy on failure")
Cc: <stable@vger.kernel.org> # v5.12+
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
"o" isn't a common asm() constraint to use; it triggers an assertion in
assert-enabled builds of LLVM that it's not recognized when targeting
aarch64 (though it appears to fall back to "m"). It's fixed in LLVM 13 now,
but there isn't really a good reason to use "o" in particular here. To
avoid causing build issues for those using assert-enabled builds of earlier
LLVM versions, the constraint needs changing.
Instead, if the point is to retain the __builtin_alloca(), make ptr appear
to "escape" via being an input to an empty inline asm block. This is
preferable anyways, since otherwise this looks like a dead store.
While the use of "r" was considered in
https://lore.kernel.org/lkml/202104011447.2E7F543@keescook/
it was only tested as an output (which looks like a dead store, and wasn't
sufficient).
Use "r" as an input constraint instead, which behaves correctly across
compilers and architectures.
Fixes: 39218ff4c6 ("stack: Optionally randomize kernel stack offset each syscall")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://reviews.llvm.org/D100412
Link: https://bugs.llvm.org/show_bug.cgi?id=49956
Link: https://lore.kernel.org/r/20210419231741.4084415-1-keescook@chromium.org
commit 6579c8d97a ("clk: Mark fwnodes when their clock provider is added")
revealed that clk/bcm/clk-raspberrypi.c driver calls
devm_of_clk_add_hw_provider(), with a NULL dev->of_node, which resulted in a
NULL pointer dereference in of_clk_add_hw_provider() when calling
fwnode_dev_initialized().
Returning 0 is reducing the if conditions in driver code and is being
consistent with the CONFIG_OF=n inline stub that returns 0 when CONFIG_OF
is disabled. The downside is that drivers will maybe register clkdev lookups
when they don't need to and waste some memory.
Fixes: 6579c8d97a ("clk: Mark fwnodes when their clock provider is added")
Fixes: 3c9ea42802 ("clk: Mark fwnodes when their clock provider is added/removed")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Nicolas Saenz Julienne <nsaenz@kernel.org>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20210426065618.588144-1-tudor.ambarus@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Fix swapping of cpu_map and stat_config records.
- Fix dynamic libbpf linking.
- Disallow -c and -F option at the same time in 'perf record'.
- Update headers with the kernel originals.
- Silence warning for JSON ArchStd files.
- Fix a build error on arm64 with clang.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYJls5AAKCRCyPKLppCJ+
J5dwAQDU/rlV0hgXT547BmLLQV8oMjOUeEEzdHtpd0R4Q36UEQEAyPacR5bfH8cZ
u2hfq2SyeMJnoDSPRQrsXQFvVK/xYQs=
=b5bE
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v5.13-2021-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix swapping of cpu_map and stat_config records.
- Fix dynamic libbpf linking.
- Disallow -c and -F option at the same time in 'perf record'.
- Update headers with the kernel originals.
- Silence warning for JSON ArchStd files.
- Fix a build error on arm64 with clang.
* tag 'perf-tools-fixes-for-v5.13-2021-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
tools headers UAPI: Sync perf_event.h with the kernel sources
tools headers cpufeatures: Sync with the kernel sources
tools include UAPI powerpc: Sync errno.h with the kernel headers
tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy'
tools headers UAPI: Sync linux/prctl.h with the kernel sources
tools headers UAPI: Sync files changed by landlock, quotactl_path and mount_settattr new syscalls
perf tools: Fix a build error on arm64 with clang
tools headers kvm: Sync kvm headers with the kernel sources
tools headers UAPI: Sync linux/kvm.h with the kernel sources
perf tools: Fix dynamic libbpf link
perf session: Fix swapping of cpu_map and stat_config records
perf jevents: Silence warning for ArchStd files
perf record: Disallow -c and -F option at the same time
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools headers UAPI: Update tools's copy of drm.h headers
Removes this annoying warning:
arch/sh/kernel/traps.c: In function ‘nmi_trap_handler’:
arch/sh/kernel/traps.c:183:15: warning: unused variable ‘cpu’ [-Wunused-variable]
183 | unsigned int cpu = smp_processor_id();
Fixes: fe3f1d5d7c ("sh: Get rid of nmi_count()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210414170517.1205430-1-eric.dumazet@gmail.com
A few of the Documentation .rst files begin with a Unicode
byte order mark (BOM). The BOM may signify endianess for
16-bit or 32-bit encodings or indicate that the text stream
is indeed Unicode. We don't need it for either of those uses.
It may also interfere with (confuse) some software.
Since we don't need it and its use is optional, just delete
the uses of it in Documentation/.
https://en.wikipedia.org/wiki/Byte_order_mark
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Link: https://lore.kernel.org/r/20210506231907.14359-1-rdunlap@infradead.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This translation file was replaced by
Documentation/translations/zh_CN/admin-guide/security-bugs.rst
which was created in commit 2d15357100 ("docs/zh_CN: Add
zh_CN/admin-guide/security-bugs.rst").
This is a translation left over from history. Remove it.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Acked-by: Wu XiangCheng <bobwxc@email.cn>
Link: https://lore.kernel.org/r/20210508030741.82655-1-wanjiabing@vivo.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmCZnCIACgkQxWXV+ddt
WDuEvhAAmC+Mkrz25GbQnSIp2FKYCCQK34D0rdghml0Bc0cJcDh3yhgIB6ZTHZ7e
Z+UZu84ISK31OHKDzXtX0MINN2wuU4u4kd6PHtYj0wSVl3cX6E/K5j6YcThfI1Ru
vCW5O87V9SCV5NnykIFt3sbYvsPKtF9lhgPQprj4np+wxaSyNlEF2c+zLTI3J7NV
+8OlM4oi8GocZd1aAwGpVM3qUPyQSHEb9oUEp6aV1ERuAs6LIyeGks3Cag6gjPnq
dYz3jV9HyZB5GtX0dmv4LeRFIog1uFi+SIEFl5RpqhB3sXN3n6XHMka4x20FXiWy
PfX9+Nf4bQGx6F9rGsgHNHQP5dVhHAkZcq3E0n0yshIfNe8wDHBRlmk0wbfj4K7I
VYv85SxEYpigG8KzF5gjiar4EqsaJVQcJioMxVE7z9vrW6xlOWD1lf/ViUZnB3wd
IQEyGz2qOe9eqJD+dnyN7QkN9WKGSUr2p1Q/DngCIwFzKWf1qIlETNXrIL+AZ97r
v4G5mMq9dCxs3s8c5SGbdF9qqK8gEuaV3iWQAoKOciuy6fbc553Q90I1v3OhW+by
j2yVoo3nJbBJBuLBNWPDUlwxQF/EHPQ6nh3fvxNRgwksXgRmqywdJb5dQ8hcKgSH
RsvinJhtKo5rTgtgGgmNvmLAjKIieW1lIVG4ha0O/m49HeaohDE=
=GNNs
-----END PGP SIGNATURE-----
Merge tag 'for-5.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"First batch of various fixes, here's a list of notable ones:
- fix unmountable seed device after fstrim
- fix silent data loss in zoned mode due to ordered extent splitting
- fix race leading to unpersisted data and metadata on fsync
- fix deadlock when cloning inline extents and using qgroups"
* tag 'for-5.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: initialize return variable in cleanup_free_space_cache_v1
btrfs: zoned: sanity check zone type
btrfs: fix unmountable seed device after fstrim
btrfs: fix deadlock when cloning inline extents and using qgroups
btrfs: fix race leading to unpersisted data and metadata on fsync
btrfs: do not consider send context as valid when trying to flush qgroups
btrfs: zoned: fix silent data loss after failure splitting ordered extent
Commit 4af22ded0e ("arc: fix memory initialization for systems
with two memory banks") fixed highmem, but for the PAE case it causes
bug messages:
| BUG: Bad page state in process swapper pfn:80000
| page:(ptrval) refcount:0 mapcount:1 mapping:00000000 index:0x0 pfn:0x80000 flags: 0x0()
| raw: 00000000 00000100 00000122 00000000 00000000 00000000 00000000 00000000
| raw: 00000000
| page dumped because: nonzero mapcount
| Modules linked in:
| CPU: 0 PID: 0 Comm: swapper Not tainted 5.12.0-rc5-00003-g1e43c377a79f #1
This is because the fix expects highmem to be always less than
lowmem and uses min_low_pfn as an upper zone border for highmem.
max_high_pfn should be ok for both highmem and highmem+PAE cases.
Fixes: 4af22ded0e ("arc: fix memory initialization for systems with two memory banks")
Signed-off-by: Vladimir Isaev <isaev@synopsys.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: stable@vger.kernel.org #5.8 onwards
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
32-bit PAGE_MASK can not be used as a mask for physical addresses
when PAE is enabled. PAGE_MASK_PHYS must be used for physical
addresses instead of PAGE_MASK.
Without this, init gets SIGSEGV if pte_modify was called:
| potentially unexpected fatal signal 11.
| Path: /bin/busybox
| CPU: 0 PID: 1 Comm: init Not tainted 5.12.0-rc5-00003-g1e43c377a79f-dirty
| Insn could not be fetched
| @No matching VMA found
| ECR: 0x00040000 EFA: 0x00000000 ERET: 0x00000000
| STAT: 0x80080082 [IE U ] BTA: 0x00000000
| SP: 0x5f9ffe44 FP: 0x00000000 BLK: 0xaf3d4
| LPS: 0x000d093e LPE: 0x000d0950 LPC: 0x00000000
| r00: 0x00000002 r01: 0x5f9fff14 r02: 0x5f9fff20
| ...
| Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Signed-off-by: Vladimir Isaev <isaev@synopsys.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
We have NR_syscall syscalls from [0 .. NR_syscall-1].
However the check for invalid syscall number is "> NR_syscall" as
opposed to >=. This off-by-one error erronesously allows "NR_syscall"
to be treated as valid syscall causeing out-of-bounds access into
syscall-call table ensuing a crash (holes within syscall table have a
invalid-entry handler but this is beyond the array implementing the
table).
This problem showed up on v5.6 kernel when testing glibc 2.33 (v5.10
kernel capable, includng faccessat2 syscall 439). The v5.6 kernel has
NR_syscalls=439 (0 to 438). Due to the bug, 439 passed by glibc was
not handled as -ENOSYS but processed leading to a crash.
Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/48
Reported-by: Shahab Vahedi <shahab@synopsys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>