When data digest is enabled we should unmap pdu iovec before handling
the data digest pdu.
Signed-off-by: Elad Grupi <elad.grupi@dell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
From the base spec, Figure 78:
"Controller Configuration, these fields are defined as parameters to
configure an "I/O Controller (IOC)" and not to configure a "Discovery
Controller (DC).
...
If the controller does not support I/O queues, then this field shall
be read-only with a value of 0h
Just perform this check for I/O controllers.
Fixes: a07b4970f4 ("nvmet: add a generic NVMe target")
Reported-by: Belanger, Martin <Martin.Belanger@dell.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We only setup io queues for nvme controllers, and it makes absolutely no
sense to allow a controller (re)connect without any I/O queues. If we
happen to fail setting the queue count for any reason, we should not allow
this to be a successful reconnect as I/O has no chance in going through.
Instead just fail and schedule another reconnect.
Reported-by: Chao Leng <lengchao@huawei.com>
Fixes: 7110230719 ("nvme-rdma: add a NVMe over Fabrics RDMA host driver")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chao Leng <lengchao@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We only setup io queues for nvme controllers, and it makes absolutely no
sense to allow a controller (re)connect without any I/O queues. If we
happen to fail setting the queue count for any reason, we should not
allow this to be a successful reconnect as I/O has no chance in going
through. Instead just fail and schedule another reconnect.
Fixes: 3f2304f8c6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
For our pure advisory use-case, we only rely on this call as a hint, so
fix the warning complaints of using the smp_processor_id variants with
preemption enabled.
Fixes: db5ad6b7f8 ("nvme-tcp: try to send request in queue_rq context")
Fixes: ada8317721 ("nvme-tcp: Fix warning with CONFIG_DEBUG_PREEMPT")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When the controller sends us a 0-length r2t PDU we should not attempt to
try to set up a h2cdata PDU but rather conclude that this is a buggy
controller (forward progress is not possible) and simply fail it
immediately.
Fixes: 3f2304f8c6 ("nvme-tcp: add NVMe over TCP host driver")
Reported-by: Belanger, Martin <Martin.Belanger@dell.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We voluntarily limit the Write Zeroes sizes to the MDTS value provided by
the hardware, but currently get the units wrong, so fix that.
Fixes: 6e02318eae ("nvme: add support for the Write Zeroes command")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Tested-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
To avoid an error recovery deadlock where the keep alive work is waiting
for a request and thus can't be flushed to make progress for tearing down
the controller. Also print the error code returned from
blk_mq_alloc_request to help debugging any future issues in this code.
Based on an earlier patch from Hannes Reinecke <hare@suse.de>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Merge nvme_keep_alive into its only caller to prepare for additional
changes to this code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Fabrics drivers currently reserve two tags on the admin queue. But
given that the connect command is only run on a freshly created queue
or after all commands have been force aborted we only need to reserve
a single tag.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
[Why?]
Should only reroute gamut remap to mpc unless 3D LUT is not used and all
planes are using the same src->dest.
[How?]
Remove DCN30 specific logic for rerouting gamut remap to mpc.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1513
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
[Why]
DCN30 needs to correctly program reversed gamma curve, which DCN20
already has.
Also needs to fix a bug that 252-255 values are clipped.
[How]
Apply two fixes into DCN30.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1513
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Calvin Hou <Calvin.Hou@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Acked-by: Vladimir Stempen <Vladimir.Stempen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
In the situations where the DWC3 gadget stops active transfers, once
calling the dwc3_gadget_giveback(), there is a chance where a function
driver can queue a new USB request in between the time where the dwc3
lock has been released and re-aquired. This occurs after we've already
issued an ENDXFER command. When the stop active transfers continues
to remove USB requests from all dep lists, the newly added request will
also be removed, while controller still has an active TRB for it.
This can lead to the controller accessing an unmapped memory address.
Fix this by ensuring parameters to prevent EP queuing are set before
calling the stop active transfers API.
Fixes: ae7e86108b ("usb: dwc3: Stop active transfers before halting the controller")
Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
Link: https://lore.kernel.org/r/1615507142-23097-1-git-send-email-wcheng@codeaurora.org
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tcpm-source-psy- does not invoke power_supply_changed API when
one of the published power supply properties is changed.
power_supply_changed needs to be called to notify
userspace clients(uevents) and kernel clients.
Fixes: f2a8aa053c ("typec: tcpm: Represent source supply through power_supply")
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210317181249.1062995-1-badhri@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove the unused "u32 vdo[3]" part in the tps6598x_rx_identity_reg
struct. This helps avoid "failed to register partner" errors which
happen when tps6598x_read_partner_identity() fails because the
amount of data read is 12 bytes smaller than the struct size.
Note that vdo[3] is already in usb_pd_identity and hence
shouldn't be added to tps6598x_rx_identity_reg as well.
Fixes: f6c56ca91b ("usb: typec: Add the Product Type VDOs to struct usb_pd_identity")
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Elias Rudberg <mail@eliasrudberg.se>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210311124710.6563-1-mail@eliasrudberg.se
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Matthias reports that the Amazon Kindle automatically removes its
emulated media if it doesn't receive another SCSI command within about
one second after a SYNCHRONIZE CACHE. It does so even when the host
has sent a PREVENT MEDIUM REMOVAL command. The reason for this
behavior isn't clear, although it's not hard to make some guesses.
At any rate, the results can be unexpected for anyone who tries to
access the Kindle in an unusual fashion, and in theory they can lead
to data loss (for example, if one file is closed and synchronized
while other files are still in the middle of being written).
To avoid such problems, this patch creates a new usb-storage quirks
flag telling the driver always to issue a REQUEST SENSE following a
SYNCHRONIZE CACHE command, and adds an unusual_devs entry for the
Kindle with the flag set. This is sufficient to prevent the Kindle
from doing its automatic unload, without interfering with proper
operation.
Another possible way to deal with this would be to increase the
frequency of TEST UNIT READY polling that the kernel normally carries
out for removable-media storage devices. However that would increase
the overall load on the system and it is not as reliable, because the
user can override the polling interval. Changing the driver's
behavior is safer and has minimal overhead.
CC: <stable@vger.kernel.org>
Reported-and-tested-by: Matthias Schwarzott <zzam@gentoo.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20210317190654.GA497856@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When gadget is disconnected, running sequence is like this.
. composite_disconnect
. Call trace:
usb_string_copy+0xd0/0x128
gadget_config_name_configuration_store+0x4
gadget_config_name_attr_store+0x40/0x50
configfs_write_file+0x198/0x1f4
vfs_write+0x100/0x220
SyS_write+0x58/0xa8
. configfs_composite_unbind
. configfs_composite_bind
In configfs_composite_bind, it has
"cn->strings.s = cn->configuration;"
When usb_string_copy is invoked. it would
allocate memory, copy input string, release previous pointed memory space,
and use new allocated memory.
When gadget is connected, host sends down request to get information.
Call trace:
usb_gadget_get_string+0xec/0x168
lookup_string+0x64/0x98
composite_setup+0xa34/0x1ee8
If gadget is disconnected and connected quickly, in the failed case,
cn->configuration memory has been released by usb_string_copy kfree but
configfs_composite_bind hasn't been run in time to assign new allocated
"cn->configuration" pointer to "cn->strings.s".
When "strlen(s->s) of usb_gadget_get_string is being executed, the dangling
memory is accessed, "BUG: KASAN: use-after-free" error occurs.
Cc: stable@vger.kernel.org
Signed-off-by: Jim Lin <jilin@nvidia.com>
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Link: https://lore.kernel.org/r/1615444961-13376-1-git-send-email-macpaul.lin@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently udc->ud.tcp_rx is being assigned twice, the second assignment
is incorrect, it should be to udc->ud.tcp_tx instead of rx. Fix this.
Fixes: 46613c9dfa ("usbip: fix vudc usbip_sockfd_store races leading to gpf")
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: stable <stable@vger.kernel.org>
Addresses-Coverity: ("Unused value")
Link: https://lore.kernel.org/r/20210311104445.7811-1-colin.king@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
MODULE_SUPPORTED_DEVICE was added in pre-git era and never was
implemented. We can safely remove it, because the kernel has grown
to have many more reliable mechanisms to determine if device is
supported or not.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
table failed to allocate due to a big number of states (Manaf
Meethalavalappu Pallikunhi).
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmBRxJ4ACgkQqDIjiipP
6E+84AgAmp7T+2VaQjukte76KH0gsg5v1mpXT+LK2HUC9ofH76eHTwY7M/6IyLmt
1V5CBetBe393ZvFn3SF2o5ipwnRWufHLgyoTO+AYLTWvJyBQC83rLUjUGqozMpOU
XFO1kLUvxYso//FS6wC3//Dfs8oMLx6SBY7k2qPTG2p0sahNC0QzSar/5Nbls9Ev
YEEpNzvd+FVMNSyZiYrMYlZv6VcFEn7ZPZLLj24Eut7A61Pbg8fc6iV8Ku26roiK
AQs5iZ0eHyzz1uW9I6obi+77OB9Fk8tpJJonlnyzZ2diMFbeaIawbhjzBMWzUgeb
rIJizEsNzlZHqre8ADbZ8yRgpITvcQ==
=rftQ
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal framework fix from Daniel Lezcano:
"Fix NULL pointer access when the cooling device transition stats
table failed to allocate due to a big number of states (Manaf
Meethalavalappu Pallikunhi)"
* tag 'thermal-v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal/core: Add NULL pointer check before using cooling device stats
r is "u32" always >= 0,mipi_dsi_create_packet may return little than zero.
so r < 0 condition is never accessible.
Fixes coccicheck warnings:
./drivers/gpu/drm/omapdrm/dss/dsi.c:2155:5-6:
WARNING: Unsigned expression compared with zero: r < 0
Signed-off-by: Junlin Yang <yangjunlin@yulong.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210312071445.1721-1-angkery@163.com
SAMPLE_OA parameter enables sampling of OA buffer and results in a call
to init the OA buffer which initializes the OA unit head/tail pointers.
The OA_EXPONENT parameter controls the periodicity of the OA reports in
the OA buffer and results in starting a hrtimer.
Before gen12, all use cases required the use of the OA buffer and i915
enforced this setting when vetting out the parameters passed. In these
platforms the hrtimer was enabled if OA_EXPONENT was passed. This worked
fine since it was implied that SAMPLE_OA is always passed.
With gen12, this changed. Users can use perf without enabling the OA
buffer as in OAR use cases. While an OAR use case should ideally not
start the hrtimer, we see that passing an OA_EXPONENT parameter will
start the hrtimer even though SAMPLE_OA is not specified. This results
in an uninitialized OA buffer, so the head/tail pointers used to track
the buffer are zero.
This itself does not fail, but if we ran a use-case that SAMPLED the OA
buffer previously, then the OA_TAIL register is still pointing to an old
value. When the timer callback runs, it ends up calculating a
wrong/large number of available reports. Since we do a spinlock_irq_save
and start processing a large number of reports, NMI watchdog fires and
causes a crash.
Start the timer only if SAMPLE_OA is specified.
v2:
- Drop SAMPLE OA check when appending samples (Ashutosh)
- Prevent read if OA buffer is not being sampled
Fixes: 00a7f0d715 ("drm/i915/tgl: Add perf support on TGL")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210305210947.58751-1-umesh.nerlige.ramappa@intel.com
(cherry picked from commit be0bdd67fd)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
On HSW/BDW with VT-d active the first tile row scanned out
after the first async flip of the frame often ends up corrupted.
Whether the corruption happens or not depends on the scanline
on which the async flip happens, but the behaviour seems very
consistent. Ie. the same set of scanlines (which are most scanlines)
always show the corruption. And another set of scanlines (far less
of them) never shows the corruption.
I discovered that disabling the fetch-stride stretching
feature cures the corruption. This is some kind of TLB related
prefetch thing AFAIK. We already disable it on SNB primary
planes due to a documented workaround. The hardware folks
indicated that disabling this should be fine, so let's go
with that.
And while we're here, let's document the relevant bits on all
pre-skl platforms.
Fixes: 2a636e240c ("drm/i915: Implement async flip for ivb/hsw")
Fixes: cda195f13a ("drm/i915: Implement async flips for bdw")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210220103303.3448-1-ville.syrjala@linux.intel.com
Reviewed-by: Karthik B S <karthik.b.s@intel.com>
(cherry picked from commit b7a7053ab2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
In ww_acquire_init(), mutex_acquire() is gated by CONFIG_DEBUG_LOCK_ALLOC.
The dep_map in the ww_acquire_ctx structure is also gated by the
same config. However mutex_release() in ww_acquire_fini() is gated by
CONFIG_DEBUG_MUTEXES. It is possible to set CONFIG_DEBUG_MUTEXES without
setting CONFIG_DEBUG_LOCK_ALLOC though it is an unlikely configuration.
That may cause a compilation error as dep_map isn't defined in this
case. Fix this potential problem by enclosing mutex_release() inside
CONFIG_DEBUG_LOCK_ALLOC.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210316153119.13802-3-longman@redhat.com
The use_ww_ctx flag is passed to mutex_optimistic_spin(), but the
function doesn't use it. The frequent use of the (use_ww_ctx && ww_ctx)
combination is repetitive.
In fact, ww_ctx should not be used at all if !use_ww_ctx. Simplify
ww_mutex code by dropping use_ww_ctx from mutex_optimistic_spin() an
clear ww_ctx if !use_ww_ctx. In this way, we can replace (use_ww_ctx &&
ww_ctx) by just (ww_ctx).
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Link: https://lore.kernel.org/r/20210316153119.13802-2-longman@redhat.com
There is a possible chance that some cooling device stats buffer
allocation fails due to very high cooling device max state value.
Later cooling device update sysfs can try to access stats data
for the same cooling device. It will lead to NULL pointer
dereference issue.
Add a NULL pointer check before accessing thermal cooling device
stats data. It fixes the following bug
[ 26.812833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004
[ 27.122960] Call trace:
[ 27.122963] do_raw_spin_lock+0x18/0xe8
[ 27.122966] _raw_spin_lock+0x24/0x30
[ 27.128157] thermal_cooling_device_stats_update+0x24/0x98
[ 27.128162] cur_state_store+0x88/0xb8
[ 27.128166] dev_attr_store+0x40/0x58
[ 27.128169] sysfs_kf_write+0x50/0x68
[ 27.133358] kernfs_fop_write+0x12c/0x1c8
[ 27.133362] __vfs_write+0x54/0x160
[ 27.152297] vfs_write+0xcc/0x188
[ 27.157132] ksys_write+0x78/0x108
[ 27.162050] ksys_write+0xf8/0x108
[ 27.166968] __arm_smccc_hvc+0x158/0x4b0
[ 27.166973] __arm_smccc_hvc+0x9c/0x4b0
[ 27.186005] el0_svc+0x8/0xc
Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1607367181-24589-1-git-send-email-manafm@codeaurora.org
As per UEFI spec 2.8B section 8.2, EFI_UNSUPPORTED may be returned by
EFI variable runtime services if no variable storage is supported by
firmware. In this case, there is no point for kernel to continue
efivars initialization. That said, efivar_init() should fail by
returning an error code, so that efivarfs will not be mounted on
/sys/firmware/efi/efivars at all. Otherwise, user space like efibootmgr
will be confused by the EFIVARFS_MAGIC seen there, while EFI variable
calls cannot be made successfully.
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
When KVM_REQ_MASTERCLOCK_UPDATE request is issued (e.g. after migration)
we need to make sure no vCPU sees stale values in PV clock structures and
thus all vCPUs are kicked with KVM_REQ_CLOCK_UPDATE. Hyper-V TSC page
clocksource is global and kvm_guest_time_update() only updates in on vCPU0
but this is not entirely correct: nothing blocks some other vCPU from
entering the guest before we finish the update on CPU0 and it can read
stale values from the page.
Invalidate TSC page in kvm_gen_update_masterclock() to switch all vCPUs
to using MSR based clocksource (HV_X64_MSR_TIME_REF_COUNT).
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210316143736.964151-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
HV_X64_MSR_TSC_EMULATION_STATUS indicates whether TSC accesses are emulated
after migration (to accommodate for a different host TSC frequency when TSC
scaling is not supported; we don't implement this in KVM). Guest can use
the same MSR to stop TSC access emulation by writing zero. Writing anything
else is forbidden.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210316143736.964151-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are two issues for RV32,
1) if use FLATMEM, it is useless to enable SPARSEMEM_STATIC.
2) if use SPARSMEM, both SPARSEMEM_VMEMMAP and SPARSEMEM_STATIC is enabled.
Fixes: d95f1a542c ("RISC-V: Implement sparsemem")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Without this I get a missing prototype warning.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: e178d670f2 ("riscv/kasan: add KASAN_VMALLOC support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Make sure that writes to kernel page table during KASAN vmalloc
initialization are made visible by adding a sfence.vma.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Fixes: e178d670f2 ("riscv/kasan: add KASAN_VMALLOC support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
init_resources() allocates an array of resources, based on the current
total number of memory regions and reserved memory regions. However,
allocating this array using memblock_alloc() might increase the number
of reserved memory regions. If that happens, populating the array later
based on the new number of regions will cause out-of-bounds writes
beyond the end of the allocated array.
Fix this by allocating one more entry, which may or may not be used.
Fixes: 797f0375dd ("RISC-V: Do not allocate memblock while iterating reserved memblocks")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
When CONFIG_SOC_CANAAN is selected, the K210 sysctl driver is always
compiled. Since this driver early init function calls the function
k210_clk_early_init() implemented by the K210 clk driver, this driver
must also always be selected for compilation ot avoid build failures.
Avoid such build failures by always selecting CONFIG_COMMON_CLK and
CONFIG_COMMON_CLK_K210 when CONFIG_SOC_CANAAN is enabled.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Fixes: c6ca7616f7 ("clk: Add RISC-V Canaan Kendryte K210 clock driver")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
There is a spelling mistake in a comment, fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Use BUG_ON instead of a if condition followed by BUG.
Generated by: scripts/coccinelle/misc/bugon.cocci
Fixes: c22b0bcb1d ("riscv: Add kprobes supported")
CC: Guo Ren <guoren@linux.alibaba.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
mpt3sas_get_port_by_id() can be called when a spinlock is held. Use
GFP_ATOMIC instead of GFP_KERNEL when allocating memory.
Issue spotted by call_kern.cocci:
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 7125 inside lock on line 7123 but uses GFP_KERNEL
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 6842 inside lock on line 6839 but uses GFP_KERNEL
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 6854 inside lock on line 6851 but uses GFP_KERNEL
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 7706 inside lock on line 7702 but uses GFP_KERNEL
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 10260 inside lock on line 10256 but uses GFP_KERNEL
Link: https://lore.kernel.org/r/20210220093951.905362-1-christophe.jaillet@wanadoo.fr
Fixes: 324c122fc0 ("scsi: mpt3sas: Add module parameter multipath_on_hba")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Both add_slot_store() and remove_slot_store() try to fix up the
drc_name copied from the store buffer by placing a NUL terminator at
nbyte + 1 or in place of a '\n' if present. However, the static buffer
that we copy the drc_name data into is not zeroed and can contain
anything past the n-th byte.
This is problematic if a '\n' byte appears in that buffer after nbytes
and the string copied into the store buffer was not NUL terminated to
start with as the strchr() search for a '\n' byte will mark this
incorrectly as the end of the drc_name string resulting in a drc_name
string that contains garbage data after the n-th byte.
Additionally it will cause us to overwrite that '\n' byte on the stack
with NUL, potentially corrupting data on the stack.
The following debugging shows an example of the drmgr utility writing
"PHB 4543" to the add_slot sysfs attribute, but add_slot_store()
logging a corrupted string value.
drmgr: drmgr: -c phb -a -s PHB 4543 -d 1
add_slot_store: drc_name = PHB 4543°|<82>!, rc = -19
Fix this by using strscpy() instead of memcpy() to ensure the string
is NUL terminated when copied into the static drc_name buffer.
Further, since the string is now NUL terminated the code only needs to
change '\n' to '\0' when present.
Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
[mpe: Reformat change log and add mention of possible stack corruption]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210315214821.452959-1-tyreld@linux.ibm.com
The "lpm" and "->enabled" are all boolean. We should be using &&
rather than the bit operator.
Link: https://lore.kernel.org/r/1615896915-148864-1-git-send-email-dj0227@163.com
Fixes: 488edafb11 ("scsi: ufs-mediatek: Introduce low-power mode for device power supply")
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: dongjian <dongjian@yulong.com>
Signed-off-by: Yue Hu <huyue2@yulong.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In zonefs_open_zone(), if opened zone count is larger than
.s_max_open_zones threshold, we missed to recover .i_wr_refcnt,
fix this.
Fixes: b5c00e9757 ("zonefs: open/close zone on file open/close")
Cc: <stable@vger.kernel.org>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Commit 6654111c89 ("MIPS: vmlinux.lds.S: align raw appended dtb to 8
bytes") changed the alignment from STRUCT_ALIGNMENT bytes to 8 bytes.
The commit's message makes it sound like it was actually done on
purpose, but this is not the case. The commit was written when raw
appended dtb were not aligned at all. The STRUCT_ALIGN() was added a few
days before, in commit 7a05293af3 ("MIPS: boot/compressed: Copy DTB to
aligned address"). The true purpose of the commit was not to align
specifically to 8 bytes, but to make sure that the generated vmlinux'
size was properly padded to the alignment required for DTBs.
While the switch to 8-byte alignment worked for vmlinux-appended dtb
blobs, it broke vmlinuz-appended dtb blobs, as the decompress routine
moves the blob to a STRUCT_ALIGNMENT aligned address.
Fix this by changing the raw appended dtb blob alignment from 8 bytes
back to STRUCT_ALIGNMENT bytes in vmlinux.lds.S.
Fixes: 6654111c89 ("MIPS: vmlinux.lds.S: align raw appended dtb to 8 bytes")
Cc: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Save the current_thread_info()->status of X86 in the new
restart_block->arch_data field so TS_COMPAT_RESTART can be removed again.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210201174716.GA17898@redhat.com
The comment in get_nr_restart_syscall() says:
* The problem is that we can get here when ptrace pokes
* syscall-like values into regs even if we're not in a syscall
* at all.
Yes, but if not in a syscall then the
status & (TS_COMPAT|TS_I386_REGS_POKED)
check below can't really help:
- TS_COMPAT can't be set
- TS_I386_REGS_POKED is only set if regs->orig_ax was changed by
32bit debugger; and even in this case get_nr_restart_syscall()
is only correct if the tracee is 32bit too.
Suppose that a 64bit debugger plays with a 32bit tracee and
* Tracee calls sleep(2) // TS_COMPAT is set
* User interrupts the tracee by CTRL-C after 1 sec and does
"(gdb) call func()"
* gdb saves the regs by PTRACE_GETREGS
* does PTRACE_SETREGS to set %rip='func' and %orig_rax=-1
* PTRACE_CONT // TS_COMPAT is cleared
* func() hits int3.
* Debugger catches SIGTRAP.
* Restore original regs by PTRACE_SETREGS.
* PTRACE_CONT
get_nr_restart_syscall() wrongly returns __NR_restart_syscall==219, the
tracee calls ia32_sys_call_table[219] == sys_madvise.
Add the sticky TS_COMPAT_RESTART flag which survives after return to user
mode. It's going to be removed in the next step again by storing the
information in the restart block. As a further cleanup it might be possible
to remove also TS_I386_REGS_POKED with that.
Test-case:
$ cvs -d :pserver:anoncvs:anoncvs@sourceware.org:/cvs/systemtap co ptrace-tests
$ gcc -o erestartsys-trap-debuggee ptrace-tests/tests/erestartsys-trap-debuggee.c --m32
$ gcc -o erestartsys-trap-debugger ptrace-tests/tests/erestartsys-trap-debugger.c -lutil
$ ./erestartsys-trap-debugger
Unexpected: retval 1, errno 22
erestartsys-trap-debugger: ptrace-tests/tests/erestartsys-trap-debugger.c:421
Fixes: 609c19a385 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code")
Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210201174709.GA17895@redhat.com
Move TS_COMPAT back to asm/thread_info.h, close to TS_I386_REGS_POKED.
It was moved to asm/processor.h by b9d989c721 ("x86/asm: Move the
thread_info::status field to thread_struct"), then later 37a8f7c383
("x86/asm: Move 'status' from thread_struct to thread_info") moved the
'status' field back but TS_COMPAT was forgotten.
Preparatory patch to fix the COMPAT case for get_nr_restart_syscall()
Fixes: 609c19a385 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210201174649.GA17880@redhat.com
Preparation for fixing get_nr_restart_syscall() on X86 for COMPAT.
Add a new helper which sets restart_block->fn and calls a dummy
arch_set_restart_data() helper.
Fixes: 609c19a385 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210201174641.GA17871@redhat.com
On a Haswell machine, the perf_fuzzer managed to trigger this message:
[117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to
write 0x0400000000000000) at rIP: 0xffffffff8106e4f4
(native_write_msr+0x4/0x20)
[117248.089957] Call Trace:
[117248.092685] intel_pmu_pebs_enable_all+0x31/0x40
[117248.097737] intel_pmu_enable_all+0xa/0x10
[117248.102210] __perf_event_task_sched_in+0x2df/0x2f0
[117248.107511] finish_task_switch.isra.0+0x15f/0x280
[117248.112765] schedule_tail+0xc/0x40
[117248.116562] ret_from_fork+0x8/0x30
A fake event called VLBR_EVENT may use the bit 58 of the PEBS_ENABLE, if
the precise_ip is set. The bit 58 is reserved by the HW. Accessing the
bit causes the unchecked MSR access error.
The fake event doesn't support PEBS. The case should be rejected.
Fixes: 097e4311cd ("perf/x86: Add constraint to create guest LBR event without hw counter")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1615555298-140216-2-git-send-email-kan.liang@linux.intel.com
A repeatable crash can be triggered by the perf_fuzzer on some Haswell
system.
https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/
For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
may be mistakenly set to 0. To minimize the impact of the defect, the
commit was introduced to try to avoid dropping the PEBS record for some
cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
the local pebs_status accordingly. However, it doesn't correct the PEBS
status in the PEBS record, which may trigger the crash, especially for
the large PEBS.
It's possible that all the PEBS records in a large PEBS have the PEBS
status 0. If so, the first get_next_pebs_record_by_bit() in the
__intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
PEBS, the 'count' parameter must > 1. The second
get_next_pebs_record_by_bit() will crash.
Besides the local pebs_status, correct the PEBS status in the PEBS
record as well.
Fixes: 01330d7288 ("perf/x86: Allow zero PEBS status with only single active event")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com
When freeing a tree block we may end up adding its extent back to the
free space cache/tree, as long as there are no more references for it,
it was created in the current transaction and writeback for it never
happened. This is generally fine, however when we have tree mod log
operations it can result in inconsistent versions of a btree after
unwinding extent buffers with the recorded tree mod log operations.
This is because:
* We only log operations for nodes (adding and removing key/pointers),
for leaves we don't do anything;
* This means that we can log a MOD_LOG_KEY_REMOVE_WHILE_FREEING operation
for a node that points to a leaf that was deleted;
* Before we apply the logged operation to unwind a node, we can have
that leaf's extent allocated again, either as a node or as a leaf, and
possibly for another btree. This is possible if the leaf was created in
the current transaction and writeback for it never started, in which
case btrfs_free_tree_block() returns its extent back to the free space
cache/tree;
* Then, before applying the tree mod log operation, some task allocates
the metadata extent just freed before, and uses it either as a leaf or
as a node for some btree (can be the same or another one, it does not
matter);
* After applying the MOD_LOG_KEY_REMOVE_WHILE_FREEING operation we now
get the target node with an item pointing to the metadata extent that
now has content different from what it had before the leaf was deleted.
It might now belong to a different btree and be a node and not a leaf
anymore.
As a consequence, the results of searches after the unwinding can be
unpredictable and produce unexpected results.
So make sure we pin extent buffers corresponding to leaves when there
are tree mod log users.
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>