Commit Graph

1029818 Commits

Author SHA1 Message Date
Íñigo Huguet
d2a16bde77 sfc: add logs explaining XDP_TX/REDIRECT is not available
If it's not possible to allocate enough channels for XDP, XDP_TX and
XDP_REDIRECT don't work. However, only a message saying that not enough
channels were available was shown, but not saying what are the
consequences in that case. The user didn't know if he/she can use XDP
or not, if the performance is reduced, or what.

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 10:02:41 -07:00
Íñigo Huguet
788bc000d4 sfc: ensure correct number of XDP queues
Commit 99ba0ea616 ("sfc: adjust efx->xdp_tx_queue_count with the real
number of initialized queues") intended to fix a problem caused by a
round up when calculating the number of XDP channels and queues.
However, this was not the real problem. The real problem was that the
number of XDP TX queues had been reduced to half in
commit e26ca4b535 ("sfc: reduce the number of requested xdp ev queues"),
but the variable xdp_tx_queue_count had remained the same.

Once the correct number of XDP TX queues is created again in the
previous patch of this series, this also can be reverted since the error
doesn't actually exist.

Only in the case that there is a bug in the code we can have different
values in xdp_queue_number and efx->xdp_tx_queue_count. Because of this,
and per Edward Cree's suggestion, I add instead a WARN_ON to catch if it
happens again in the future.

Note that the number of allocated queues can be higher than the number
of used ones due to the round up, as explained in the existing comment
in the code. That's why we also have to stop increasing xdp_queue_number
beyond efx->xdp_tx_queue_count.

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 10:02:41 -07:00
Íñigo Huguet
f28100cb9c sfc: fix lack of XDP TX queues - error XDP TX failed (-22)
Fixes: e26ca4b535 sfc: reduce the number of requested xdp ev queues

The buggy commit intended to allocate less channels for XDP in order to
be more unlikely to reach the limit of 32 channels of the driver.

The idea was to use each IRQ/eventqeue for more XDP TX queues than
before, calculating which is the maximum number of TX queues that one
event queue can handle. For example, in EF10 each event queue could
handle up to 8 queues, better than the 4 they were handling before the
change. This way, it would have to allocate half of channels than before
for XDP TX.

The problem is that the TX queues are also contained inside the channel
structs, and there are only 4 queues per channel. Reducing the number of
channels means also reducing the number of queues, resulting in not
having the desired number of 1 queue per CPU.

This leads to getting errors on XDP_TX and XDP_REDIRECT if they're
executed from a high numbered CPU, because there only exist queues for
the low half of CPUs, actually. If XDP_TX/REDIRECT is executed in a low
numbered CPU, the error doesn't happen. This is the error in the logs
(repeated many times, even rate limited):
sfc 0000:5e:00.0 ens3f0np0: XDP TX failed (-22)

This errors happens in function efx_xdp_tx_buffers, where it expects to
have a dedicated XDP TX queue per CPU.

Reverting the change makes again more likely to reach the limit of 32
channels in machines with many CPUs. If this happen, no XDP_TX/REDIRECT
will be possible at all, and we will have this log error messages:

At interface probe:
sfc 0000:5e:00.0: Insufficient resources for 12 XDP event queues (24 other channels, max 32)

At every subsequent XDP_TX/REDIRECT failure, rate limited:
sfc 0000:5e:00.0 ens3f0np0: XDP TX failed (-22)

However, without reverting the change, it makes the user to think that
everything is OK at probe time, but later it fails in an unpredictable
way, depending on the CPU that handles the packet.

It is better to restore the predictable behaviour. If the user sees the
error message at probe time, he/she can try to configure the best way it
fits his/her needs. At least, he/she will have 2 options:
- Accept that XDP_TX/REDIRECT is not available (he/she may not need it)
- Load sfc module with modparam 'rss_cpus' with a lower number, thus
  creating less normal RX queues/channels, letting more free resources
  for XDP, with some performance penalty.

Anyway, let the calculation of maximum TX queues that can be handled by
a single event queue, and use it only if it's less than the number of TX
queues per channel. This doesn't happen in practice, but could happen if
some constant values are tweaked in the future, such us
EFX_MAX_TXQ_PER_CHANNEL, EFX_MAX_EVQ_SIZE or EFX_MAX_DMAQ_SIZE.

Related mailing list thread:
https://lore.kernel.org/bpf/20201215104327.2be76156@carbon/

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 10:02:41 -07:00
Gustavo A. R. Silva
2e7ea96924 cpufreq: Fix fall-through warning for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a
fallthrough warning by simply dropping the empty default case at
the bottom.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-07-13 11:53:07 -05:00
Pavel Skripkin
deb7178eb9 net: fddi: fix UAF in fza_probe
fp is netdev private data and it cannot be
used after free_netdev() call. Using fp after free_netdev()
can cause UAF bug. Fix it by moving free_netdev() after error message.

Fixes: 61414f5ec9 ("FDDI: defza: Add support for DEC FDDIcontroller 700
TURBOchannel adapter")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 09:43:50 -07:00
Vladimir Oltean
b0b33b048d net: dsa: sja1105: fix address learning getting disabled on the CPU port
In May 2019 when commit 640f763f98 ("net: dsa: sja1105: Add support
for Spanning Tree Protocol") was introduced, the comment that "STP does
not get called for the CPU port" was true. This changed after commit
0394a63acf ("net: dsa: enable and disable all ports") in August 2019
and went largely unnoticed, because the sja1105_bridge_stp_state_set()
method did nothing different compared to the static setup done by
sja1105_init_mac_settings().

With the ability to turn address learning off introduced by the blamed
commit, there is a new priv->learn_ena port mask in the driver. When
sja1105_bridge_stp_state_set() gets called and we are in
BR_STATE_LEARNING or later, address learning is enabled or not depending
on priv->learn_ena & BIT(port).

So what happens is that priv->learn_ena is not being set from anywhere
for the CPU port, and the static configuration done by
sja1105_init_mac_settings() is being overwritten.

To solve this, acknowledge that the static configuration of STP state is
no longer necessary because the STP state is being set by the DSA core
now, but what is necessary is to set priv->learn_ena for the CPU port.

Fixes: 4d94235495 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 09:32:41 -07:00
Vladimir Oltean
e56c6bbd98 net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload
The point with a *dev and a *brport_dev is that when we have a LAG net
device that is a bridge port, *dev is an ocelot net device and
*brport_dev is the bonding/team net device. The ocelot net device
beneath the LAG does not exist from the bridge's perspective, so we need
to sync the switchdev objects belonging to the brport_dev and not to the
dev.

Fixes: e4bd44e89d ("net: ocelot: replay switchdev events when joining bridge")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 09:30:46 -07:00
Yajun Deng
01757f536a net: Use nlmsg_unicast() instead of netlink_unicast()
It has 'if (err >0 )' statement in nlmsg_unicast(), so use nlmsg_unicast()
instead of netlink_unicast(), this looks more concise.

v2: remove the change in netfilter.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13 09:28:29 -07:00
Aaron Liu
adefab4ef3 drm/amd/pm: Add waiting for response of mode-reset message for yellow carp
Remove mdelay process and use smu_cmn_send_smc_msg_with_param to send
mode-reset message to SMC.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:59:50 -04:00
Eric Huang
5adcd7458a Revert "drm/amdkfd: Add heavy-weight TLB flush after unmapping"
This reverts commit 1098d658be.

Reason for revert: it causes regressions on several Asics.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:59:41 -04:00
Eric Huang
d605094394 Revert "drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update"
This reverts commit 075e8080c1.

Reason for revert: the related commit is reverted.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:59:30 -04:00
Eric Huang
c37387c354 Revert "drm/amdkfd: Make TLB flush conditional on mapping"
This reverts commit 31f3324378.

Reason for revert: it causes regressions on several Asics.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:59:22 -04:00
Eric Huang
22762e3766 Revert "drm/amdgpu: Fix warning of Function parameter or member not described"
This reverts commit 7a68d188d1.

Reason for revert: the related commit is reverted.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:59:14 -04:00
Eric Huang
f5cc09acec Revert "drm/amdkfd: Add memory sync before TLB flush on unmap"
This reverts commit 3be4dca197.

Reason for revert: it causes regressions on several Asics.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:59:07 -04:00
Chengming Gui
06055d2e1c drm/amd/pm: Fix BACO state setting for Beige_Goby
Correct BACO state setting for Beige_Goby

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:58:56 -04:00
Emily.Deng
9be26ddf88 drm/amdgpu: Restore msix after FLR
After FLR, the msix will be cleared, so need to re-enable it.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Emily.Deng <Emily.Deng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:58:48 -04:00
Felix Kuehling
99e7d65ccc drm/amdkfd: Allow CPU access for all VRAM BOs
The thunk needs to mmap all BOs for CPU access to allow the debugger to
access them. Invisible ones are mapped with PROT_NONE.

Fixes: 71df0368e9 ("drm/amdgpu: Implement mmap as GEM object function")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:58:38 -04:00
Zhan Liu
c010efb7f0 drm/amdgpu/display - only update eDP's backlight level when necessary
[Why]
The original logic is to update eDP's backlight level
on every amdgpu dm atomic commit, which causes excessive
DMUB write. As a result, when playing game or moving window
around, DMUB timeout and system lagging are observed.

[How]
We only need to update eDP's backlight level when current level
doesn't match requested level.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:58:30 -04:00
Philip Yang
5017bf8214 drm/amdkfd: handle fault counters on invalid address
prange is NULL if vm fault retry on invalid address, for this case, can
not use prange to get pdd, use adev to get gpuidx and then get pdd
instead, then increase pdd vm fault counter.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:13 -04:00
Emily Deng
fa8f311e9e drm/amdgpu: Correct the irq numbers for virtual crtc
The irq number should be decided by num_crtc, and the num_crtc could change
by parameter.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:12 -04:00
Xiaomeng Hou
834b8245d6 drm/amd/display: update header file name
Update the register header file name.

Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:12 -04:00
Xiaomeng Hou
21cf0293d5 drm/amd/pm: drop smu_v13_0_1.c|h files for yellow carp
Since there's nothing special in smu implementation for yellow carp,
it's better to reuse the common smu_v13_0 interfaces and drop the
specific smu_v13_0_1.c|h files.

v2: remove the duplicate register offset and shift mask header files as
well.

Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:11 -04:00
Dmytro Laktyushkin
9849e71ac0 drm/amd/display: remove faulty assert
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:11 -04:00
Wesley Chalmers
dce9d910eb Revert "drm/amd/display: Always write repeater mode regardless of LTTPR"
This reverts commit 2b7605d73b

Some displays are not lighting up when put in LTTPR Transparent Mode

Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-13 11:48:11 -04:00
Nicholas Kazlauskas
e9cfe00ba8 drm/amd/display: Fix updating infoframe for DCN3.1 eDP
[Why]
We're only treating TMDS as a valid target for infoframe updates which
results in PSR being unable to transition from state 4 to state 5.

[How]
Also allow infoframe updates for DCN3.1 - following how we handle
this path for earlier ASIC as well.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:10 -04:00
Luben Tuikov
43a44c5322 drm/amdgpu: Return error if no RAS
In amdgpu_ras_query_error_count() return an error
if the device doesn't support RAS. This prevents
that function from having to always set the values
of the integer pointers (if set), and thus
prevents function side effects--always to have to
set values of integers if integer pointers set,
regardless of whether RAS is supported or
not--with this change this side effect is
mitigated.

Also, if no pointers are set, don't count, since
we've no way of reporting the counts.

Also, give this function a kernel-doc.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Reported-by: Tom Rix <trix@redhat.com>
Fixes: a46751fbcd ("drm/amdgpu: Fix RAS function interface")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:10 -04:00
Jingwen Chen
798c511548 drm/amdgpu: SRIOV flr_work should take write_lock
[Why]
If flr_work takes read_lock, then other threads who takes
read_lock can access hardware when host is doing vf flr.

[How]
flr_work should take write_lock to avoid this case.

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:09 -04:00
Nathan Chancellor
8cdd23c23c arm64: Restrict ARM64_BTI_KERNEL to clang 12.0.0 and newer
Commit 97fed779f2 ("arm64: bti: Provide Kconfig for kernel mode BTI")
disabled CONFIG_ARM64_BTI_KERNEL when CONFIG_GCOV_KERNEL was enabled and
compiling with clang because of warnings that were seen with
allmodconfig because LLVM was not emitting PAC/BTI instructions for
compiler generated functions:

  | warning: some functions compiled with BTI and some compiled without BTI
  | warning: not setting BTI in feature flags

This dependency was fine for avoiding the warnings with allmodconfig
until commit 51c2ee6d12 ("Kconfig: Introduce ARCH_WANTS_NO_INSTR and
CC_HAS_NO_PROFILE_FN_ATTR"), which prevents CONFIG_GCOV_KERNEL from
being enabled with clang 12.0.0 or older because those versions do not
support the no_profile_instrument_function attribute.

As a result, CONFIG_ARM64_BTI_KERNEL gets enabled with allmodconfig and
there are more warnings like the ones above due to CONFIG_KASAN, which
suffers from the same problem as CONFIG_GCOV_KERNEL. This was most
likely not noticed at the time because allmodconfig +
CONFIG_GCOV_KERNEL=n was not tested. defconfig + CONFIG_KASAN=y is
enough to reproduce the same warnings as above.

The root cause of the warnings was resolved in LLVM during the 12.0.0
release so rather than play whack-a-mole with the dependencies, just
update CONFIG_ARM64_BTI_KERNEL to require clang 12.0.0, which will have
all of the issues ironed out.

Link: https://github.com/ClangBuiltLinux/linux/issues/1428
Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/3010034706?check_suite_focus=true
Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/3010035725?check_suite_focus=true
Link: a88c722e68
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20210712214636.3134425-1-nathan@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-07-13 16:31:31 +01:00
Zhen Lei
0af778269a fbmem: Do not delete the mode that is still in use
The execution of fb_delete_videomode() is not based on the result of the
previous fbcon_mode_deleted(). As a result, the mode is directly deleted,
regardless of whether it is still in use, which may cause UAF.

==================================================================
BUG: KASAN: use-after-free in fb_mode_is_equal+0x36e/0x5e0 \
drivers/video/fbdev/core/modedb.c:924
Read of size 4 at addr ffff88807e0ddb1c by task syz-executor.0/18962

CPU: 2 PID: 18962 Comm: syz-executor.0 Not tainted 5.10.45-rc1+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ...
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x137/0x1be lib/dump_stack.c:118
 print_address_description+0x6c/0x640 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report+0x13d/0x1e0 mm/kasan/report.c:562
 fb_mode_is_equal+0x36e/0x5e0 drivers/video/fbdev/core/modedb.c:924
 fbcon_mode_deleted+0x16a/0x220 drivers/video/fbdev/core/fbcon.c:2746
 fb_set_var+0x1e1/0xdb0 drivers/video/fbdev/core/fbmem.c:975
 do_fb_ioctl+0x4d9/0x6e0 drivers/video/fbdev/core/fbmem.c:1108
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 18960:
 kasan_save_stack mm/kasan/common.c:48 [inline]
 kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
 kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
 __kasan_slab_free+0x108/0x140 mm/kasan/common.c:422
 slab_free_hook mm/slub.c:1541 [inline]
 slab_free_freelist_hook+0xd6/0x1a0 mm/slub.c:1574
 slab_free mm/slub.c:3139 [inline]
 kfree+0xca/0x3d0 mm/slub.c:4121
 fb_delete_videomode+0x56a/0x820 drivers/video/fbdev/core/modedb.c:1104
 fb_set_var+0x1f3/0xdb0 drivers/video/fbdev/core/fbmem.c:978
 do_fb_ioctl+0x4d9/0x6e0 drivers/video/fbdev/core/fbmem.c:1108
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 13ff178ccd ("fbcon: Call fbcon_mode_deleted/new_modelist directly")
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210712085544.2828-1-thunder.leizhen@huawei.com
2021-07-13 17:01:38 +02:00
Thomas Zimmermann
4db1cb1338 Merge remote-tracking branch 'drm-misc/drm-misc-next-fixes' into drm-misc-fixes
Picking up left-over patches in drm-misc-next-fixes.
2021-07-13 15:55:57 +02:00
Daniel Vetter
1e7b5812f4 Short summary of fixes pull:
* dma-buf: Fix fence leak in sync_file_merge() error code
  * drm/panel: nt35510: Don't fail on DSI reads
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmDtO2QACgkQaA3BHVML
 eiPzfAgAw+jUC4fTJK1+tS7XZjvQZ3KLq9ElVJXdDOxCQE5SRciIqae147q9y3Yv
 3T+A/mdwlyZOWVpTdmvWODBehpKe8kWmiH622mTWep2YpPyoxR5MWvS6efkXhC6W
 sWPV5Q/Bd7uCq5uI1p4NzWXajhcFrDY7nzJMGCMTT10YmvMRSgWMVL6E+H9iwBtP
 9kMTdrySUtCPKB4yKLnLdy8cdR8aXzr9Gb45IQUCADBWdlDg6tei4V7ko1nx/Why
 jx00GF01pzWj+J1obdyixwWA6cQ+NmteXBL3vqDILsfW32mqBaoZejOUQsOErY4s
 +JkQp+TFrAZN+gaz86oIuvmQRPVNVg==
 =D86/
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2021-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Short summary of fixes pull:

 * dma-buf: Fix fence leak in sync_file_merge() error code
 * drm/panel: nt35510: Don't fail on DSI reads

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YO07pEfweKVO+7y0@linux-uq9g
2021-07-13 15:15:17 +02:00
Cristian Marussi
bdb8742dc6 firmware: arm_scmi: Fix range check for the maximum number of pending messages
SCMI message headers carry a sequence number and such field is sized to
allow for MSG_TOKEN_MAX distinct numbers; moreover zero is not really an
acceptable maximum number of pending in-flight messages.

Fix accordingly the checks performed on the value exported by transports
in scmi_desc.max_msg

Link: https://lore.kernel.org/r/20210712141833.6628-3-cristian.marussi@arm.com
Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
[sudeep.holla: updated the patch title and error message]
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-07-13 11:42:20 +01:00
Cristian Marussi
187a002b07 firmware: arm_scmi: Avoid padding in sensor message structure
scmi_resp_sensor_reading_complete structure is meant to represent an
SCMI asynchronous reading complete message. The readings field with
a 64bit type forces padding and breaks reads in scmi_sensor_reading_get.

Split it in two adjacent 32bit readings_low/high subfields to avoid the
padding within the structure. Alternatively we could to mark the structure
packed.

Link: https://lore.kernel.org/r/20210628170042.34105-1-cristian.marussi@arm.com
Fixes: e2083d3673 ("firmware: arm_scmi: Add SCMI v3.0 sensors timestamped reads")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-07-13 11:42:14 +01:00
Cristian Marussi
b98cf55ec0 firmware: arm_scmi: Fix kernel doc warnings about return values
Kernel doc validation script still complains about the following:

|No description found for return value of 'scmi_get_protocol_device'
|No description found for return value of 'scmi_devm_notifier_register'
|No description found for return value of 'scmi_devm_notifier_unregister'

Fix adding missing Return kernel-doc statements.

Link: https://lore.kernel.org/r/20210712143504.33541-1-cristian.marussi@arm.com
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-07-13 11:39:54 +01:00
Sudeep Holla
5ff6319d46 firmware: arm_scpi: Fix kernel doc warnings
Kernel doc validation script is unhappy and complains with the below set
of warnings.

 | Function parameter or member 'device_domain_id' not described in 'scpi_ops'
 | Function parameter or member 'get_transition_latency' not described in 'scpi_ops'
 | Function parameter or member 'add_opps_to_device' not described in 'scpi_ops'
 | Function parameter or member 'sensor_get_capability' not described in 'scpi_ops'
 | Function parameter or member 'sensor_get_info' not described in 'scpi_ops'
 | Function parameter or member 'sensor_get_value' not described in 'scpi_ops'
 | Function parameter or member 'device_get_power_state' not described in 'scpi_ops'
 | Function parameter or member 'device_set_power_state' not described in 'scpi_ops'

Fix them adding appropriate documents or missing keywords.

Link: https://lore.kernel.org/r/20210712130801.2436492-1-sudeep.holla@arm.com
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-07-13 11:39:48 +01:00
Sudeep Holla
52f83955aa firmware: arm_scmi: Fix kernel doc warnings
Kernel doc validation script is unhappy and complains with the below set
of warnings.

 | Function parameter or member 'fast_switch_possible' not described in 'scmi_perf_proto_ops'
 | Function parameter or member 'power_scale_mw_get' not described in 'scmi_perf_proto_ops'
 | cannot understand function prototype: 'struct scmi_sensor_reading '
 | cannot understand function prototype: 'struct scmi_range_attrs '
 | cannot understand function prototype: 'struct scmi_sensor_axis_info '
 | cannot understand function prototype: 'struct scmi_sensor_intervals_info '

Fix them adding appropriate documents or missing keywords.

Link: https://lore.kernel.org/r/20210712130801.2436492-2-sudeep.holla@arm.com
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-07-13 11:39:42 +01:00
Casey Chen
251ef6f71b nvme-pci: do not call nvme_dev_remove_admin from nvme_remove
nvme_dev_remove_admin could free dev->admin_q and the admin_tagset
while they are being accessed by nvme_dev_disable(), which can be called
by nvme_reset_work via nvme_remove_dead_ctrl.

Commit cb4bfda62a ("nvme-pci: fix hot removal during error handling")
intended to avoid requests being stuck on a removed controller by killing
the admin queue. But the later fix c8e9e9b764 ("nvme-pci: unquiesce
admin queue on shutdown"), together with nvme_dev_disable(dev, true)
right before nvme_dev_remove_admin() could help dispatch requests and
fail them early, so we don't need nvme_dev_remove_admin() any more.

Fixes: cb4bfda62a ("nvme-pci: fix hot removal during error handling")
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-07-13 12:03:20 +02:00
Casey Chen
e4b9852a0f nvme-pci: fix multiple races in nvme_setup_io_queues
Below two paths could overlap each other if we power off a drive quickly
after powering it on. There are multiple races in nvme_setup_io_queues()
because of shutdown_lock missing and improper use of NVMEQ_ENABLED bit.

nvme_reset_work()                                nvme_remove()
  nvme_setup_io_queues()                           nvme_dev_disable()
  ...                                              ...
A1  clear NVMEQ_ENABLED bit for admin queue          lock
    retry:                                       B1  nvme_suspend_io_queues()
A2    pci_free_irq() admin queue                 B2  nvme_suspend_queue() admin queue
A3    pci_free_irq_vectors()                         nvme_pci_disable()
A4    nvme_setup_irqs();                         B3    pci_free_irq_vectors()
      ...                                            unlock
A5    queue_request_irq() for admin queue
      set NVMEQ_ENABLED bit
      ...
      nvme_create_io_queues()
A6      result = queue_request_irq();
        set NVMEQ_ENABLED bit
      ...
      fail to allocate enough IO queues:
A7      nvme_suspend_io_queues()
        goto retry

If B3 runs in between A1 and A2, it will crash if irqaction haven't
been freed by A2. B2 is supposed to free admin queue IRQ but it simply
can't fulfill the job as A1 has cleared NVMEQ_ENABLED bit.

Fix: combine A1 A2 so IRQ get freed as soon as the NVMEQ_ENABLED bit
gets cleared.

After solved #1, A2 could race with B3 if A2 is freeing IRQ while B3
is checking irqaction. A3 also could race with B2 if B2 is freeing
IRQ while A3 is checking irqaction.

Fix: A2 and A3 take lock for mutual exclusion.

A3 could race with B3 since they could run free_msi_irqs() in parallel.

Fix: A3 takes lock for mutual exclusion.

A4 could fail to allocate all needed IRQ vectors if A3 and A4 are
interrupted by B3.

Fix: A4 takes lock for mutual exclusion.

If A5/A6 happened after B2/B1, B3 will crash since irqaction is not NULL.
They are just allocated by A5/A6.

Fix: Lock queue_request_irq() and setting of NVMEQ_ENABLED bit.

A7 could get chance to pci_free_irq() for certain IO queue while B3 is
checking irqaction.

Fix: A7 takes lock.

nvme_dev->online_queues need to be protected by shutdown_lock. Since it
is not atomic, both paths could modify it using its own copy.

Co-developed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-07-13 11:40:42 +02:00
Prabhakar Kushwaha
8b43ced64d nvme-tcp: use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
dev_get_by_name() finds network device by name but it also increases the
reference count.

If a nvme-tcp queue is present and the network device driver is removed
before nvme_tcp, we will face the following continuous log:

  "kernel:unregister_netdevice: waiting for <eth> to become free. Usage count = 2"

And rmmod further halts. Similar case arises during reboot/shutdown
with nvme-tcp queue present and both never completes.

To fix this, use __dev_get_by_name() which finds network device by
name without increasing any reference counter.

Fixes: 3ede8f72a9 ("nvme-tcp: allow selecting the network interface for connections")
Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
[hch: remove the ->ndev member entirely]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-07-13 11:34:24 +02:00
Geert Uytterhoeven
432b52eea3 ARM: shmobile: defconfig: Restore graphical consoles
As of commit f611b1e762 ("drm: Avoid circular dependencies for
CONFIG_FB"), CONFIG_FB is no longer auto-enabled.  While CONFIG_FB may
be considered unneeded for systems where graphics is provided by a DRM
driver, R-Mobile A1 still relies on a frame buffer device driver for
graphics support.

Restore support for graphics on R-Mobile A1 and graphical consoles on
DRM-based systems by explicitly enabling CONFIG_FB in the defconfig for
Renesas ARM systems.

Fixes: f611b1e762 ("drm: Avoid circular dependencies for CONFIG_FB")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/2a4474be1d2c00c6ca97c2714844ea416a9ea9a9.1626084948.git.geert+renesas@glider.be
2021-07-13 09:45:51 +02:00
Gustavo A. R. Silva
e181ad4388 drm/msm: Fix fall-through warning in msm_gem_new_impl()
Fix the following fall-through warning:

drivers/gpu/drm/msm/msm_gem.c: In function 'msm_gem_new_impl':
drivers/gpu/drm/msm/msm_gem.c:1170:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
 1170 |   if (priv->has_cached_coherent)
      |      ^
drivers/gpu/drm/msm/msm_gem.c:1173:2: note: here
 1173 |  default:
      |  ^~~~~~~

by replacing the /* fallthrough */ comment with fallthrough;

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2021-07-12 21:32:25 -05:00
Jaegeuk Kim
053c16ac89 scsi: ufs: core: Add missing host_lock in ufshcd_vops_setup_xfer_req()
This patch adds a host_lock which existed before on
ufshcd_vops_setup_xfer_req().

Link: https://lore.kernel.org/r/20210701005117.3846179-1-jaegeuk@kernel.org
Fixes: a45f937110 ("scsi: ufs: Optimize host lock on transfer requests send/compl paths")
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-12 22:02:06 -04:00
Sreekanth Reddy
aa0dc6a733 scsi: mpi3mr: Fix W=1 compilation warnings
Fix for the following W=1 compilation warning:

  'strncpy' output may be truncated copying 16 bytes from a string of
  length 64

Link: https://lore.kernel.org/r/20210707081756.20922-1-sreekanth.reddy@broadcom.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-12 21:56:50 -04:00
Randy Dunlap
bb6beabf2f scsi: pm8001: Clean up kernel-doc and comments
Fix kernel-doc warnings then test again, wash, rinse, find more, then
repeat more/again.

Also fix spellos, some grammar, and some punctuation.

../drivers/scsi/pm8001/pm8001_ctl.c:557: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 ** pm8001_ctl_fatal_log_show - fatal error logging
../drivers/scsi/pm8001/pm8001_ctl.c:577: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 ** non_fatal_log_show - non fatal error logging
../drivers/scsi/pm8001/pm8001_ctl.c:622: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 ** pm8001_ctl_gsm_log_show - gsm dump collection

Link: https://lore.kernel.org/r/20210708165723.8594-1-rdunlap@infradead.org
Cc: Jack Wang <jinpu.wang@cloud.ionos.com>
Cc: linux-scsi@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-12 21:54:07 -04:00
Steffen Maier
8b3bdd99c0 scsi: zfcp: Report port fc_security as unknown early during remote cable pull
On remote cable pull, a zfcp_port keeps its status and only gets
ZFCP_STATUS_PORT_LINK_TEST added. Only after an ADISC timeout, we would
actually start port recovery and remove ZFCP_STATUS_COMMON_UNBLOCKED which
zfcp_sysfs_port_fc_security_show() detected and reported as "unknown"
instead of the old and possibly stale zfcp_port->connection_info.

Add check for ZFCP_STATUS_PORT_LINK_TEST for timely "unknown" report.

Link: https://lore.kernel.org/r/20210702160922.2667874-1-maier@linux.ibm.com
Fixes: a17c784600 ("scsi: zfcp: report FC Endpoint Security in sysfs")
Cc: <stable@vger.kernel.org> #5.7+
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-12 21:51:50 -04:00
Tyrel Datwyler
93aa71ad73 scsi: core: Fix bad pointer dereference when ehandler kthread is invalid
Commit 66a834d092 ("scsi: core: Fix error handling of scsi_host_alloc()")
changed the allocation logic to call put_device() to perform host cleanup
with the assumption that IDA removal and stopping the kthread would
properly be performed in scsi_host_dev_release(). However, in the unlikely
case that the error handler thread fails to spawn, shost->ehandler is set
to ERR_PTR(-ENOMEM).

The error handler cleanup code in scsi_host_dev_release() will call
kthread_stop() if shost->ehandler != NULL which will always be the case
whether the kthread was successfully spawned or not. In the case that it
failed to spawn this has the nasty side effect of trying to dereference an
invalid pointer when kthread_stop() is called. The following splat provides
an example of this behavior in the wild:

scsi host11: error handler thread failed to spawn, error = -4
Kernel attempted to read user page (10c) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x0000010c
Faulting instruction address: 0xc00000000818e9a8
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: ibmvscsi(+) scsi_transport_srp dm_multipath dm_mirror dm_region
 hash dm_log dm_mod fuse overlay squashfs loop
CPU: 12 PID: 274 Comm: systemd-udevd Not tainted 5.13.0-rc7 #1
NIP:  c00000000818e9a8 LR: c0000000089846e8 CTR: 0000000000007ee8
REGS: c000000037d12ea0 TRAP: 0300   Not tainted  (5.13.0-rc7)
MSR:  800000000280b033 &lt;SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE&gt;  CR: 28228228
XER: 20040001
CFAR: c0000000089846e4 DAR: 000000000000010c DSISR: 40000000 IRQMASK: 0
GPR00: c0000000089846e8 c000000037d13140 c000000009cc1100 fffffffffffffffc
GPR04: 0000000000000001 0000000000000000 0000000000000000 c000000037dc0000
GPR08: 0000000000000000 c000000037dc0000 0000000000000001 00000000fffff7ff
GPR12: 0000000000008000 c00000000a049000 c000000037d13d00 000000011134d5a0
GPR16: 0000000000001740 c0080000190d0000 c0080000190d1740 c000000009129288
GPR20: c000000037d13bc0 0000000000000001 c000000037d13bc0 c0080000190b7898
GPR24: c0080000190b7708 0000000000000000 c000000033bb2c48 0000000000000000
GPR28: c000000046b28280 0000000000000000 000000000000010c fffffffffffffffc
NIP [c00000000818e9a8] kthread_stop+0x38/0x230
LR [c0000000089846e8] scsi_host_dev_release+0x98/0x160
Call Trace:
[c000000033bb2c48] 0xc000000033bb2c48 (unreliable)
[c0000000089846e8] scsi_host_dev_release+0x98/0x160
[c00000000891e960] device_release+0x60/0x100
[c0000000087e55c4] kobject_release+0x84/0x210
[c00000000891ec78] put_device+0x28/0x40
[c000000008984ea4] scsi_host_alloc+0x314/0x430
[c0080000190b38bc] ibmvscsi_probe+0x54/0xad0 [ibmvscsi]
[c000000008110104] vio_bus_probe+0xa4/0x4b0
[c00000000892a860] really_probe+0x140/0x680
[c00000000892aefc] driver_probe_device+0x15c/0x200
[c00000000892b63c] device_driver_attach+0xcc/0xe0
[c00000000892b740] __driver_attach+0xf0/0x200
[c000000008926f28] bus_for_each_dev+0xa8/0x130
[c000000008929ce4] driver_attach+0x34/0x50
[c000000008928fc0] bus_add_driver+0x1b0/0x300
[c00000000892c798] driver_register+0x98/0x1a0
[c00000000810eb60] __vio_register_driver+0x80/0xe0
[c0080000190b4a30] ibmvscsi_module_init+0x9c/0xdc [ibmvscsi]
[c0000000080121d0] do_one_initcall+0x60/0x2d0
[c000000008261abc] do_init_module+0x7c/0x320
[c000000008265700] load_module+0x2350/0x25b0
[c000000008265cb4] __do_sys_finit_module+0xd4/0x160
[c000000008031110] system_call_exception+0x150/0x2d0
[c00000000800d35c] system_call_common+0xec/0x278

Fix this be nulling shost->ehandler when the kthread fails to spawn.

Link: https://lore.kernel.org/r/20210701195659.3185475-1-tyreld@linux.ibm.com
Fixes: 66a834d092 ("scsi: core: Fix error handling of scsi_host_alloc()")
Cc: stable@vger.kernel.org
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-12 21:46:24 -04:00
Bart Van Assche
fbf1a58701 scsi: fas216: Fix a build error
Use SAM_STAT_GOOD instead of GOOD since GOOD has been removed.

Link: https://lore.kernel.org/r/20210711033623.11267-1-bvanassche@acm.org
Fixes: 3d45cefc8e ("scsi: core: Drop obsolete Linux-specific SCSI status codes")
Fixes: df13031476 ("scsi: fas216: Use get_status_byte() to avoid using Linux-specific status codes")
Cc: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-12 21:41:31 -04:00
Bart Van Assche
422969bbb5 scsi: core: Fix the documentation of the scsi_execute() time parameter
The unit of the scsi_execute() timeout parameter is 1/HZ seconds instead of
one second, just like the timeouts used in the block layer. Fix the
documentation header above the definition of the scsi_execute() macro.

Link: https://lore.kernel.org/r/20210709202638.9480-3-bvanassche@acm.org
Fixes: "[SCSI] use scatter lists for all block pc requests and simplify hw handlers" # v2.6.16.28
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-12 21:40:21 -04:00
Paolo Pisati
0c0f6299ba selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit hot-remove error test
While the offline memory test obey ratio limit, the same test with
error injection does not and tries to offline all the hotpluggable
memory, spamming system logs with hundreds of thousands of dump_page()
entries, slowing system down (to the point the test itself timesout and
gets terminated) and excessive fs occupation:

...
[ 9784.393354] page:c00c0000007d1b40 refcount:3 mapcount:0 mapping:c0000001fc03e950 index:0xe7b
[ 9784.393355] def_blk_aops
[ 9784.393356] flags: 0x3ffff800002062(referenced|active|workingset|private)
[ 9784.393358] raw: 003ffff800002062 c0000001b9343a68 c0000001b9343a68 c0000001fc03e950
[ 9784.393359] raw: 0000000000000e7b c000000006607b18 00000003ffffffff c00000000490d000
[ 9784.393359] page dumped because: migration failure
[ 9784.393360] page->mem_cgroup:c00000000490d000
[ 9784.393416] migrating pfn 1f46d failed ret:1
...

$ grep "page dumped because: migration failure" /var/log/kern.log | wc -l
2405558

$ ls -la /var/log/kern.log
-rw-r----- 1 syslog adm 2256109539 Jun 30 14:19 /var/log/kern.log

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-07-12 14:20:01 -06:00
SeongJae Park
df4b0807ca kunit: tool: Assert the version requirement
Commit 87c9c16317 ("kunit: tool: add support for QEMU") on the 'next'
tree adds 'from __future__ import annotations' in 'kunit_kernel.py'.
Because it is supported on only >=3.7 Python, people using older Python
will get below error:

    Traceback (most recent call last):
      File "./tools/testing/kunit/kunit.py", line 20, in <module>
        import kunit_kernel
      File "/home/sjpark/linux/tools/testing/kunit/kunit_kernel.py", line 9
        from __future__ import annotations
        ^
    SyntaxError: future feature annotations is not defined

This commit adds a version assertion in 'kunit.py', so that people get
more explicit error message like below:

    Traceback (most recent call last):
      File "./tools/testing/kunit/kunit.py", line 15, in <module>
        assert sys.version_info >= (3, 7), "Python version is too old"
    AssertionError: Python version is too old

Signed-off-by: SeongJae Park <sjpark@amazon.de>
Acked-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-07-12 14:02:32 -06:00