Commit Graph

916116 Commits

Author SHA1 Message Date
Taehee Yoo
4dee15b4fd macvlan: fix null dereference in macvlan_device_event()
In the macvlan_device_event(), the list_first_entry_or_null() is used.
This function could return null pointer if there is no node.
But, the macvlan module doesn't check the null pointer.
So, null-ptr-deref would occur.

      bond0
        |
   +----+-----+
   |          |
macvlan0   macvlan1
   |          |
 dummy0     dummy1

The problem scenario.
If dummy1 is removed,
1. ->dellink() of dummy1 is called.
2. NETDEV_UNREGISTER of dummy1 notification is sent to macvlan module.
3. ->dellink() of macvlan1 is called.
4. NETDEV_UNREGISTER of macvlan1 notification is sent to bond module.
5. __bond_release_one() is called and it internally calls
   dev_set_mac_address().
6. dev_set_mac_address() calls the ->ndo_set_mac_address() of macvlan1,
   which is macvlan_set_mac_address().
7. macvlan_set_mac_address() calls the dev_set_mac_address() with dummy1.
8. NETDEV_CHANGEADDR of dummy1 is sent to macvlan module.
9. In the macvlan_device_event(), it calls list_first_entry_or_null().
At this point, dummy1 and macvlan1 were removed.
So, list_first_entry_or_null() will return NULL.

Test commands:
    ip netns add nst
    ip netns exec nst ip link add bond0 type bond
    for i in {0..10}
    do
        ip netns exec nst ip link add dummy$i type dummy
	ip netns exec nst ip link add macvlan$i link dummy$i \
		type macvlan mode passthru
	ip netns exec nst ip link set macvlan$i master bond0
    done
    ip netns del nst

Splat looks like:
[   40.585687][  T146] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEI
[   40.587249][  T146] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[   40.588342][  T146] CPU: 1 PID: 146 Comm: kworker/u8:2 Not tainted 5.7.0-rc1+ #532
[   40.589299][  T146] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   40.590469][  T146] Workqueue: netns cleanup_net
[   40.591045][  T146] RIP: 0010:macvlan_device_event+0x4e2/0x900 [macvlan]
[   40.591905][  T146] Code: 00 00 00 00 00 fc ff df 80 3c 06 00 0f 85 45 02 00 00 48 89 da 48 b8 00 00 00 00 00 fc ff d2
[   40.594126][  T146] RSP: 0018:ffff88806116f4a0 EFLAGS: 00010246
[   40.594783][  T146] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   40.595653][  T146] RDX: 0000000000000000 RSI: ffff88806547ddd8 RDI: ffff8880540f1360
[   40.596495][  T146] RBP: ffff88804011a808 R08: fffffbfff4fb8421 R09: fffffbfff4fb8421
[   40.597377][  T146] R10: ffffffffa7dc2107 R11: 0000000000000000 R12: 0000000000000008
[   40.598186][  T146] R13: ffff88804011a000 R14: ffff8880540f1000 R15: 1ffff1100c22de9a
[   40.599012][  T146] FS:  0000000000000000(0000) GS:ffff888067800000(0000) knlGS:0000000000000000
[   40.600004][  T146] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   40.600665][  T146] CR2: 00005572d3a807b8 CR3: 000000005fcf4003 CR4: 00000000000606e0
[   40.601485][  T146] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   40.602461][  T146] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   40.603443][  T146] Call Trace:
[   40.603871][  T146]  ? nf_tables_dump_setelem+0xa0/0xa0 [nf_tables]
[   40.604587][  T146]  ? macvlan_uninit+0x100/0x100 [macvlan]
[   40.605212][  T146]  ? __module_text_address+0x13/0x140
[   40.605842][  T146]  notifier_call_chain+0x90/0x160
[   40.606477][  T146]  dev_set_mac_address+0x28e/0x3f0
[   40.607117][  T146]  ? netdev_notify_peers+0xc0/0xc0
[   40.607762][  T146]  ? __module_text_address+0x13/0x140
[   40.608440][  T146]  ? notifier_call_chain+0x90/0x160
[   40.609097][  T146]  ? dev_set_mac_address+0x1f0/0x3f0
[   40.609758][  T146]  dev_set_mac_address+0x1f0/0x3f0
[   40.610402][  T146]  ? __local_bh_enable_ip+0xe9/0x1b0
[   40.611071][  T146]  ? bond_hw_addr_flush+0x77/0x100 [bonding]
[   40.611823][  T146]  ? netdev_notify_peers+0xc0/0xc0
[   40.612461][  T146]  ? bond_hw_addr_flush+0x77/0x100 [bonding]
[   40.613213][  T146]  ? bond_hw_addr_flush+0x77/0x100 [bonding]
[   40.613963][  T146]  ? __local_bh_enable_ip+0xe9/0x1b0
[   40.614631][  T146]  ? bond_time_in_interval.isra.31+0x90/0x90 [bonding]
[   40.615484][  T146]  ? __bond_release_one+0x9f0/0x12c0 [bonding]
[   40.616230][  T146]  __bond_release_one+0x9f0/0x12c0 [bonding]
[   40.616949][  T146]  ? bond_enslave+0x47c0/0x47c0 [bonding]
[   40.617642][  T146]  ? lock_downgrade+0x730/0x730
[   40.618218][  T146]  ? check_flags.part.42+0x450/0x450
[   40.618850][  T146]  ? __mutex_unlock_slowpath+0xd0/0x670
[   40.619519][  T146]  ? trace_hardirqs_on+0x30/0x180
[   40.620117][  T146]  ? wait_for_completion+0x250/0x250
[   40.620754][  T146]  bond_netdev_event+0x822/0x970 [bonding]
[   40.621460][  T146]  ? __module_text_address+0x13/0x140
[   40.622097][  T146]  notifier_call_chain+0x90/0x160
[   40.622806][  T146]  rollback_registered_many+0x660/0xcf0
[   40.623522][  T146]  ? netif_set_real_num_tx_queues+0x780/0x780
[   40.624290][  T146]  ? notifier_call_chain+0x90/0x160
[   40.624957][  T146]  ? netdev_upper_dev_unlink+0x114/0x180
[   40.625686][  T146]  ? __netdev_adjacent_dev_unlink_neighbour+0x30/0x30
[   40.626421][  T146]  ? mutex_is_locked+0x13/0x50
[   40.627016][  T146]  ? unregister_netdevice_queue+0xf2/0x240
[   40.627663][  T146]  unregister_netdevice_many.part.134+0x13/0x1b0
[   40.628362][  T146]  default_device_exit_batch+0x2d9/0x390
[   40.628987][  T146]  ? unregister_netdevice_many+0x40/0x40
[   40.629615][  T146]  ? dev_change_net_namespace+0xcb0/0xcb0
[   40.630279][  T146]  ? prepare_to_wait_exclusive+0x2e0/0x2e0
[   40.630943][  T146]  ? ops_exit_list.isra.9+0x97/0x140
[   40.631554][  T146]  cleanup_net+0x441/0x890
[ ... ]

Fixes: e289fd2817 ("macvlan: fix the problem when mac address changes for passthru mode")
Reported-by: syzbot+5035b1f9dc7ea4558d5a@syzkaller.appspotmail.com
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21 15:51:55 -07:00
Jason Yan
c95576a34c e1000: remove unneeded conversion to bool
The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:

drivers/net/ethernet/intel/e1000/e1000_main.c:1479:44-49: WARNING:
conversion to bool not needed here

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21 15:45:32 -07:00
Jason Yan
7ff4f0631f i40e: Remove unneeded conversion to bool
The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:

drivers/net/ethernet/intel/i40e/i40e_main.c:1614:52-57: WARNING:
conversion to bool not needed here
drivers/net/ethernet/intel/i40e/i40e_main.c:11439:52-57: WARNING:
conversion to bool not needed here

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21 15:45:32 -07:00
Jason Yan
e9a9e51994 ptp: Remove unneeded conversion to bool
The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:

drivers/ptp/ptp_ines.c:403:55-60: WARNING: conversion to bool not
needed here
drivers/ptp/ptp_ines.c:404:55-60: WARNING: conversion to bool not
needed here

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21 15:45:32 -07:00
Jiri Slaby
526f3d96b8 cgroup, netclassid: remove double cond_resched
Commit 018d26fcd1 ("cgroup, netclassid: periodically release file_lock
on classid") added a second cond_resched to write_classid indirectly by
update_classid_task. Remove the one in write_classid.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21 15:44:30 -07:00
Linus Torvalds
18bf34080c Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "15 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  tools/vm: fix cross-compile build
  coredump: fix null pointer dereference on coredump
  mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path
  shmem: fix possible deadlocks on shmlock_user_lock
  vmalloc: fix remap_vmalloc_range() bounds checks
  mm/shmem: fix build without THP
  mm/ksm: fix NULL pointer dereference when KSM zero page is enabled
  tools/build: tweak unused value workaround
  checkpatch: fix a typo in the regex for $allocFunctions
  mm, gup: return EINTR when gup is interrupted by fatal signals
  mm/hugetlb: fix a addressing exception caused by huge_pte_offset
  MAINTAINERS: add an entry for kfifo
  mm/userfaultfd: disable userfaultfd-wp on x86_32
  slub: avoid redzone when choosing freepointer location
  sh: fix build error in mm/init.c
2020-04-21 13:26:54 -07:00
Linus Torvalds
8160a563cf Bugfixes, and a few cleanups to the newly-introduced assembly language
vmentry code for AMD.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl6fFwoUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNgEQf/WK0z8WMKxGDr4YdLlxvJxLHUTd/Z
 uKDMkllRil5+hFy5tq5yeKEzPRtINkJ9bSwrUW3dWtZiCxdED/K3uXOh30znycQL
 KmVX5ZlmD5Gm9YizVUSbhXZj9p4AvtsvmrUUSH5W1FOJ7g4cxK9a29h3CkfJ5EPq
 WFyVfua9JMBjKCyWgjSOlCQ5L0NEB3bezWzuIj1TQW5A82fTjrUyciwBZQ5mnZC6
 nC4kN8M8NWoceRQT/uD/I3l2o+GlYf6xYE6637if0CpaLQRyvYDSwdB4G+1MB0M1
 PtEwkSkwni4PmWwcMyi/gIx37HRA3ycgZIVbb+MUmTA1pakUMCOjqin6hw==
 =Ax1z
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Bugfixes, and a few cleanups to the newly-introduced assembly language
  vmentry code for AMD"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions
  kvm: Disable objtool frame pointer checking for vmenter.S
  MAINTAINERS: add a reviewer for KVM/s390
  KVM: s390: Fix PV check in deliverable_irqs()
  kvm: Handle reads of SandyBridge RAPL PMU MSRs rather than injecting #GP
  KVM: Remove CREATE_IRQCHIP/SET_PIT2 race
  KVM: SVM: Fix __svm_vcpu_run declaration.
  KVM: SVM: Do not setup frame pointer in __svm_vcpu_run
  KVM: SVM: Fix build error due to missing release_pages() include
  KVM: SVM: Do not mark svm_vcpu_run with STACK_FRAME_NON_STANDARD
  kvm: nVMX: match comment with return type for nested_vmx_exit_reflected
  kvm: nVMX: reflect MTF VM-exits if injected by L1
  KVM: s390: Return last valid slot if approx index is out-of-bounds
  KVM: Check validity of resolved slot when searching memslots
  KVM: VMX: Enable machine check support for 32bit targets
  KVM: SVM: move more vmentry code to assembly
  KVM: SVM: fix compilation with modular PSP and non-modular KVM
2020-04-21 12:59:10 -07:00
Takashi Iwai
e7b6b3ec01 ASoC: Fixes for v5.7
Quite a lot of fixes here, a lot of driver specific ones but the biggest
 one is the revert of changes to the startup and shutdown sequence for
 DAIs that went in during the merge window - they broke some older x86
 platforms and attempts to fix them didn't succeed so it's safer to just
 roll them back and try to make sure those platforms are handled properly
 in any future attempt.
 
 The rockchip S/PDIF DT stuff was IIRC for validation issues.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl6fSlITHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0ODsB/4whz1ppFUJ4+XLaLoOTCJjKPnQEg6N
 lZRhPS1/gQVL/qUpwOIS3vr3Z+8Zsm9F3VXOEGmMFfPR1f4+Fejz1OJkg0mUuhzr
 Xx+LHOGyaabj9zvoNxZDS1eE00t8dLrkoRx6vce3dRZluZ5DATN6L/4EB3RIhuny
 4ckuQAf6KEGLYpqzAVAJyUp8Q6pBl15vHyH+KMXHZJ4qxQjZaBPA7tPabdzRqStM
 Lq+xKkBkJepDet3vQZvvmwNPhS0pU1aqqcopVxj03zivpMwQRftcSYTpdT2SpIF1
 GM1FGiRV1lbNUsQNnhkImE9V6X+I3zmQR4dLUbGTT+Le/0e8CtKRXYCQ
 =D8Su
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v5.7-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.7

Quite a lot of fixes here, a lot of driver specific ones but the biggest
one is the revert of changes to the startup and shutdown sequence for
DAIs that went in during the merge window - they broke some older x86
platforms and attempts to fix them didn't succeed so it's safer to just
roll them back and try to make sure those platforms are handled properly
in any future attempt.

The rockchip S/PDIF DT stuff was IIRC for validation issues.
2020-04-21 21:41:36 +02:00
Alexander Tsoy
cf9fb7b873 ALSA: usb-audio: Apply async workaround for Scarlett 2i4 2nd gen
Due to rounding error driver sometimes incorrectly calculate next packet
size, which results in audible clicks on devices with synchronous playback
endpoints. For example on a high speed bus and a sample rate 44.1 kHz it
loses one sample every ~40.9 seconds. Fortunately playback interface on
Scarlett 2i4 2nd gen has a working explicit feedback endpoint, so we can
switch playback data endpoint to asynchronous mode as a workaround.

Signed-off-by: Alexander Tsoy <alexander@tsoy.me>
Link: https://lore.kernel.org/r/20200421190908.462860-1-alexander@tsoy.me
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-04-21 21:30:28 +02:00
Linus Torvalds
189522da8b virtio: fixes, cleanups
Some bug fixes.
 Cleanup a couple of issues that surfaced meanwhile.
 Disable vhost on ARM with OABI for now - to be fixed
 fully later in the cycle or in the next release.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl6d6ZgPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpH3oH/0bJ6o+FiAi8xXgYqm9XXmswrZoZLahjyPay
 dA7Sz5nNKVtdSGH9o0wRdcekt0SOI3ilZSkv9nwt9ep/5YzC3brf2hry+nPvMTsA
 MhI3IAa7sK1vCXkftwOlx+SIeDfIwsqr+h4SCfMRxlIT0yAmOC8fl2ByT2dIbqnj
 dlzwczecHI9LPUEmRWiKH/4Tj5MPZN5IeFSIAE+nA/9cl5h4qVSfYtWD3Y4VQ82g
 Rv3mvVE+chaVbPxewaBZ8Y0Avti4tMyzsE0MY+dz5xfh+75hqMfygg//1osbEAbz
 SiL5dDcANe8Q+QOc/BxHdj4dqpqUp1ldV+3Lge9k4lWAGnsEMEk=
 =GZb2
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes and cleanups from Michael Tsirkin:

 - Some bug fixes

 - Cleanup a couple of issues that surfaced meanwhile

 - Disable vhost on ARM with OABI for now - to be fixed fully later in
   the cycle or in the next release.

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (24 commits)
  vhost: disable for OABI
  virtio: drop vringh.h dependency
  virtio_blk: add a missing include
  virtio-balloon: Avoid using the word 'report' when referring to free page hinting
  virtio-balloon: make virtballoon_free_page_report() static
  vdpa: fix comment of vdpa_register_device()
  vdpa: make vhost, virtio depend on menu
  vdpa: allow a 32 bit vq alignment
  drm/virtio: fix up for include file changes
  remoteproc: pull in slab.h
  rpmsg: pull in slab.h
  virtio_input: pull in slab.h
  remoteproc: pull in slab.h
  virtio-rng: pull in slab.h
  virtgpu: pull in uaccess.h
  tools/virtio: make asm/barrier.h self contained
  tools/virtio: define aligned attribute
  virtio/test: fix up after IOTLB changes
  vhost: Create accessors for virtqueues private_data
  vdpasim: Return status in vdpasim_get_status
  ...
2020-04-21 12:27:18 -07:00
Linus Torvalds
b61f7ff0f6 tpmdd updates for Linux v5.6-rc3
-----BEGIN PGP SIGNATURE-----
 
 iJYEABYIAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCXp4P6iAcamFya2tvLnNh
 a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0sxRAQC3+7HXeakWG39Z
 mmNXwIhpUZsbFa3/JobqtQT/gaz9vAEAqu4+VmCz7a8L2LBVYCE/CvD4AG5u14d+
 KeYc0Zsxfgw=
 =x8S9
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-20200421' of git://git.infradead.org/users/jjs/linux-tpmdd

Pull tpm fixes from Jarkko Sakkinen:
 "A few bug fixes"

* tag 'tpmdd-next-20200421' of git://git.infradead.org/users/jjs/linux-tpmdd:
  tpm/tpm_tis: Free IRQ if probing fails
  tpm: fix wrong return value in tpm_pcr_extend
  tpm: ibmvtpm: retry on H_CLOSED in tpm_ibmvtpm_send()
  tpm: Export tpm2_get_cc_attrs_tbl for ibmvtpm driver as module
2020-04-21 12:24:33 -07:00
Linus Torvalds
20f1648909 Two trivial clang-format changes:
- Don't indent C++ namespaces (Ian Rogers)
 
  - The usual clang-format macro list update (Miguel Ojeda)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAl6dx9IACgkQGXyLc2ht
 IW0wjw//cyb0ooOTGlsSsJZrhkQLCFWByXx2cwgSgrmN7vZ1H7JF/85hqNZbAPyy
 OaEpiAA4CaEcJssWRS/xyOQd/YbXRJXXHKft5NzNmJPNJ5S8e4oSvciu4t7JVuxm
 OwVYrSytPB151Qge8kBxzroCQQB6alTyCdlL68LrN911TYcEdcuL53Ob47dUxKxJ
 FYoTtsZMHjzPdB2EkYwwd2uY9zUbS3wr8xvOy7J/PLg4ZpOdvx8E7imnHxslNKPj
 EU0X3wsOvGeCdb9OIiKbIWU+UlvjY4geqC9gCjB95vt0xtSdB6cYQ7I0a7UAPxEX
 L5wD/ufU5q7xpxJ0/EWrrgwvFi4OHkAgz/XhwD9f1YKz/FU5rvb0ezP0VhjLz/HR
 UyJfrwgAilB6WAqTVk1QCWy4WuN7mIpwsEtGMCD9+NuJl9+bRq7Ju5NO9t8BLMD8
 d9VRmqCDz97ulGUoMo+DYGBXAw3qzGfHKSDZfao7TKCvLHBytG1XQ3Q1Q+e2/KVp
 zf60L9QXOAnJbYPxAZ8W+XkYzJ5IiPW9/rYpivSx8Oi6N7TvFvpccwOOhHe+I4JE
 yhANFLIJCM9cnnWAAJI0V0x2ZQxiIbuFPI7SNruhGjM+IjGY5ucHqN0gOcZzT5qQ
 eW6jdIdjonlyLyqhSJty0x2vZY2qCV7JRGpl4zcY4Z3hALiIhXU=
 =VKnE
 -----END PGP SIGNATURE-----

Merge tag 'clang-format-for-linus-v5.7-rc3' of git://github.com/ojeda/linux

Pull clang-format fixlets from Miguel Ojeda:
 "Two trivial clang-format changes:

   - Don't indent C++ namespaces (Ian Rogers)

   - The usual clang-format macro list update (Miguel Ojeda)"

* tag 'clang-format-for-linus-v5.7-rc3' of git://github.com/ojeda/linux:
  clang-format: Update with the latest for_each macro list
  clang-format: don't indent namespaces
2020-04-21 12:07:42 -07:00
David S. Miller
76fc6a9a9a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) flow_block_cb memleak in nf_flow_table_offload_del_cb(), from Roi Dayan.

2) Fix error path handling in nf_nat_inet_register_fn(), from Hillf Danton.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21 11:50:31 -07:00
Lucas Stach
cf01699ee2 tools/vm: fix cross-compile build
Commit 7ed1c1901f ("tools: fix cross-compile var clobbering") moved
the setup of the CC variable to tools/scripts/Makefile.include to make
the behavior consistent across all the tools Makefiles.

As the vm tools missed the include we end up with the wrong CC in a
cross-compiling evironment.

Fixes: 7ed1c1901f (tools: fix cross-compile var clobbering)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Martin Kelly <martin@martingkelly.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200416104748.25243-1-l.stach@pengutronix.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:56 -07:00
Sudip Mukherjee
db973a7289 coredump: fix null pointer dereference on coredump
If the core_pattern is set to "|" and any process segfaults then we get
a null pointer derefernce while trying to coredump. The call stack shows:

    RIP: do_coredump+0x628/0x11c0

When the core_pattern has only "|" there is no use of trying the
coredump and we can check that while formating the corename and exit
with an error.

After this change I get:

    format_corename failed
    Aborting core

Fixes: 315c69261d ("coredump: split pipe command whitespace before expanding template")
Reported-by: Matthew Ruffell <matthew.ruffell@canonical.com>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Wise <pabs3@bonedaddy.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200416194612.21418-1-sudipm.mukherjee@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:56 -07:00
Yang Shi
94b7cc01da mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path
Syzbot reported the below lockdep splat:

    WARNING: possible irq lock inversion dependency detected
    5.6.0-rc7-syzkaller #0 Not tainted
    --------------------------------------------------------
    syz-executor.0/10317 just changed the state of lock:
    ffff888021d16568 (&(&info->lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
    ffff888021d16568 (&(&info->lock)->rlock){+.+.}, at: shmem_mfill_atomic_pte+0x1012/0x21c0 mm/shmem.c:2407
    but this lock was taken by another, SOFTIRQ-safe lock in the past:
     (&(&xa->xa_lock)->rlock#5){..-.}

    and interrupts could create inverse lock ordering between them.

    other info that might help us debug this:
     Possible interrupt unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(&(&info->lock)->rlock);
                                   local_irq_disable();
                                   lock(&(&xa->xa_lock)->rlock#5);
                                   lock(&(&info->lock)->rlock);
      <Interrupt>
        lock(&(&xa->xa_lock)->rlock#5);

     *** DEADLOCK ***

The full report is quite lengthy, please see:

  https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2004152007370.13597@eggly.anvils/T/#m813b412c5f78e25ca8c6c7734886ed4de43f241d

It is because CPU 0 held info->lock with IRQ enabled in userfaultfd_copy
path, then CPU 1 is splitting a THP which held xa_lock and info->lock in
IRQ disabled context at the same time.  If softirq comes in to acquire
xa_lock, the deadlock would be triggered.

The fix is to acquire/release info->lock with *_irq version instead of
plain spin_{lock,unlock} to make it softirq safe.

Fixes: 4c27fe4c4c ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
Reported-by: syzbot+e27980339d305f2dbfd9@syzkaller.appspotmail.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: syzbot+e27980339d305f2dbfd9@syzkaller.appspotmail.com
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Link: http://lkml.kernel.org/r/1587061357-122619-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:56 -07:00
Hugh Dickins
ea0dfeb420 shmem: fix possible deadlocks on shmlock_user_lock
Recent commit 71725ed10c ("mm: huge tmpfs: try to split_huge_page()
when punching hole") has allowed syzkaller to probe deeper, uncovering a
long-standing lockdep issue between the irq-unsafe shmlock_user_lock,
the irq-safe xa_lock on mapping->i_pages, and shmem inode's info->lock
which nests inside xa_lock (or tree_lock) since 4.8's shmem_uncharge().

user_shm_lock(), servicing SysV shmctl(SHM_LOCK), wants
shmlock_user_lock while its caller shmem_lock() holds info->lock with
interrupts disabled; but hugetlbfs_file_setup() calls user_shm_lock()
with interrupts enabled, and might be interrupted by a writeback endio
wanting xa_lock on i_pages.

This may not risk an actual deadlock, since shmem inodes do not take
part in writeback accounting, but there are several easy ways to avoid
it.

Requiring interrupts disabled for shmlock_user_lock would be easy, but
it's a high-level global lock for which that seems inappropriate.
Instead, recall that the use of info->lock to guard info->flags in
shmem_lock() dates from pre-3.1 days, when races with SHMEM_PAGEIN and
SHMEM_TRUNCATE could occur: nowadays it serves no purpose, the only flag
added or removed is VM_LOCKED itself, and calls to shmem_lock() an inode
are already serialized by the caller.

Take info->lock out of the chain and the possibility of deadlock or
lockdep warning goes away.

Fixes: 4595ef88d1 ("shmem: make shmem_inode_info::lock irq-safe")
Reported-by: syzbot+c8a8197c8852f566b9d9@syzkaller.appspotmail.com
Reported-by: syzbot+40b71e145e73f78f81ad@syzkaller.appspotmail.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2004161707410.16322@eggly.anvils
Link: https://lore.kernel.org/lkml/000000000000e5838c05a3152f53@google.com/
Link: https://lore.kernel.org/lkml/0000000000003712b305a331d3b1@google.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:56 -07:00
Jann Horn
bdebd6a283 vmalloc: fix remap_vmalloc_range() bounds checks
remap_vmalloc_range() has had various issues with the bounds checks it
promises to perform ("This function checks that addr is a valid
vmalloc'ed area, and that it is big enough to cover the vma") over time,
e.g.:

 - not detecting pgoff<<PAGE_SHIFT overflow

 - not detecting (pgoff<<PAGE_SHIFT)+usize overflow

 - not checking whether addr and addr+(pgoff<<PAGE_SHIFT) are the same
   vmalloc allocation

 - comparing a potentially wildly out-of-bounds pointer with the end of
   the vmalloc region

In particular, since commit fc9702273e ("bpf: Add mmap() support for
BPF_MAP_TYPE_ARRAY"), unprivileged users can cause kernel null pointer
dereferences by calling mmap() on a BPF map with a size that is bigger
than the distance from the start of the BPF map to the end of the
address space.

This could theoretically be used as a kernel ASLR bypass, by using
whether mmap() with a given offset oopses or returns an error code to
perform a binary search over the possible address range.

To allow remap_vmalloc_range_partial() to verify that addr and
addr+(pgoff<<PAGE_SHIFT) are in the same vmalloc region, pass the offset
to remap_vmalloc_range_partial() instead of adding it to the pointer in
remap_vmalloc_range().

In remap_vmalloc_range_partial(), fix the check against
get_vm_area_size() by using size comparisons instead of pointer
comparisons, and add checks for pgoff.

Fixes: 833423143c ("[PATCH] mm: introduce remap_vmalloc_range()")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Link: http://lkml.kernel.org/r/20200415222312.236431-1-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:56 -07:00
Hugh Dickins
0783ac95b4 mm/shmem: fix build without THP
Some optimizers don't notice that shmem_punch_compound() is always true
(PageTransCompound() being false) without CONFIG_TRANSPARENT_HUGEPAGE==y.

Use IS_ENABLED to help them to avoid the BUILD_BUG inside HPAGE_PMD_NR.

Fixes: 71725ed10c ("mm: huge tmpfs: try to split_huge_page() when punching hole")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2004142339170.10035@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
Muchun Song
56df70a63e mm/ksm: fix NULL pointer dereference when KSM zero page is enabled
find_mergeable_vma() can return NULL.  In this case, it leads to a crash
when we access vm_mm(its offset is 0x40) later in write_protect_page.
And this case did happen on our server.  The following call trace is
captured in kernel 4.19 with the following patch applied and KSM zero
page enabled on our server.

  commit e86c59b1b1 ("mm/ksm: improve deduplication of zero pages with colouring")

So add a vma check to fix it.

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
  Oops: 0000 [#1] SMP NOPTI
  CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
  RIP: try_to_merge_one_page+0xc7/0x760
  Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
        60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
        8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
  RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
  RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
  RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
  RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
  R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
  R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
  FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
    ksm_scan_thread+0x115e/0x1960
    kthread+0xf5/0x130
    ret_from_fork+0x1f/0x30

[songmuchun@bytedance.com: if the vma is out of date, just exit]
  Link: http://lkml.kernel.org/r/20200416025034.29780-1-songmuchun@bytedance.com
[akpm@linux-foundation.org: add the conventional braces, replace /** with /*]
Fixes: e86c59b1b1 ("mm/ksm: improve deduplication of zero pages with colouring")
Co-developed-by: Xiongchun Duan <duanxiongchun@bytedance.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Markus Elfring <Markus.Elfring@web.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200416025034.29780-1-songmuchun@bytedance.com
Link: http://lkml.kernel.org/r/20200414132905.83819-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
George Burgess IV
a21151b9d8 tools/build: tweak unused value workaround
Clang has -Wself-assign enabled by default under -Wall, which always
gets -Werror'ed on this file, causing sync-compare-and-swap to be
disabled by default.

The generally-accepted way to spell "this value is intentionally
unused," is casting it to `void`.  This is accepted by both GCC and
Clang with -Wall enabled: https://godbolt.org/z/qqZ9r3

Signed-off-by: George Burgess IV <gbiv@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: http://lkml.kernel.org/r/20200414195638.156123-1-gbiv@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
Christophe JAILLET
461e156536 checkpatch: fix a typo in the regex for $allocFunctions
Here, we look for function such as 'netdev_alloc_skb_ip_align', so a '_'
is missing in the regex.

To make sure:
   grep -r --include=*.c skbip_a * | wc   ==>   0 results
   grep -r --include=*.c skb_ip_a * | wc  ==> 112 results

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200407190029.892-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
Michal Hocko
d180870d83 mm, gup: return EINTR when gup is interrupted by fatal signals
EINTR is the usual error code which other killable interfaces return.
This is the case for the other fatal_signal_pending break out from the
same function.  Make the code consistent.

ERESTARTSYS is also quite confusing because the signal is fatal and so
no restart will happen before returning to the userspace.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>
Link: http://lkml.kernel.org/r/20200409071133.31734-1-mhocko@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
Longpeng
3c1d7e6ccb mm/hugetlb: fix a addressing exception caused by huge_pte_offset
Our machine encountered a panic(addressing exception) after run for a
long time and the calltrace is:

    RIP: hugetlb_fault+0x307/0xbe0
    RSP: 0018:ffff9567fc27f808  EFLAGS: 00010286
    RAX: e800c03ff1258d48 RBX: ffffd3bb003b69c0 RCX: e800c03ff1258d48
    RDX: 17ff3fc00eda72b7 RSI: 00003ffffffff000 RDI: e800c03ff1258d48
    RBP: ffff9567fc27f8c8 R08: e800c03ff1258d48 R09: 0000000000000080
    R10: ffffaba0704c22a8 R11: 0000000000000001 R12: ffff95c87b4b60d8
    R13: 00005fff00000000 R14: 0000000000000000 R15: ffff9567face8074
    FS:  00007fe2d9ffb700(0000) GS:ffff956900e40000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffd3bb003b69c0 CR3: 000000be67374000 CR4: 00000000003627e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
      follow_hugetlb_page+0x175/0x540
      __get_user_pages+0x2a0/0x7e0
      __get_user_pages_unlocked+0x15d/0x210
      __gfn_to_pfn_memslot+0x3c5/0x460 [kvm]
      try_async_pf+0x6e/0x2a0 [kvm]
      tdp_page_fault+0x151/0x2d0 [kvm]
     ...
      kvm_arch_vcpu_ioctl_run+0x330/0x490 [kvm]
      kvm_vcpu_ioctl+0x309/0x6d0 [kvm]
      do_vfs_ioctl+0x3f0/0x540
      SyS_ioctl+0xa1/0xc0
      system_call_fastpath+0x22/0x27

For 1G hugepages, huge_pte_offset() wants to return NULL or pudp, but it
may return a wrong 'pmdp' if there is a race.  Please look at the
following code snippet:

    ...
    pud = pud_offset(p4d, addr);
    if (sz != PUD_SIZE && pud_none(*pud))
        return NULL;
    /* hugepage or swap? */
    if (pud_huge(*pud) || !pud_present(*pud))
        return (pte_t *)pud;

    pmd = pmd_offset(pud, addr);
    if (sz != PMD_SIZE && pmd_none(*pmd))
        return NULL;
    /* hugepage or swap? */
    if (pmd_huge(*pmd) || !pmd_present(*pmd))
        return (pte_t *)pmd;
    ...

The following sequence would trigger this bug:

 - CPU0: sz = PUD_SIZE and *pud = 0 , continue
 - CPU0: "pud_huge(*pud)" is false
 - CPU1: calling hugetlb_no_page and set *pud to xxxx8e7(PRESENT)
 - CPU0: "!pud_present(*pud)" is false, continue
 - CPU0: pmd = pmd_offset(pud, addr) and maybe return a wrong pmdp

However, we want CPU0 to return NULL or pudp in this case.

We must make sure there is exactly one dereference of pud and pmd.

Signed-off-by: Longpeng <longpeng2@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200413010342.771-1-longpeng2@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
Bartosz Golaszewski
5701feb0a9 MAINTAINERS: add an entry for kfifo
Kfifo has been written by Stefani Seibold and she's implicitly expected
to Ack any changes to it.  She's not however officially listed as kfifo
maintainer which leads to delays in patch review.  This patch proposes
to add an explitic entry for kfifo to MAINTAINERS file.

[akpm@linux-foundation.org: alphasort F: entries, per Joe]
[akpm@linux-foundation.org: remove colon, per Bartosz]
Link: http://lkml.kernel.org/r/20200124174533.21815-1-brgl@bgdev.pl
Link: http://lkml.kernel.org/r/20200413104250.26683-1-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
Peter Xu
b64d8d1e1b mm/userfaultfd: disable userfaultfd-wp on x86_32
Userfaultfd-wp is not yet working on 32bit hosts, but it's accidentally
enabled previously.  Disable it.

Fixes: 5a281062af ("userfaultfd: wp: add WP pagetable tracking to x86")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>
Link: http://lkml.kernel.org/r/20200413141608.109211-1-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
Kees Cook
89b83f282d slub: avoid redzone when choosing freepointer location
Marco Elver reported system crashes when booting with "slub_debug=Z".

The freepointer location (s->offset) was not taking into account that
the "inuse" size that includes the redzone area should not be used by
the freelist pointer.  Change the calculation to save the area of the
object that an inline freepointer may be written into.

Fixes: 3202fa62fb ("slub: relocate freelist pointer to middle of object")
Reported-by: Marco Elver <elver@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Marco Elver <elver@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/202004151054.BD695840@keescook
Link: https://lore.kernel.org/linux-mm/20200415164726.GA234932@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
Masahiro Yamada
1eb64c07aa sh: fix build error in mm/init.c
The closing parenthesis is missing.

Fixes: bfeb022f8f ("mm/memory_hotplug: add pgprot_t to mhp_params")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Link: http://lkml.kernel.org/r/20200413014743.16353-1-masahiroy@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-21 11:11:55 -07:00
Ma, Jianpeng
d56deb1e4e block: remove unused header
Dax related code already removed from this file.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-21 09:51:10 -06:00
Waiman Long
d6c8e949a3 blk-iocost: Fix error on iocost_ioc_vrate_adj
Systemtap 4.2 is unable to correctly interpret the "u32 (*missed_ppm)[2]"
argument of the iocost_ioc_vrate_adj trace entry defined in
include/trace/events/iocost.h leading to the following error:

  /tmp/stapAcz0G0/stap_c89c58b83cea1724e26395efa9ed4939_6321_aux_6.c:78:8:
  error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token
   , u32[]* __tracepoint_arg_missed_ppm

That argument type is indeed rather complex and hard to read. Looking
at block/blk-iocost.c. It is just a 2-entry u32 array. By simplifying
the argument to a simple "u32 *missed_ppm" and adjusting the trace
entry accordingly, the compilation error was gone.

Fixes: 7caa47151a ("blkcg: implement blk-iocost")
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-21 09:49:36 -06:00
Eric W. Biederman
61e713bdca signal: Avoid corrupting si_pid and si_uid in do_notify_parent
Christof Meerwald <cmeerw@cmeerw.org> writes:
> Hi,
>
> this is probably related to commit
> 7a0cf09494 (signal: Correct namespace
> fixups of si_pid and si_uid).
>
> With a 5.6.5 kernel I am seeing SIGCHLD signals that don't include a
> properly set si_pid field - this seems to happen for multi-threaded
> child processes.
>
> A simple test program (based on the sample from the signalfd man page):
>
> #include <sys/signalfd.h>
> #include <signal.h>
> #include <unistd.h>
> #include <spawn.h>
> #include <stdlib.h>
> #include <stdio.h>
>
> #define handle_error(msg) \
>     do { perror(msg); exit(EXIT_FAILURE); } while (0)
>
> int main(int argc, char *argv[])
> {
>   sigset_t mask;
>   int sfd;
>   struct signalfd_siginfo fdsi;
>   ssize_t s;
>
>   sigemptyset(&mask);
>   sigaddset(&mask, SIGCHLD);
>
>   if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1)
>     handle_error("sigprocmask");
>
>   pid_t chldpid;
>   char *chldargv[] = { "./sfdclient", NULL };
>   posix_spawn(&chldpid, "./sfdclient", NULL, NULL, chldargv, NULL);
>
>   sfd = signalfd(-1, &mask, 0);
>   if (sfd == -1)
>     handle_error("signalfd");
>
>   for (;;) {
>     s = read(sfd, &fdsi, sizeof(struct signalfd_siginfo));
>     if (s != sizeof(struct signalfd_siginfo))
>       handle_error("read");
>
>     if (fdsi.ssi_signo == SIGCHLD) {
>       printf("Got SIGCHLD %d %d %d %d\n",
>           fdsi.ssi_status, fdsi.ssi_code,
>           fdsi.ssi_uid, fdsi.ssi_pid);
>       return 0;
>     } else {
>       printf("Read unexpected signal\n");
>     }
>   }
> }
>
>
> and a multi-threaded client to test with:
>
> #include <unistd.h>
> #include <pthread.h>
>
> void *f(void *arg)
> {
>   sleep(100);
> }
>
> int main()
> {
>   pthread_t t[8];
>
>   for (int i = 0; i != 8; ++i)
>   {
>     pthread_create(&t[i], NULL, f, NULL);
>   }
> }
>
> I tried to do a bit of debugging and what seems to be happening is
> that
>
>   /* From an ancestor pid namespace? */
>   if (!task_pid_nr_ns(current, task_active_pid_ns(t))) {
>
> fails inside task_pid_nr_ns because the check for "pid_alive" fails.
>
> This code seems to be called from do_notify_parent and there we
> actually have "tsk != current" (I am assuming both are threads of the
> current process?)

I instrumented the code with a warning and received the following backtrace:
> WARNING: CPU: 0 PID: 777 at kernel/pid.c:501 __task_pid_nr_ns.cold.6+0xc/0x15
> Modules linked in:
> CPU: 0 PID: 777 Comm: sfdclient Not tainted 5.7.0-rc1userns+ #2924
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> RIP: 0010:__task_pid_nr_ns.cold.6+0xc/0x15
> Code: ff 66 90 48 83 ec 08 89 7c 24 04 48 8d 7e 08 48 8d 74 24 04 e8 9a b6 44 00 48 83 c4 08 c3 48 c7 c7 59 9f ac 82 e8 c2 c4 04 00 <0f> 0b e9 3fd
> RSP: 0018:ffffc9000042fbf8 EFLAGS: 00010046
> RAX: 000000000000000c RBX: 0000000000000000 RCX: ffffc9000042faf4
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81193d29
> RBP: ffffc9000042fc18 R08: 0000000000000000 R09: 0000000000000001
> R10: 000000100f938416 R11: 0000000000000309 R12: ffff8880b941c140
> R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880b941c140
> FS:  0000000000000000(0000) GS:ffff8880bca00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f2e8c0a32e0 CR3: 0000000002e10000 CR4: 00000000000006f0
> Call Trace:
>  send_signal+0x1c8/0x310
>  do_notify_parent+0x50f/0x550
>  release_task.part.21+0x4fd/0x620
>  do_exit+0x6f6/0xaf0
>  do_group_exit+0x42/0xb0
>  get_signal+0x13b/0xbb0
>  do_signal+0x2b/0x670
>  ? __audit_syscall_exit+0x24d/0x2b0
>  ? rcu_read_lock_sched_held+0x4d/0x60
>  ? kfree+0x24c/0x2b0
>  do_syscall_64+0x176/0x640
>  ? trace_hardirqs_off_thunk+0x1a/0x1c
>  entry_SYSCALL_64_after_hwframe+0x49/0xb3

The immediate problem is as Christof noticed that "pid_alive(current) == false".
This happens because do_notify_parent is called from the last thread to exit
in a process after that thread has been reaped.

The bigger issue is that do_notify_parent can be called from any
process that manages to wait on a thread of a multi-threaded process
from wait_task_zombie.  So any logic based upon current for
do_notify_parent is just nonsense, as current can be pretty much
anything.

So change do_notify_parent to call __send_signal directly.

Inspecting the code it appears this problem has existed since the pid
namespace support started handling this case in 2.6.30.  This fix only
backports to 7a0cf09494 ("signal: Correct namespace fixups of si_pid and si_uid")
where the problem logic was moved out of __send_signal and into send_signal.

Cc: stable@vger.kernel.org
Fixes: 6588c1e3ff ("signals: SI_USER: Masquerade si_pid when crossing pid ns boundary")
Ref: 921cf9f630 ("signals: protect cinit from unblocked SIG_DFL signals")
Link: https://lore.kernel.org/lkml/20200419201336.GI22017@edge.cmeerw.net/
Reported-by: Christof Meerwald <cmeerw@cmeerw.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-04-21 09:55:30 -05:00
Mark Rutland
3fabb43818 arm64: sync kernel APIAKey when installing
A direct write to a APxxKey_EL1 register requires a context
synchronization event to ensure that indirect reads made by subsequent
instructions (e.g. AUTIASP, PACIASP) observe the new value.

When we initialize the boot task's APIAKey in boot_init_stack_canary()
via ptrauth_keys_switch_kernel() we miss the necessary ISB, and so there
is a window where instructions are not guaranteed to use the new APIAKey
value. This has been observed to result in boot-time crashes where
PACIASP and AUTIASP within a function used a mixture of the old and new
key values.

Fix this by having ptrauth_keys_switch_kernel() synchronize the new key
value with an ISB. At the same time, __ptrauth_key_install() is renamed
to __ptrauth_key_install_nosync() so that it is obvious that this
performs no synchronization itself.

Fixes: 2832158233 ("arm64: initialize ptrauth keys for kernel booting task")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Will Deacon <will@kernel.org>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Will Deacon <will@kernel.org>
2020-04-21 15:52:56 +01:00
Shengjiu Wang
1e060a453c
ASoC: wm8960: Fix wrong clock after suspend & resume
After suspend & resume, wm8960_hw_params may be called when
bias_level is not SND_SOC_BIAS_ON, then wm8960_configure_clocking
is not called. But if sample rate is changed at that time, then
the output clock rate will be not correct.

So judgement of bias_level is SND_SOC_BIAS_ON in wm8960_hw_params
is not necessary and it causes above issue.

Fixes: 3176bf2d7c ("ASoC: wm8960: update pll and clock setting function")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/1587468525-27514-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-04-21 15:43:22 +01:00
Paolo Bonzini
00a6a5ef39 PPC KVM fix for 5.7
- Fix a regression introduced in the last merge window, which results
   in guests in HPT mode dying randomly.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJeni/pAAoJEJ2a6ncsY3GfTRoIANAQjIZi96AfJcfnrYQ4yUF7
 scxawTiJ9VavvsEJLJ7vsozrJ4xxmvmA0fFWC84uw9+BwPqoLFFvZTjazbGEDVvF
 FGwNBR/k7nfFVMIHS3K9iy9KjvYL3xkL26AgFTDJFq8hmOO9pH0txuk4r7SXb+NX
 bGG0mScAD/Dg/HwAHAS6EP3jT35QtGTK62p8foqVTziTNcmBn9Ywtg0lEzAcq2iY
 Y1BUD4Ov3cggshMI9SqHE8Yyq0XA2Wi6ggcyz/gVzvcbdFQmtg57Tri8nN8661LX
 XKh+VTpYSIxNs5GgjwlNesJzJ9h6CSynJF556qrjQ0XsXcNqvn8fcZdNQ+hnRYw=
 =Y19W
 -----END PGP SIGNATURE-----

Merge tag 'kvm-ppc-fixes-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master

PPC KVM fix for 5.7

- Fix a regression introduced in the last merge window, which results
  in guests in HPT mode dying randomly.
2020-04-21 09:39:55 -04:00
Paolo Bonzini
3bda03865f KVM: s390: Fix for 5.7 and maintainer update
- Silence false positive lockdep warning
 - add Claudio as reviewer
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJenY6AAAoJEBF7vIC1phx8bykQAK+QZyD+H/zGNuqeUVn0sh8e
 yKUVMR+kuE+l57q77nt2AYVxqpCD9xSKRR+SOSLzhVH/HJf625nm+Ny/WOWMebwJ
 EA/KK+v15T5rga8gFza+4cPg4v/pHwjHhSbjTb1JWg+8cJR1BTj6OxRuTtWr5+25
 GF4RhkJOit/VhNbCo1aIgs7/7F1pPALstdPAUsHYe1PeULdRMVqSVluXT2KTPhpi
 /kzDw8sKKcYgv/eaVdcNoHv+VX1AWIRDAKEttCywyocfbu0ESwadmR7C0qlm1446
 HqowP6F0xCF0Whi/65aN4ZOv7wjO/qrV08DZ7JLA3/oKlXtZ1ieyiE2q/P1frSo1
 gvmuHiH5/UI6t6a/BSCpJwqcilxKYArqAAYBKoGiJhTbsJStqw0wl41klWTKXlTq
 VrCvjoUxQ9JMjFCQ1GXOU+ODNyX2IwZYptJ5vF24HYzBJwUBe3HPG9/BA8YcodzG
 qGQ5IKv0Q1IFTwOqnt557H0MjcBtNIEx54aLJrPy3wldsiNSj39Ft0cuvnbR+Q4F
 QhKk88dHtd7NW1IirfgYmLGe0rB1ANKM7wUGEdM5w2y5Eg8wCs8/P4KeGh0YyFI9
 xPqZDfwof6KkDjOGFXr/CeD/thi+km0/FpePb7cL5Ow4a+JmrCvqQiXrf0TbnFpv
 t5ZlHnGzoSHsEaRgmJ+X
 =d46L
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master

KVM: s390: Fix for 5.7 and maintainer update

- Silence false positive lockdep warning
- add Claudio as reviewer
2020-04-21 09:37:13 -04:00
Ryder Lee
10e41f34a0 MAINTAINERS: update mt76 reviewers
Roy no longer works here. Time to say goodbye, my friend.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/c171e0dfce9f2dad5ca6935eaf6004117f82e259.1587195398.git.ryder.lee@mediatek.com
2020-04-21 15:40:55 +03:00
Luca Coelho
1edd56e69d iwlwifi: fix WGDS check when WRDS is disabled
In the reference BIOS implementation, WRDS can be disabled without
disabling WGDS.  And this happens in most cases where WRDS is
disabled, causing the WGDS without WRDS check and issue an error.

To avoid this issue, we change the check so that we only considered it
an error if the WRDS entry doesn't exist.  If the entry (or the
selected profile is disabled for any other reason), we just silently
ignore WGDS.

Cc: stable@vger.kernel.org # 4.14+
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205513
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20200417133700.72ad25c3998b.I875d935cefd595ed7f640ddcfc7bc802627d2b7f@changeid
2020-04-21 15:40:30 +03:00
Johannes Berg
e6d419f943 iwlwifi: mvm: fix inactive TID removal return value usage
The function iwl_mvm_remove_inactive_tids() returns bool, so we
should just check "if (ret)", not "if (ret >= 0)" (which would
do nothing useful here). We obviously therefore cannot use the
return value of the function for the free_queue, we need to use
the queue (i) we're currently dealing with instead.

Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.9d862ed72535.I9e27ccc3ee3c8855fc13682592b571581925dfbd@changeid
2020-04-21 15:39:05 +03:00
Ilan Peer
38af8d5a90 iwlwifi: mvm: Do not declare support for ACK Enabled Aggregation
As this was not supposed to be enabled to begin with.

Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.53dbc3c6c36b.Idfe118546b92cc31548b2211472a5303c7de5909@changeid
2020-04-21 15:39:04 +03:00
Johannes Berg
e5b72e3bc4 iwlwifi: mvm: limit maximum queue appropriately
Due to some hardware issues, queue 31 isn't usable on devices that have
32 queues (7000, 8000, 9000 families), which is correctly reflected in
the configuration and TX queue initialization.

However, the firmware API and queue allocation code assumes that there
are 32 queues, and if something actually attempts to use #31 this leads
to a NULL-pointer dereference since it's not allocated.

Fix this by limiting to 31 in the IWL_MVM_DQA_MAX_DATA_QUEUE, and also
add some code to catch this earlier in the future, if the configuration
changes perhaps.

Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.98a79be2db6a.I3a4af6b03b87a6bc18db9b1ff9a812f397bee1fc@changeid
2020-04-21 15:39:03 +03:00
Johannes Berg
d8d6639702 iwlwifi: pcie: indicate correct RB size to device
In the context info, we need to indicate the correct RB size
to the device so that it will not think we have 4k when we
only use 2k. This seems to not have caused any issues right
now, likely because the hardware no longer supports putting
multiple entries into a single RB, and practically all of
the entries should be smaller than 2k.

Nevertheless, it's a bug, and we must advertise the right
size to the device.

Note that right now we can only tell it 2k vs. 4k, so for
the cases where we have more, still use 4k. This needs to
be fixed by the firmware first.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes: cfdc20efeb ("iwlwifi: pcie: use partial pages if applicable")
Cc: stable@vger.kernel.org # v5.6
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.ae6cd345764f.I0985c55223decf70182b9ef1d8edf4179f537853@changeid
2020-04-21 15:39:02 +03:00
Mordechay Goodstein
290d5e4951 iwlwifi: mvm: beacon statistics shouldn't go backwards
We reset statistics also in case that we didn't reassoc so in
this cases keep last beacon counter.

Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.1f9142751fbc.Ifbfd0f928a0a761110b8f4f2ca5483a61fb21131@changeid
2020-04-21 15:39:01 +03:00
Johannes Berg
b98b33d556 iwlwifi: pcie: actually release queue memory in TVQM
The iwl_trans_pcie_dyn_txq_free() function only releases the frames
that may be left on the queue by calling iwl_pcie_gen2_txq_unmap(),
but doesn't actually free the DMA ring or byte-count tables for the
queue. This leads to pretty large memory leaks (at least before my
queue size improvements), in particular in monitor/sniffer mode on
channel hopping since this happens on every channel change.

This was also now more evident after the move to a DMA pool for the
byte count tables, showing messages such as

  BUG iwlwifi:bc (...): Objects remaining in iwlwifi:bc on __kmem_cache_shutdown()

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=206811.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes: 6b35ff9157 ("iwlwifi: pcie: introduce a000 TX queues management")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20200417100405.f5f4c4193ec1.Id5feebc9b4318041913a9c89fc1378bb5454292c@changeid
2020-04-21 15:39:00 +03:00
Arnd Bergmann
e980121346 arm64: soc: ZynqMP SoC fixes for v5.7
- Fix firmware driver dependency
 - Fix one spare warning in firmware driver
 -----BEGIN PGP SIGNATURE-----
 
 iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCXp6jyQAKCRDKSWXLKUoM
 IQzTAKCc2Tt4o0esEkq86wp0ppImrRurgwCgjopa+LOY523LeYXvnM3RlBmPxVk=
 =IYSa
 -----END PGP SIGNATURE-----

Merge tag 'zynqmp-soc-for-v5.7-rc3' of https://github.com/Xilinx/linux-xlnx into arm/fixes

arm64: soc: ZynqMP SoC fixes for v5.7

- Fix firmware driver dependency
- Fix one spare warning in firmware driver

* tag 'zynqmp-soc-for-v5.7-rc3' of https://github.com/Xilinx/linux-xlnx:
  firmware: xilinx: make firmware_debugfs_root static
  drivers: soc: xilinx: fix firmware driver Kconfig dependency

Link: https://lore.kernel.org/r/4c6daeb0-bc61-8bdb-6ed6-5f58cd915326@monstr.eu
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-04-21 14:21:20 +02:00
Chris Rorvick
a176e114ac iwlwifi: actually check allocated conf_tlv pointer
Commit 71bc0334a6 ("iwlwifi: check allocated pointer when allocating
conf_tlvs") attempted to fix a typoe introduced by commit 17b809c9b2
("iwlwifi: dbg: move debug data to a struct") but does not implement the
check correctly.

Fixes: 71bc0334a6 ("iwlwifi: check allocated pointer when allocating conf_tlvs")
Tweeted-by: @grsecurity
Signed-off-by: Chris Rorvick <chris@rorvick.com>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200417074558.12316-1-sedat.dilek@gmail.com
2020-04-21 10:49:52 +03:00
Takashi Iwai
7686e34852 ALSA: usx2y: Fix potential NULL dereference
The error handling code in usX2Y_rate_set() may hit a potential NULL
dereference when an error occurs before allocating all us->urb[].
Add a proper NULL check for fixing the corner case.

Reported-by: Lin Yi <teroincn@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200420075529.27203-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-04-21 08:00:41 +02:00
Gregor Pintar
6f4ea2074d ALSA: usb-audio: Add quirk for Focusrite Scarlett 2i2
Force it to use asynchronous playback.

Same quirk has already been added for Focusrite Scarlett Solo (2nd gen)
with a commit 46f5710f0b ("ALSA: usb-audio: Add quirk for Focusrite
Scarlett Solo").

This also seems to prevent regular clicks when playing at 44100Hz
on Scarlett 2i2 (2nd gen). I did not notice any side effects.

Moved both quirks to snd_usb_audioformat_attributes_quirk() as suggested.

Signed-off-by: Gregor Pintar <grpintar@gmail.com>
Reviewed-by: Alexander Tsoy <alexander@tsoy.me>
Link: https://lore.kernel.org/r/20200420214030.2361-1-grpintar@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-04-21 07:58:54 +02:00
Luke Nelson
d2b6c3ab70 bpf, selftests: Add test for BPF_STX BPF_B storing R10
This patch adds a test to test_verifier that writes the lower 8 bits of
R10 (aka FP) using BPF_B to an array map and reads the result back. The
expected behavior is that the result should be the same as first copying
R10 to R9, and then storing / loading the lower 8 bits of R9.

This test catches a bug that was present in the x86-64 JIT that caused
an incorrect encoding for BPF_STX BPF_B when the source operand is R10.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200418232655.23870-2-luke.r.nels@gmail.com
2020-04-20 19:25:30 -07:00
Luke Nelson
aee194b14d bpf, x86: Fix encoding for lower 8-bit registers in BPF_STX BPF_B
This patch fixes an encoding bug in emit_stx for BPF_B when the source
register is BPF_REG_FP.

The current implementation for BPF_STX BPF_B in emit_stx saves one REX
byte when the operands can be encoded using Mod-R/M alone. The lower 8
bits of registers %rax, %rbx, %rcx, and %rdx can be accessed without using
a REX prefix via %al, %bl, %cl, and %dl, respectively. Other registers,
(e.g., %rsi, %rdi, %rbp, %rsp) require a REX prefix to use their 8-bit
equivalents (%sil, %dil, %bpl, %spl).

The current code checks if the source for BPF_STX BPF_B is BPF_REG_1
or BPF_REG_2 (which map to %rdi and %rsi), in which case it emits the
required REX prefix. However, it misses the case when the source is
BPF_REG_FP (mapped to %rbp).

The result is that BPF_STX BPF_B with BPF_REG_FP as the source operand
will read from register %ch instead of the correct %bpl. This patch fixes
the problem by fixing and refactoring the check on which registers need
the extra REX byte. Since no BPF registers map to %rsp, there is no need
to handle %spl.

Fixes: 622582786c ("net: filter: x86: internal BPF JIT")
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200418232655.23870-1-luke.r.nels@gmail.com
2020-04-20 19:25:30 -07:00
Jann Horn
8ff3571f7e bpf: Fix handling of XADD on BTF memory
check_xadd() can cause check_ptr_to_btf_access() to be executed with
atype==BPF_READ and value_regno==-1 (meaning "just check whether the access
is okay, don't tell me what type it will result in").
Handle that case properly and skip writing type information, instead of
indexing into the registers at index -1 and writing into out-of-bounds
memory.

Note that at least at the moment, you can't actually write through a BTF
pointer, so check_xadd() will reject the program after calling
check_ptr_to_btf_access with atype==BPF_WRITE; but that's after the
verifier has already corrupted memory.

This patch assumes that BTF pointers are not available in unprivileged
programs.

Fixes: 9e15db6613 ("bpf: Implement accurate raw_tp context access via BTF")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200417000007.10734-2-jannh@google.com
2020-04-20 18:41:34 -07:00