2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-24 21:24:00 +08:00
Commit Graph

706472 Commits

Author SHA1 Message Date
Linus Torvalds
42057e1825 Mixed bugfixes. Perhaps the most interesting one is a latent bug
that was finally triggered by PCID support.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJZzmJqAAoJEL/70l94x66D99oH/R4hOMfzDxFOgW3LnaCQJvwo
 n1+tH3as0dfdkpggZ+UmJuKnbVJ0625+qozenrdYkKtk1YiyIIQWG3vdsz4HBfzp
 CYK2NVVymf0dg8DQaluz6iB1R28ke12PggzyFv01s1QyENBDA8J38pslZarPM2OA
 tnpRKC6B59/VmRD0PWS6yRmTXY+HfzWlWg4JMraq2VdybbEXJhh8BNfjjNn30DkZ
 SW8kHq60AUd5Arhb3cmiPiXZCQ7odqF2u2mEcBmnA9hAacaGEheSzKCUOaEIjmZV
 5/jTyG1tZkN7CbrG81ryuoa8A6qTOSyHxo1QkzAmE/p+s2IzGfzzLqmtfIsAWkE=
 =1lM1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Mixed bugfixes. Perhaps the most interesting one is a latent bug that
  was finally triggered by PCID support"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm/x86: Handle async PF in RCU read-side critical sections
  KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume
  KVM: VMX: use cmpxchg64
  KVM: VMX: simplify and fix vmx_vcpu_pi_load
  KVM: VMX: avoid double list add with VT-d posted interrupts
  KVM: VMX: extract __pi_post_block
  KVM: PPC: Book3S HV: Check for updated HDSISR on P9 HDSI exception
  KVM: nVMX: fix HOST_CR3/HOST_CR4 cache
2017-09-29 12:18:55 -07:00
Linus Torvalds
95d3652eec Merge branch 'fixes-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull keys fixes from James Morris:
 "Notable here is a rewrite of big_key crypto by Jason Donenfeld to
  address some issues in the original code.

  From Jason's commit log:
   "This started out as just replacing the use of crypto/rng with
    get_random_bytes_wait, so that we wouldn't use bad randomness at
    boot time. But, upon looking further, it appears that there were
    even deeper underlying cryptographic problems, and that this seems
    to have been committed with very little crypto review. So, I rewrote
    the whole thing, trying to keep to the conventions introduced by the
    previous author, to fix these cryptographic flaws."

  There has been positive review of the new code by Eric Biggers and
  Herbert Xu, and it passes basic testing via the keyutils test suite.
  Eric also manually tested it.

  Generally speaking, we likely need to improve the amount of crypto
  review for kernel crypto users including keys (I'll post a note
  separately to ksummit-discuss)"

* 'fixes-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security/keys: rewrite all of big_key crypto
  security/keys: properly zero out sensitive key material in big_key
  KEYS: use kmemdup() in request_key_auth_new()
  KEYS: restrict /proc/keys by credentials at open time
  KEYS: reset parent each time before searching key_user_tree
  KEYS: prevent KEYCTL_READ on negative key
  KEYS: prevent creating a different user's keyrings
  KEYS: fix writing past end of user-supplied buffer in keyring_read()
  KEYS: fix key refcount leak in keyctl_read_key()
  KEYS: fix key refcount leak in keyctl_assume_authority()
  KEYS: don't revoke uninstantiated key in request_key_auth_new()
  KEYS: fix cred refcount leak in request_key_auth_new()
2017-09-29 10:26:35 -07:00
Boqun Feng
b862789aa5 kvm/x86: Handle async PF in RCU read-side critical sections
Sasha Levin reported a WARNING:

| WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
| rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
| WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
| rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
...
| CPU: 0 PID: 6974 Comm: syz-fuzzer Not tainted 4.13.0-next-20170908+ #246
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
| 1.10.1-1ubuntu1 04/01/2014
| Call Trace:
...
| RIP: 0010:rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
| RIP: 0010:rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
| RSP: 0018:ffff88003b2debc8 EFLAGS: 00010002
| RAX: 0000000000000001 RBX: 1ffff1000765bd85 RCX: 0000000000000000
| RDX: 1ffff100075d7882 RSI: ffffffffb5c7da20 RDI: ffff88003aebc410
| RBP: ffff88003b2def30 R08: dffffc0000000000 R09: 0000000000000001
| R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b2def08
| R13: 0000000000000000 R14: ffff88003aebc040 R15: ffff88003aebc040
| __schedule+0x201/0x2240 kernel/sched/core.c:3292
| schedule+0x113/0x460 kernel/sched/core.c:3421
| kvm_async_pf_task_wait+0x43f/0x940 arch/x86/kernel/kvm.c:158
| do_async_page_fault+0x72/0x90 arch/x86/kernel/kvm.c:271
| async_page_fault+0x22/0x30 arch/x86/entry/entry_64.S:1069
| RIP: 0010:format_decode+0x240/0x830 lib/vsprintf.c:1996
| RSP: 0018:ffff88003b2df520 EFLAGS: 00010283
| RAX: 000000000000003f RBX: ffffffffb5d1e141 RCX: ffff88003b2df670
| RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffb5d1e140
| RBP: ffff88003b2df560 R08: dffffc0000000000 R09: 0000000000000000
| R10: ffff88003b2df718 R11: 0000000000000000 R12: ffff88003b2df5d8
| R13: 0000000000000064 R14: ffffffffb5d1e140 R15: 0000000000000000
| vsnprintf+0x173/0x1700 lib/vsprintf.c:2136
| sprintf+0xbe/0xf0 lib/vsprintf.c:2386
| proc_self_get_link+0xfb/0x1c0 fs/proc/self.c:23
| get_link fs/namei.c:1047 [inline]
| link_path_walk+0x1041/0x1490 fs/namei.c:2127
...

This happened when the host hit a page fault, and delivered it as in an
async page fault, while the guest was in an RCU read-side critical
section.  The guest then tries to reschedule in kvm_async_pf_task_wait(),
but rcu_preempt_note_context_switch() would treat the reschedule as a
sleep in RCU read-side critical section, which is not allowed (even in
preemptible RCU).  Thus the WARN.

To cure this, make kvm_async_pf_task_wait() go to the halt path if the
PF happens in a RCU read-side critical section.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-29 17:05:17 +02:00
Wanpeng Li
305d0ab476 KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume
------------[ cut here ]------------
 WARNING: CPU: 4 PID: 5280 at /home/kernel/linux/arch/x86/kvm//vmx.c:11394 nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
 CPU: 4 PID: 5280 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0+ #17
 RIP: 0010:nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
 Call Trace:
  ? emulator_read_emulated+0x15/0x20 [kvm]
  ? segmented_read+0xae/0xf0 [kvm]
  vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
  ? vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
  x86_emulate_instruction+0x733/0x810 [kvm]
  vmx_handle_exit+0x2f4/0xda0 [kvm_intel]
  ? kvm_arch_vcpu_ioctl_run+0xd2f/0x1c60 [kvm]
  kvm_arch_vcpu_ioctl_run+0xdab/0x1c60 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  entry_SYSCALL_64_fastpath+0x23/0xc2

A nested #PF is triggered during L0 emulating instruction for L2. However, it
doesn't consider we should not break L1's vmlauch/vmresme. This patch fixes
it by queuing the #PF exception instead ,requesting an immediate VM exit from
L2 and keeping the exception for L1 pending for a subsequent nested VM exit.

This should actually work all the time, making vmx_inject_page_fault_nested
totally unnecessary.  However, that's not working yet, so this patch can work
around the issue in the meanwhile.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-29 16:30:37 +02:00
Linus Torvalds
770b782f55 ACPI fix for v4.14-rc3
This fixes an APEI problem that may cause a reported error to be
 missed due to a race condition.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJZzVkTAAoJEILEb/54YlRxGgcP/1j0kvrkU+/V8D4bZrpH9q/3
 ad1W/krYLmam5Q+IsWEMfK+mwmT0CPHG3wM/OeT6VV/mTF9u6CyCM0m/XvnorKg3
 Yp6wiKzAq7N9HIB6nqZUPwTgB0vIh3pLYBvqA1Dc6hlNU7lrsBuxUmYrpxM4hk6R
 X5BAKFQygFrunjxi22fvJjk2yxxxg6IY4R7JYbJQIJbfKBAfMraMrDVoSE/gHieL
 riOe1qJp0x5enI7kyOlGHQr0Sq+tOIrfJbf4O4Y4p1EwaXwk23mrfIpG9PtUpW3z
 t3jJZC7Rg7liIS1ZrozZmSbNP2KFdF3nbQYqRBEzfbT4isOSJRXHGB2eqzroIpM5
 rEgPjflLb561RWx7pcEQHH9z6cZ6cdbw97XNcdPTsJpxc46FohojdNR4FVY+z90I
 KwakMwVUs5qUEhU7LcLbtRCXZyzCnXHdz72zYEIaqTBOhZ3yXFHzy66ld7Fe7Dwk
 9Cu2u6P8gnnLPPbW5vRQGYhNdb5tfcOdzjQ0kajX+5kj+xlo5Nlhn8/LMOAqipOu
 nwYnJLfu4adMyVCTmepgur32Pwlfp/oDupbe1Fp0dHe6wk6wiqzmo8RCxDT/3uIT
 qxJB664AVQ9xpqOHVfRyZZxa07CRaW3aAYBqxkluIuL9lvEpSNWeStY9OxL1HCL3
 C1R6WiA48V/1dDkbimJ1
 =4NCJ
 -----END PGP SIGNATURE-----

Merge tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "This fixes an APEI problem that may cause a reported error to be
  missed due to a race condition"

* tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / APEI: clear error status before acknowledging the error
2017-09-28 14:38:30 -07:00
Linus Torvalds
74de8187ff Power management fixes for v4.14-rc3
- Fix a deadlock in the operating performance points (OPP)
    framework caused by a notifier callback taking a lock that's
    already held by its caller (Viresh Kumar).
 
  - Prevent the ti-cpufreq and cpufreq-dt-platdev drivers from
    attempting to register conflicting device objects which
    triggers a warning from sysfs (Suniel Mahesh).
 
  - Drop a stale reference to a piece of intel_pstate documentation
    that's not in the tree any more (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJZzVhfAAoJEILEb/54YlRxG1EQAJT3sG6InKIntxApvNG4o2fM
 0zs2At29EgiOcxN0rBs5DYUiNk2yLMmbt3X/PnJxt5BijIANN/HlvEeD5jip4iHU
 0F4o1gEDgFRkpEGlnR3tjUpCs/YZRwvmsox9zPqU7Nu4G/x6MjU5tlwT6BHCzq/U
 gSG6O8GSy+pK8B+SJu2SpWSNGqdCmv2a1aKgGA+KLFlad+AM7k1cPoX/Wv5fQGZ6
 iS20CLel4U4A6mzgYnnBhPSNsFYL4y0AxJ2SQ+O8PEWdP6hcmOvT5bo3TJTiTqqP
 vQU9DTzsNxS8NL3ShGVCRAKZVWQav0SQHESTx687bjjPaxg7ppMHpodnRAp3niEI
 5uyKGGerbdmJdKqEjEajpRLJWFjU8lcGMqWUUJFWDkIA88soSF1EoelxifH7rxnA
 raLPxQ/FJKX/Og36jVgH96+a0sz+emnFj/BBrWxySKED5tBQ6HqKPKZqV/uFJE6h
 DJ0qcYIxPdHtOCKYwLBsjJ2au2HUpp5fXzX+EOLmgnxIkHl9tsIbCnCAZldBIKVd
 9ENErc1vFXA38SpHSWRf2mT/sGOjnToxik1PRsWOp7zXNkiXyFRqblQe6uJiUCvM
 jU7Wl0HUNRGP0xEhdL3Ij7uGyOdVauHRLPEy5c+CJ9nSMCMYwYIW2pBdV8IgyRmD
 Y4gxTBrJ8nHTTWLucSFb
 =EEyM
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a deadlock in the operating performance points (OPP)
  framework introduced during the 4.11 cycle, more issues with duplicate
  device objects for cpufreq-dt and cpufreq documentation.

  Specifics:

   - Fix a deadlock in the operating performance points (OPP) framework
     caused by a notifier callback taking a lock that's already held by
     its caller (Viresh Kumar).

   - Prevent the ti-cpufreq and cpufreq-dt-platdev drivers from
     attempting to register conflicting device objects which triggers a
     warning from sysfs (Suniel Mahesh).

   - Drop a stale reference to a piece of intel_pstate documentation
     that's not in the tree any more (Rafael Wysocki)"

* tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: docs: Drop intel-pstate.txt from index.txt
  cpufreq: dt: Fix sysfs duplicate filename creation for platform-device
  PM / OPP: Call notifier without holding opp_table->lock
2017-09-28 14:35:51 -07:00
Linus Torvalds
02a2b05395 Changes since last update:
- fix various problems with the copy-on-write extent maps getting freed
   at the wrong time
 - fix printk format specifier problems
 - report zeroing operation outcomes instead of dropping them on the
   floor
 - fix some crashes when dio operations partially fail
 - fix a race condition between unwritten extent conversion & dio read
 - fix some incorrect tests in the inode log item processing
 - correct the delayed allocation space reservations on rmap filesystems
 - fix some problems checking for dax support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJZypYxAAoJEPh/dxk0SrTrJ3YQAJFWUCp194an+yuvgOY+MuyL
 PG/vAA3DyJjYbwIsqUE//dlp9nrarccAXcxPITWlLdGZ//qHbXO2MguO3KIQ4iG8
 qmsA+tXetVoYZYxYZLQ0KjX/XJTaAXY64xKTFxMMTTKUoxPygJRUF/FPfFFcTtaq
 Q/ULikS5mhtW7/mQCfXBvtqM5ZD61A9vQRjDL5jRdrDbz49TQqtskp/7F6SEHLxU
 fTCGhN7Ys4MQ4fmtUc+EUh0LPX8oAKIIKiGz3zUqrk/FgNYI2NqnTYvflfN8L9UE
 t+k+4CGrON+dzrau4HrvZaYbfIPhRaJUM4QzFcDIPoaBZOt6DpBI0dEKm9FD7Hw/
 vUvBs0M9asqYycH3PopFHugF+SxW8g7g+5TD8S9rg3j33PZahSNm3gt5gYb1Kiij
 3TZPirst6OeQuEjWX6L5LAruAtqtEXtHL7o4dGn5LdQkJ0EIdKXMd9YGz0F/trTK
 Grqf2Mep/Q8nccMTksaj94X5AhmM4znYmbAnbS/+QfYTgLk92GJltxoKTB6roW/N
 fJ5azjyzGsr4BWdgakK3aA9glaQWGh3PY8Up2VLeEdjwcy3zyscnpZP2PSvt+l9X
 pmMDpMTvQD0E6e5246itB69Il1NXTEoG/t9Hlx/2x9g0R2hjK6CRXXrwPnz9zYkI
 7wFz5B5LmJ27vFGTCxo5
 =7ptY
 -----END PGP SIGNATURE-----

Merge tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:

 - fix various problems with the copy-on-write extent maps getting freed
   at the wrong time

 - fix printk format specifier problems

 - report zeroing operation outcomes instead of dropping them on the
   floor

 - fix some crashes when dio operations partially fail

 - fix a race condition between unwritten extent conversion & dio read

 - fix some incorrect tests in the inode log item processing

 - correct the delayed allocation space reservations on rmap filesystems

 - fix some problems checking for dax support

* tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: revert "xfs: factor rmap btree size into the indlen calculations"
  xfs: Capture state of the right inode in xfs_iflush_done
  xfs: perag initialization should only touch m_ag_max_usable for AG 0
  xfs: update i_size after unwritten conversion in dio completion
  iomap_dio_rw: Allocate AIO completion queue before submitting dio
  xfs: validate bdev support for DAX inode flag
  xfs: remove redundant re-initialization of total_nr_pages
  xfs: Output warning message when discard option was enabled even though the device does not support discard
  xfs: report zeroed or not correctly in xfs_zero_range()
  xfs: kill meaningless variable 'zero'
  fs/xfs: Use %pS printk format for direct addresses
  xfs: evict CoW fork extents when performing finsert/fcollapse
  xfs: don't unconditionally clear the reflink flag on zero-block files
2017-09-28 13:27:23 -07:00
Linus Torvalds
e49aa15ef6 Revert "Bluetooth: Add option for disabling legacy ioctl interfaces"
This reverts commit dbbccdc4ce.

It turns out that the "legacy" users aren't so legacy at all, and that
turning off the legacy ioctl will break the current Qt bluetooth stack
for bluetooth LE devices that were released just a couple of months ago.

So it's simply not true that this was a legacy interface that hasn't
been needed and is only limited to old legacy BT devices.  Because I
actually read Kconfig help messages, and actively try to turn off
features that I don't need, I turned the option off.

Then I spent _way_ too much time debugging BLE issues until I realized
that it wasn't the Qt and subsurface development that had broken one of
my dive computer BLE downloads, but simply my broken kernel config.

Maybe in a decade it will be true that this is a legacy interface.  And
maybe with a better help-text and correct dependencies, this kind of
legacy removal might be acceptable.  But as things are right now both
the commit message and the Kconfig help text were misleading, and the
Kconfig option had the wrong dependenencies.

There's no reason to keep that broken Kconfig option in the tree.

Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-28 13:20:32 -07:00
Rafael J. Wysocki
333d177422 Merge branch 'acpi-apei'
* acpi-apei:
  ACPI / APEI: clear error status before acknowledging the error
2017-09-28 22:18:15 +02:00
Linus Torvalds
9173583226 Second -rc update for 4.14 kernel
- a few core fixes
 - a few ipoib fixes
 - a few mlx5 fixes
 - a 7 patch hfi1 related series
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZy8HKAAoJELgmozMOVy/d/3YP/RtJ4I+7dlHAdTrUsLkNIXzj
 6e2sc5A7JQRvhbWa6ZfqkbD4DBz2gkz9bXmlYotP1nVfunBie9xQPi+nN39YNnTv
 VPYa0G7RD53APw71ETCGh0uBBAjc8lGm0AOPj+HpSP7PvrLdH6B68IcAeXCSOf8D
 orzXI0bRpRnLsW4IJ0zN09zShigYuCJVl0Wf59QB0Wrbw4veQD4W7bLSCAUTmuZk
 TPb8bPlXY64Bf731HRftxIRl3HwUrpTPv5DuHcASAbVL/KeucWpPmOAj9XqhXTQp
 tnqtiwBWYDcsLBwS/IS40B2gfN1BCh6hn03pSVbPj+HD/FLY7x8Gf/Lu0qQNmklz
 9nvgMKHL/2h+T4M7DulhS7DTP58bvtkyKG+j77gjEmKX1OI0NXHOntKZDSjGAT2J
 zw2dNx4Y/Sgng1HBCbHAAHMrFUdyj7XpQNR8mzdGvDcwtRfrDKmchGtvhVclPsbl
 R3U9GN2NcAwg2+bIN96hTzUMB10QOZdvddGFvbxuB7FaWkskPaN52O1ptT3+MyWt
 xccZp0iYu40zV80mEm+nF/kZwR8omfE6xM1ujQdIhMHstGe+z29BhqsaQ8Zw1qEG
 oaU7+9m2aK57SvcSimR2S4kdK7Gxw9+BIVKdRREJwe9xvWVf96OvJnhnh5t5Fs56
 BTN1mBn+7LxlK9eDVler
 =HbhA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
 "Second -rc update for 4.14.

  Both Mellanox and Intel had a series of -rc fixes that landed this
  week. The Mellanox bunch is spread throughout the stack and not just
  in their driver, where as the Intel bunch was mostly in the hfi1
  driver. And, several of the fixes in the hfi1 driver were more than
  just simple 5 line fixes. As a result, the hfi1 driver fixes has a
  sizable LOC count.

  Everything else is as one would expect in an RC cycle in terms of LOC
  count. One item that might jump out and make you think "That's not an
  rc item" is the fix that corrects a typo. But, that change fixes a
  typo in a user visible API that was just added in this merge window,
  so if we fix it now, we can fix it. If we don't, the typo is in the
  API forever. Another that might not appear to be a fix at first glance
  is the Simplify mlx5_ib_cont_pages patch, but the simplification
  allows them to fix a bug in the existing function whenever the length
  of an SGE exceeded page size. We also had to revert one patch from the
  merge window that was wrong.

  Summary:

   - a few core fixes
   - a few ipoib fixes
   - a few mlx5 fixes
   - a 7-patch hfi1 related series"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/hfi1: Unsuccessful PCIe caps tuning should not fail driver load
  IB/hfi1: On error, fix use after free during user context setup
  Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
  IB/hfi1: Return correct value in general interrupt handler
  IB/hfi1: Check eeprom config partition validity
  IB/hfi1: Only reset QSFP after link up and turn off AOC TX
  IB/hfi1: Turn off AOC TX after offline substates
  IB/mlx5: Fix NULL deference on mlx5_ib_update_xlt failure
  IB/mlx5: Simplify mlx5_ib_cont_pages
  IB/ipoib: Fix inconsistency with free_netdev and free_rdma_netdev
  IB/ipoib: Fix sysfs Pkey create<->remove possible deadlock
  IB: Correct MR length field to be 64-bit
  IB/core: Fix qp_sec use after free access
  IB/core: Fix typo in the name of the tag-matching cap struct
2017-09-28 12:12:51 -07:00
Rafael J. Wysocki
abeb19a219 Merge branches 'pm-opp' and 'pm-cpufreq'
* pm-opp:
  PM / OPP: Call notifier without holding opp_table->lock

* pm-cpufreq:
  cpufreq: docs: Drop intel-pstate.txt from index.txt
  cpufreq: dt: Fix sysfs duplicate filename creation for platform-device
2017-09-28 21:08:07 +02:00
Linus Torvalds
26e811cdb9 Fix refcounting bug in CRIU interface, noticed by Chris Salls (Oleg & Tycho).
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJZzI6SAAoJEIly9N/cbcAmlg8P/iwmvIvZFXRIXqtjxV+i7Fom
 YxccAYhPGCpjmpewnlw46jErCd6mOhoL5QsA/Ab1uUHr4YwDZGAMSBhZcOtJIT3h
 5Yk+DdQCZkirM2xpBt4+sD2oGpnZaKS0rqmkYxM8XPZMVWsIGQu05oBGONkPQP1Q
 HkourKw1/UEqdUh3MqV0Vej+luniA8PI7nD61v6uYAxK+5XO5xJN9c3BqNBSFjXR
 dIzFHd/zlYICihaaPe+CVsDfHc/pn/kt2o7dvdtlPnJrz/doy69m9jOZUQkmeqkJ
 /ho+BQKchci5hqeaV5guhzZugiVUbYLPCtVEDJKm+hiRr4iu/5yKRNuxjbNQ4VHh
 68zF2yhLlilEqXgGQQ/kOvfUP/cWYKpirpCpFLRBQtuo+XF9EwoOR/MmdhRFu2Id
 lh+UgUeQW87Qdv+OKCIdb7tIJh04N1fXc5YBRjEZ/oSzDbl17HnVEfhHEksul8nf
 cX1uTJpt83d4SzMCkOGFVWYBK7U/xNnc+7hOB4tRYBb1il76xIII3A8jl481t2Ss
 VmTJcQd1t1HHYX8Og7S5yxAUH9+/FSRMwe6IWymPW82EFYayo1s98rWDp6Xpmezf
 ZlR4XdZvUWsY0KnMbbOyS0+9dMjyf0yV/E1Nxpm7oaqFz0rR8wjb9nJ4nvf9WofF
 2hzoKUX1kdn+eLftjDs6
 =Hnjh
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp fix from Kees Cook:
 "Fix refcounting bug in CRIU interface, noticed by Chris Salls (Oleg &
  Tycho)"

* tag 'seccomp-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()
2017-09-28 11:20:52 -07:00
Paolo Bonzini
c0a1666bcb KVM: VMX: use cmpxchg64
This fixes a compilation failure on 32-bit systems.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-28 17:58:41 +02:00
Oleg Nesterov
66a733ea6b seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()
As Chris explains, get_seccomp_filter() and put_seccomp_filter() can end
up using different filters. Once we drop ->siglock it is possible for
task->seccomp.filter to have been replaced by SECCOMP_FILTER_FLAG_TSYNC.

Fixes: f8e529ed94 ("seccomp, ptrace: add support for dumping seccomp filters")
Reported-by: Chris Salls <chrissalls5@gmail.com>
Cc: stable@vger.kernel.org # needs s/refcount_/atomic_/ for v4.12 and earlier
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
[tycho: add __get_seccomp_filter vs. open coding refcount_inc()]
Signed-off-by: Tycho Andersen <tycho@docker.com>
[kees: tweak commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-09-27 22:51:12 -07:00
Rafael J. Wysocki
8aba233390 cpufreq: docs: Drop intel-pstate.txt from index.txt
Commit 33fc30b470 (cpufreq: intel_pstate: Document the current
behavior and user interface) dropped the intel-pstate.txt file
from Documentation/cpu-freq/, but it did not update the index.txt
file in there accordingly, so do that now.

Fixes: 33fc30b470 (cpufreq: intel_pstate: Document the current behavior and user interface)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-28 02:08:43 +02:00
James Morris
2569e7e1d6 Merge commit 'keys-fixes-20170927' into fixes-v4.14-rc3
From David Howells:

"There are two sets of patches here:
 (1) A bunch of core keyrings bug fixes from Eric Biggers.

 (2) Fixing big_key to use safe crypto from Jason A. Donenfeld."
2017-09-28 09:11:28 +10:00
Tyler Baicar
aaf2c2fb0f ACPI / APEI: clear error status before acknowledging the error
Currently we acknowledge errors before clearing the error status.
This could cause a new error to be populated by firmware in-between
the error acknowledgment and the error status clearing which would
cause the second error's status to be cleared without being handled.
So, clear the error status before acknowledging the errors.

Also, make sure to acknowledge the error if the error status read
fails.

Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-27 23:13:06 +02:00
Linus Torvalds
9cd6681cb1 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and isofs fixes from Jan Kara:
 "Two quota fixes (fallout of the quota locking changes) and an isofs
  build fix"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Fix quota corruption with generic/232 test
  isofs: fix build regression
  quota: add missing lock into __dquot_transfer()
2017-09-27 12:22:12 -07:00
Linus Torvalds
225d3b6748 linux-kselftest-4.14-rc3-fixes
This update consists of:
 
 - fixes to several existing tests
 - a test for regression introduced by
   b9470c2760 ("inet: kill smallest_size and smallest_port")
 - seccomp support for glibc 2.26 siginfo_t.h
 - fixes to kselftest framework and tests to run make O=dir use-case
 - fixes to silence unnecessary test output to de-clutter test results
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZy7S7AAoJEAsCRMQNDUMcAt0P/iuR279yaBF3RVqHTyXsmr/t
 RO6k4uj4XLYKTrVnV/YTu5hLCGO9fPDhprMmrTqlAGclioEyMDtRTOWDDln4TNFh
 gehbXiOTVVHlLPCOXXRwvU+RsMppgi4O2WRTBK0dnTkBdl+sTLOl4iywGyqFPB11
 O3oj1nNc8ruaxYoUMYwxiGCm1OATrngoSu/Y4mMhZPgT9MnCtZWDlg//kkrxQDHO
 UTD11zk17nBAOw2q4nw3I4un00tgN8RzIOfg9g47Az40LjWSG5c5oAgd/hArqeBv
 7pCUR1PnNKTf0RujX0nfaoQQ+bOEXqpV9GmM67HLo8Q/5e4lYxWdmSdhItPS5qtS
 ZLo1lEMOuRH7+FCQuD236llhwKVMm/+R3jnXgdJcc+SupdGCmpzZ9P8rscX1g11R
 ZDZ9+k8XOA2p7ufxSIGFEILSovn0FUMneOd3Nhwk40R7cIvSiZh+V+Xzdb6Q1K9T
 NBVtH8qvRi5TyHSNwQCDF45fC6bCM80JxGcPToOguFsQTcUL6B0pG6xhxZG73+Ut
 br+Z5y+g+JLWLeGzaBjo4LnqFpeP6w4Jb8CCrqu8BussV3BToIFCJkGX6aOggow/
 D3g03tGDeMjqFMYwn0ZCH5s5u9cicWUUC8CBvoCJp2UZaE/prsNNfRjZjfwYlrVj
 TvWPdPJtwjA/sdq/n2Hl
 =FUuY
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-4.14-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "This update consists of:

   - fixes to several existing tests

   - a test for regression introduced by b9470c2760 ("inet: kill
     smallest_size and smallest_port")

   - seccomp support for glibc 2.26 siginfo_t.h

   - fixes to kselftest framework and tests to run make O=dir use-case

   - fixes to silence unnecessary test output to de-clutter test results"

* tag 'linux-kselftest-4.14-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (28 commits)
  selftests: timers: set-timer-lat: Fix hang when testing unsupported alarms
  selftests: timers: set-timer-lat: fix hang when std out/err are redirected
  selftests/memfd: correct run_tests.sh permission
  selftests/seccomp: Support glibc 2.26 siginfo_t.h
  selftests: futex: Makefile: fix for loops in targets to run silently
  selftests: Makefile: fix for loops in targets to run silently
  selftests: mqueue: Use full path to run tests from Makefile
  selftests: futex: copy sub-dir test scripts for make O=dir run
  selftests: lib.mk: copy test scripts and test files for make O=dir run
  selftests: sync: kselftest and kselftest-clean fail for make O=dir case
  selftests: sync: use TEST_CUSTOM_PROGS instead of TEST_PROGS
  selftests: lib.mk: add TEST_CUSTOM_PROGS to allow custom test run/install
  selftests: watchdog: fix to use TEST_GEN_PROGS and remove clean
  selftests: lib.mk: fix test executable status check to use full path
  selftests: Makefile: clear LDFLAGS for make O=dir use-case
  selftests: lib.mk: kselftest and kselftest-clean fail for make O=dir case
  Makefile: kselftest and kselftest-clean fail for make O=dir case
  selftests/net: msg_zerocopy enable build with older kernel headers
  selftests: actually run the various net selftests
  selftest: add a reuseaddr test
  ...
2017-09-27 10:51:08 -07:00
Linus Torvalds
7031b64125 Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu fixes and cleanups from Ingo Molnar:
 "This is _way_ more cleanups than fixes, but the bugs were subtle and
  hard to hit, and the primary reason for them existing was the
  unnecessary historical complexity of some of the x86/fpu interfaces.

  The first bunch of commits clean up and simplify the xstate user copy
  handling functions, in reaction to the collective head-scratching
  about the xstate user-copy handling code that leads up to the fix for
  this SkyLake xstate handling bug:

     0852b37417: x86/fpu: Add FPU state copying quirk to handle XRSTOR failure on Intel Skylake CPUs

  The cleanups don't change any functionality, they just (hopefully)
  make it all clearer, more consistent, more debuggable and more robust.

  Note that most of the linecount increase comes from these commits,
  where we better split the user/kernel copy logic by having more
  variants, instead repeated fragile patterns of:

               if (kbuf) {
                       memcpy(kbuf + pos, data, copy);
               } else {
                       if (__copy_to_user(ubuf + pos, data, copy))
                               return -EFAULT;
               }

  The next bunch of commits simplify the FPU state-machine to get rid of
  old lazy-FPU idiosyncrasies - a defensive simplification to make all
  the code easier to review and fix. No change in functionality.

  Then there's a couple of additional debugging tweaks: static checker
  warning fix and move an FPU related warning to under WARN_ON_FPU(),
  followed by another bunch of commits that represent a finegrained
  split-up of the fixes from Eric Biggers to handle weird xstate bits
  properly.

  I did this finegrained split-up because some of these fixes also
  impact the ABI for weird xstate handling, for which we'd like to have
  good bisection results, should they cause any problems. (We also had
  one regression with the more monolithic fixes, so splitting it all up
  sounded prudent for robustness reasons as well.)

  About the whole series: the commits up to 03eaec81ac have been in
  -next for months - but I've recently rebased them to remove a state
  machine clean-up commit that was objected to, and to make it more
  bisectable - so technically it's a new, rebased tree.

  Robustness history: this series had some regressions along the way,
  and all reported regressions have been fixed. All but one of the
  regressions manifested itself as easy to report warnings. The previous
  version of this latest series was also in linux-next, with one
  (warning-only) regression reported which is fixed in the latest
  version.

  Barring last minute brown paper bag bugs (and the commits are now
  older by a day which I'd hope helps paperbag reduction), I'm
  reasonably confident about its general robustness.

  Famous last words ..."

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/fpu: Use using_compacted_format() instead of open coded X86_FEATURE_XSAVES
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_user_to_xstate()
  x86/fpu: Eliminate the 'xfeatures' local variable in copy_user_to_xstate()
  x86/fpu: Copy the full header in copy_user_to_xstate()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_kernel_to_xstate()
  x86/fpu: Eliminate the 'xfeatures' local variable in copy_kernel_to_xstate()
  x86/fpu: Copy the full state_header in copy_kernel_to_xstate()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in __fpu__restore_sig()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in xstateregs_set()
  x86/fpu: Introduce validate_xstate_header()
  x86/fpu: Rename fpu__activate_fpstate_read/write() to fpu__prepare_[read|write]()
  x86/fpu: Rename fpu__activate_curr() to fpu__initialize()
  x86/fpu: Simplify and speed up fpu__copy()
  x86/fpu: Fix stale comments about lazy FPU logic
  x86/fpu: Rename fpu::fpstate_active to fpu::initialized
  x86/fpu: Remove fpu__current_fpstate_write_begin/end()
  x86/fpu: Fix fpu__activate_fpstate_read() and update comments
  x86/fpu: Reinitialize FPU registers if restoring FPU state fails
  x86/fpu: Don't let userspace set bogus xcomp_bv
  x86/fpu: Turn WARN_ON() in context switch into WARN_ON_FPU()
  ...
2017-09-27 08:33:26 -07:00
Harish Chegondi
828bcbdc97 IB/hfi1: Unsuccessful PCIe caps tuning should not fail driver load
Failure to tune PCIe capabilities should not fail driver load. This can
cause the driver load to fail on systems with any of the following:
1. HFI's parent is not root. Example: HFI card is behind a PCIe bridge.
2. HFI's parent is not PCI Express capable.
In these situations, failure to tune PCIe capabilities should be logged
in the system message logs but not cause the driver load to fail.

This patch also ensures pcie capability word DevCtl is written only
after a successful read and the capability tuning process continues
even if read/write of the pcie capability word DevCtl fails.

Fixes: c53df62c7a ("IB/hfi1: Check return values from PCI config API calls")
Fixes: bf70a77577 ("staging/rdma/hfi1: Enable WFR PCIe extended tags from the driver")
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Michael J. Ruhl
b8f42738ac IB/hfi1: On error, fix use after free during user context setup
During base context setup, if setup_base_ctxt() fails, the context is
deallocated. This is incorrect because the context is referenced on
return, to notify any waiting subcontext.  If there are no subcontexts
the pointer will be invalid.

Reorganize the error path so that deallocate_ctxt() is called after all
the possible subcontexts have been notified.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Alex Estrin
612601d001 Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
commit 9a9b811269 will cause core to fail UD QP from being destroyed
on ipoib unload, therefore cause resources leakage.
On pkey change event above patch modifies mgid before calling underlying
driver to detach it from QP. Drivers' detach_mcast() will fail to find
modified mgid it was never given to attach in a first place.
Core qp->usecnt will never go down, so ib_destroy_qp() will fail.

IPoIB driver actually does take care of new broadcast mgid based on new
pkey by destroying an old mcast object in ipoib_mcast_dev_flush())
....
	if (priv->broadcast) {
		rb_erase(&priv->broadcast->rb_node, &priv->multicast_tree);
		list_add_tail(&priv->broadcast->list, &remove_list);
		priv->broadcast = NULL;
	}
...

then in restarted ipoib_macst_join_task() creating a new broadcast mcast
object, sending join request and on completion tells the driver to attach
to reinitialized QP:
...
if (!priv->broadcast) {
...
	broadcast = ipoib_mcast_alloc(dev, 0);
...
	memcpy(broadcast->mcmember.mgid.raw, priv->dev->broadcast + 4,
	       sizeof (union ib_gid));
	priv->broadcast = broadcast;
...

Fixes: 9a9b811269 ("IB/ipoib: Update broadcast object if PKey value was changed in index 0")
Cc: stable@vger.kernel.org
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Kamenee Arumugam
09592af5fd IB/hfi1: Return correct value in general interrupt handler
The general interrupt handler returns IRQ_HANDLED whether an IRQ
was handled or not.
Determine if an IRQ was handled and return the correct value.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Jan Sokolowski
753b19afb1 IB/hfi1: Check eeprom config partition validity
Relying on a trailing magic value is incorrect. There are instances where
this is not present as trailing magic value has a specific purpose which is
not partition validation. Instead use the header magic value which is
present in all variants of the platform configuration and is intended for
validation. This is also used in other locations in the driver.

Fixes: bc5214ee29 (IB/hfi1: Handle missing magic values in config file)
Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Sebastian Sanchez
30e10527bc IB/hfi1: Only reset QSFP after link up and turn off AOC TX
QSFP reset enables AOC transmitters by default. They should be off
before moving to high power mode to complete the setup. There is no
need to reset the QSFP during LNI failure as it was reset at link down.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Sebastian Sanchez
df5efdd970 IB/hfi1: Turn off AOC TX after offline substates
Offline.quietDuration was added in the 8051 firmware, and the driver
only turns off the AOC transmitters when offline.quiet is reached.
However, the AOC transmitters need to be turned off at the new state.
Therefore, turn off the AOC transmitters at any offline substates
including offline.quiet and offline.quietDuration, then recheck we
reached offline.quiet to support backwards compatibility.

Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Paolo Bonzini
31afb2ea2b KVM: VMX: simplify and fix vmx_vcpu_pi_load
The simplify part: do not touch pi_desc.nv, we can set it when the
VCPU is first created.  Likewise, pi_desc.sn is only handled by
vmx_vcpu_pi_load, do not touch it in __pi_post_block.

The fix part: do not check kvm_arch_has_assigned_device, instead
check the SN bit to figure out whether vmx_vcpu_pi_put ran before.
This matches what the previous patch did in pi_post_block.

Cc: Huangweidong <weidong.huang@huawei.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: wangxin <wangxinxin.wang@huawei.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Tested-by: Longpeng (Mike) <longpeng2@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-27 13:45:42 +02:00
Paolo Bonzini
8b306e2f3c KVM: VMX: avoid double list add with VT-d posted interrupts
In some cases, for example involving hot-unplug of assigned
devices, pi_post_block can forget to remove the vCPU from the
blocked_vcpu_list.  When this happens, the next call to
pi_pre_block corrupts the list.

Fix this in two ways.  First, check vcpu->pre_pcpu in pi_pre_block
and WARN instead of adding the element twice in the list.  Second,
always do the list removal in pi_post_block if vcpu->pre_pcpu is
set (not -1).

The new code keeps interrupts disabled for the whole duration of
pi_pre_block/pi_post_block.  This is not strictly necessary, but
easier to follow.  For the same reason, PI.ON is checked only
after the cmpxchg, and to handle it we just call the post-block
code.  This removes duplication of the list removal code.

Cc: Huangweidong <weidong.huang@huawei.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: wangxin <wangxinxin.wang@huawei.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Tested-by: Longpeng (Mike) <longpeng2@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-27 13:45:37 +02:00
Paolo Bonzini
cd39e1176d KVM: VMX: extract __pi_post_block
Simple code movement patch, preparing for the next one.

Cc: Huangweidong <weidong.huang@huawei.com>
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: wangxin <wangxinxin.wang@huawei.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Tested-by: Longpeng (Mike) <longpeng2@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-27 13:45:28 +02:00
Jan Kara
4c6bb69663 quota: Fix quota corruption with generic/232 test
Eric has reported that since commit d2faa41516 "quota: Do not acquire
dqio_sem for dquot overwrites in v2 format" test generic/232
occasionally fails due to quota information being incorrect. Indeed that
commit was too eager to remove dqio_sem completely from the path that
just overwrites quota structure with updated information. Although that
is innocent on its own, another process that inserts new quota structure
to the same block can perform read-modify-write cycle of that block thus
effectively discarding quota information update if they race in a wrong
way.

Fix the problem by acquiring dqio_sem for reading for overwrites of
quota structure. Note that it *is* possible to completely avoid taking
dqio_sem in the overwrite path however that will require modifying path
inserting / deleting quota structures to avoid RMW cycles of the full
block and for now it is not clear whether it is worth the hassle.

Fixes: d2faa41516
Reported-and-tested-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-09-27 11:33:47 +02:00
Linus Torvalds
dc972a67cc MMC host:
- sdhci-pci: Fix voltage switch for some Intel host controllers
  - tmio: remove broken and noisy debug macro
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZyuXmAAoJEP4mhCVzWIwpUoMQAJihQnVY1i7Y6/OfiJrU1YSE
 TDT/pD5k0GtXs4SSK5UGkGqBVoH647KZTG54vj0FIGVAkNXBShME/ht4qCyecHuZ
 OwzGbhubozMJN77UamCI5SwyON1Wcb3ohIBUo03XfgC2Flp+5w0Ey0RacWnoWwzY
 VHjP1wmnP7tH1a7N3jmZxsaKSZFjQIyKUE5xCvDIdG8zplDkxM0195TEYSuSn9B1
 WJ/zjLVKKFPQI54xl/JqRBO6Z+CMIvZh1g/xaptvy4lwu5ACPUDxZZI1tUy/vWQU
 z/l8/PrJYH1lC8BYMVIokGKWSftuA2TsXRyYcGEd9++q1mW+FgjrnIpfArR2DYx1
 lHS/DjqVgaIfFuh+tcKq5mmUy1G8Ken9MMQixu9/zCxRB1M+KlAPAOoFb7YNKdP0
 U4nz9uvKkmQMMmI7oHd/PItsCHENixRN5VF29vcCyxg90dikS0+yut4Mm9fMMjay
 qoxzvb+LkiKj/fJb7sEdJ0lt2V/n8HQemySykNjFUXQ04TlJ2IvTEjR3angubuvH
 zBK2gqEzA/XmQFt06sw387nhkTFda/4Ch4osfKsyW/Y//eoJSbqxer2WnG349nlF
 FqJlA0O6tTuKPXAzrN2nU3naFIlxJH6LoxNqPFAeBhHDdv0pIuNvykrJE92+XgIZ
 J+v+W4sR4ljcWR2QPeMv
 =QLYF
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:

  - sdhci-pci: Fix voltage switch for some Intel host controllers

  - tmio: remove broken and noisy debug macro

* tag 'mmc-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-pci: Fix voltage switch for some Intel host controllers
  mmc: tmio: remove broken and noisy debug macro
2017-09-26 16:54:22 -07:00
Andreas Gruenbacher
fc46820b27 vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets
In generic_file_llseek_size, return -ENXIO for negative offsets as well
as offsets beyond EOF.  This affects filesystems which don't implement
SEEK_HOLE / SEEK_DATA internally, possibly because they don't support
holes.

Fixes xfstest generic/448.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-26 13:46:06 -07:00
Darrick J. Wong
5e5c943c1f xfs: revert "xfs: factor rmap btree size into the indlen calculations"
In commit fd26a88093 we added a worst case estimate for rmapbt blocks
needed to satisfy the block mapping request.  Since then, we added the
ability to reserve enough space in each AG such that we should never run
out of blocks to grow the rmapbt, which makes this calculation
unnecessary.  Revert the commit because it makes the extra delalloc
indlen accounting unnecessary and incorrect.

Reported-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-09-26 10:55:20 -07:00
Carlos Maiolino
842f6e9f78 xfs: Capture state of the right inode in xfs_iflush_done
My previous patch: d3a304b629 check for
XFS_LI_FAILED flag xfs_iflush done, so the failed item can be properly
resubmitted.

In the loop scanning other inodes being completed, it should check the
current item for the XFS_LI_FAILED, and not the initial one.

The state of the initial inode is checked after the loop ends

Kudos to Eric for catching this.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-09-26 10:55:20 -07:00
Darrick J. Wong
9789dd9e1d xfs: perag initialization should only touch m_ag_max_usable for AG 0
We call __xfs_ag_resv_init to make a per-AG reservation for each AG.
This makes the reservation per-AG, not per-filesystem.  Therefore, it
is incorrect to adjust m_ag_max_usable for each AG.  Adjust it only
when we're reserving AG 0's blocks so that we only do it once per fs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2017-09-26 10:55:19 -07:00
Eryu Guan
ee70daaba8 xfs: update i_size after unwritten conversion in dio completion
Since commit d531d91d69 ("xfs: always use unwritten extents for
direct I/O writes"), we start allocating unwritten extents for all
direct writes to allow appending aio in XFS.

But for dio writes that could extend file size we update the in-core
inode size first, then convert the unwritten extents to real
allocations at dio completion time in xfs_dio_write_end_io(). Thus a
racing direct read could see the new i_size and find the unwritten
extents first and read zeros instead of actual data, if the direct
writer also takes a shared iolock.

Fix it by updating the in-core inode size after the unwritten extent
conversion. To do this, introduce a new boolean argument to
xfs_iomap_write_unwritten() to tell if we want to update in-core
i_size or not.

Suggested-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-09-26 10:55:19 -07:00
Chandan Rajendra
546e7be824 iomap_dio_rw: Allocate AIO completion queue before submitting dio
Executing xfs/104 test in a loop on Linux-v4.13 kernel on a ppc64
machine can cause the following NULL pointer dereference,

.queue_work_on+0x4c/0x80
.iomap_dio_bio_end_io+0xbc/0x1f0
.bio_endio+0x118/0x1f0
.blk_update_request+0xd0/0x470
.blk_mq_end_request+0x24/0xc0
.lo_complete_rq+0x40/0xe0
.__blk_mq_complete_request_remote+0x28/0x40
.flush_smp_call_function_queue+0xc4/0x1e0
.smp_ipi_demux_relaxed+0x8c/0x100
.icp_hv_ipi_action+0x54/0xa0
.__handle_irq_event_percpu+0x84/0x2c0
.handle_irq_event_percpu+0x28/0x80
.handle_percpu_irq+0x78/0xc0
.generic_handle_irq+0x40/0x70
.__do_irq+0x88/0x200
.call_do_irq+0x14/0x24
.do_IRQ+0x84/0x130

This occurs due to the following sequence of events,

1. Allocate dio for Direct I/O write.
2. Invoke iomap_apply() until iov_iter_count() bytes have been submitted.
   - Assume that we have submitted atleast one bio. Hence iomap_dio->ref value
     will be >= 2.
   - If during the second iteration, iomap_apply() ends up returning -ENOSPC, we would
     break out of the loop and since the 'ret' value is a negative number we
     end up not allocating memory for super_block->s_dio_done_wq.
3. Meanwhile, iomap_dio_bio_end_io() is invoked for bios that have been
   submitted and here the code ends up dereferencing the NULL pointer stored
   at super_block->s_dio_done_wq.

This commit fixes the bug by allocating memory for
super_block->s_dio_done_wq before iomap_apply() is invoked.

Reported-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-09-26 10:55:19 -07:00
Ross Zwisler
6851a3db7e xfs: validate bdev support for DAX inode flag
Currently only the blocksize is checked, but we should really be calling
bdev_dax_supported() which also tests to make sure we can get a
struct dax_device and that the dax_direct_access() path is working.

This is the same check that we do for the "-o dax" mount option in
xfs_fs_fill_super().

This does not fix the race issues that caused the XFS DAX inode option to
be disabled, so that option will still be disabled.  If/when we re-enable
it, though, I think we will want this issue to have been fixed.  I also do
think that we want to fix this in stable kernels.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
CC: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-09-26 10:55:19 -07:00
Ingo Molnar
8474c532b5 Merge branch 'WIP.x86/fpu' into x86/fpu, because it's ready
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 10:17:43 +02:00
Eric Biggers
738f48cb5f x86/fpu: Use using_compacted_format() instead of open coded X86_FEATURE_XSAVES
This is the canonical method to use.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-11-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:48 +02:00
Eric Biggers
98c0fad9d6 x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_user_to_xstate()
Tighten the checks in copy_user_to_xstate().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-10-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:48 +02:00
Eric Biggers
3d703477bc x86/fpu: Eliminate the 'xfeatures' local variable in copy_user_to_xstate()
We now have this field in hdr.xfeatures.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-9-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:48 +02:00
Eric Biggers
af2c4322d9 x86/fpu: Copy the full header in copy_user_to_xstate()
This is in preparation to verify the full xstate header as supplied by user-space.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-8-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:47 +02:00
Eric Biggers
af95774b3c x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_kernel_to_xstate()
Tighten the checks in copy_kernel_to_xstate().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-7-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:47 +02:00
Eric Biggers
b89eda482d x86/fpu: Eliminate the 'xfeatures' local variable in copy_kernel_to_xstate()
We have this information in the xstate_header.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-6-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:46 +02:00
Eric Biggers
80d8ae86b3 x86/fpu: Copy the full state_header in copy_kernel_to_xstate()
This is in preparation to verify the full xstate header as supplied by user-space.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-5-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:46 +02:00
Eric Biggers
b11e2e18a7 x86/fpu: Use validate_xstate_header() to validate the xstate_header in __fpu__restore_sig()
Tighten the checks in __fpu__restore_sig() and update comments.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-4-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:46 +02:00
Eric Biggers
cf9df81b13 x86/fpu: Use validate_xstate_header() to validate the xstate_header in xstateregs_set()
Tighten the checks in xstateregs_set().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-3-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:45 +02:00
Eric Biggers
e63e5d5c15 x86/fpu: Introduce validate_xstate_header()
Move validation of user-supplied xstate_header into a helper function,
in preparation of calling it from both the ptrace and sigreturn syscall
paths.

The new function also considers it to be an error if *any* reserved bits
are set, whereas before we were just clearing most of them silently.

This should reduce the chance of bugs that fail to correctly validate
user-supplied XSAVE areas.  It also will expose any broken userspace
programs that set the other reserved bits; this is desirable because
such programs will lose compatibility with future CPUs and kernels if
those bits are ever used for anything.  (There shouldn't be any such
programs, and in fact in the case where the compacted format is in use
we were already validating xfeatures.  But you never know...)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20170924105913.9157-2-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-26 09:43:45 +02:00