Commit Graph

297 Commits

Author SHA1 Message Date
Linus Torvalds
8fa590bf34 ARM64:
* Enable the per-vcpu dirty-ring tracking mechanism, together with an
   option to keep the good old dirty log around for pages that are
   dirtied by something other than a vcpu.
 
 * Switch to the relaxed parallel fault handling, using RCU to delay
   page table reclaim and giving better performance under load.
 
 * Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option,
   which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a9:
   "Fix a number of issues with MTE, such as races on the tags being
   initialised vs the PG_mte_tagged flag as well as the lack of support
   for VM_SHARED when KVM is involved.  Patches from Catalin Marinas and
   Peter Collingbourne").
 
 * Merge the pKVM shadow vcpu state tracking that allows the hypervisor
   to have its own view of a vcpu, keeping that state private.
 
 * Add support for the PMUv3p5 architecture revision, bringing support
   for 64bit counters on systems that support it, and fix the
   no-quite-compliant CHAIN-ed counter support for the machines that
   actually exist out there.
 
 * Fix a handful of minor issues around 52bit VA/PA support (64kB pages
   only) as a prefix of the oncoming support for 4kB and 16kB pages.
 
 * Pick a small set of documentation and spelling fixes, because no
   good merge window would be complete without those.
 
 s390:
 
 * Second batch of the lazy destroy patches
 
 * First batch of KVM changes for kernel virtual != physical address support
 
 * Removal of a unused function
 
 x86:
 
 * Allow compiling out SMM support
 
 * Cleanup and documentation of SMM state save area format
 
 * Preserve interrupt shadow in SMM state save area
 
 * Respond to generic signals during slow page faults
 
 * Fixes and optimizations for the non-executable huge page errata fix.
 
 * Reprogram all performance counters on PMU filter change
 
 * Cleanups to Hyper-V emulation and tests
 
 * Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest
   running on top of a L1 Hyper-V hypervisor)
 
 * Advertise several new Intel features
 
 * x86 Xen-for-KVM:
 
 ** Allow the Xen runstate information to cross a page boundary
 
 ** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
 
 ** Add support for 32-bit guests in SCHEDOP_poll
 
 * Notable x86 fixes and cleanups:
 
 ** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
 
 ** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
    years back when eliminating unnecessary barriers when switching between
    vmcs01 and vmcs02.
 
 ** Clean up vmread_error_trampoline() to make it more obvious that params
    must be passed on the stack, even for x86-64.
 
 ** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
    of the current guest CPUID.
 
 ** Fudge around a race with TSC refinement that results in KVM incorrectly
    thinking a guest needs TSC scaling when running on a CPU with a
    constant TSC, but no hardware-enumerated TSC frequency.
 
 ** Advertise (on AMD) that the SMM_CTL MSR is not supported
 
 ** Remove unnecessary exports
 
 Generic:
 
 * Support for responding to signals during page faults; introduces
   new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
 
 Selftests:
 
 * Fix an inverted check in the access tracking perf test, and restore
   support for asserting that there aren't too many idle pages when
   running on bare metal.
 
 * Fix build errors that occur in certain setups (unsure exactly what is
   unique about the problematic setup) due to glibc overriding
   static_assert() to a variant that requires a custom message.
 
 * Introduce actual atomics for clear/set_bit() in selftests
 
 * Add support for pinning vCPUs in dirty_log_perf_test.
 
 * Rename the so called "perf_util" framework to "memstress".
 
 * Add a lightweight psuedo RNG for guest use, and use it to randomize
   the access pattern and write vs. read percentage in the memstress tests.
 
 * Add a common ucall implementation; code dedup and pre-work for running
   SEV (and beyond) guests in selftests.
 
 * Provide a common constructor and arch hook, which will eventually be
   used by x86 to automatically select the right hypercall (AMD vs. Intel).
 
 * A bunch of added/enabled/fixed selftests for ARM64, covering memslots,
   breakpoints, stage-2 faults and access tracking.
 
 * x86-specific selftest changes:
 
 ** Clean up x86's page table management.
 
 ** Clean up and enhance the "smaller maxphyaddr" test, and add a related
    test to cover generic emulation failure.
 
 ** Clean up the nEPT support checks.
 
 ** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
 
 ** Fix an ordering issue in the AMX test introduced by recent conversions
    to use kvm_cpu_has(), and harden the code to guard against similar bugs
    in the future.  Anything that tiggers caching of KVM's supported CPUID,
    kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
    the caching occurs before the test opts in via prctl().
 
 Documentation:
 
 * Remove deleted ioctls from documentation
 
 * Clean up the docs for the x86 MSR filter.
 
 * Various fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt
 KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF
 mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex
 yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii
 Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW
 MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA==
 =QAYX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Enable the per-vcpu dirty-ring tracking mechanism, together with an
     option to keep the good old dirty log around for pages that are
     dirtied by something other than a vcpu.

   - Switch to the relaxed parallel fault handling, using RCU to delay
     page table reclaim and giving better performance under load.

   - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
     option, which multi-process VMMs such as crosvm rely on (see merge
     commit 382b5b87a9: "Fix a number of issues with MTE, such as
     races on the tags being initialised vs the PG_mte_tagged flag as
     well as the lack of support for VM_SHARED when KVM is involved.
     Patches from Catalin Marinas and Peter Collingbourne").

   - Merge the pKVM shadow vcpu state tracking that allows the
     hypervisor to have its own view of a vcpu, keeping that state
     private.

   - Add support for the PMUv3p5 architecture revision, bringing support
     for 64bit counters on systems that support it, and fix the
     no-quite-compliant CHAIN-ed counter support for the machines that
     actually exist out there.

   - Fix a handful of minor issues around 52bit VA/PA support (64kB
     pages only) as a prefix of the oncoming support for 4kB and 16kB
     pages.

   - Pick a small set of documentation and spelling fixes, because no
     good merge window would be complete without those.

  s390:

   - Second batch of the lazy destroy patches

   - First batch of KVM changes for kernel virtual != physical address
     support

   - Removal of a unused function

  x86:

   - Allow compiling out SMM support

   - Cleanup and documentation of SMM state save area format

   - Preserve interrupt shadow in SMM state save area

   - Respond to generic signals during slow page faults

   - Fixes and optimizations for the non-executable huge page errata
     fix.

   - Reprogram all performance counters on PMU filter change

   - Cleanups to Hyper-V emulation and tests

   - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
     guest running on top of a L1 Hyper-V hypervisor)

   - Advertise several new Intel features

   - x86 Xen-for-KVM:

      - Allow the Xen runstate information to cross a page boundary

      - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured

      - Add support for 32-bit guests in SCHEDOP_poll

   - Notable x86 fixes and cleanups:

      - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).

      - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
        a few years back when eliminating unnecessary barriers when
        switching between vmcs01 and vmcs02.

      - Clean up vmread_error_trampoline() to make it more obvious that
        params must be passed on the stack, even for x86-64.

      - Let userspace set all supported bits in MSR_IA32_FEAT_CTL
        irrespective of the current guest CPUID.

      - Fudge around a race with TSC refinement that results in KVM
        incorrectly thinking a guest needs TSC scaling when running on a
        CPU with a constant TSC, but no hardware-enumerated TSC
        frequency.

      - Advertise (on AMD) that the SMM_CTL MSR is not supported

      - Remove unnecessary exports

  Generic:

   - Support for responding to signals during page faults; introduces
     new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks

  Selftests:

   - Fix an inverted check in the access tracking perf test, and restore
     support for asserting that there aren't too many idle pages when
     running on bare metal.

   - Fix build errors that occur in certain setups (unsure exactly what
     is unique about the problematic setup) due to glibc overriding
     static_assert() to a variant that requires a custom message.

   - Introduce actual atomics for clear/set_bit() in selftests

   - Add support for pinning vCPUs in dirty_log_perf_test.

   - Rename the so called "perf_util" framework to "memstress".

   - Add a lightweight psuedo RNG for guest use, and use it to randomize
     the access pattern and write vs. read percentage in the memstress
     tests.

   - Add a common ucall implementation; code dedup and pre-work for
     running SEV (and beyond) guests in selftests.

   - Provide a common constructor and arch hook, which will eventually
     be used by x86 to automatically select the right hypercall (AMD vs.
     Intel).

   - A bunch of added/enabled/fixed selftests for ARM64, covering
     memslots, breakpoints, stage-2 faults and access tracking.

   - x86-specific selftest changes:

      - Clean up x86's page table management.

      - Clean up and enhance the "smaller maxphyaddr" test, and add a
        related test to cover generic emulation failure.

      - Clean up the nEPT support checks.

      - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.

      - Fix an ordering issue in the AMX test introduced by recent
        conversions to use kvm_cpu_has(), and harden the code to guard
        against similar bugs in the future. Anything that tiggers
        caching of KVM's supported CPUID, kvm_cpu_has() in this case,
        effectively hides opt-in XSAVE features if the caching occurs
        before the test opts in via prctl().

  Documentation:

   - Remove deleted ioctls from documentation

   - Clean up the docs for the x86 MSR filter.

   - Various fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
  KVM: x86: Add proper ReST tables for userspace MSR exits/flags
  KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
  KVM: arm64: selftests: Align VA space allocator with TTBR0
  KVM: arm64: Fix benign bug with incorrect use of VA_BITS
  KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
  KVM: x86: Advertise that the SMM_CTL MSR is not supported
  KVM: x86: remove unnecessary exports
  KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
  tools: KVM: selftests: Convert clear/set_bit() to actual atomics
  tools: Drop "atomic_" prefix from atomic test_and_set_bit()
  tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
  KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
  perf tools: Use dedicated non-atomic clear/set bit helpers
  tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
  KVM: arm64: selftests: Enable single-step without a "full" ucall()
  KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
  KVM: Remove stale comment about KVM_REQ_UNHALT
  KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
  KVM: Reference to kvm_userspace_memory_region in doc and comments
  KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
  ...
2022-12-15 11:12:21 -08:00
Linus Torvalds
7a76117f9f platform-drivers-x86 for v6.2-1
Highlights:
  -  Intel:
     -  PMC: Add support for Meteor Lake
     -  Intel On Demand: various updates
  -  ideapad-laptop:
     -  Add support for various Fn keys on new models
     -  Fix touchpad on/off handling in a generic way to avoid having
        to add more and more quirks
  -  android-x86-tablets: Add support for 2 more X86 Android tablet models
  -  New Dell WMI DDV driver
  -  Miscellaneous cleanups and small bugfixes
 
 The following is an automated git shortlog grouped by driver:
 
 ACPI:
  -  battery: Pass battery hook pointer to hook callbacks
 
 ISST:
  -  Fix typo in comments
 
 Move existing HP drivers to a new hp subdir:
  - Move existing HP drivers to a new hp subdir
 
 dell:
  -  Add new dell-wmi-ddv driver
 
 dell-ddv:
  -  Warn if ePPID has a suspicious length
  -  Improve buffer handling
 
 huawei-wmi:
  -  remove unnecessary member
  -  fix return value calculation
  -  do not hard-code sizes
 
 ideapad-laptop:
  -  Make touchpad_ctrl_via_ec a module option
  -  Stop writing VPCCMD_W_TOUCHPAD at probe time
  -  Send KEY_TOUCHPAD_TOGGLE on some models
  -  Only toggle ps2 aux port on/off on select models
  -  Do not send KEY_TOUCHPAD* events on probe / resume
  -  Refactor ideapad_sync_touchpad_state()
  -  support for more special keys in WMI
  -  Add new _CFG bit numbers for future use
  -  Revert "check for touchpad support in _CFG"
 
 intel/pmc:
  -  Relocate Alder Lake PCH support
  -  Relocate Tiger Lake PCH support
  -  Relocate Ice Lake PCH support
  -  Relocate Cannon Lake Point PCH support
  -  Relocate Sunrise Point PCH support
  -  Move variable declarations and definitions to header and core.c
  -  Replace all the reg_map with init functions
 
 intel/pmc/core:
  -  Add Meteor Lake support to pmc core driver
 
 intel_scu_ipc:
  -  fix possible name leak in __intel_scu_ipc_register()
 
 mxm-wmi:
  -  fix memleak in mxm_wmi_call_mx[ds|mx]()
 
 platform/mellanox:
  -  mlxbf-pmc: Fix event typo
  -  Add BlueField-3 support in the tmfifo driver
 
 platform/x86/amd:
  -  pmc: Add a workaround for an s0i3 issue on Cezanne
 
 platform/x86/amd/pmf:
  -  pass the struct by reference
 
 platform/x86/dell:
  -  alienware-wmi: Use sysfs_emit() instead of scnprintf()
 
 platform/x86/intel:
  -  pmc: Fix repeated word in comment
 
 platform/x86/intel/hid:
  -  Add module-params for 5 button array + SW_TABLET_MODE reporting
 
 platform/x86/intel/sdsi:
  -  Add meter certificate support
  -  Support different GUIDs
  -  Hide attributes if hardware doesn't support
  -  Add Intel On Demand text
 
 sony-laptop:
  -  Convert to use sysfs_emit_at() API
 
 thinkpad_acpi:
  -  use strstarts()
  -  Fix max_brightness of thinklight
 
 tools/arch/x86:
  -  intel_sdsi: Add support for reading meter certificates
  -  intel_sdsi: Add support for new GUID
  -  intel_sdsi: Read more On Demand registers
  -  intel_sdsi: Add Intel On Demand text
  -  intel_sdsi: Add support for reading state certificates
 
 uv_sysfs:
  -  Use sysfs_emit() instead of scnprintf()
 
 wireless-hotkey:
  -  use ACPI HID as phys
 
 x86-android-tablets:
  -  Add Advantech MICA-071 extra button
  -  Add Lenovo Yoga Tab 3 (YT3-X90F) charger + fuel-gauge data
  -  Add Medion Lifetab S10346 data
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmOW+QgUHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yAPwf/dAYLHiC2ox5YlNTLX2DvU+jOpeBv
 W+EIx4oHQz1+O9jrWMLyvS9zTwTEAf6ANLiMP3damEvtJnB72ClgFITzlJAaB4zN
 yj0SdxoBRMt6zDL2QwMkwitvb5kJonLfO2H7NsMwA6f0KP1X8sio3oVRAMMVwlzz
 nwDKM/VBpuxmy+d880wRRoAkgRkTsPIOwBkYdo1525NU7kkTmtrMpgM+SXQsHTJn
 TB9uQnyuiq5/znh3k1Qn+OGwXQezmGz2Fb76IcW5RzUQDew6n6b3kzILee5ddynT
 Pa7/ibwpV+FtZjm2kS/l4tV+WPdA+s5TSWoq7Hz0jzBX9GdOORcMZmEneg==
 =z16d
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver updates from Hans de Goede:

 - Intel:
      - PMC: Add support for Meteor Lake
      - Intel On Demand: various updates

 - Ideapad-laptop:
      - Add support for various Fn keys on new models
      - Fix touchpad on/off handling in a generic way to avoid having to
        add more and more quirks

 - Android x86 tablets:
      - Add support for two more X86 Android tablet models

 - New Dell WMI DDV driver

 - Miscellaneous cleanups and small bugfixes

* tag 'platform-drivers-x86-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (52 commits)
  platform/mellanox: mlxbf-pmc: Fix event typo
  platform/x86: intel_scu_ipc: fix possible name leak in __intel_scu_ipc_register()
  platform/x86: sony-laptop: Convert to use sysfs_emit_at() API
  platform/x86/dell: alienware-wmi: Use sysfs_emit() instead of scnprintf()
  platform/x86: uv_sysfs: Use sysfs_emit() instead of scnprintf()
  platform/x86: mxm-wmi: fix memleak in mxm_wmi_call_mx[ds|mx]()
  platform/x86: x86-android-tablets: Add Advantech MICA-071 extra button
  platform/x86: x86-android-tablets: Add Lenovo Yoga Tab 3 (YT3-X90F) charger + fuel-gauge data
  platform/x86: x86-android-tablets: Add Medion Lifetab S10346 data
  platform/x86: wireless-hotkey: use ACPI HID as phys
  platform/x86/intel/hid: Add module-params for 5 button array + SW_TABLET_MODE reporting
  platform/x86: ideapad-laptop: Make touchpad_ctrl_via_ec a module option
  platform/x86: ideapad-laptop: Stop writing VPCCMD_W_TOUCHPAD at probe time
  platform/x86: ideapad-laptop: Send KEY_TOUCHPAD_TOGGLE on some models
  platform/x86: ideapad-laptop: Only toggle ps2 aux port on/off on select models
  platform/x86: ideapad-laptop: Do not send KEY_TOUCHPAD* events on probe / resume
  platform/x86: ideapad-laptop: Refactor ideapad_sync_touchpad_state()
  tools/arch/x86: intel_sdsi: Add support for reading meter certificates
  tools/arch/x86: intel_sdsi: Add support for new GUID
  tools/arch/x86: intel_sdsi: Read more On Demand registers
  ...
2022-12-12 10:47:10 -08:00
Sean Christopherson
bb056c0f08 tools: KVM: selftests: Convert clear/set_bit() to actual atomics
Convert {clear,set}_bit() to atomics as KVM's ucall implementation relies
on clear_bit() being atomic, they are defined in atomic.h, and the same
helpers in the kernel proper are atomic.

KVM's ucall infrastructure is the only user of clear_bit() in tools/, and
there are no true set_bit() users.  tools/testing/nvdimm/ does make heavy
use of set_bit(), but that code builds into a kernel module of sorts, i.e.
pulls in all of the kernel's header and so is already getting the kernel's
atomic set_bit().

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221119013450.2643007-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02 13:22:35 -05:00
Sean Christopherson
36293352ff tools: Drop "atomic_" prefix from atomic test_and_set_bit()
Drop the "atomic_" prefix from tools' atomic_test_and_set_bit() to
match the kernel nomenclature where test_and_set_bit() is atomic,
and __test_and_set_bit() provides the non-atomic variant.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221119013450.2643007-9-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02 13:22:34 -05:00
Javier Martinez Canillas
66a9221d73 KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
The documentation says that the ioctl has been deprecated, but it has been
actually removed and the remaining references are just left overs.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Message-Id: <20221202105011.185147-3-javierm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02 12:54:40 -05:00
David E. Box
7fdc03a737 tools/arch/x86: intel_sdsi: Add support for reading meter certificates
Add option to read and decode On Demand meter certificates.

Link: https://github.com/intel/intel-sdsi/blob/master/meter-certificate.rst

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20221119002343.1281885-10-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-11-21 10:56:17 +01:00
David E. Box
429e789c67 tools/arch/x86: intel_sdsi: Add support for new GUID
The structure and content of the On Demand registers is based on the GUID
which is read from hardware through sysfs. Add support for decoding the
registers of a new GUID 0xF210D9EF.

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20221119002343.1281885-9-david.e.box@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-11-21 10:56:09 +01:00
David E. Box
a8041a89b7 tools/arch/x86: intel_sdsi: Read more On Demand registers
Add decoding of the following On Demand register fields:

1. NVRAM content authorization error status
2. Enabled features: telemetry and attestation
3. Key provisioning status
4. NVRAM update limit
5. PCU_CR3_CAPID_CFG

Link: https://github.com/intel/intel-sdsi/blob/master/state-certificate-encoding.rst
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20221119002343.1281885-8-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-11-21 10:55:59 +01:00
David E. Box
334599bccb tools/arch/x86: intel_sdsi: Add Intel On Demand text
Intel Software Defined Silicon (SDSi) is now officially known as Intel
On Demand. Change text in tool to indicate this.

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20221119002343.1281885-7-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-11-21 10:55:56 +01:00
David E. Box
3088258ea7 tools/arch/x86: intel_sdsi: Add support for reading state certificates
Add option to read and decode On Demand state certificates.

Link: https://github.com/intel/intel-sdsi/blob/master/state-certificate-encoding.rst
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20221119002343.1281885-6-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-11-21 10:55:51 +01:00
Peter Gonda
cf4694be2b tools: Add atomic_test_and_set_bit()
Add x86 and generic implementations of atomic_test_and_set_bit() to allow
KVM selftests to atomically manage bitmaps.

Note, the generic version is taken from arch_test_and_set_bit() as of
commit 415d832497 ("locking/atomic: Make test_and_*_bit() ordered on
failure").

Signed-off-by: Peter Gonda <pgonda@google.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20221006003409.649993-5-seanjc@google.com
2022-11-16 16:58:52 -08:00
Borislav Petkov
2632daebaf x86/cpu: Restore AMD's DE_CFG MSR after resume
DE_CFG contains the LFENCE serializing bit, restore it on resume too.
This is relevant to older families due to the way how they do S3.

Unify and correct naming while at it.

Fixes: e4d0e84e49 ("x86/cpu/AMD: Make LFENCE a serializing instruction")
Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Reported-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-11-15 10:15:58 -08:00
Arnaldo Carvalho de Melo
74455fd7e4 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  257449c6a5 ("x86/cpufeatures: Add LbrExtV2 feature bit")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/lkml/Y1g6vGPqPhOrXoaN@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
4402e360d0 tools headers: Update the copy of x86's memcpy_64.S used in 'perf bench'
We also need to add SYM_TYPED_FUNC_START() to util/include/linux/linkage.h
and update tools/perf/check_headers.sh to ignore the include cfi_types.h
line when checking if the kernel original files drifted from the copies
we carry.

This is to get the changes from:

  ccace936ee ("x86: Add types to indirectly called assembly functions")

Addressing these tools/perf build warnings:

  Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
  diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/lkml/Y1f3VRIec9EBgX6F@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
a3a365655a tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  b8d1d16360 ("x86/apic: Don't disable x2APIC if locked")
  ca5b7c0d96 ("perf/x86/amd/lbr: Add LbrExtV2 branch record support")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2022-10-14 18:06:34.294561729 -0300
  +++ after	2022-10-14 18:06:41.285744044 -0300
  @@ -264,6 +264,7 @@
   	[0xc0000102 - x86_64_specific_MSRs_offset] = "KERNEL_GS_BASE",
   	[0xc0000103 - x86_64_specific_MSRs_offset] = "TSC_AUX",
   	[0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO",
  +	[0xc000010e - x86_64_specific_MSRs_offset] = "AMD64_LBR_SELECT",
   	[0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG",
   	[0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS",
   	[0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL",
  $

Now one can trace systemwide asking to see backtraces to where that MSR
is being read/written, see this example with a previous update:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  Using CPUID AuthenticAMD-25-21-0
  0x6a0
  0x6a8
  New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  0x6a0
  0x6a8
  New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  mmap size 528384B
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
     0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule ([kernel.kallsyms])
                                       futex_wait_queue_me ([kernel.kallsyms])
                                       futex_wait ([kernel.kallsyms])
                                       do_futex ([kernel.kallsyms])
                                       __x64_sys_futex ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                       __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
     0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule_idle ([kernel.kallsyms])
                                       do_idle ([kernel.kallsyms])
                                       cpu_startup_entry ([kernel.kallsyms])
                                       secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/lkml/Y0nQkz2TUJxwfXJd@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Ravi Bangoria
160ae99365 perf amd ibs: Sync arch/x86/include/asm/amd-ibs.h header with the kernel
Although new details added into this header is currently used by kernel
only, tools copy needs to be in sync with kernel file to avoid
tools/perf/check-headers.sh warnings.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20221006153946.7816-3-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06 16:27:38 -03:00
Arnaldo Carvalho de Melo
356edeca2e tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  7df548840c ("x86/bugs: Add "unknown" reporting for MMIO Stale Data")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Link: https://lore.kernel.org/lkml/YysTRji90sNn2p5f@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-21 16:08:00 -03:00
Nick Desaulniers
a0a12c3ed0 asm goto: eradicate CC_HAS_ASM_GOTO
GCC has supported asm goto since 4.5, and Clang has since version 9.0.0.
The minimum supported versions of these tools for the build according to
Documentation/process/changes.rst are 5.1 and 11.0.0 respectively.

Remove the feature detection script, Kconfig option, and clean up some
fallback code that is no longer supported.

The removed script was also testing for a GCC specific bug that was
fixed in the 4.7 release.

Also remove workarounds for bpftrace using clang older than 9.0.0, since
other BPF backend fixes are required at this point.

Link: https://lore.kernel.org/lkml/CAK7LNATSr=BXKfkdW8f-H5VT_w=xBpT2ZQcZ7rm6JfkdE+QnmA@mail.gmail.com/
Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48637
Acked-by: Borislav Petkov <bp@suse.de>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-21 10:06:28 -07:00
Arnaldo Carvalho de Melo
e5bc0deae5 tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
To pick the changes in:

  43bb9e000e ("KVM: x86: Tweak name of MONITOR/MWAIT #UD quirk to make it #UD specific")
  94dfc73e7c ("treewide: uapi: Replace zero-length arrays with flexible-array members")
  bfbcc81bb8 ("KVM: x86: Add a quirk for KVM's "MONITOR/MWAIT are NOPs!" behavior")
  b172862241 ("KVM: x86: PIT: Preserve state of speaker port data bit")
  ed2351174e ("KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault")

That just rebuilds kvm-stat.c on x86, no change in functionality.

This silences these perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
  diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

Cc: Chenyi Qiang <chenyi.qiang@intel.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Durrant <pdurrant@amazon.com>
Link: https://lore.kernel.org/lkml/Yv6OMPKYqYSbUxwZ@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-19 15:30:34 -03:00
Arnaldo Carvalho de Melo
eea085d114 tools headers UAPI: Sync KVM's vmx.h header with the kernel sources
To pick the changes in:

  2f4073e08f ("KVM: VMX: Enable Notify VM exit")

That makes 'perf kvm-stat' aware of this new NOTIFY exit reason, thus
addressing the following perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/vmx.h' differs from latest version at 'arch/x86/include/uapi/asm/vmx.h'
  diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Tao Xu <tao3.xu@intel.com>
Link: http://lore.kernel.org/lkml/Yv6LavXMZ+njijpq@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-19 15:30:34 -03:00
Arnaldo Carvalho de Melo
62ed93d199 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  2b12993220 ("x86/speculation: Add RSB VM Exit protections")
  28a99e95f5 ("x86/amd: Use IBPB for firmware calls")
  4ad3278df6 ("x86/speculation: Disable RRSBA behavior")
  26aae8ccbc ("x86/cpu/amd: Enumerate BTC_NO")
  9756bba284 ("x86/speculation: Fill RSB on vmexit for IBRS")
  3ebc170068 ("x86/bugs: Add retbleed=ibpb")
  2dbb887e87 ("x86/entry: Add kernel IBRS implementation")
  6b80b59b35 ("x86/bugs: Report AMD retbleed vulnerability")
  a149180fbc ("x86: Add magic AMD return-thunk")
  15e67227c4 ("x86: Undo return-thunk damage")
  a883d624ae ("x86/cpufeatures: Move RETPOLINE flags to word 11")
  aae99a7c9a ("x86/cpufeatures: Introduce x2AVIC CPUID bit")
  6f33a9daff ("x86: Fix comment for X86_FEATURE_ZEN")
  5180218615 ("x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Wyes Karny <wyes.karny@amd.com>
Link: https://lore.kernel.org/lkml/Yvznmu5oHv0ZDN2w@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-19 15:30:33 -03:00
Arnaldo Carvalho de Melo
7f7f86a7bd tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  2b12993220 ("x86/speculation: Add RSB VM Exit protections")
  4af184ee8b ("tools/power turbostat: dump secondary Turbo-Ratio-Limit")
  4ad3278df6 ("x86/speculation: Disable RRSBA behavior")
  d7caac991f ("x86/cpu/amd: Add Spectral Chicken")
  6ad0ad2bf8 ("x86/bugs: Report Intel retbleed vulnerability")
  c59a1f106f ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
  465932db25 ("x86/cpu: Add new VMX feature, Tertiary VM-Execution control")
  027bbb884b ("KVM: x86/speculation: Disable Fill buffer clear within guests")
  5180218615 ("x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2022-08-17 09:05:13.938246475 -0300
  +++ after	2022-08-17 09:05:22.221455851 -0300
  @@ -161,6 +161,7 @@
   	[0x0000048f] = "IA32_VMX_TRUE_EXIT_CTLS",
   	[0x00000490] = "IA32_VMX_TRUE_ENTRY_CTLS",
   	[0x00000491] = "IA32_VMX_VMFUNC",
  +	[0x00000492] = "IA32_VMX_PROCBASED_CTLS3",
   	[0x000004c1] = "IA32_PMC0",
   	[0x000004d0] = "IA32_MCG_EXT_CTL",
   	[0x00000560] = "IA32_RTIT_OUTPUT_BASE",
  @@ -212,6 +213,7 @@
   	[0x0000064D] = "PLATFORM_ENERGY_STATUS",
   	[0x0000064e] = "PPERF",
   	[0x0000064f] = "PERF_LIMIT_REASONS",
  +	[0x00000650] = "SECONDARY_TURBO_RATIO_LIMIT",
   	[0x00000658] = "PKG_WEIGHTED_CORE_C0_RES",
   	[0x00000659] = "PKG_ANY_CORE_C0_RES",
   	[0x0000065A] = "PKG_ANY_GFXE_C0_RES",
  $

Now one can trace systemwide asking to see backtraces to where those
MSRs are being read/written, see this example with a previous update:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  Using CPUID AuthenticAMD-25-21-0
  0x6a0
  0x6a8
  New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  0x6a0
  0x6a8
  New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  mmap size 528384B
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
     0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule ([kernel.kallsyms])
                                       futex_wait_queue_me ([kernel.kallsyms])
                                       futex_wait ([kernel.kallsyms])
                                       do_futex ([kernel.kallsyms])
                                       __x64_sys_futex ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                       __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
     0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule_idle ([kernel.kallsyms])
                                       do_idle ([kernel.kallsyms])
                                       cpu_startup_entry ([kernel.kallsyms])
                                       secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Like Xu <like.xu@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/lkml/YvzbT24m2o5U%2F7+q@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-19 15:30:33 -03:00
Linus Torvalds
5318b987fe More from the CPU vulnerability nightmares front:
Intel eIBRS machines do not sufficiently mitigate against RET
 mispredictions when doing a VM Exit therefore an additional RSB,
 one-entry stuffing is needed.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLqsGsACgkQEsHwGGHe
 VUpXGg//ZEkxhf3Ri7X9PknAWNG6eIEqigKqWcdnOw+Oq/GMVb6q7JQsqowK7KBZ
 AKcY5c/KkljTJNohditnfSOePyCG5nDTPgfkjzIawnaVdyJWMRCz/L4X2cv6ykDl
 2l2EvQm4Ro8XAogYhE7GzDg/osaVfx93OkLCQj278VrEMWgM/dN2RZLpn+qiIkNt
 DyFlQ7cr5UASh/svtKLko268oT4JwhQSbDHVFLMJ52VaLXX36yx4rValZHUKFdox
 ZDyj+kiszFHYGsI94KAD0dYx76p6mHnwRc4y/HkVcO8vTacQ2b9yFYBGTiQatITf
 0Nk1RIm9m3rzoJ82r/U0xSIDwbIhZlOVNm2QtCPkXqJZZFhopYsZUnq2TXhSWk4x
 GQg/2dDY6gb/5MSdyLJmvrTUtzResVyb/hYL6SevOsIRnkwe35P6vDDyp15F3TYK
 YvidZSfEyjtdLISBknqYRQD964dgNZu9ewrj+WuJNJr+A2fUvBzUebXjxHREsugN
 jWp5GyuagEKTtneVCvjwnii+ptCm6yfzgZYLbHmmV+zhinyE9H1xiwVDvo5T7DDS
 ZJCBgoioqMhp5qR59pkWz/S5SNGui2rzEHbAh4grANy8R/X5ASRv7UHT9uAo6ve1
 xpw6qnE37CLzuLhj8IOdrnzWwLiq7qZ/lYN7m+mCMVlwRWobbOo=
 =a8em
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 eIBRS fixes from Borislav Petkov:
 "More from the CPU vulnerability nightmares front:

  Intel eIBRS machines do not sufficiently mitigate against RET
  mispredictions when doing a VM Exit therefore an additional RSB,
  one-entry stuffing is needed"

* tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Add LFENCE to RSB fill sequence
  x86/speculation: Add RSB VM Exit protections
2022-08-09 09:29:07 -07:00
Linus Torvalds
48a577dc1b perf tools changes for v6.0: 1st batch
- Introduce 'perf lock contention' subtool, using new lock contention
   tracepoints and using BPF for in kernel aggregation and then userspace
   processing using the perf tooling infrastructure for resolving symbols, target
   specification, etc.
 
   Since the new lock contention tracepoints don't provide lock names, get up to
   8 stack traces and display the first non-lock function symbol name as a caller:
 
   $ perf lock report -F acquired,contended,avg_wait,wait_total
 
                   Name   acquired  contended     avg wait    total wait
 
    update_blocked_a...         40         40      3.61 us     144.45 us
    kernfs_fop_open+...          5          5      3.64 us      18.18 us
     _nohz_idle_balance          3          3      2.65 us       7.95 us
    tick_do_update_j...          1          1      6.04 us       6.04 us
     ep_scan_ready_list          1          1      3.93 us       3.93 us
 
   Supports the usual 'perf record' + 'perf report' workflow as well as a
   BCC/bpftrace like mode where you start the tool and then press control+C to get
   results:
 
    $ sudo perf lock contention -b
   ^C
    contended   total wait     max wait     avg wait         type   caller
 
           42    192.67 us     13.64 us      4.59 us     spinlock   queue_work_on+0x20
           23     85.54 us     10.28 us      3.72 us     spinlock   worker_thread+0x14a
            6     13.92 us      6.51 us      2.32 us        mutex   kernfs_iop_permission+0x30
            3     11.59 us     10.04 us      3.86 us        mutex   kernfs_dop_revalidate+0x3c
            1      7.52 us      7.52 us      7.52 us     spinlock   kthread+0x115
            1      7.24 us      7.24 us      7.24 us     rwlock:W   sys_epoll_wait+0x148
            2      7.08 us      3.99 us      3.54 us     spinlock   delayed_work_timer_fn+0x1b
            1      6.41 us      6.41 us      6.41 us     spinlock   idle_balance+0xa06
            2      2.50 us      1.83 us      1.25 us        mutex   kernfs_iop_lookup+0x2f
            1      1.71 us      1.71 us      1.71 us        mutex   kernfs_iop_getattr+0x2c
   ...
 
 - Add new 'perf kwork' tool to trace time properties of kernel work (such as
   softirq, and workqueue), uses eBPF skeletons to collect info in kernel space,
   aggregating data that then gets processed by the userspace tool, e.g.:
 
   # perf kwork report
 
    Kwork Name      | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
   ----------------------------------------------------------------------------------------------------
    nvme0q5:130     | 004 |      1.101 ms |    49 |    0.051 ms |    26035.056403 s |  26035.056455 s |
    amdgpu:162      | 002 |      0.176 ms |     9 |    0.046 ms |    26035.268020 s |  26035.268066 s |
    nvme0q24:149    | 023 |      0.161 ms |    55 |    0.009 ms |    26035.655280 s |  26035.655288 s |
    nvme0q20:145    | 019 |      0.090 ms |    33 |    0.014 ms |    26035.939018 s |  26035.939032 s |
    nvme0q31:156    | 030 |      0.075 ms |    21 |    0.010 ms |    26035.052237 s |  26035.052247 s |
    nvme0q8:133     | 007 |      0.062 ms |    12 |    0.021 ms |    26035.416840 s |  26035.416861 s |
    nvme0q6:131     | 005 |      0.054 ms |    22 |    0.010 ms |    26035.199919 s |  26035.199929 s |
    nvme0q19:144    | 018 |      0.052 ms |    14 |    0.010 ms |    26035.110615 s |  26035.110625 s |
    nvme0q7:132     | 006 |      0.049 ms |    13 |    0.007 ms |    26035.125180 s |  26035.125187 s |
    nvme0q18:143    | 017 |      0.033 ms |    14 |    0.007 ms |    26035.169698 s |  26035.169705 s |
    nvme0q17:142    | 016 |      0.013 ms |     1 |    0.013 ms |    26035.565147 s |  26035.565160 s |
    enp5s0-rx-0:164 | 006 |      0.004 ms |     4 |    0.002 ms |    26035.928882 s |  26035.928884 s |
    enp5s0-tx-0:166 | 008 |      0.003 ms |     3 |    0.002 ms |    26035.870923 s |  26035.870925 s |
   --------------------------------------------------------------------------------------------------------
 
   See commit log messages for more examples with extra options to limit the events time window, etc.
 
 - Add support for new AMD IBS (Instruction Based Sampling) features:
 
   With the DataSrc extensions, the source of data can be decoded among:
  - Local L3 or other L1/L2 in CCX.
  - A peer cache in a near CCX.
  - Data returned from DRAM.
  - A peer cache in a far CCX.
  - DRAM address map with "long latency" bit set.
  - Data returned from MMIO/Config/PCI/APIC.
  - Extension Memory (S-Link, GenZ, etc - identified by the CS target
     and/or address map at DF's choice).
  - Peer Agent Memory.
 
 - Support hardware tracing with Intel PT on guest machines, combining the
   traces with the ones in the host machine.
 
 - Add a "-m" option to 'perf buildid-list' to show kernel and modules
   build-ids, to display all of the information needed to do external
   symbolization of kernel stack traces, such as those collected by
   bpf_get_stackid().
 
 - Add arch TSC frequency information to perf.data file headers.
 
 - Handle changes in the binutils disassembler function signatures in
   perf, bpftool and bpf_jit_disasm (Acked by the bpftool maintainer).
 
 - Fix building the perf perl binding with the newest gcc in distros such
   as fedora rawhide, where some new warnings were breaking the build as
   perf uses -Werror.
 
 - Add 'perf test' entry for branch stack sampling.
 
 - Add ARM SPE system wide 'perf test' entry.
 
 - Add user space counter reading tests to 'perf test'.
 
 - Build with python3 by default, if available.
 
 - Add python converter script for the vendor JSON event files.
 
 - Update vendor JSON files for alderlake, bonnell, broadwell, broadwellde,
   broadwellx, cascadelakex, elkhartlake, goldmont, goldmontplus, haswell,
   haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, knightslanding,
   nehalemep, nehalemex, sandybridge, sapphirerapids, silvermont, skylake,
   skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp and westmereex.
 
 - Add vendor JSON File for Intel meteorlake.
 
 - Add Arm Cortex-A78C and X1C JSON vendor event files.
 
 - Add workaround to symbol address reading from ELF files without phdr,
   falling back to the previoous equation.
 
 - Convert legacy map definition to BTF-defined in the perf BPF script test.
 
 - Rework prologue generation code to stop using libbpf deprecated APIs.
 
 - Add default hybrid events for 'perf stat' on x86.
 
 - Add topdown metrics in the default 'perf stat' on the hybrid machines (big/little cores).
 
 - Prefer sampled CPU when exporting JSON in 'perf data convert'
 
 - Fix ('perf stat CSV output linter') and ("Check branch stack sampling") 'perf test' entries on s390.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYuw6gwAKCRCyPKLppCJ+
 J5+iAP0RL6sKMhzdkRjRYfG8CluJ401YaPHadzv5jxP8gOZz2gEAsuYDrMF9t1zB
 4DqORfobdX9UQEJjP9oRltU73GM0swI=
 =2/M0
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.0-2022-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:

 - Introduce 'perf lock contention' subtool, using new lock contention
   tracepoints and using BPF for in kernel aggregation and then
   userspace processing using the perf tooling infrastructure for
   resolving symbols, target specification, etc.

   Since the new lock contention tracepoints don't provide lock names,
   get up to 8 stack traces and display the first non-lock function
   symbol name as a caller:

    $ perf lock report -F acquired,contended,avg_wait,wait_total

                    Name   acquired  contended     avg wait    total wait

     update_blocked_a...         40         40      3.61 us     144.45 us
     kernfs_fop_open+...          5          5      3.64 us      18.18 us
      _nohz_idle_balance          3          3      2.65 us       7.95 us
     tick_do_update_j...          1          1      6.04 us       6.04 us
      ep_scan_ready_list          1          1      3.93 us       3.93 us

   Supports the usual 'perf record' + 'perf report' workflow as well as
   a BCC/bpftrace like mode where you start the tool and then press
   control+C to get results:

     $ sudo perf lock contention -b
    ^C
    contended   total wait     max wait     avg wait         type   caller

            42    192.67 us     13.64 us      4.59 us     spinlock   queue_work_on+0x20
            23     85.54 us     10.28 us      3.72 us     spinlock   worker_thread+0x14a
             6     13.92 us      6.51 us      2.32 us        mutex   kernfs_iop_permission+0x30
             3     11.59 us     10.04 us      3.86 us        mutex   kernfs_dop_revalidate+0x3c
             1      7.52 us      7.52 us      7.52 us     spinlock   kthread+0x115
             1      7.24 us      7.24 us      7.24 us     rwlock:W   sys_epoll_wait+0x148
             2      7.08 us      3.99 us      3.54 us     spinlock   delayed_work_timer_fn+0x1b
             1      6.41 us      6.41 us      6.41 us     spinlock   idle_balance+0xa06
             2      2.50 us      1.83 us      1.25 us        mutex   kernfs_iop_lookup+0x2f
             1      1.71 us      1.71 us      1.71 us        mutex   kernfs_iop_getattr+0x2c
    ...

 - Add new 'perf kwork' tool to trace time properties of kernel work
   (such as softirq, and workqueue), uses eBPF skeletons to collect info
   in kernel space, aggregating data that then gets processed by the
   userspace tool, e.g.:

    # perf kwork report

     Kwork Name      | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
    ----------------------------------------------------------------------------------------------------
     nvme0q5:130     | 004 |      1.101 ms |    49 |    0.051 ms |    26035.056403 s |  26035.056455 s |
     amdgpu:162      | 002 |      0.176 ms |     9 |    0.046 ms |    26035.268020 s |  26035.268066 s |
     nvme0q24:149    | 023 |      0.161 ms |    55 |    0.009 ms |    26035.655280 s |  26035.655288 s |
     nvme0q20:145    | 019 |      0.090 ms |    33 |    0.014 ms |    26035.939018 s |  26035.939032 s |
     nvme0q31:156    | 030 |      0.075 ms |    21 |    0.010 ms |    26035.052237 s |  26035.052247 s |
     nvme0q8:133     | 007 |      0.062 ms |    12 |    0.021 ms |    26035.416840 s |  26035.416861 s |
     nvme0q6:131     | 005 |      0.054 ms |    22 |    0.010 ms |    26035.199919 s |  26035.199929 s |
     nvme0q19:144    | 018 |      0.052 ms |    14 |    0.010 ms |    26035.110615 s |  26035.110625 s |
     nvme0q7:132     | 006 |      0.049 ms |    13 |    0.007 ms |    26035.125180 s |  26035.125187 s |
     nvme0q18:143    | 017 |      0.033 ms |    14 |    0.007 ms |    26035.169698 s |  26035.169705 s |
     nvme0q17:142    | 016 |      0.013 ms |     1 |    0.013 ms |    26035.565147 s |  26035.565160 s |
     enp5s0-rx-0:164 | 006 |      0.004 ms |     4 |    0.002 ms |    26035.928882 s |  26035.928884 s |
     enp5s0-tx-0:166 | 008 |      0.003 ms |     3 |    0.002 ms |    26035.870923 s |  26035.870925 s |
    --------------------------------------------------------------------------------------------------------

   See commit log messages for more examples with extra options to limit
   the events time window, etc.

 - Add support for new AMD IBS (Instruction Based Sampling) features:

   With the DataSrc extensions, the source of data can be decoded among:
     - Local L3 or other L1/L2 in CCX.
     - A peer cache in a near CCX.
     - Data returned from DRAM.
     - A peer cache in a far CCX.
     - DRAM address map with "long latency" bit set.
     - Data returned from MMIO/Config/PCI/APIC.
     - Extension Memory (S-Link, GenZ, etc - identified by the CS target
       and/or address map at DF's choice).
     - Peer Agent Memory.

 - Support hardware tracing with Intel PT on guest machines, combining
   the traces with the ones in the host machine.

 - Add a "-m" option to 'perf buildid-list' to show kernel and modules
   build-ids, to display all of the information needed to do external
   symbolization of kernel stack traces, such as those collected by
   bpf_get_stackid().

 - Add arch TSC frequency information to perf.data file headers.

 - Handle changes in the binutils disassembler function signatures in
   perf, bpftool and bpf_jit_disasm (Acked by the bpftool maintainer).

 - Fix building the perf perl binding with the newest gcc in distros
   such as fedora rawhide, where some new warnings were breaking the
   build as perf uses -Werror.

 - Add 'perf test' entry for branch stack sampling.

 - Add ARM SPE system wide 'perf test' entry.

 - Add user space counter reading tests to 'perf test'.

 - Build with python3 by default, if available.

 - Add python converter script for the vendor JSON event files.

 - Update vendor JSON files for most Intel cores.

 - Add vendor JSON File for Intel meteorlake.

 - Add Arm Cortex-A78C and X1C JSON vendor event files.

 - Add workaround to symbol address reading from ELF files without phdr,
   falling back to the previoous equation.

 - Convert legacy map definition to BTF-defined in the perf BPF script
   test.

 - Rework prologue generation code to stop using libbpf deprecated APIs.

 - Add default hybrid events for 'perf stat' on x86.

 - Add topdown metrics in the default 'perf stat' on the hybrid machines
   (big/little cores).

 - Prefer sampled CPU when exporting JSON in 'perf data convert'

 - Fix ('perf stat CSV output linter') and ("Check branch stack
   sampling") 'perf test' entries on s390.

* tag 'perf-tools-for-v6.0-2022-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits)
  perf stat: Refactor __run_perf_stat() common code
  perf lock: Print the number of lost entries for BPF
  perf lock: Add --map-nr-entries option
  perf lock: Introduce struct lock_contention
  perf scripting python: Do not build fail on deprecation warnings
  genelf: Use HAVE_LIBCRYPTO_SUPPORT, not the never defined HAVE_LIBCRYPTO
  perf build: Suppress openssl v3 deprecation warnings in libcrypto feature test
  perf parse-events: Break out tracepoint and printing
  perf parse-events: Don't #define YY_EXTRA_TYPE
  tools bpftool: Don't display disassembler-four-args feature test
  tools bpftool: Fix compilation error with new binutils
  tools bpf_jit_disasm: Don't display disassembler-four-args feature test
  tools bpf_jit_disasm: Fix compilation error with new binutils
  tools perf: Fix compilation error with new binutils
  tools include: add dis-asm-compat.h to handle version differences
  tools build: Don't display disassembler-four-args feature test
  tools build: Add feature test for init_disassemble_info API changes
  perf test: Add ARM SPE system wide test
  perf tools: Rework prologue generation code
  perf bpf: Convert legacy map definition to BTF-defined
  ...
2022-08-06 09:36:08 -07:00
Daniel Sneddon
2b12993220 x86/speculation: Add RSB VM Exit protections
tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as
documented for RET instructions after VM exits. Mitigate it with a new
one-entry RSB stuffing mechanism and a new LFENCE.

== Background ==

Indirect Branch Restricted Speculation (IBRS) was designed to help
mitigate Branch Target Injection and Speculative Store Bypass, i.e.
Spectre, attacks. IBRS prevents software run in less privileged modes
from affecting branch prediction in more privileged modes. IBRS requires
the MSR to be written on every privilege level change.

To overcome some of the performance issues of IBRS, Enhanced IBRS was
introduced.  eIBRS is an "always on" IBRS, in other words, just turn
it on once instead of writing the MSR on every privilege level change.
When eIBRS is enabled, more privileged modes should be protected from
less privileged modes, including protecting VMMs from guests.

== Problem ==

Here's a simplification of how guests are run on Linux' KVM:

void run_kvm_guest(void)
{
	// Prepare to run guest
	VMRESUME();
	// Clean up after guest runs
}

The execution flow for that would look something like this to the
processor:

1. Host-side: call run_kvm_guest()
2. Host-side: VMRESUME
3. Guest runs, does "CALL guest_function"
4. VM exit, host runs again
5. Host might make some "cleanup" function calls
6. Host-side: RET from run_kvm_guest()

Now, when back on the host, there are a couple of possible scenarios of
post-guest activity the host needs to do before executing host code:

* on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not
touched and Linux has to do a 32-entry stuffing.

* on eIBRS hardware, VM exit with IBRS enabled, or restoring the host
IBRS=1 shortly after VM exit, has a documented side effect of flushing
the RSB except in this PBRSB situation where the software needs to stuff
the last RSB entry "by hand".

IOW, with eIBRS supported, host RET instructions should no longer be
influenced by guest behavior after the host retires a single CALL
instruction.

However, if the RET instructions are "unbalanced" with CALLs after a VM
exit as is the RET in #6, it might speculatively use the address for the
instruction after the CALL in #3 as an RSB prediction. This is a problem
since the (untrusted) guest controls this address.

Balanced CALL/RET instruction pairs such as in step #5 are not affected.

== Solution ==

The PBRSB issue affects a wide variety of Intel processors which
support eIBRS. But not all of them need mitigation. Today,
X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates
PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e.,
eIBRS systems which enable legacy IBRS explicitly.

However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT
and most of them need a new mitigation.

Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE
which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT.

The lighter-weight mitigation performs a CALL instruction which is
immediately followed by a speculative execution barrier (INT3). This
steers speculative execution to the barrier -- just like a retpoline
-- which ensures that speculation can never reach an unbalanced RET.
Then, ensure this CALL is retired before continuing execution with an
LFENCE.

In other words, the window of exposure is opened at VM exit where RET
behavior is troublesome. While the window is open, force RSB predictions
sampling for RET targets to a dead end at the INT3. Close the window
with the LFENCE.

There is a subset of eIBRS systems which are not vulnerable to PBRSB.
Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB.
Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO.

  [ bp: Massage, incorporate review comments from Andy Cooper. ]

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-08-03 11:23:52 +02:00
Linus Torvalds
e2b5421007 flexible-array transformations in UAPI for 6.0-rc1
Hi Linus,
 
 Please, pull the following treewide patch that replaces zero-length arrays
 with flexible-array members in UAPI. This patch has been baking in
 linux-next for 5 weeks now.
 
 -fstrict-flex-arrays=3 is coming and we need to land these changes
 to prevent issues like these in the short future:
 
 ../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0,
 but the source string has length 2 (including NUL byte) [-Wfortify-source]
 		strcpy(de3->name, ".");
 		^
 
 Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If
 this breaks anything, we can use a union with a new member name.
 
 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836
 
 Thanks
 --
 Gustavo
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmLoNdcACgkQRwW0y0cG
 2zEVeg//QYJ3j2pbKt9zB6muO3SkrNoMPc5wpY/SITUeiDscukLvGzJG88eIZskl
 NaEjbmacHmdlQrBkUdr10i1+hkb2zRd6/j42GIDXEhhKTMoT2UxJCBp47KSvd7VY
 dKNLGsgQs3kwmmxLEGu6w6vywWpI5wxXTKWL1Q/RpUXoOnLmsMEbzKTjf12a1Edl
 9gPNY+tMHIHyB0pGIRXDY/ZF5c+FcRFn6kKeMVzJL0bnX7FI4UmYe83k9ajEiLWA
 MD3JAw/mNv2X0nizHHuQHIjtky8Pr+E8hKs5ni88vMYmFqeABsTw4R1LJykv/mYa
 NakU1j9tHYTKcs2Ju+gIvSKvmatKGNmOpti/8RAjEX1YY4cHlHWNsigVbVRLqfo7
 SKImlSUxOPGFS3HAJQCC9P/oZgICkUdD6sdLO1PVBnE1G3Fvxg5z6fGcdEuEZkVR
 PQwlYDm1nlTuScbkgVSBzyU/AkntVMJTuPWgbpNo+VgSXWZ8T/U8II0eGrFVf9rH
 +y5dAS52/bi6OP0la7fNZlq7tcPfNG9HJlPwPb1kQtuPT4m6CBhth/rRrDJwx8za
 0cpJT75Q3CI0wLZ7GN4yEjtNQrlAeeiYiS4LMQ/SFFtg1KzvmYYVmWDhOf0+mMDA
 f7bq4cxEg2LHwrhRgQQWowFVBu7yeiwKbcj9sybfA27bMqCtfto=
 =8yMq
 -----END PGP SIGNATURE-----

Merge tag 'flexible-array-transformations-UAPI-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull uapi flexible array update from Gustavo Silva:
 "A treewide patch that replaces zero-length arrays with flexible-array
  members in UAPI. This has been baking in linux-next for 5 weeks now.

  '-fstrict-flex-arrays=3' is coming and we need to land these changes
  to prevent issues like these in the short future:

    fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0, but the source string has length 2 (including NUL byte) [-Wfortify-source]
		strcpy(de3->name, ".");
		^

  Since these are all [0] to [] changes, the risk to UAPI is nearly
  zero. If this breaks anything, we can use a union with a new member
  name"

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

* tag 'flexible-array-transformations-UAPI-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  treewide: uapi: Replace zero-length arrays with flexible-array members
2022-08-02 19:50:47 -07:00
Arnaldo Carvalho de Melo
18808564aa Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up the fixes that went upstream via acme/perf/urgent and to get
to v5.19.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 08:59:31 -03:00
Arnaldo Carvalho de Melo
553de6e115 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  28a99e95f5 ("x86/amd: Use IBPB for firmware calls")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Link: https://lore.kernel.org/lkml/Yt6oWce9UDAmBAtX@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27 11:17:50 -03:00
Arnaldo Carvalho de Melo
0698461ad2 Merge remote-tracking branch 'torvalds/master' into perf/core
To update the perf/core codebase.

Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end
of evsel__config(), after what was added in:

  49c692b7df ("perf offcpu: Accept allowed sample types only")

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-18 10:36:11 -03:00
Arnaldo Carvalho de Melo
91d248c3b9 tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes from these csets:

  4ad3278df6 ("x86/speculation: Disable RRSBA behavior")
  d7caac991f ("x86/cpu/amd: Add Spectral Chicken")

That cause no changes to tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  $

Just silences this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YtQTm9wsB3hxQWvy@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:50:50 -03:00
Arnaldo Carvalho de Melo
f098addbdb tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  f43b9876e8 ("x86/retbleed: Add fine grained Kconfig knobs")
  a149180fbc ("x86: Add magic AMD return-thunk")
  15e67227c4 ("x86: Undo return-thunk damage")
  369ae6ffc4 ("x86/retpoline: Cleanup some #ifdefery")
  4ad3278df6 x86/speculation: Disable RRSBA behavior
  26aae8ccbc x86/cpu/amd: Enumerate BTC_NO
  9756bba284 x86/speculation: Fill RSB on vmexit for IBRS
  3ebc170068 x86/bugs: Add retbleed=ibpb
  2dbb887e87 x86/entry: Add kernel IBRS implementation
  6b80b59b35 x86/bugs: Report AMD retbleed vulnerability
  a149180fbc x86: Add magic AMD return-thunk
  15e67227c4 x86: Undo return-thunk damage
  a883d624ae x86/cpufeatures: Move RETPOLINE flags to word 11
  5180218615 x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Link: https://lore.kernel.org/lkml/YtQM40VmiLTkPND2@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:49:14 -03:00
Pawan Gupta
4ad3278df6 x86/speculation: Disable RRSBA behavior
Some Intel processors may use alternate predictors for RETs on
RSB-underflow. This condition may be vulnerable to Branch History
Injection (BHI) and intramode-BTI.

Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines,
eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against
such attacks. However, on RSB-underflow, RET target prediction may
fallback to alternate predictors. As a result, RET's predicted target
may get influenced by branch history.

A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback
behavior when in kernel mode. When set, RETs will not take predictions
from alternate predictors, hence mitigating RETs as well. Support for
this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2).

For spectre v2 mitigation, when a user selects a mitigation that
protects indirect CALLs and JMPs against BHI and intramode-BTI, set
RRSBA_DIS_S also to protect RETs for RSB-underflow case.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-07-09 13:12:45 +02:00
Gustavo A. R. Silva
94dfc73e7c treewide: uapi: Replace zero-length arrays with flexible-array members
There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use “flexible array members”[1] for these
cases. The older style of one-element or zero-length arrays should
no longer be used[2].

This code was transformed with the help of Coccinelle:
(linux-5.19-rc2$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch)

@@
identifier S, member, array;
type T1, T2;
@@

struct S {
  ...
  T1 member;
  T2 array[
- 0
  ];
};

-fstrict-flex-arrays=3 is coming and we need to land these changes
to prevent issues like these in the short future:

../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0,
but the source string has length 2 (including NUL byte) [-Wfortify-source]
		strcpy(de3->name, ".");
		^

Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If
this breaks anything, we can use a union with a new member name.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays

Link: https://github.com/KSPP/linux/issues/78
Build-tested-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/62b675ec.wKX6AOZ6cbE71vtF%25lkp@intel.com/
Acked-by: Dan Williams <dan.j.williams@intel.com> # For ndctl.h
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2022-06-28 21:26:05 +02:00
Arnaldo Carvalho de Melo
f8d8661940 tools headers UAPI: Synch KVM's svm.h header with the kernel
To pick up the changes from:

  d5af44dde5 ("x86/sev: Provide support for SNP guest request NAEs")
  0afb6b660a ("x86/sev: Use SEV-SNP AP creation to start secondary CPUs")
  dc3f3d2474 ("x86/mm: Validate memory when changing the C-bit")
  cbd3d4f7c4 ("x86/sev: Check SEV-SNP features support")

That gets these new SVM exit reasons:

+       { SVM_VMGEXIT_PSC,              "vmgexit_page_state_change" }, \
+       { SVM_VMGEXIT_GUEST_REQUEST,    "vmgexit_guest_request" }, \
+       { SVM_VMGEXIT_EXT_GUEST_REQUEST, "vmgexit_ext_guest_request" }, \
+       { SVM_VMGEXIT_AP_CREATION,      "vmgexit_ap_creation" }, \
+       { SVM_VMGEXIT_HV_FEATURES,      "vmgexit_hypervisor_feature" }, \

Addressing this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/svm.h' differs from latest version at 'arch/x86/include/uapi/asm/svm.h'
  diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h

This causes these changes:

  CC      /tmp/build/perf-urgent/arch/x86/util/kvm-stat.o
  LD      /tmp/build/perf-urgent/arch/x86/util/perf-in.o
  LD      /tmp/build/perf-urgent/arch/x86/perf-in.o
  LD      /tmp/build/perf-urgent/arch/perf-in.o
  LD      /tmp/build/perf-urgent/perf-in.o
  LINK    /tmp/build/perf-urgent/perf

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:55 -03:00
Arnaldo Carvalho de Melo
4b3f7644ae tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  d6d0c7f681 ("x86/cpufeatures: Add PerfMonV2 feature bit")
  296d5a17e7 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
  f30903394e ("x86/cpufeatures: Add virtual TSC_AUX feature bit")
  8ad7e8f696 ("x86/fpu/xsave: Support XSAVEC in the kernel")
  59bd54a84d ("x86/tdx: Detect running as a TDX guest in early boot")
  a77d41ac3a ("x86/cpufeatures: Add AMD Fam19h Branch Sampling feature")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YrDkgmwhLv+nKeOo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26 12:32:43 -03:00
Ravi Bangoria
c1f4f92b7d perf tool ibs: Sync AMD IBS header file
IBS support has been enhanced with two new features in upcoming uarch:

1. DataSrc extension
2. L3 miss filtering.

Additional set of bits has been introduced in IBS registers to exploit
these features.

New bits are already defining in arch/x86/ header. Sync it with tools
header file. Also rename existing ibs_op_data field 'data_src' to
'data_src_lo'.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20220604044519.594-8-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-24 13:18:55 -03:00
Arnaldo Carvalho de Melo
2e323f360a tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
To pick the changes in:

  f1a9761fbb ("KVM: x86: Allow userspace to opt out of hypercall patching")

That just rebuilds kvm-stat.c on x86, no change in functionality.

This silences these perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
  diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

Cc: Oliver Upton <oupton@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/lkml/Yq8qgiMwRcl9ds+f@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-19 11:22:59 -03:00
Linus Torvalds
8e8afafb0b Yet another hw vulnerability with a software mitigation: Processor MMIO
Stale Data.
 
 They are a class of MMIO-related weaknesses which can expose stale data
 by propagating it into core fill buffers. Data which can then be leaked
 using the usual speculative execution methods.
 
 Mitigations include this set along with microcode updates and are
 similar to MDS and TAA vulnerabilities: VERW now clears those buffers
 too.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmKXMkMTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoWGPD/idalLIhhV5F2+hZIKm0WSnsBxAOh9K
 7y8xBxpQQ5FUfW3vm7Pg3ro6VJp7w2CzKoD4lGXzGHriusn3qst3vkza9Ay8xu8g
 RDwKe6hI+p+Il9BV9op3f8FiRLP9bcPMMReW/mRyYsOnJe59hVNwRAL8OG40PY4k
 hZgg4Psfvfx8bwiye5efjMSe4fXV7BUCkr601+8kVJoiaoszkux9mqP+cnnB5P3H
 zW1d1jx7d6eV1Y063h7WgiNqQRYv0bROZP5BJkufIoOHUXDpd65IRF3bDnCIvSEz
 KkMYJNXb3qh7EQeHS53NL+gz2EBQt+Tq1VH256qn6i3mcHs85HvC68gVrAkfVHJE
 QLJE3MoXWOqw+mhwzCRrEXN9O1lT/PqDWw8I4M/5KtGG/KnJs+bygmfKBbKjIVg4
 2yQWfMmOgQsw3GWCRjgEli7aYbDJQjany0K/qZTq54I41gu+TV8YMccaWcXgDKrm
 cXFGUfOg4gBm4IRjJ/RJn+mUv6u+/3sLVqsaFTs9aiib1dpBSSUuMGBh548Ft7g2
 5VbFVSDaLjB2BdlcG7enlsmtzw0ltNssmqg7jTK/L7XNVnvxwUoXw+zP7RmCLEYt
 UV4FHXraMKNt2ZketlomC8ui2hg73ylUp4pPdMXCp7PIXp9sVamRTbpz12h689VJ
 /s55bWxHkR6S
 =LBxT
 -----END PGP SIGNATURE-----

Merge tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 MMIO stale data fixes from Thomas Gleixner:
 "Yet another hw vulnerability with a software mitigation: Processor
  MMIO Stale Data.

  They are a class of MMIO-related weaknesses which can expose stale
  data by propagating it into core fill buffers. Data which can then be
  leaked using the usual speculative execution methods.

  Mitigations include this set along with microcode updates and are
  similar to MDS and TAA vulnerabilities: VERW now clears those buffers
  too"

* tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation/mmio: Print SMT warning
  KVM: x86/speculation: Disable Fill buffer clear within guests
  x86/speculation/mmio: Reuse SRBDS mitigation for SBDS
  x86/speculation/srbds: Update SRBDS mitigation selection
  x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data
  x86/speculation/mmio: Enable CPU Fill buffer clearing on idle
  x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations
  x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data
  x86/speculation: Add a common function for MD_CLEAR mitigation update
  x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
  Documentation: Add documentation for Processor MMIO Stale Data
2022-06-14 07:43:15 -07:00
Arnaldo Carvalho de Melo
9dde6cadb9 tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  db1af12929 ("x86/msr-index: Define INTEGRITY_CAPABILITIES MSR")
  089be16d59 ("x86/msr: Add PerfCntrGlobal* registers")
  f52ba93190 ("tools/power turbostat: Add Power Limit4 support")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2022-05-26 12:50:01.228612839 -0300
  +++ after	2022-05-26 12:50:07.699776166 -0300
  @@ -116,6 +116,7 @@
   	[0x0000026f] = "MTRRfix4K_F8000",
   	[0x00000277] = "IA32_CR_PAT",
   	[0x00000280] = "IA32_MC0_CTL2",
  +	[0x000002d9] = "INTEGRITY_CAPS",
   	[0x000002ff] = "MTRRdefType",
   	[0x00000309] = "CORE_PERF_FIXED_CTR0",
   	[0x0000030a] = "CORE_PERF_FIXED_CTR1",
  @@ -176,6 +177,7 @@
   	[0x00000586] = "IA32_RTIT_ADDR3_A",
   	[0x00000587] = "IA32_RTIT_ADDR3_B",
   	[0x00000600] = "IA32_DS_AREA",
  +	[0x00000601] = "VR_CURRENT_CONFIG",
   	[0x00000606] = "RAPL_POWER_UNIT",
   	[0x0000060a] = "PKGC3_IRTL",
   	[0x0000060b] = "PKGC6_IRTL",
  @@ -260,6 +262,10 @@
   	[0xc0000102 - x86_64_specific_MSRs_offset] = "KERNEL_GS_BASE",
   	[0xc0000103 - x86_64_specific_MSRs_offset] = "TSC_AUX",
   	[0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO",
  +	[0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG",
  +	[0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS",
  +	[0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL",
  +	[0xc0000302 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS_CLR",
   };

   #define x86_AMD_V_KVM_MSRs_offset 0xc0010000
  @@ -318,4 +324,5 @@
   	[0xc00102b4 - x86_AMD_V_KVM_MSRs_offset] = "AMD_CPPC_STATUS",
   	[0xc00102f0 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN_CTL",
   	[0xc00102f1 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN",
  +	[0xc0010300 - x86_AMD_V_KVM_MSRs_offset] = "AMD_SAMP_BR_FROM",
   };
  $

Now one can trace systemwide asking to see backtraces to where those
MSRs are being read/written, see this example with a previous update:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  Using CPUID AuthenticAMD-25-21-0
  0x6a0
  0x6a8
  New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  0x6a0
  0x6a8
  New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  mmap size 528384B
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
     0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule ([kernel.kallsyms])
                                       futex_wait_queue_me ([kernel.kallsyms])
                                       futex_wait ([kernel.kallsyms])
                                       do_futex ([kernel.kallsyms])
                                       __x64_sys_futex ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                       __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
     0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule_idle ([kernel.kallsyms])
                                       do_idle ([kernel.kallsyms])
                                       cpu_startup_entry ([kernel.kallsyms])
                                       secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/lkml/Yo+i%252Fj5+UtE9dcix@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-27 13:22:14 -03:00
Linus Torvalds
cfeb2522c3 Perf events changes for this cycle were:
Platform PMU changes:
 =====================
 
  - x86/intel:
     - Add new Intel Alder Lake and Raptor Lake support
 
  - x86/amd:
     - AMD Zen4 IBS extensions support
     - Add AMD PerfMonV2 support
     - Add AMD Fam19h Branch Sampling support
 
 Generic changes:
 ================
 
  - signal: Deliver SIGTRAP on perf event asynchronously if blocked
 
    Perf instrumentation can be driven via SIGTRAP, but this causes a problem
    when SIGTRAP is blocked by a task & terminate the task.
 
    Allow user-space to request these signals asynchronously (after they get
    unblocked) & also give the information to the signal handler when this
    happens:
 
      " To give user space the ability to clearly distinguish synchronous from
        asynchronous signals, introduce siginfo_t::si_perf_flags and
        TRAP_PERF_FLAG_ASYNC (opted for flags in case more binary information is
        required in future).
 
        The resolution to the problem is then to (a) no longer force the signal
        (avoiding the terminations), but (b) tell user space via si_perf_flags
        if the signal was synchronous or not, so that such signals can be
        handled differently (e.g. let user space decide to ignore or consider
        the data imprecise). "
 
  - Unify/standardize the /sys/devices/cpu/events/* output format.
 
  - Misc fixes & cleanups.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmKLuiURHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ioSRAAgM3PneFHn5MFiuV/8ZfP3xMHNUOYOCgN
 JhALRcUhDdL4N9pS0DSImfXvAlYPJ/TZK8qBRNDsRgygp5vjrbr9zH2HdZBW1gyV
 qi3bpuNS+METnfNyumAoBeOYbMIvpm3NDUX+w68Xvkd1g8ykyno8Zc2H2hj3IDsW
 cK3ErP0CZLsnBZsymy29/bxCYhfxsED6J06hOa8R3Tvl4XYg/27Z+tEuZ4GYeFS8
 VikulYB9RhRWUbhkzwjyRSbTWyvsuXP+xD28ymUIxXaNCDOwxK8uYtVepUFIBO8X
 cZgtwT2faV3y5ZAnz02M+/JZl+Jz5EPm037vNQp9aJsTuAbAGnxh/hL0cBVuDqhv
 Nh9wkqS8FqwAbtpvg/IeamzqN5z/Yn2Q/Jyk/4oWipmeddXWUL7sYVoSduTGJJkz
 cZz2ciNQbnOCzv0ZSjihrGMqPaT+/wI/iLW3ouLoZXpfTtVVRiiLuI1DDAZ1rd2r
 D6djV8JjHIs71V/6E9ahVATxq8yMdikd7u734rA5K3XSxIBTYrdshbOhddzgeE7d
 chQ7XvpQXDoFrZtxkHXP5iIeNF7fU9MWNWaEcsrZaWEB/8UpD6eL2if1Kl8mog+h
 J4+zR1LWRHh8TNRfos3yCP2PSbbS6LPVsYLJzP+bb+pxgqdJ+urxfmxoCtY5trNI
 zHT52xfdxSo=
 =UqYA
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:
 "Platform PMU changes:

   - x86/intel:
      - Add new Intel Alder Lake and Raptor Lake support

   - x86/amd:
      - AMD Zen4 IBS extensions support
      - Add AMD PerfMonV2 support
      - Add AMD Fam19h Branch Sampling support

  Generic changes:

   - signal: Deliver SIGTRAP on perf event asynchronously if blocked

     Perf instrumentation can be driven via SIGTRAP, but this causes a
     problem when SIGTRAP is blocked by a task & terminate the task.

     Allow user-space to request these signals asynchronously (after
     they get unblocked) & also give the information to the signal
     handler when this happens:

       "To give user space the ability to clearly distinguish
        synchronous from asynchronous signals, introduce
        siginfo_t::si_perf_flags and TRAP_PERF_FLAG_ASYNC (opted for
        flags in case more binary information is required in future).

        The resolution to the problem is then to (a) no longer force the
        signal (avoiding the terminations), but (b) tell user space via
        si_perf_flags if the signal was synchronous or not, so that such
        signals can be handled differently (e.g. let user space decide
        to ignore or consider the data imprecise). "

   - Unify/standardize the /sys/devices/cpu/events/* output format.

   - Misc fixes & cleanups"

* tag 'perf-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
  perf/x86/amd/core: Fix reloading events for SVM
  perf/x86/amd: Run AMD BRS code only on supported hw
  perf/x86/amd: Fix AMD BRS period adjustment
  perf/x86/amd: Remove unused variable 'hwc'
  perf/ibs: Fix comment
  perf/amd/ibs: Advertise zen4_ibs_extensions as pmu capability attribute
  perf/amd/ibs: Add support for L3 miss filtering
  perf/amd/ibs: Use ->is_visible callback for dynamic attributes
  perf/amd/ibs: Cascade pmu init functions' return value
  perf/x86/uncore: Add new Alder Lake and Raptor Lake support
  perf/x86/uncore: Clean up uncore_pci_ids[]
  perf/x86/cstate: Add new Alder Lake and Raptor Lake support
  perf/x86/msr: Add new Alder Lake and Raptor Lake support
  perf/x86: Add new Alder Lake and Raptor Lake support
  perf/amd/ibs: Use interrupt regs ip for stack unwinding
  perf/x86/amd/core: Add PerfMonV2 overflow handling
  perf/x86/amd/core: Add PerfMonV2 counter control
  perf/x86/amd/core: Detect available counters
  perf/x86/amd/core: Detect PerfMonV2 support
  x86/msr: Add PerfCntrGlobal* registers
  ...
2022-05-24 10:59:38 -07:00
Linus Torvalds
c5a3d3c01e - Remove a bunch of chicken bit options to turn off CPU features which
are not really needed anymore
 
 - Misc fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmKLdfgACgkQEsHwGGHe
 VUpB5Q//TIGVgmnSd0YYxY2cIe047lfcd34D+3oEGk0d2FidtirP/tjgBqIXRuY5
 UncoveqBuI/6/7bodP/ANg9DNVXv2489eFYyZtEOLSGnfzV2AU10aw95cuQQG+BW
 YIc6bGSsgfiNo8Vtj4L3xkVqxOrqaCYnh74GTSNNANht3i8KH8Qq9n3qZTuMiF6R
 fH9xWak3TZB2nMzHdYrXh0sSR6eBHN3KYSiT0DsdlU9PUlavlSPFYQRiAlr6FL6J
 BuYQdlUaCQbINvaviGW4SG7fhX32RfF/GUNaBajB40TO6H98KZLpBBvstWQ841xd
 /o44o5wbghoGP1ne8OKwP+SaAV2bE6twd5eO1lpwcpXnQfATvjQ2imxvOiRhy5LY
 pFPt/hko9gKWJ6SI0SQ4tiKJALFPLWD6561scHU6PoriFhv0SRIaPmJyEsDYynMz
 bCXaPPsoovRwwwBfAxxQjljIlhQSBVt3gWZ8NWD1tYbNaqM+WK7xKBaONGh3OCw3
 iK7lsbbljtM0zmANImYyeo7+Hr1NVOmMiK2WZYbxhxgzH3l8v/6EbDt3I70WU57V
 9apCU3/nk/HFpX65SdW5qmuiWLVdH9NXrEqbvaUB4ApT18MdUUugewBhcGnf3Umu
 wEtltzziqcIkxzDoXXpBGWpX31S7PsM2XVDqYC7dwuNttgEw2Fc=
 =7AUX
 -----END PGP SIGNATURE-----

Merge tag 'x86_cpu_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 CPU feature updates from Borislav Petkov:

 - Remove a bunch of chicken bit options to turn off CPU features which
   are not really needed anymore

 - Misc fixes and cleanups

* tag 'x86_cpu_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Add missing prototype for unpriv_ebpf_notify()
  x86/pm: Fix false positive kmemleak report in msr_build_context()
  x86/speculation/srbds: Do not try to turn mitigation off when not supported
  x86/cpu: Remove "noclflush"
  x86/cpu: Remove "noexec"
  x86/cpu: Remove "nosmep"
  x86/cpu: Remove CONFIG_X86_SMAP and "nosmap"
  x86/cpu: Remove "nosep"
  x86/cpu: Allow feature bit names from /proc/cpuinfo in clearcpuid=
2022-05-23 18:01:31 -07:00
Pawan Gupta
027bbb884b KVM: x86/speculation: Disable Fill buffer clear within guests
The enumeration of MD_CLEAR in CPUID(EAX=7,ECX=0).EDX{bit 10} is not an
accurate indicator on all CPUs of whether the VERW instruction will
overwrite fill buffers. FB_CLEAR enumeration in
IA32_ARCH_CAPABILITIES{bit 17} covers the case of CPUs that are not
vulnerable to MDS/TAA, indicating that microcode does overwrite fill
buffers.

Guests running in VMM environments may not be aware of all the
capabilities/vulnerabilities of the host CPU. Specifically, a guest may
apply MDS/TAA mitigations when a virtual CPU is enumerated as vulnerable
to MDS/TAA even when the physical CPU is not. On CPUs that enumerate
FB_CLEAR_CTRL the VMM may set FB_CLEAR_DIS to skip overwriting of fill
buffers by the VERW instruction. This is done by setting FB_CLEAR_DIS
during VMENTER and resetting on VMEXIT. For guests that enumerate
FB_CLEAR (explicitly asking for fill buffer clear capability) the VMM
will not use FB_CLEAR_DIS.

Irrespective of guest state, host overwrites CPU buffers before VMENTER
to protect itself from an MMIO capable guest, as part of mitigation for
MMIO Stale Data vulnerabilities.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:41:35 +02:00
Pawan Gupta
5180218615 x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
Processor MMIO Stale Data is a class of vulnerabilities that may
expose data after an MMIO operation. For more details please refer to
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst

Add the Processor MMIO Stale Data bug enumeration. A microcode update
adds new bits to the MSR IA32_ARCH_CAPABILITIES, define them.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:14:30 +02:00
Ravi Bangoria
9cb23f598c perf/ibs: Fix comment
s/IBS Op Data 2/IBS Op Data 1/ for MSR 0xc0011035.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220509044914.1473-9-ravi.bangoria@amd.com
2022-05-11 16:27:10 +02:00
Pawan Gupta
400331f8ff x86/tsx: Disable TSX development mode at boot
A microcode update on some Intel processors causes all TSX transactions
to always abort by default[*]. Microcode also added functionality to
re-enable TSX for development purposes. With this microcode loaded, if
tsx=on was passed on the cmdline, and TSX development mode was already
enabled before the kernel boot, it may make the system vulnerable to TSX
Asynchronous Abort (TAA).

To be on safer side, unconditionally disable TSX development mode during
boot. If a viable use case appears, this can be revisited later.

  [*]: Intel TSX Disable Update for Selected Processors, doc ID: 643557

  [ bp: Drop unstable web link, massage heavily. ]

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/347bd844da3a333a9793c6687d4e4eb3b2419a3e.1646943780.git.pawan.kumar.gupta@linux.intel.com
2022-04-11 09:58:40 +02:00
Borislav Petkov
dbae0a934f x86/cpu: Remove CONFIG_X86_SMAP and "nosmap"
Those were added as part of the SMAP enablement but SMAP is currently
an integral part of kernel proper and there's no need to disable it
anymore.

Rip out that functionality. Leave --uaccess default on for objtool as
this is what objtool should do by default anyway.

If still needed - clearcpuid=smap.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220127115626.14179-4-bp@alien8.de
2022-04-04 10:16:57 +02:00
Arnaldo Carvalho de Melo
5ced812435 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  991625f3dd ("x86/ibt: Add IBT feature, MSR and #CP handling")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YkSCx2kr4ambH+Qe@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Arnaldo Carvalho de Melo
672b259fed tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  991625f3dd ("x86/ibt: Add IBT feature, MSR and #CP handling")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2022-03-29 16:23:07.678740040 -0300
  +++ after	2022-03-29 16:23:16.960978524 -0300
  @@ -220,6 +220,13 @@
   	[0x00000669] = "MC6_DEMOTION_POLICY_CONFIG",
   	[0x00000680] = "LBR_NHM_FROM",
   	[0x00000690] = "CORE_PERF_LIMIT_REASONS",
  +	[0x000006a0] = "IA32_U_CET",
  +	[0x000006a2] = "IA32_S_CET",
  +	[0x000006a4] = "IA32_PL0_SSP",
  +	[0x000006a5] = "IA32_PL1_SSP",
  +	[0x000006a6] = "IA32_PL2_SSP",
  +	[0x000006a7] = "IA32_PL3_SSP",
  +	[0x000006a8] = "IA32_INT_SSP_TAB",
   	[0x000006B0] = "GFX_PERF_LIMIT_REASONS",
   	[0x000006B1] = "RING_PERF_LIMIT_REASONS",
   	[0x000006c0] = "LBR_NHM_TO",
  $

And this gets rebuilt:

  CC      /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
  LD      /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
  LD      /tmp/build/perf/trace/beauty/perf-in.o
  CC      /tmp/build/perf/util/amd-sample-raw.o
  LD      /tmp/build/perf/util/perf-in.o
  LD      /tmp/build/perf/perf-in.o
  LINK    /tmp/build/perf/perf

Now one can trace systemwide asking to see backtraces to where those
MSRs are being read/written with:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  Using CPUID AuthenticAMD-25-21-0
  0x6a0
  0x6a8
  New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  0x6a0
  0x6a8
  New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  mmap size 528384B
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
     0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule ([kernel.kallsyms])
                                       futex_wait_queue_me ([kernel.kallsyms])
                                       futex_wait ([kernel.kallsyms])
                                       do_futex ([kernel.kallsyms])
                                       __x64_sys_futex ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                       __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
     0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule_idle ([kernel.kallsyms])
                                       do_idle ([kernel.kallsyms])
                                       cpu_startup_entry ([kernel.kallsyms])
                                       secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/YkNd7Ky+vi7H2Zl2@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Linus Torvalds
7b58b82b86 perf tools changes for v5.18: 1st batch
New features:
 
 perf ftrace:
 
 - Add -n/--use-nsec option to the 'latency' subcommand.
 
   Default: usecs:
 
   $ sudo perf ftrace latency -T dput -a sleep 1
   #   DURATION     |      COUNT | GRAPH                          |
        0 - 1    us |    2098375 | #############################  |
        1 - 2    us |         61 |                                |
        2 - 4    us |         33 |                                |
        4 - 8    us |         13 |                                |
        8 - 16   us |        124 |                                |
       16 - 32   us |        123 |                                |
       32 - 64   us |          1 |                                |
       64 - 128  us |          0 |                                |
      128 - 256  us |          1 |                                |
      256 - 512  us |          0 |                                |
 
   Better granularity with nsec:
 
   $ sudo perf ftrace latency -T dput -a -n sleep 1
   #   DURATION     |      COUNT | GRAPH                          |
        0 - 1    us |          0 |                                |
        1 - 2    ns |          0 |                                |
        2 - 4    ns |          0 |                                |
        4 - 8    ns |          0 |                                |
        8 - 16   ns |          0 |                                |
       16 - 32   ns |          0 |                                |
       32 - 64   ns |          0 |                                |
       64 - 128  ns |    1163434 | ##############                 |
      128 - 256  ns |     914102 | #############                  |
      256 - 512  ns |        884 |                                |
      512 - 1024 ns |        613 |                                |
        1 - 2    us |         31 |                                |
        2 - 4    us |         17 |                                |
        4 - 8    us |          7 |                                |
        8 - 16   us |        123 |                                |
       16 - 32   us |         83 |                                |
 
 perf lock:
 
 - Add -c/--combine-locks option to merge lock instances in the same class into
   a single entry.
 
   # perf lock report -c
                  Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns)
 
         rcu_read_lock   251225         0            0              0            0            0
    hrtimer_bases.lock    39450         0            0              0            0            0
   &sb->s_type->i_l...    10301         1          662            662          662          662
      ptlock_ptr(page)    10173         2          701           1402          760          642
   &(ei->i_block_re...     8732         0            0              0            0            0
          &xa->xa_lock     8088         0            0              0            0            0
           &base->lock     6705         0            0              0            0            0
           &p->pi_lock     5549         0            0              0            0            0
   &dentry->d_lockr...     5010         4         1274           5097         1844          789
             &ep->lock     3958         0            0              0            0            0
 
 - Add -F/--field option to customize the list of fields to output:
 
   $ perf lock report -F contended,wait_max -k avg_wait
                   Name contended max wait(ns) avg wait(ns)
 
         slock-AF_INET6         1        23543        23543
      &lruvec->lru_lock         5        18317        11254
         slock-AF_INET6         1        10379        10379
             rcu_node_1         1         2104         2104
    &dentry->d_lockr...         1         1844         1844
    &dentry->d_lockr...         1         1672         1672
       &newf->file_lock        15         2279         1025
    &dentry->d_lockr...         1          792          792
 
 - Add --synth=no option for record, as there is no need to symbolize,
   lock names comes from the tracepoints.
 
 perf record:
 
 - Threaded recording, opt-in, via the new --threads command line option.
 
 - Improve AMD IBS (Instruction-Based Sampling) error handling messages.
 
 perf script:
 
 - Add 'brstackinsnlen' field (use it with -F) for branch stacks.
 
 - Output branch sample type in 'perf script'.
 
 perf report:
 
 - Add "addr_from" and "addr_to" sort dimensions.
 
 - Print branch stack entry type in 'perf report --dump-raw-trace'
 
 - Fix symbolization for chrooted workloads.
 
 Hardware tracing:
 
 Intel PT:
 
 - Add CFE (Control Flow Event) and EVD (Event Data) packets support.
 
 - Add MODE.Exec IFLAG bit support.
 
 Explanation about these features from the "Intel® 64 and IA-32 architectures
 software developer’s manual combined volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C,
 3D, and 4" PDF at:
 
   https://cdrdv2.intel.com/v1/dl/getContent/671200
 
 At page 3951:
 
 <quote>
 32.2.4
 
 Event Trace is a capability that exposes details about the asynchronous
 events, when they are generated, and when their corresponding software
 event handler completes execution. These include:
 
 o Interrupts, including NMI and SMI, including the interrupt vector when
 defined.
 
 o Faults, exceptions including the fault vector.
 
 — Page faults additionally include the page fault address, when in context.
 
 o Event handler returns, including IRET and RSM.
 
 o VM exits and VM entries.¹
 
 — VM exits include the values written to the “exit reason” and “exit qualification” VMCS fields.
 INIT and SIPI events.
 
 o TSX aborts, including the abort status returned for the RTM instructions.
 
 o Shutdown.
 
 Additionally, it provides indication of the status of the Interrupt Flag
 (IF), to indicate when interrupts are masked.
 </quote>
 
 ARM CoreSight:
 
 - Use advertised caps/min_interval as default sample_period on ARM spe.
 
 - Update deduction of TRCCONFIGR register for branch broadcast on ARM's CoreSight ETM.
 
 Vendor Events (JSON):
 
 Intel:
 
 - Update events and metrics for:
 
     Alderlake, Broadwell, Broadwell DE, BroadwellX, CascadelakeX, Elkhartlake,
     Bonnell, Goldmont, GoldmontPlus, Westmere EP-DP, Haswell, HaswellX,
     Icelake, IcelakeX, Ivybridge, Ivytown, Jaketown, Knights Landing,
     Nehalem EP, Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX,
     Tigerlake, TremontX, Westmere EP-SP, Westmere EX.
 
 ARM:
 
 - Add support for HiSilicon CPA PMU aliasing.
 
 perf stat:
 
 - Fix forked applications enablement of counters.
 
 - The 'slots' should only be printed on a different order than the one specified
   on the command line when 'topdown' events are present, fix it.
 
 Miscellaneous:
 
 - Sync msr-index, cpufeatures header files with the kernel sources.
 
 - Stop using some deprecated libbpf APIs in 'perf trace'.
 
 - Fix some spelling mistakes.
 
 - Refactor the maps pointers usage to pave the way for using refcount debugging.
 
 - Only offer the --tui option on perf top, report and annotate when perf was
   built with libslang.
 
 - Don't mention --to-ctf in 'perf data --help' when not linking with the required
   library, libbabeltrace.
 
 - Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by array_size.cocci.
 
 - Enhance the matching of sub-commands abbreviations:
 	'perf c2c rec' -> 'perf c2c record'
 	'perf c2c recport -> error
 
 - Set build-id using build-id header on new mmap records.
 
 - Fix generation of 'perf --version' string.
 
 perf test:
 
 - Add test for the arm_spe event.
 
 - Add test to check unwinding using fame-pointer (fp) mode on arm64.
 
 - Make metric testing more robust in 'perf test'.
 
 - Add error message for unsupported branch stack cases.
 
 libperf:
 
 - Add API for allocating new thread map array.
 
 - Fix typo in perf_evlist__open() failure error messages in libperf tests.
 
 perf c2c:
 
 - Replace bitmap_weight() with bitmap_empty() where appropriate.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYj8viwAKCRCyPKLppCJ+
 J8K3AQDpN45P4/TWJxVWhZlvYzJtWDSboXHZJfmBiEd4Xu2zbwD7BFW02f1ATHPr
 dGBFXxRQQufBIqfE+OQXG59Awp1m8wE=
 =1l8S
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "New features:

  perf ftrace:

   - Add -n/--use-nsec option to the 'latency' subcommand.

     Default: usecs:

     $ sudo perf ftrace latency -T dput -a sleep 1
     #   DURATION     |      COUNT | GRAPH                          |
          0 - 1    us |    2098375 | #############################  |
          1 - 2    us |         61 |                                |
          2 - 4    us |         33 |                                |
          4 - 8    us |         13 |                                |
          8 - 16   us |        124 |                                |
         16 - 32   us |        123 |                                |
         32 - 64   us |          1 |                                |
         64 - 128  us |          0 |                                |
        128 - 256  us |          1 |                                |
        256 - 512  us |          0 |                                |

     Better granularity with nsec:

     $ sudo perf ftrace latency -T dput -a -n sleep 1
     #   DURATION     |      COUNT | GRAPH                          |
          0 - 1    us |          0 |                                |
          1 - 2    ns |          0 |                                |
          2 - 4    ns |          0 |                                |
          4 - 8    ns |          0 |                                |
          8 - 16   ns |          0 |                                |
         16 - 32   ns |          0 |                                |
         32 - 64   ns |          0 |                                |
         64 - 128  ns |    1163434 | ##############                 |
        128 - 256  ns |     914102 | #############                  |
        256 - 512  ns |        884 |                                |
        512 - 1024 ns |        613 |                                |
          1 - 2    us |         31 |                                |
          2 - 4    us |         17 |                                |
          4 - 8    us |          7 |                                |
          8 - 16   us |        123 |                                |
         16 - 32   us |         83 |                                |

  perf lock:

   - Add -c/--combine-locks option to merge lock instances in the same
     class into a single entry.

     # perf lock report -c
                    Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns)

           rcu_read_lock   251225         0            0              0            0            0
      hrtimer_bases.lock    39450         0            0              0            0            0
     &sb->s_type->i_l...    10301         1          662            662          662          662
        ptlock_ptr(page)    10173         2          701           1402          760          642
     &(ei->i_block_re...     8732         0            0              0            0            0
            &xa->xa_lock     8088         0            0              0            0            0
             &base->lock     6705         0            0              0            0            0
             &p->pi_lock     5549         0            0              0            0            0
     &dentry->d_lockr...     5010         4         1274           5097         1844          789
               &ep->lock     3958         0            0              0            0            0

      - Add -F/--field option to customize the list of fields to output:

     $ perf lock report -F contended,wait_max -k avg_wait
                     Name contended max wait(ns) avg wait(ns)

           slock-AF_INET6         1        23543        23543
        &lruvec->lru_lock         5        18317        11254
           slock-AF_INET6         1        10379        10379
               rcu_node_1         1         2104         2104
      &dentry->d_lockr...         1         1844         1844
      &dentry->d_lockr...         1         1672         1672
         &newf->file_lock        15         2279         1025
      &dentry->d_lockr...         1          792          792

   - Add --synth=no option for record, as there is no need to symbolize,
     lock names comes from the tracepoints.

  perf record:

   - Threaded recording, opt-in, via the new --threads command line
     option.

   - Improve AMD IBS (Instruction-Based Sampling) error handling
     messages.

  perf script:

   - Add 'brstackinsnlen' field (use it with -F) for branch stacks.

   - Output branch sample type in 'perf script'.

  perf report:

   - Add "addr_from" and "addr_to" sort dimensions.

   - Print branch stack entry type in 'perf report --dump-raw-trace'

   - Fix symbolization for chrooted workloads.

  Hardware tracing:

  Intel PT:

   - Add CFE (Control Flow Event) and EVD (Event Data) packets support.

   - Add MODE.Exec IFLAG bit support.

     Explanation about these features from the "Intel® 64 and IA-32
     architectures software developer’s manual combined volumes: 1, 2A,
     2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at:

        https://cdrdv2.intel.com/v1/dl/getContent/671200

     At page 3951:
      "32.2.4

       Event Trace is a capability that exposes details about the
       asynchronous events, when they are generated, and when their
       corresponding software event handler completes execution. These
       include:

        o Interrupts, including NMI and SMI, including the interrupt
          vector when defined.

        o Faults, exceptions including the fault vector.

           - Page faults additionally include the page fault address,
             when in context.

        o Event handler returns, including IRET and RSM.

        o VM exits and VM entries.¹

           - VM exits include the values written to the “exit reason”
             and “exit qualification” VMCS fields. INIT and SIPI events.

        o TSX aborts, including the abort status returned for the RTM
          instructions.

        o Shutdown.

       Additionally, it provides indication of the status of the
       Interrupt Flag (IF), to indicate when interrupts are masked"

  ARM CoreSight:

   - Use advertised caps/min_interval as default sample_period on ARM
     spe.

   - Update deduction of TRCCONFIGR register for branch broadcast on
     ARM's CoreSight ETM.

  Vendor Events (JSON):

  Intel:

   - Update events and metrics for: Alderlake, Broadwell, Broadwell DE,
     BroadwellX, CascadelakeX, Elkhartlake, Bonnell, Goldmont,
     GoldmontPlus, Westmere EP-DP, Haswell, HaswellX, Icelake, IcelakeX,
     Ivybridge, Ivytown, Jaketown, Knights Landing, Nehalem EP,
     Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX,
     Tigerlake, TremontX, Westmere EP-SP, and Westmere EX.

  ARM:

   - Add support for HiSilicon CPA PMU aliasing.

  perf stat:

   - Fix forked applications enablement of counters.

   - The 'slots' should only be printed on a different order than the
     one specified on the command line when 'topdown' events are
     present, fix it.

  Miscellaneous:

   - Sync msr-index, cpufeatures header files with the kernel sources.

   - Stop using some deprecated libbpf APIs in 'perf trace'.

   - Fix some spelling mistakes.

   - Refactor the maps pointers usage to pave the way for using refcount
     debugging.

   - Only offer the --tui option on perf top, report and annotate when
     perf was built with libslang.

   - Don't mention --to-ctf in 'perf data --help' when not linking with
     the required library, libbabeltrace.

   - Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by
     array_size.cocci.

   - Enhance the matching of sub-commands abbreviations:
	'perf c2c rec' -> 'perf c2c record'
	'perf c2c recport -> error

   - Set build-id using build-id header on new mmap records.

   - Fix generation of 'perf --version' string.

  perf test:

   - Add test for the arm_spe event.

   - Add test to check unwinding using fame-pointer (fp) mode on arm64.

   - Make metric testing more robust in 'perf test'.

   - Add error message for unsupported branch stack cases.

  libperf:

   - Add API for allocating new thread map array.

   - Fix typo in perf_evlist__open() failure error messages in libperf
     tests.

  perf c2c:

   - Replace bitmap_weight() with bitmap_empty() where appropriate"

* tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (143 commits)
  perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages
  perf python: Add perf_env stubs that will be needed in evsel__open_strerror()
  perf tools: Enhance the matching of sub-commands abbreviations
  libperf tests: Fix typo in perf_evlist__open() failure error messages
  tools arm64: Import cputype.h
  perf lock: Add -F/--field option to control output
  perf lock: Extend struct lock_key to have print function
  perf lock: Add --synth=no option for record
  tools headers cpufeatures: Sync with the kernel sources
  tools headers cpufeatures: Sync with the kernel sources
  perf stat: Fix forked applications enablement of counters
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf evsel: Make evsel__env() always return a valid env
  perf build-id: Fix spelling mistake "Cant" -> "Can't"
  perf header: Fix spelling mistake "could't" -> "couldn't"
  perf script: Add 'brstackinsnlen' for branch stacks
  perf parse-events: Move slots only with topdown
  perf ftrace latency: Update documentation
  perf ftrace latency: Add -n/--use-nsec option
  perf tools: Fix version kernel tag
  ...
2022-03-27 13:42:32 -07:00
Linus Torvalds
1464677662 platform-drivers-x86 for v5.18-1
Highlights:
 - new drivers:
   - AMD Host System Management Port (HSMP)
   - Intel Software Defined Silicon
 - removed drivers (functionality folded into other drivers):
   - intel_cht_int33fe_microb
   - surface3_button
 - amd-pmc:
   - s2idle bug-fixes
   - Support for AMD Spill to DRAM STB feature
 - hp-wmi:
   - Fix SW_TABLET_MODE detection method (and other fixes)
   - Support omen thermal profile policy v1
 - serial-multi-instantiate:
   - Add SPI device support
   - Add support for CS35L41 amplifiers used in new laptops
 - think-lmi:
   - syfs-class-firmware-attributes Certificate authentication support
 - thinkpad_acpi:
   - Fixes + quirks
   - Add platform_profile support on AMD based ThinkPads
 - x86-android-tablets
   - Improve Asus ME176C / TF103C support
   - Support Nextbook Ares 8, Lenovo Tab 2 830 and 1050 tablets
 - Lots of various other small fixes and hardware-id additions
 
 The following is an automated git shortlog grouped by driver:
 
 ACPI / scan:
  -  Create platform device for CS35L41
 
 ACPI / x86:
  -  Add support for LPS0 callback handler
 
 ALSA:
  -  hda/realtek: Add support for HP Laptops
 
 Add AMD system management interface:
  - Add AMD system management interface
 
 Add Intel Software Defined Silicon driver:
  - Add Intel Software Defined Silicon driver
 
 Documentation:
  -  syfs-class-firmware-attributes: Lenovo Certificate support
  -  Add x86/amd_hsmp driver
 
 ISST:
  -  Fix possible circular locking dependency detected
 
 Input:
  -  soc_button_array - add support for Microsoft Surface 3 (MSHW0028) buttons
 
 Merge remote-tracking branch 'pdx86/platform-drivers-x86-pinctrl-pmu_clk' into review-hans-gcc12:
  - Merge remote-tracking branch 'pdx86/platform-drivers-x86-pinctrl-pmu_clk' into review-hans-gcc12
 
 Merge tag 'platform-drivers-x86-serial-multi-instantiate-1' into review-hans:
  - Merge tag 'platform-drivers-x86-serial-multi-instantiate-1' into review-hans
 
 Replace acpi_bus_get_device():
  - Replace acpi_bus_get_device()
 
 amd-pmc:
  -  Only report STB errors when STB enabled
  -  Drop CPU QoS workaround
  -  Output error codes in messages
  -  Move to later in the suspend process
  -  Validate entry into the deepest state on resume
  -  uninitialized variable in amd_pmc_s2d_init()
  -  Set QOS during suspend on CZN w/ timer wakeup
  -  Add support for AMD Spill to DRAM STB feature
  -  Correct usage of SMU version
  -  Make amd_pmc_stb_debugfs_fops static
 
 asus-tf103c-dock:
  -  Make 2 global structs static
 
 asus-wmi:
  -  Fix regression when probing for fan curve control
 
 hp-wmi:
  -  support omen thermal profile policy v1
  -  Changing bios_args.data to be dynamically allocated
  -  Fix 0x05 error code reported by several WMI calls
  -  Fix SW_TABLET_MODE detection method
  -  Fix hp_wmi_read_int() reporting error (0x05)
 
 huawei-wmi:
  -  check the return value of device_create_file()
 
 i2c-multi-instantiate:
  -  Rename it for a generic serial driver name
 
 int3472:
  -  Add terminator to gpiod_lookup_table
 
 intel-uncore-freq:
  -  fix uncore_freq_common_init() error codes
 
 intel_cht_int33fe:
  -  Move to intel directory
  -  Drop Lenovo Yogabook YB1-X9x code
  -  Switch to DMI modalias based loading
 
 intel_crystal_cove_charger:
  -  Fix IRQ masking / unmasking
 
 lg-laptop:
  -  Move setting of battery charge limit to common location
 
 pinctrl:
  -  baytrail: Add pinconf group + function for the pmu_clk
 
 platform/dcdbas:
  -  move EXPORT_SYMBOL after function
 
 platform/surface:
  -  Remove Surface 3 Button driver
  -  surface3-wmi: Simplify resource management
  -  Replace acpi_bus_get_device()
  -  Reinstate platform dependency
 
 platform/x86/intel-uncore-freq:
  -  Split common and enumeration part
 
 platform/x86/intel/uncore-freq:
  -  Display uncore current frequency
  -  Use sysfs API to create attributes
  -  Move to uncore-frequency folder
 
 selftests:
  -  sdsi: test sysfs setup
 
 serial-multi-instantiate:
  -  Add SPI support
  -  Reorganize I2C functions
 
 spi:
  -  Add API to count spi acpi resources
  -  Support selection of the index of the ACPI Spi Resource before alloc
  -  Create helper API to lookup ACPI info for spi device
  -  Make spi_alloc_device and spi_add_device public again
 
 surface:
  -  surface3_power: Fix battery readings on batteries without a serial number
 
 think-lmi:
  -  Certificate authentication support
 
 thinkpad_acpi:
  -  consistently check fan_get_status return.
  -  Don't use test_bit on an integer
  -  Fix compiler warning about uninitialized err variable
  -  clean up dytc profile convert
  -  Add PSC mode support
  -  Add dual fan probe
  -  Add dual-fan quirk for T15g (2nd gen)
  -  Fix incorrect use of platform profile on AMD platforms
  -  Add quirk for ThinkPads without a fan
 
 tools arch x86:
  -  Add Intel SDSi provisiong tool
 
 touchscreen_dmi:
  -  Add info for the RWC NANOTE P8 AY07J 2-in-1
 
 x86-android-tablets:
  -  Depend on EFI and SPI
  -  Lenovo Yoga Tablet 2 830/1050 sound support
  -  Workaround Lenovo Yoga Tablet 2 830/1050 poweroff hang
  -  Add Lenovo Yoga Tablet 2 830 / 1050 data
  -  Fix EBUSY error when requesting IOAPIC IRQs
  -  Minor charger / fuel-gauge improvements
  -  Add Nextbook Ares 8 data
  -  Add IRQ to Asus ME176C accelerometer info
  -  Add lid-switch gpio-keys pdev to Asus ME176C + TF103C
  -  Add x86_android_tablet_get_gpiod() helper
  -  Add Asus ME176C/TF103C charger and fuelgauge props
  -  Add battery swnode support
  -  Trivial typo fix for MODULE_AUTHOR
  -  Fix the buttons on CZC P10T tablet
  -  Constify the gpiod_lookup_tables arrays
  -  Add an init() callback to struct x86_dev_info
  -  Add support for disabling ACPI _AEI handlers
  -  Correct crystal_cove_charger module name
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmI8SjEUHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9wYUwf/cdUMPFy5cwpHq1LuqGy+PxVCRHCe
 71PFd2Ycj+HGOtrt66RxSiCC1Seb4tylr7FvudToDaqWjlBf5n6LhpDudg4ds7Qw
 lCuRlaXTIrF7p3nOLIsWvJPRqacMG79KkRM62MLTS2evtRYjbnKvFzNPJPzr8827
 1AhCakE92S8gkR5lUZYYHtsaz9rZ4z4TrEtjO6GdlbL2bDw0l18dNNwdMomfVpNS
 bBIHIDLeufDuMJ4PxIHlE5MB3AuZAuc0HTJWihozyJX/h5FMGI6qVm0/s9RAfHgX
 XdMCpADtS/JjHCmkFgLZYIzvXTxwQVZRo5VO0Wrv5Mis6gSpxJXCd0aKlA==
 =1x9/
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver updates from Hans de Goede:
  "New drivers:
    - AMD Host System Management Port (HSMP)
    - Intel Software Defined Silicon

  Removed drivers (functionality folded into other drivers):
    - intel_cht_int33fe_microb
    - surface3_button

  amd-pmc:
    - s2idle bug-fixes
    - Support for AMD Spill to DRAM STB feature

  hp-wmi:
    - Fix SW_TABLET_MODE detection method (and other fixes)
    - Support omen thermal profile policy v1

  serial-multi-instantiate:
    - Add SPI device support
    - Add support for CS35L41 amplifiers used in new laptops

  think-lmi:
    - syfs-class-firmware-attributes Certificate authentication support

  thinkpad_acpi:
    - Fixes + quirks
    - Add platform_profile support on AMD based ThinkPads

  x86-android-tablets:
    - Improve Asus ME176C / TF103C support
    - Support Nextbook Ares 8, Lenovo Tab 2 830 and 1050 tablets

  Lots of various other small fixes and hardware-id additions"

* tag 'platform-drivers-x86-v5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (60 commits)
  platform/x86: think-lmi: Certificate authentication support
  Documentation: syfs-class-firmware-attributes: Lenovo Certificate support
  platform/x86: amd-pmc: Only report STB errors when STB enabled
  platform/x86: amd-pmc: Drop CPU QoS workaround
  platform/x86: amd-pmc: Output error codes in messages
  platform/x86: amd-pmc: Move to later in the suspend process
  ACPI / x86: Add support for LPS0 callback handler
  platform/x86: thinkpad_acpi: consistently check fan_get_status return.
  platform/x86: hp-wmi: support omen thermal profile policy v1
  platform/x86: hp-wmi: Changing bios_args.data to be dynamically allocated
  platform/x86: hp-wmi: Fix 0x05 error code reported by several WMI calls
  platform/x86: hp-wmi: Fix SW_TABLET_MODE detection method
  platform/x86: hp-wmi: Fix hp_wmi_read_int() reporting error (0x05)
  platform/x86: amd-pmc: Validate entry into the deepest state on resume
  platform/x86: thinkpad_acpi: Don't use test_bit on an integer
  platform/x86: thinkpad_acpi: Fix compiler warning about uninitialized err variable
  platform/x86: thinkpad_acpi: clean up dytc profile convert
  platform/x86: x86-android-tablets: Depend on EFI and SPI
  platform/x86: amd-pmc: uninitialized variable in amd_pmc_s2d_init()
  platform/x86: intel-uncore-freq: fix uncore_freq_common_init() error codes
  ...
2022-03-25 12:14:39 -07:00