Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Update the CPU features to include identifying and reporting on the
Secure Memory Encryption (SME) feature. SME is identified by CPUID
0x8000001f, but requires BIOS support to enable it (set bit 23 of
MSR_K8_SYSCFG). Only show the SME feature as available if reported by
CPUID, enabled by BIOS and not configured as CONFIG_X86_32=y.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Toshimitsu Kani <toshi.kani@hpe.com>
Cc: kasan-dev@googlegroups.com
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/85c17ff450721abccddc95e611ae8df3f4d9718b.1500319216.git.thomas.lendacky@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Better machine check handling for HV KVM
- Ability to support guests with threads=2, 4 or 8 on POWER9
- Fix for a race that could cause delayed recognition of signals
- Fix for a bug where POWER9 guests could sleep with interrupts pending.
ARM:
- VCPU request overhaul
- allow timer and PMU to have their interrupt number selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups
s390:
- initial machine check forwarding
- migration support for the CMMA page hinting information
- cleanups and fixes
x86:
- nested VMX bugfixes and improvements
- more reliable NMI window detection on AMD
- APIC timer optimizations
Generic:
- VCPU request overhaul + documentation of common code patterns
- kvm_stat improvements
There is a small conflict in arch/s390 due to an arch-wide field rename.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJZW4XTAAoJEL/70l94x66DkhMH/izpk54KI17PtyQ9VYI2sYeZ
BWK6Kl886g3ij4pFi3pECqjDJzWaa3ai+vFfzzpJJ8OkCJT5Rv4LxC5ERltVVmR8
A3T1I/MRktSC0VJLv34daPC2z4Lco/6SPipUpPnL4bE2HATKed4vzoOjQ3tOeGTy
dwi7TFjKwoVDiM7kPPDRnTHqCe5G5n13sZ49dBe9WeJ7ttJauWqoxhlYosCGNPEj
g8ZX8+cvcAhVnz5uFL8roqZ8ygNEQq2mgkU18W8ZZKuiuwR0gdsG0gSBFNTdwIMK
NoreRKMrw0+oLXTIB8SZsoieU6Qi7w3xMAMabe8AJsvYtoersugbOmdxGCr1lsA=
=OD7H
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"PPC:
- Better machine check handling for HV KVM
- Ability to support guests with threads=2, 4 or 8 on POWER9
- Fix for a race that could cause delayed recognition of signals
- Fix for a bug where POWER9 guests could sleep with interrupts pending.
ARM:
- VCPU request overhaul
- allow timer and PMU to have their interrupt number selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups
s390:
- initial machine check forwarding
- migration support for the CMMA page hinting information
- cleanups and fixes
x86:
- nested VMX bugfixes and improvements
- more reliable NMI window detection on AMD
- APIC timer optimizations
Generic:
- VCPU request overhaul + documentation of common code patterns
- kvm_stat improvements"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits)
Update my email address
kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS
x86: kvm: mmu: use ept a/d in vmcs02 iff used in vmcs12
kvm: x86: mmu: allow A/D bits to be disabled in an mmu
x86: kvm: mmu: make spte mmio mask more explicit
x86: kvm: mmu: dead code thanks to access tracking
KVM: PPC: Book3S: Fix typo in XICS-on-XIVE state saving code
KVM: PPC: Book3S HV: Close race with testing for signals on guest entry
KVM: PPC: Book3S HV: Simplify dynamic micro-threading code
KVM: x86: remove ignored type attribute
KVM: LAPIC: Fix lapic timer injection delay
KVM: lapic: reorganize restart_apic_timer
KVM: lapic: reorganize start_hv_timer
kvm: nVMX: Check memory operand to INVVPID
KVM: s390: Inject machine check into the nested guest
KVM: s390: Inject machine check into the guest
tools/kvm_stat: add new interactive command 'b'
tools/kvm_stat: add new command line switch '-i'
tools/kvm_stat: fix error on interactive command 'g'
KVM: SVM: suppress unnecessary NMI singlestep on GIF=0 and nested exit
...
- Rework suspend-to-idle to allow it to take wakeup events signaled
by the EC into account on ACPI-based platforms in order to properly
support power button wakeup from suspend-to-idle on recent Dell
laptops (Rafael Wysocki).
That includes the core suspend-to-idle code rework, support for
the Low Power S0 _DSM interface, and support for the ACPI INT0002
Virtual GPIO device from Hans de Goede (required for USB keyboard
wakeup from suspend-to-idle to work on some machines).
- Stop trying to export the current CPU frequency via /proc/cpuinfo
on x86 as that is inaccurate and confusing (Len Brown).
- Rework the way in which the current CPU frequency is exported by
the kernel (over the cpufreq sysfs interface) on x86 systems with
the APERF and MPERF registers by always using values read from
these registers, when available, to compute the current frequency
regardless of which cpufreq driver is in use (Len Brown).
- Rework the PCI/ACPI device wakeup infrastructure to remove the
questionable and artificial distinction between "devices that
can wake up the system from sleep states" and "devices that can
generate wakeup signals in the working state" from it, which
allows the code to be simplified quite a bit (Rafael Wysocki).
- Fix the wakeup IRQ framework by making it use SRCU instead of
RCU which doesn't allow sleeping in the read-side critical
sections, but which in turn is expected to be allowed by the
IRQ bus locking infrastructure (Thomas Gleixner).
- Modify some computations in the intel_pstate driver to avoid
rounding errors resulting from them (Srinivas Pandruvada).
- Reduce the overhead of the intel_pstate driver in the HWP
(hardware-managed P-states) mode and when the "performance"
P-state selection algorithm is in use by making it avoid
registering scheduler callbacks in those cases (Len Brown).
- Rework the energy_performance_preference sysfs knob in
intel_pstate by changing the values that correspond to
different symbolic hint names used by it (Len Brown).
- Make it possible to use more than one cpuidle driver at the same
time on ARM (Daniel Lezcano).
- Make it possible to prevent the cpuidle menu governor from using
the 0 state by disabling it via sysfs (Nicholas Piggin).
- Add support for FFH (Fixed Functional Hardware) MWAIT in ACPI C1
on AMD systems (Yazen Ghannam).
- Make the CPPC cpufreq driver take the lowest nonlinear performance
information into account (Prashanth Prakash).
- Add support for hi3660 to the cpufreq-dt driver, fix the
imx6q driver and clean up the sfi, exynos5440 and intel_pstate
drivers (Colin Ian King, Krzysztof Kozlowski, Octavian Purdila,
Rafael Wysocki, Tao Wang).
- Fix a few minor issues in the generic power domains (genpd)
framework and clean it up somewhat (Krzysztof Kozlowski,
Mikko Perttunen, Viresh Kumar).
- Fix a couple of minor issues in the operating performance points
(OPP) framework and clean it up somewhat (Viresh Kumar).
- Fix a CONFIG dependency in the hibernation core and clean it up
slightly (Balbir Singh, Arvind Yadav, BaoJun Luo).
- Add rk3228 support to the rockchip-io adaptive voltage scaling
(AVS) driver (David Wu).
- Fix an incorrect bit shift operation in the RAPL power capping
driver (Adam Lessnau).
- Add support for the EPP field in the HWP (hardware managed
P-states) control register, HWP.EPP, to the x86_energy_perf_policy
tool and update msr-index.h with HWP.EPP values (Len Brown).
- Fix some minor issues in the turbostat tool (Len Brown).
- Add support for AMD family 0x17 CPUs to the cpupower tool and fix
a minor issue in it (Sherry Hurwitz).
- Assorted cleanups, mostly related to the constification of some
data structures (Arvind Yadav, Joe Perches, Kees Cook, Krzysztof
Kozlowski).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZWrICAAoJEILEb/54YlRxZYMQAIRhfbyDxKq+ByvSilUS8kTA
AItwJ8FFzykhiwN75Cqabg4rAGyWma7IRs1vzU7zeC1aEQIn+bTQtvk+utZNI+g2
ANFlDha20q/sXsP/CDMMTIAdW9tSOC0TOvFI9s2V2Y8dJZhoekO4ctx34FAfUS5d
Ao6rwSAWCMsCXcGaTAlqTA+TEJmBG7u6Iq6hq6ngltoFwOv3mWWBVn52VVaJ7SMp
9/IPbbLGMFAedrgEBRGCR+MME1xZZpvcZIJaTt1Mgn7Cx3cJaysIUAvqY/SsvFGq
5FcUTcF2qpK3+AGawiAxZIjvOBsGRtIwqKinNIzYWs/NjiIdzmgVAmTeuPtTqp+5
HFehUdtkFcnuDnLqSNzAaZUa7tw84cJkwnbVMnesx0MkG6rZ1SeL22E2Sabpcdsh
3Yo1ThzJSxi59DhiiE92EQnNCEjmCldRy+8q5Ag035muxl6EJYvuNBMnZv/BMCUn
ltSNOrmps1DlN+Col8ORIeNzQ1YjYzWMqKAYzSbyccm4ug/iSHx0/DuESmQ4GTlF
YCwkmqyWiHrBwpl51jc+4a7SGlMmKRqU+MJes0CjagaaqoUAb8qeBOpzEJ0yNwjZ
wtI41l6blE6kbMD3yqGdCfiB2S7GlPVoxa15eX1wRyLH3fLjwwrzJirEaiBS86tI
1PzHZEOmBlh3DYC6DBKA
=Wsph
-----END PGP SIGNATURE-----
Merge tag 'pm-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"The big ticket items here are the rework of suspend-to-idle in order
to add proper support for power button wakeup from it on recent Dell
laptops and the rework of interfaces exporting the current CPU
frequency on x86.
In addition to that, support for a few new pieces of hardware is
added, the PCI/ACPI device wakeup infrastructure is simplified
significantly and the wakeup IRQ framework is fixed to unbreak the IRQ
bus locking infrastructure.
Also, there are some functional improvements for intel_pstate, tools
updates and small fixes and cleanups all over.
Specifics:
- Rework suspend-to-idle to allow it to take wakeup events signaled
by the EC into account on ACPI-based platforms in order to properly
support power button wakeup from suspend-to-idle on recent Dell
laptops (Rafael Wysocki).
That includes the core suspend-to-idle code rework, support for the
Low Power S0 _DSM interface, and support for the ACPI INT0002
Virtual GPIO device from Hans de Goede (required for USB keyboard
wakeup from suspend-to-idle to work on some machines).
- Stop trying to export the current CPU frequency via /proc/cpuinfo
on x86 as that is inaccurate and confusing (Len Brown).
- Rework the way in which the current CPU frequency is exported by
the kernel (over the cpufreq sysfs interface) on x86 systems with
the APERF and MPERF registers by always using values read from
these registers, when available, to compute the current frequency
regardless of which cpufreq driver is in use (Len Brown).
- Rework the PCI/ACPI device wakeup infrastructure to remove the
questionable and artificial distinction between "devices that can
wake up the system from sleep states" and "devices that can
generate wakeup signals in the working state" from it, which allows
the code to be simplified quite a bit (Rafael Wysocki).
- Fix the wakeup IRQ framework by making it use SRCU instead of RCU
which doesn't allow sleeping in the read-side critical sections,
but which in turn is expected to be allowed by the IRQ bus locking
infrastructure (Thomas Gleixner).
- Modify some computations in the intel_pstate driver to avoid
rounding errors resulting from them (Srinivas Pandruvada).
- Reduce the overhead of the intel_pstate driver in the HWP
(hardware-managed P-states) mode and when the "performance" P-state
selection algorithm is in use by making it avoid registering
scheduler callbacks in those cases (Len Brown).
- Rework the energy_performance_preference sysfs knob in intel_pstate
by changing the values that correspond to different symbolic hint
names used by it (Len Brown).
- Make it possible to use more than one cpuidle driver at the same
time on ARM (Daniel Lezcano).
- Make it possible to prevent the cpuidle menu governor from using
the 0 state by disabling it via sysfs (Nicholas Piggin).
- Add support for FFH (Fixed Functional Hardware) MWAIT in ACPI C1 on
AMD systems (Yazen Ghannam).
- Make the CPPC cpufreq driver take the lowest nonlinear performance
information into account (Prashanth Prakash).
- Add support for hi3660 to the cpufreq-dt driver, fix the imx6q
driver and clean up the sfi, exynos5440 and intel_pstate drivers
(Colin Ian King, Krzysztof Kozlowski, Octavian Purdila, Rafael
Wysocki, Tao Wang).
- Fix a few minor issues in the generic power domains (genpd)
framework and clean it up somewhat (Krzysztof Kozlowski, Mikko
Perttunen, Viresh Kumar).
- Fix a couple of minor issues in the operating performance points
(OPP) framework and clean it up somewhat (Viresh Kumar).
- Fix a CONFIG dependency in the hibernation core and clean it up
slightly (Balbir Singh, Arvind Yadav, BaoJun Luo).
- Add rk3228 support to the rockchip-io adaptive voltage scaling
(AVS) driver (David Wu).
- Fix an incorrect bit shift operation in the RAPL power capping
driver (Adam Lessnau).
- Add support for the EPP field in the HWP (hardware managed
P-states) control register, HWP.EPP, to the x86_energy_perf_policy
tool and update msr-index.h with HWP.EPP values (Len Brown).
- Fix some minor issues in the turbostat tool (Len Brown).
- Add support for AMD family 0x17 CPUs to the cpupower tool and fix a
minor issue in it (Sherry Hurwitz).
- Assorted cleanups, mostly related to the constification of some
data structures (Arvind Yadav, Joe Perches, Kees Cook, Krzysztof
Kozlowski)"
* tag 'pm-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (69 commits)
cpufreq: Update scaling_cur_freq documentation
cpufreq: intel_pstate: Clean up after performance governor changes
PM: hibernate: constify attribute_group structures.
cpuidle: menu: allow state 0 to be disabled
intel_idle: Use more common logging style
PM / Domains: Fix missing default_power_down_ok comment
PM / Domains: Fix unsafe iteration over modified list of domains
PM / Domains: Fix unsafe iteration over modified list of domain providers
PM / Domains: Fix unsafe iteration over modified list of device links
PM / Domains: Handle safely genpd_syscore_switch() call on non-genpd device
PM / Domains: Call driver's noirq callbacks
PM / core: Drop run_wake flag from struct dev_pm_info
PCI / PM: Simplify device wakeup settings code
PCI / PM: Drop pme_interrupt flag from struct pci_dev
ACPI / PM: Consolidate device wakeup settings code
ACPI / PM: Drop run_wake from struct acpi_device_wakeup_flags
PM / QoS: constify *_attribute_group.
PM / AVS: rockchip-io: add io selectors and supplies for rk3228
powercap/RAPL: prevent overridding bits outside of the mask
PM / sysfs: Constify attribute groups
...
Bits 11:2 must be zero and the linear addess in bits 63:12 must be
canonical. Otherwise, WRMSR(BNDCFGS) should raise #GP.
Fixes: 0dd376e709 ("KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save")
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Currently, the SMIs are visible to all performance counters, because
many users want to measure everything including SMIs. But in some
cases, the SMI cycles should not be counted - for example, to calculate
the cost of an SMI itself. So a knob is needed.
When setting FREEZE_WHILE_SMM bit in IA32_DEBUGCTL, all performance
counters will be effected. There is no way to do per-counter freeze
on SMI. So it should not use the per-event interface (e.g. ioctl or
event attribute) to set FREEZE_WHILE_SMM bit.
Adds sysfs entry /sys/device/cpu/freeze_on_smi to set FREEZE_WHILE_SMM
bit in IA32_DEBUGCTL. When set, freezes perfmon and trace messages
while in SMM.
Value has to be 0 or 1. It will be applied to all processors.
Also serialize the entire setting so we don't get multiple concurrent
threads trying to update to different values.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: bp@alien8.de
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1494600673-244667-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
x = 1
ulong_long = x << 32;
results in:
warning: left shift count >= width of type
x = 8
ulong_long = x << 24;
results in a sign extended ulong_long
Cast x to unsigned long long in these macros
to prevent these errors.
Signed-off-by: Len Brown <len.brown@intel.com>
The Hardware Performance State request MSR has a field
to express the "Energy Performance Preference" (HWP.EPP).
Decode that field so the definition may be shared by
by the intel_pstate driver and any utilities that
decode the same register.
Signed-off-by: Len Brown <len.brown@intel.com>
Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. Exposing this feature to userspace will allow a
ptracer to trap and emulate the CPUID instruction.
When supported, this feature is controlled by toggling bit 0 of
MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
https://bugzilla.kernel.org/attachment.cgi?id=243991
Implement a new pair of arch_prctls, available on both x86-32 and x86-64.
ARCH_GET_CPUID: Returns the current CPUID state, either 0 if CPUID faulting
is enabled (and thus the CPUID instruction is not available) or 1 if
CPUID faulting is not enabled.
ARCH_SET_CPUID: Set the CPUID state to the second argument. If
cpuid_enabled is 0 CPUID faulting will be activated, otherwise it will
be deactivated. Returns ENODEV if CPUID faulting is not supported on
this system.
The state of the CPUID faulting flag is propagated across forks, but reset
upon exec.
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Robert O'Callahan <robert@ocallahan.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: user-mode-linux-user@lists.sourceforge.net
Cc: David Matlack <dmatlack@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170320081628.18952-9-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
When enabled, the processor will fault on attempts to execute the CPUID
instruction with CPL>0. This will allow a ptracer to emulate the CPUID
instruction.
Bit 31 of MSR_PLATFORM_INFO advertises support for this feature. It is
documented in detail in Section 2.3.2 of
https://bugzilla.kernel.org/attachment.cgi?id=243991
Detect support for this feature and expose it as X86_FEATURE_CPUID_FAULT.
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Robert O'Callahan <robert@ocallahan.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: user-mode-linux-user@lists.sourceforge.net
Cc: David Matlack <dmatlack@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170320081628.18952-8-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This matches the only public Intel documentation of this MSR, in the
"Virtualization Technology FlexMigration Application Note"
(preserved at https://bugzilla.kernel.org/attachment.cgi?id=243991)
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Robert O'Callahan <robert@ocallahan.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: user-mode-linux-user@lists.sourceforge.net
Cc: David Matlack <dmatlack@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170320081628.18952-2-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The debug control MSR is "highly magical" as the blockstep bit can be
cleared by hardware under not well documented circumstances.
So a task switch relying on the bit set by the previous task (according to
the previous tasks thread flags) can trip over this and not update the flag
for the next task.
To fix this its required to handle DEBUGCTLMSR_BTF when either the previous
or the next or both tasks have the TIF_BLOCKSTEP flag set.
While at it avoid branching within the TIF_BLOCKSTEP case and evaluating
boot_cpu_data twice in kernels without CONFIG_X86_DEBUGCTLMSR.
x86_64: arch/x86/kernel/process.o
text data bss dec hex
3024 8577 16 11617 2d61 Before
3008 8577 16 11601 2d51 After
i386: No change
[ tglx: Made the shift value explicit, use a local variable to make the
code readable and massaged changelog]
Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/20170214081104.9244-3-khuey@kylehuey.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull changes related to turbostat for v4.11 from Len Brown.
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (44 commits)
tools/power turbostat: version 17.02.24
tools/power turbostat: bugfix: --add u32 was printed as u64
tools/power turbostat: show error on exec
tools/power turbostat: dump p-state software config
tools/power turbostat: show package number, even without --debug
tools/power turbostat: support "--hide C1" etc.
tools/power turbostat: move --Package and --processor into the --cpu option
tools/power turbostat: turbostat.8 update
tools/power turbostat: update --list feature
tools/power turbostat: use wide columns to display large numbers
tools/power turbostat: Add --list option to show available header names
tools/power turbostat: fix zero IRQ count shown in one-shot command mode
tools/power turbostat: add --cpu parameter
tools/power turbostat: print sysfs C-state stats
tools/power turbostat: extend --add option to accept /sys path
tools/power turbostat: skip unused counters on BDX
tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits
tools/power turbostat: skip unused counters on SKX
tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7
tools/power turbostat: initial Gemini Lake SOC support
...
This non-architectural MSR has disable bits
for various prefetchers on modern processors.
While these bits are generally touched only by the BIOS,
say, via BIOS SETUP, it is useful to dump them
when examining options that can alter performance.
Cc: x86@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
The Baytrail SOC, with its Silvermont core, has some unique properties:
1. a hardware CC1 residency counter
2. a module-c6 residency counter
3. a package-c6 counter at traditional package-c7 counter address.
The SOC does not support c3, pc3, c7 or pc7 counters.
Signed-off-by: Len Brown <len.brown@intel.com>
The two users, intel_idle driver and turbostat utility
are using the new name, MSR_PKG_CST_CONFIG_CONTROL
Cc: x86@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
define MSR_PKG_CST_CONFIG_CONTROL (0xE2),
which is the string used by Intel Documentation.
We use this MSR in intel_idle and turbostat by a previous name,
to be updated in the next patch.
Cc: x86@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
Define new MSR MISC_FEATURE_ENABLES (0x140).
On supported CPUs if bit 1 of this MSR is set, then calling MONITOR and
MWAIT instructions outside of ring 0 will not cause invalid-opcode
exception.
The MSR MISC_FEATURE_ENABLES is not yet documented in the SDM. Here is the
relevant documentation:
Hex Dec Name Scope
140H 320 MISC_FEATURE_ENABLES Thread
0 Reserved
1 If set to 1, the MONITOR and MWAIT instructions do not
cause invalid-opcode exceptions when executed with CPL > 0
or in virtual-8086 mode. If MWAIT is executed when CPL > 0
or in virtual-8086 mode, and if EAX indicates a C-state
other than C0 or C1, the instruction operates as if EAX
indicated the C-state C1.
63:2 Reserved
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: Piotr.Luc@intel.com
Cc: dave.hansen@linux.intel.com
Link: http://lkml.kernel.org/r/1484918557-15481-2-git-send-email-grzegorz.andrejczuk@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Intel Xeons from Ivy Bridge onwards support a processor identification
number set in the factory. To the user this is a handy unique number to
identify a particular CPU. Intel can decode this to the fab/production
run to track errors. On systems that have it, include it in the machine
check record. I'm told that this would be helpful for users that run
large data centers with multi-socket servers to keep track of which CPUs
are seeing errors.
Boris:
* Add some clarifying comments and spacing.
* Mask out [63:2] in the disabled-but-not-locked case
* Call the MSR variable "val" for more readability.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20161123114855.njguoaygp3qnbkia@pd.tnic
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Remove MSR_NHM_TURBO_RATIO_LIMIT and MSR_IVT_TURBO_RATIO_LIMIT as
they are duplicate.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Nothing outside of the Intel PT driver should ever care about its MSR
bits, so there is no reason to keep them in msr-index.h. This patch
moves them to a pt-local header.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1461771888-10409-3-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add Skylake client support for RAPL domains. In addition to RAPL domains
in Broadwell clients, it has support for platform domain (aka PSys). The
PSys domain controls the entire SoC instead of just a CPU package. Unlike
package domain, PSys support requires more than just processor level
implementation. The other parts in the system need additional HW level
signaling, which OEMs need to support. When not supported, the energy
counter register in PSys domain returns 0.
Also corrected error in comment for GPU counter, which previously was
DRAM counter.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com
[ Cnverted to model_match stuff. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: jacob.jun.pan@linux.intel.com
Cc: rjw@rjwysocki.net
Link: http://lkml.kernel.org/r/1460930581-29748-2-git-send-email-srinivas.pandruvada@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
* pm-core:
PM / wakeirq: fix wakeirq setting after wakup re-configuration from sysfs
PM / runtime: Document steps for device removal
* powercap:
powercap: intel_rapl: Add missing Haswell model
* pm-tools:
tools/power turbostat: work around RC6 counter wrap
tools/power turbostat: initial KBL support
tools/power turbostat: initial SKX support
tools/power turbostat: decode BXT TSC frequency via CPUID
tools/power turbostat: initial BXT support
tools/power turbostat: print IRTL MSRs
tools/power turbostat: SGX state should print only if --debug
Some processors use the Interrupt Response Time Limit (IRTL) MSR value
to describe the maximum IRQ response time latency for deep
package C-states. (Though others have the register, but do not use it)
Lets print it out to give insight into the cases where it is used.
IRTL begain in SNB, with PC3/PC6/PC7, and HSW added PC8/PC9/PC10.
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
AMD Carrizo (Family 15h, Model 60h) introduces a time-stamp counter
which is indicated by CPUID.8000_0001H:ECX[27]. It increments at a 100
MHz rate in all P-states, and C states, S0, or S1. The frequency is
about 100MHz. This counter will be used to calculate processor power
and other parts. So add an interface into the MSR PMU to get the PTSC
counter value.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1454056197-5893-2-git-send-email-ray.huang@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The list of CPU model specific registers contains two copies of TDP
registers, remove the one, which is out of numerical order in the
list.
Fixes: 6a35fc2d6c ("cpufreq: intel_pstate: get P1 from TAR when available")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Kristen Carlson
Accardi <kristen@linux.intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: http://lkml.kernel.org/r/1459018020-24577-1-git-send-email-vladimir_zapolskiy@mentor.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- Redesign of cpufreq governors and the intel_pstate driver to
make them use callbacks invoked by the scheduler to trigger CPU
frequency evaluation instead of using per-CPU deferrable timers
for that purpose (Rafael Wysocki).
- Reorganization and cleanup of cpufreq governor code to make it
more straightforward and fix some concurrency problems in it
(Rafael Wysocki, Viresh Kumar).
- Cleanup and improvements of locking in the cpufreq core (Viresh
Kumar).
- Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
Kumar, Eric Biggers).
- intel_pstate driver updates including fixes, optimizations and a
modification to make it enable enable hardware-coordinated P-state
selection (HWP) by default if supported by the processor (Philippe
Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
Franciosi).
- Operating Performance Points (OPP) framework updates to improve
its handling of voltage regulators and device clocks and updates
of the cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).
- Updates of the powernv cpufreq driver to fix initialization
and cleanup problems in it and correct its worker thread handling
with respect to CPU offline, new powernv_throttle tracepoint
(Shilpasri Bhat).
- ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).
- ACPICA updates including one fix for a regression introduced
by previos changes in the ACPICA code (Bob Moore, Lv Zheng,
David Box, Colin Ian King).
- Support for installing ACPI tables from initrd (Lv Zheng).
- Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
Chaugule).
- Support for _HID(ACPI0010) devices (ACPI processor containers)
and ACPI processor driver cleanups (Sudeep Holla).
- Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
Aleksey Makarov).
- Modification of the ACPI PCI IRQ management code to make it treat
255 in the Interrupt Line register as "not connected" on x86 (as
per the specification) and avoid attempts to use that value as
a valid interrupt vector (Chen Fan).
- ACPI APEI fixes related to resource leaks (Josh Hunt).
- Removal of modularity from a few ACPI drivers (BGRT, GHES,
intel_pmic_crc) that cannot be built as modules in practice (Paul
Gortmaker).
- PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
as a valid resource type (Harb Abdulhamid).
- New device ID (future AMD I2C controller) in the ACPI driver for
AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).
- Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).
- cpuidle menu governor optimization to avoid a square root
computation in it (Rasmus Villemoes).
- Fix for potential use-after-free in the generic device properties
framework (Heikki Krogerus).
- Updates of the generic power domains (genpd) framework including
support for multiple power states of a domain, fixes and debugfs
output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
Geert Uytterhoeven).
- Intel RAPL power capping driver updates to reduce IPI overhead in
it (Jacob Pan).
- System suspend/hibernation code cleanups (Eric Biggers, Saurabh
Sengar).
- Year 2038 fix for the process freezer (Abhilash Jindal).
- turbostat utility updates including new features (decoding of more
registers and CPUID fields, sub-second intervals support, GFX MHz
and RC6 printout, --out command line option), fixes (syscall jitter
detection and workaround, reductioin of the number of syscalls made,
fixes related to Xeon x200 processors, compiler warning fixes) and
cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJW50NXAAoJEILEb/54YlRxvr8QAIktC9+ft0y5AmU46hDcBWcK
QutyWJL9X9BS6DWBJZA2qclDYFmhMfi5Fza1se0gQ9TnLB/KrBwHWLsiYoTsb1k+
nPKf214aPk+qAhkVuyB4leNWML9Qz9n9jwku/EYxWWpgtbSRf3+0ioIKZeWWc/8V
JvuaOu4O+g/tkmL7QTrnGWBwhIIssAAV85QPsHkx+g68MrCj4UMMzm7z9G21SPXX
bmP8yIHsczX/XnRsY0W2NSno7Vdk6ImHpDJ26IAZg28WRNPWICHgGYHvB0TTWMvb
tts+yqfF7/7QLRjT/M8k9CzDBDE/DnVqoZ0fNJ+aYr7hNKF32mtAN+jH9ZB9dl/P
fEFapJkPxnWyzAoVoB9Dz0rkcZkYMlbxlLWzUGpaPq0JflUUTzLk0ApSjmMn4HRO
UddwCDdyHTaYThp3gn6GbOb0pIP0SdOVbI1M2QV2x/4PLcT2Ft8Np1+1RFWOeinZ
Bdl9AE890big0808mqbBzw/buETwr9FjHtCdDPXpP0vJpkBLu3nIYRNb0LCt39es
mWMp6dFhGgvGj3D3ahTuV3GI8hdpDkh9SObexa11RCjkTKrXcwEmFxHxLeFXwKYq
alG278bo6cSChRMziS1lis+W/3tsJRN4TXUSv1PPzJHrFgptQVFRStU9ngBKP+pN
WB+itPc4Fw0YHOrAFsrx
=cfty
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"This time the majority of changes go into cpufreq and they are
significant.
First off, the way CPU frequency updates are triggered is different
now. Instead of having to set up and manage a deferrable timer for
each CPU in the system to evaluate and possibly change its frequency
periodically, cpufreq governors set up callbacks to be invoked by the
scheduler on a regular basis (basically on utilization updates). The
"old" governors, "ondemand" and "conservative", still do all of their
work in process context (although that is triggered by the scheduler
now), but intel_pstate does it all in the callback invoked by the
scheduler with no need for any additional asynchronous processing.
Of course, this eliminates the overhead related to the management of
all those timers, but also it allows the cpufreq governor code to be
simplified quite a bit. On top of that, the common code and data
structures used by the "ondemand" and "conservative" governors are
cleaned up and made more straightforward and some long-standing and
quite annoying problems are addressed. In particular, the handling of
governor sysfs attributes is modified and the related locking becomes
more fine grained which allows some concurrency problems to be avoided
(particularly deadlocks with the core cpufreq code).
In principle, the new mechanism for triggering frequency updates
allows utilization information to be passed from the scheduler to
cpufreq. Although the current code doesn't make use of it, in the
works is a new cpufreq governor that will make decisions based on the
scheduler's utilization data. That should allow the scheduler and
cpufreq to work more closely together in the long run.
In addition to the core and governor changes, cpufreq drivers are
updated too. Fixes and optimizations go into intel_pstate, the
cpufreq-dt driver is updated on top of some modification in the
Operating Performance Points (OPP) framework and there are fixes and
other updates in the powernv cpufreq driver.
Apart from the cpufreq updates there is some new ACPICA material,
including a fix for a problem introduced by previous ACPICA updates,
and some less significant changes in the ACPI code, like CPPC code
optimizations, ACPI processor driver cleanups and support for loading
ACPI tables from initrd.
Also updated are the generic power domains framework, the Intel RAPL
power capping driver and the turbostat utility and we have a bunch of
traditional assorted fixes and cleanups.
Specifics:
- Redesign of cpufreq governors and the intel_pstate driver to make
them use callbacks invoked by the scheduler to trigger CPU
frequency evaluation instead of using per-CPU deferrable timers for
that purpose (Rafael Wysocki).
- Reorganization and cleanup of cpufreq governor code to make it more
straightforward and fix some concurrency problems in it (Rafael
Wysocki, Viresh Kumar).
- Cleanup and improvements of locking in the cpufreq core (Viresh
Kumar).
- Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
Kumar, Eric Biggers).
- intel_pstate driver updates including fixes, optimizations and a
modification to make it enable enable hardware-coordinated P-state
selection (HWP) by default if supported by the processor (Philippe
Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
Franciosi).
- Operating Performance Points (OPP) framework updates to improve its
handling of voltage regulators and device clocks and updates of the
cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).
- Updates of the powernv cpufreq driver to fix initialization and
cleanup problems in it and correct its worker thread handling with
respect to CPU offline, new powernv_throttle tracepoint (Shilpasri
Bhat).
- ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).
- ACPICA updates including one fix for a regression introduced by
previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box,
Colin Ian King).
- Support for installing ACPI tables from initrd (Lv Zheng).
- Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
Chaugule).
- Support for _HID(ACPI0010) devices (ACPI processor containers) and
ACPI processor driver cleanups (Sudeep Holla).
- Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
Aleksey Makarov).
- Modification of the ACPI PCI IRQ management code to make it treat
255 in the Interrupt Line register as "not connected" on x86 (as
per the specification) and avoid attempts to use that value as a
valid interrupt vector (Chen Fan).
- ACPI APEI fixes related to resource leaks (Josh Hunt).
- Removal of modularity from a few ACPI drivers (BGRT, GHES,
intel_pmic_crc) that cannot be built as modules in practice (Paul
Gortmaker).
- PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
as a valid resource type (Harb Abdulhamid).
- New device ID (future AMD I2C controller) in the ACPI driver for
AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).
- Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).
- cpuidle menu governor optimization to avoid a square root
computation in it (Rasmus Villemoes).
- Fix for potential use-after-free in the generic device properties
framework (Heikki Krogerus).
- Updates of the generic power domains (genpd) framework including
support for multiple power states of a domain, fixes and debugfs
output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
Geert Uytterhoeven).
- Intel RAPL power capping driver updates to reduce IPI overhead in
it (Jacob Pan).
- System suspend/hibernation code cleanups (Eric Biggers, Saurabh
Sengar).
- Year 2038 fix for the process freezer (Abhilash Jindal).
- turbostat utility updates including new features (decoding of more
registers and CPUID fields, sub-second intervals support, GFX MHz
and RC6 printout, --out command line option), fixes (syscall jitter
detection and workaround, reductioin of the number of syscalls
made, fixes related to Xeon x200 processors, compiler warning
fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)"
* tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits)
tools/power turbostat: bugfix: TDP MSRs print bits fixing
tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
tools/power turbostat: call __cpuid() instead of __get_cpuid()
tools/power turbostat: indicate SMX and SGX support
tools/power turbostat: detect and work around syscall jitter
tools/power turbostat: show GFX%rc6
tools/power turbostat: show GFXMHz
tools/power turbostat: show IRQs per CPU
tools/power turbostat: make fewer systems calls
tools/power turbostat: fix compiler warnings
tools/power turbostat: add --out option for saving output in a file
tools/power turbostat: re-name "%Busy" field to "Busy%"
tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
tools/power turbostat: allow sub-sec intervals
ACPI / APEI: ERST: Fixed leaked resources in erst_init
ACPI / APEI: Fix leaked resources
intel_pstate: Do not skip samples partially
intel_pstate: Remove freq calculation from intel_pstate_calc_busy()
intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()
...
Pull turbostat updates for 4.6 from Len Brown.
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: bugfix: TDP MSRs print bits fixing
tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
tools/power turbostat: call __cpuid() instead of __get_cpuid()
tools/power turbostat: indicate SMX and SGX support
tools/power turbostat: detect and work around syscall jitter
tools/power turbostat: show GFX%rc6
tools/power turbostat: show GFXMHz
tools/power turbostat: show IRQs per CPU
tools/power turbostat: make fewer systems calls
tools/power turbostat: fix compiler warnings
tools/power turbostat: add --out option for saving output in a file
tools/power turbostat: re-name "%Busy" field to "Busy%"
tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
tools/power turbostat: allow sub-sec intervals
tools/power turbostat: Decode MSR_MISC_PWR_MGMT
tools/power turbostat: decode HWP registers
x86 msr-index: Simplify syntax for HWP fields
tools/power turbostat: CPUID(0x16) leaf shows base, max, and bus frequency
tools/power turbostat: decode more CPUID fields
In order to keep this file's size sensible and not cause too much
unnecessary churn, make the rule explicit - similar to pci_ids.h - that
only MSRs which are used in multiple compilation units, should get added
to it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: alex.williamson@redhat.com
Cc: gleb@kernel.org
Cc: joro@8bytes.org
Cc: kvm@vger.kernel.org
Cc: sherry.hurwitz@amd.com
Cc: wei@redhat.com
Link: http://lkml.kernel.org/r/1455612202-14414-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The kernel accesses IC_CFG MSR (0xc0011021) on AMD because it
checks whether the way access filter is enabled on some F15h
models, and, if so, disables it.
kvm doesn't handle that MSR access and complains about it, which
can get really noisy in dmesg when one starts kvm guests all the
time for testing. And it is useless anyway - guest kernel
shouldn't be doing such changes anyway so tell it that that
filter is disabled.
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1448273546-2567-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
MSR_NHM_PLATFORM_INFO has been replaced by...
MSR_PLATFORM_INFO
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After Ivybridge, the max non turbo ratio obtained from platform info msr
is not always guaranteed P1 on client platforms. The max non turbo
activation ratio (TAR), determines the max for the current level of TDP.
The ratio in platform info is physical max. The TAR MSR can be locked,
so updating this value is not possible on all platforms.
This change gets this ratio from MSR TURBO_ACTIVATION_RATIO if
available,
but also do some sanity checking to make sure that this value is
correct.
The sanity check involves reading the TDP ratio for the current tdp
control value when platform has configurable TDP present and matching
TAC
with this.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull perf fixes from Thomas Gleixner:
"Another pile of fixes for perf:
- Plug overflows and races in the core code
- Sanitize the flow of the perf syscall so we error out before
handling the more complex and hard to undo setups
- Improve and fix Broadwell and Skylake hardware support
- Revert a fix which broke what it tried to fix in perf tools
- A couple of smaller fixes in various places of perf tools"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Fix copying of /proc/kcore
perf intel-pt: Remove no_force_psb from documentation
perf probe: Use existing routine to look for a kernel module by dso->short_name
perf/x86: Change test_aperfmperf() and test_intel() to static
tools lib traceevent: Fix string handling in heterogeneous arch environments
perf record: Avoid infinite loop at buildid processing with no samples
perf: Fix races in computing the header sizes
perf: Fix u16 overflows
perf: Restructure perf syscall point of no return
perf/x86/intel: Fix Skylake FRONTEND MSR extrareg mask
perf/x86/intel/pebs: Add PEBS frontend profiling for Skylake
perf/x86/intel: Make the CYCLE_ACTIVITY.* constraint on Broadwell more specific
perf tools: Bool functions shouldn't return -1
tools build: Add test for presence of __get_cpuid() gcc builtin
tools build: Add test for presence of numa_num_possible_cpus() in libnuma
Revert "perf symbols: Fix mismatched declarations for elf_getphdrnum"
perf stat: Fix per-pkg event reporting bug
These have roughly the same purpose as the SMRR, which we do not need
to implement in KVM. However, Linux accesses MSR_K8_TSEG_ADDR at
boot, which causes problems when running a Xen dom0 under KVM.
Just return 0, meaning that processor protection of SMRAM is not
in effect.
Reported-by: M A Young <m.a.young@durham.ac.uk>
Cc: stable@vger.kernel.org
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Skylake has a new FRONTEND_LATENCY PEBS event to accurately profile
frontend problems (like ITLB or decoding issues).
The new event is configured through a separate MSR, which selects
a range of sub events.
Define the extra MSR as a extra reg and export support for it
through sysfs. To avoid duplicating the existing
tables use a new function to add new entries to existing tables.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1435707205-6676-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- ACPICA update to upstream revision 20150818 including method
tracing extensions to allow more in-depth AML debugging in the
kernel and a number of assorted fixes and cleanups (Bob Moore,
Lv Zheng, Markus Elfring).
- ACPI sysfs code updates and a documentation update related to
AML method tracing (Lv Zheng).
- ACPI EC driver fix related to serialized evaluations of _Qxx
methods and ACPI tools updates allowing the EC userspace tool
to be built from the kernel source (Lv Zheng).
- ACPI processor driver updates preparing it for future
introduction of CPPC support and ACPI PCC mailbox driver
updates (Ashwin Chaugule).
- ACPI interrupts enumeration fix for a regression related
to the handling of IRQ attribute conflicts between MADT
and the ACPI namespace (Jiang Liu).
- Fixes related to ACPI device PM (Mika Westerberg, Srinidhi Kasagar).
- ACPI device registration code reorganization to separate the
sysfs-related code and bus type operations from the rest (Rafael
J Wysocki).
- Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause,
Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss).
- ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups
(Pan Xinhui, Rafael J Wysocki).
- cpufreq core cleanups on top of the previous changes allowing it
to preseve its sysfs directories over system suspend/resume (Viresh
Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior).
- cpufreq fixes and cleanups related to governors (Viresh Kumar).
- cpufreq updates (core and the cpufreq-dt driver) related to the
turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz).
- New DT bindings for Operating Performance Points (OPP), support
for them in the OPP framework and in the cpufreq-dt driver plus
related OPP framework fixes and cleanups (Viresh Kumar).
- cpufreq powernv driver updates (Shilpasri G Bhat).
- New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen).
- Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups
and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean).
- intel_pstate driver updates including Skylake-S support, support
for enabling HW P-states per CPU and an additional vendor bypass
list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao).
- cpuidle core fixes related to the handling of coupled idle states
(Xunlei Pang).
- intel_idle driver updates including Skylake Client support and
support for freeze-mode-specific idle states (Len Brown).
- Driver core updates related to power management (Andy Shevchenko,
Rafael J Wysocki).
- Generic power domains framework fixes and cleanups (Jon Hunter,
Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson).
- Device PM QoS framework update to allow the latency tolerance
setting to be exposed to user space via sysfs (Mika Westerberg).
- devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect
exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas).
- System sleep support updates (Alan Stern, Len Brown, SungEun Kim).
- rockchip-io AVS support updates (Heiko Stuebner).
- PM core clocks support fixup (Colin Ian King).
- Power capping RAPL driver update including support for Skylake H/S
and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi).
- Generic device properties framework fixes related to the handling
of static (driver-provided) property sets (Andy Shevchenko).
- turbostat and cpupower updates (Len Brown, Shilpasri G Bhat,
Shreyas B Prabhu).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJV5hhGAAoJEILEb/54YlRxs+EQAK51iFk48+IbpHYaZZ50Yo4m
ZZc2zBcbwRcBlU9vKERrhG+jieSl8J/JJNxT8vBjKqyvNw038mCjewQh02ol0HuC
R7nlDiVJkmZ50sLO4xwE/1UBZr/XqbddwCUnYzvFMkMTA0ePzFtf8BrJ1FXpT8S/
fkwSXQty6hvJDwxkfrbMSaA730wMju9lahx8D6MlmUAedWYZOJDMQKB4WKa/St5X
9uckBPHUBB2KiKlXxdbFPwKLNxHvLROq5SpDLc6cM/7XZB+QfNFy85CUjCUtYo1O
1W8k0qnztvZ6UEv27qz5dejGyAGOarMWGGNsmL9evoeGeHRpQL+dom7HcTnbAfUZ
walyhYSm/zKkdy7Vl3xWUUQkMG48+PviMI6K0YhHXb3Rm5wlR/yBNZTwNIty9SX/
fKCHEa8QynWwLxgm53c3xRkiitJxMsHNK03moLD9zQMjshTyTNvpNbZoahyKQzk6
H+9M1DBRHhkkREDWSwGutukxfEMtWe2vcZcyERrFiY7l5k1j58DwDBMPqjPhRv6q
P/1NlCzr0XYf83Y86J18LbDuPGDhTjjIEn6CqbtI2mmWqTg3+rF7zvS2ux+FzMnA
gisv8l6GT9JiWhxKFqqL/rrVpwtyHebWLYE/RpNUW6fEzLziRNj1qyYO9dqI/GGi
I3rfxlXoc/5xJWCgNB8f
=fTgI
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"From the number of commits perspective, the biggest items are ACPICA
and cpufreq changes with the latter taking the lead (over 50 commits).
On the cpufreq front, there are many cleanups and minor fixes in the
core and governors, driver updates etc. We also have a new cpufreq
driver for Mediatek MT8173 chips.
ACPICA mostly updates its debug infrastructure and adds a number of
fixes and cleanups for a good measure.
The Operating Performance Points (OPP) framework is updated with new
DT bindings and support for them among other things.
We have a few updates of the generic power domains framework and a
reorganization of the ACPI device enumeration code and bus type
operations.
And a lot of fixes and cleanups all over.
Included is one branch from the MFD tree as it contains some
PM-related driver core and ACPI PM changes a few other commits are
based on.
Specifics:
- ACPICA update to upstream revision 20150818 including method
tracing extensions to allow more in-depth AML debugging in the
kernel and a number of assorted fixes and cleanups (Bob Moore, Lv
Zheng, Markus Elfring).
- ACPI sysfs code updates and a documentation update related to AML
method tracing (Lv Zheng).
- ACPI EC driver fix related to serialized evaluations of _Qxx
methods and ACPI tools updates allowing the EC userspace tool to be
built from the kernel source (Lv Zheng).
- ACPI processor driver updates preparing it for future introduction
of CPPC support and ACPI PCC mailbox driver updates (Ashwin
Chaugule).
- ACPI interrupts enumeration fix for a regression related to the
handling of IRQ attribute conflicts between MADT and the ACPI
namespace (Jiang Liu).
- Fixes related to ACPI device PM (Mika Westerberg, Srinidhi
Kasagar).
- ACPI device registration code reorganization to separate the
sysfs-related code and bus type operations from the rest (Rafael J
Wysocki).
- Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause,
Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss).
- ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups (Pan
Xinhui, Rafael J Wysocki).
- cpufreq core cleanups on top of the previous changes allowing it to
preseve its sysfs directories over system suspend/resume (Viresh
Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior).
- cpufreq fixes and cleanups related to governors (Viresh Kumar).
- cpufreq updates (core and the cpufreq-dt driver) related to the
turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz).
- New DT bindings for Operating Performance Points (OPP), support for
them in the OPP framework and in the cpufreq-dt driver plus related
OPP framework fixes and cleanups (Viresh Kumar).
- cpufreq powernv driver updates (Shilpasri G Bhat).
- New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen).
- Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups
and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean).
- intel_pstate driver updates including Skylake-S support, support
for enabling HW P-states per CPU and an additional vendor bypass
list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao).
- cpuidle core fixes related to the handling of coupled idle states
(Xunlei Pang).
- intel_idle driver updates including Skylake Client support and
support for freeze-mode-specific idle states (Len Brown).
- Driver core updates related to power management (Andy Shevchenko,
Rafael J Wysocki).
- Generic power domains framework fixes and cleanups (Jon Hunter,
Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson).
- Device PM QoS framework update to allow the latency tolerance
setting to be exposed to user space via sysfs (Mika Westerberg).
- devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect
exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas).
- System sleep support updates (Alan Stern, Len Brown, SungEun Kim).
- rockchip-io AVS support updates (Heiko Stuebner).
- PM core clocks support fixup (Colin Ian King).
- Power capping RAPL driver update including support for Skylake H/S
and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi).
- Generic device properties framework fixes related to the handling
of static (driver-provided) property sets (Andy Shevchenko).
- turbostat and cpupower updates (Len Brown, Shilpasri G Bhat,
Shreyas B Prabhu)"
* tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (180 commits)
cpufreq: speedstep-lib: Use monotonic clock
cpufreq: powernv: Increase the verbosity of OCC console messages
cpufreq: sfi: use kmemdup rather than duplicating its implementation
cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor()
cpufreq: rename cpufreq_real_policy as cpufreq_user_policy
cpufreq: remove redundant 'policy' field from user_policy
cpufreq: remove redundant 'governor' field from user_policy
cpufreq: update user_policy.* on success
cpufreq: use memcpy() to copy policy
cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event
cpufreq: mediatek: Add MT8173 cpufreq driver
dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings
PM / Domains: Fix typo in description of genpd_dev_pm_detach()
PM / Domains: Remove unusable governor dummies
PM / Domains: Make pm_genpd_init() available to modules
PM / domains: Align column headers and data in pm_genpd_summary output
powercap / RAPL: disable the 2nd power limit properly
tools: cpupower: Fix error when running cpupower monitor
PM / OPP: Drop unlikely before IS_ERR(_OR_NULL)
PM / OPP: Fix static checker warning (broken 64bit big endian systems)
...
Pull turbostat changes for v4.3 from Len Brown.
* 'turbostat' of https://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: fix typo on DRAM column in Joules-mode
tools/power turbostat: fix parameter passing for forked command
tools/power turbostat: dump CONFIG_TDP
tools/power turbostat: cpu0 is no longer hard-coded, so update output
tools/power turbostat: update turbostat(8)
Add new MSRs (LBR_INFO) and some new MSR bits used by the Intel Skylake
PMU driver.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285767-27027-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Intel PT chapter in the new Intel Architecture SDM adds several packets
corresponding enable bits and registers that control packet generation.
Also, additional bits in the Intel PT CPUID leaf were added to enumerate
presence and parameters of these new packets and features.
The packets and enables are:
* CYC: cycle accurate mode, provides the number of cycles elapsed since
previous CYC packet; its presence and available threshold values are
enumerated via CPUID;
* MTC: mini time counter packets, used for tracking TSC time between
full TSC packets; its presence and available resolution options are
enumerated via CPUID;
* PSB packet period is now configurable, available period values are
enumerated via CPUID.
This patch adds corresponding bit and register definitions, pmu driver
capabilities based on CPUID enumeration, new attribute format bits for
the new featurens and extends event configuration validation function
to take these into account.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1438262131-12725-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This header containing all MSRs and respective bit definitions
got exported to userspace in conjunction with the big UAPI
shuffle.
But, it doesn't belong in the UAPI headers because userspace can
do its own MSR defines and exporting them from the kernel blocks
us from doing cleanups/renames in that header. Which is
ridiculous - it is not kernel's job to export such a header and
keep MSRs list and their names stable.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433436928-31903-19-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
Pull KVM updates from Marcelo Tosatti:
"Considerable KVM/PPC work, x86 kvmclock vsyscall support,
IA32_TSC_ADJUST MSR emulation, amongst others."
Fix up trivial conflict in kernel/sched/core.c due to cross-cpu
migration notifier added next to rq migration call-back.
* tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (156 commits)
KVM: emulator: fix real mode segment checks in address linearization
VMX: remove unneeded enable_unrestricted_guest check
KVM: VMX: fix DPL during entry to protected mode
x86/kexec: crash_vmclear_local_vmcss needs __rcu
kvm: Fix irqfd resampler list walk
KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump
x86/kexec: VMCLEAR VMCSs loaded on all cpus if necessary
KVM: MMU: optimize for set_spte
KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface
KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation
KVM: PPC: bookehv: Add guest computation mode for irq delivery
KVM: PPC: Make EPCR a valid field for booke64 and bookehv
KVM: PPC: booke: Extend MAS2 EPN mask for 64-bit
KVM: PPC: e500: Mask MAS2 EPN high 32-bits in 32/64 tlbwe emulation
KVM: PPC: Mask ea's high 32-bits in 32/64 instr emulation
KVM: PPC: e500: Add emulation helper for getting instruction ea
KVM: PPC: bookehv64: Add support for interrupt handling
KVM: PPC: bookehv: Remove GET_VCPU macro from exception handler
KVM: PPC: booke: Fix get_tb() compile error on 64-bit
KVM: PPC: e500: Silence bogus GCC warning in tlb code
...
CPUID.7.0.EBX[1]=1 indicates IA32_TSC_ADJUST MSR 0x3b is supported
Basic design is to emulate the MSR by allowing reads and writes to a guest
vcpu specific location to store the value of the emulated MSR while adding
the value to the vmcs tsc_offset. In this way the IA32_TSC_ADJUST value will
be included in all reads to the TSC MSR whether through rdmsr or rdtsc. This
is of course as long as the "use TSC counter offsetting" VM-execution control
is enabled as well as the IA32_TSC_ADJUST control.
However, because hardware will only return the TSC + IA32_TSC_ADJUST +
vmsc tsc_offset for a guest process when it does and rdtsc (with the correct
settings) the value of our virtualized IA32_TSC_ADJUST must be stored in one
of these three locations. The argument against storing it in the actual MSR
is performance. This is likely to be seldom used while the save/restore is
required on every transition. IA32_TSC_ADJUST was created as a way to solve
some issues with writing TSC itself so that is not an option either.
The remaining option, defined above as our solution has the problem of
returning incorrect vmcs tsc_offset values (unless we intercept and fix, not
done here) as mentioned above. However, more problematic is that storing the
data in vmcs tsc_offset will have a different semantic effect on the system
than does using the actual MSR. This is illustrated in the following example:
The hypervisor set the IA32_TSC_ADJUST, then the guest sets it and a guest
process performs a rdtsc. In this case the guest process will get
TSC + IA32_TSC_ADJUST_hyperviser + vmsc tsc_offset including
IA32_TSC_ADJUST_guest. While the total system semantics changed the semantics
as seen by the guest do not and hence this will not cause a problem.
Signed-off-by: Will Auld <will.auld@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
If the TSC deadline mode is supported, LAPIC timer one-shot mode can be
implemented using IA32_TSC_DEADLINE MSR. An interrupt will be generated
when the TSC value equals or exceeds the value in the IA32_TSC_DEADLINE
MSR.
This enables us to skip the APIC calibration during boot. Also, in
xapic mode, this enables us to skip the uncached apic access to re-arm
the APIC timer.
As this timer ticks at the high frequency TSC rate, we use the
TSC_DIVISOR (32) to work with the 32-bit restrictions in the
clockevent API's to avoid 64-bit divides etc (frequency is u32 and
"unsigned long" in the set_next_event(), max_delta limits the next
event to 32-bit for 32-bit kernel).
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: venki@google.com
Cc: len.brown@intel.com
Link: http://lkml.kernel.org/r/1350941878.6017.31.camel@sbsiddha-desk.sc.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull perf updates from Ingo Molnar:
"This tree includes some late late perf items that missed the first
round:
tools:
- Bash auto completion improvements, now we can auto complete the
tools long options, tracepoint event names, etc, from Namhyung Kim.
- Look up thread using tid instead of pid in 'perf sched'.
- Move global variables into a perf_kvm struct, from David Ahern.
- Hists refactorings, preparatory for improved 'diff' command, from
Jiri Olsa.
- Hists refactorings, preparatory for event group viewieng work, from
Namhyung Kim.
- Remove double negation on optional feature macro definitions, from
Namhyung Kim.
- Remove several cases of needless global variables, on most
builtins.
- misc fixes
kernel:
- sysfs support for IBS on AMD CPUs, from Robert Richter.
- Support for an upcoming Intel CPU, the Xeon-Phi / Knights Corner
HPC blade PMU, from Vince Weaver.
- misc fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
perf: Fix perf_cgroup_switch for sw-events
perf: Clarify perf_cpu_context::active_pmu usage by renaming it to ::unique_pmu
perf/AMD/IBS: Add sysfs support
perf hists: Add more helpers for hist entry stat
perf hists: Move he->stat.nr_events initialization to a template
perf hists: Introduce struct he_stat
perf diff: Removing the total_period argument from output code
perf tool: Add hpp interface to enable/disable hpp column
perf tools: Removing hists pair argument from output path
perf hists: Separate overhead and baseline columns
perf diff: Refactor diff displacement possition info
perf hists: Add struct hists pointer to struct hist_entry
perf tools: Complete tracepoint event names
perf/x86: Add support for Intel Xeon-Phi Knights Corner PMU
perf evlist: Remove some unused methods
perf evlist: Introduce add_newtp method
perf kvm: Move global variables into a perf_kvm struct
perf tools: Convert to BACKTRACE_SUPPORT
perf tools: Long option completion support for each subcommands
perf tools: Complete long option names of perf command
...
The following patch adds perf_event support for the Xeon-Phi
PMU, as documented in the "Intel Xeon Phi Coprocessor (codename:
Knights Corner) Performance Monitoring Units" manual.
Even though it is a co-processor, a Phi runs a full Linux
environment and can support performance counters.
This is just barebones support, it does not add support for
interesting new features such as the SPFLT intruction that
allows starting/stopping events without entering the kernel.
The PMU internally is just like that of an original Pentium, but
a "P6-like" MSR interface is provided. The interface is
different enough from a real P6 that it's not easy (or
practical) to re-use the code in perf_event_p6.c
Acked-by: Lawrence F Meadows <lawrence.f.meadows@intel.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: eranian@gmail.com
Cc: Lawrence F <lawrence.f.meadows@intel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1209261405320.8398@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some AMD systems may round the frequencies in ACPI tables to 100MHz
boundaries. We can obtain the real frequencies from MSRs, so add a quirk
to fix these frequencies up on AMD systems.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The programming model for P-states on modern AMD CPUs is very similar to
that of Intel and VIA. It makes sense to consolidate this support into one
driver rather than duplicating functionality between two of them. This
patch adds support for AMDs with hardware P-state control to acpi-cpufreq.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch implements code to handle ibs interrupts. If ibs data
is available a raw perf_event data sample is created and sent
back to the userland. This patch only implements the storage of
ibs data in the raw sample, but this could be extended in a
later patch by generating generic event data such as the rip
from the ibs sampling data.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323968199-9326-3-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the LBR definitions for NHM/WSM/SNB and Core.
It also adds the definitions for the architected LBR MSR:
LBR_SELECT, LBRT_TOS.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This pre-defination is preparing for KVM tsc deadline timer emulation, but
theirself are not kvm specific.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Fix a trivial typo in the name of the constant
ENERGY_PERF_BIAS_POWERSAVE. This didn't cause trouble because this
constant is not currently used for anything.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Len Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/tip-abe48b108247e9b90b4c6739662a2e5c765ed114@git.kernel.org
Since 2.6.36 (23016bf0d2), Linux prints the existence of "epb" in /proc/cpuinfo,
Since 2.6.38 (d5532ee7b4), the x86_energy_perf_policy(8) utility has
been available in-tree to update MSR_IA32_ENERGY_PERF_BIAS.
However, the typical BIOS fails to initialize the MSR, presumably
because this is handled by high-volume shrink-wrap operating systems...
Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8).
As a result, WSM-EP, SNB, and later hardware from Intel will run in its
default hardware power-on state (performance), which assumes that users
care for performance at all costs and not for energy efficiency.
While that is fine for performance benchmarks, the hardware's intended default
operating point is "normal" mode...
Initialize the MSR to the "normal" by default during kernel boot.
x86_energy_perf_policy(8) is available to change the default after boot,
should the user have a different preference.
Signed-off-by: Len Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107140051020.18606@x980
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org>
When the guest can use VMX instructions (when the "nested" module option is
on), it should also be able to read and write VMX MSRs, e.g., to query about
VMX capabilities. This patch adds this support.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch enhances the kvm_amd module with functions to
support the TSC_RATE_MSR which can be used to set a given
tsc frequency for the guest vcpu.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch disables GartTlbWlk errors on AMD Fam10h CPUs if
the BIOS forgets to do is (or is just too old). Letting
these errors enabled can cause a sync-flood on the CPU
causing a reboot.
The AMD BKDG recommends disabling GART TLB Wlk Error completely.
This patch is the fix for
https://bugzilla.kernel.org/show_bug.cgi?id=33012
on my machine.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Link: http://lkml.kernel.org/r/20110415131152.GJ18463@8bytes.org
Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
A correction to Intel cpu model CPUID data (patch queued)
caused winxp to BSOD when booted with a Penryn model.
This was traced to the CPUID "model" field correction from
6 -> 23 (as is proper for a Penryn class of cpu). Only in
this case does the problem surface.
The cause for this failure is winxp accessing the BBL_CR_CTL3
MSR which is unsupported by current kvm, appears to be a
legacy MSR not fully characterized yet existing in current
silicon, and is apparently carried forward in MSR space to
accommodate vintage code as here. It is not yet conclusive
whether this MSR implements any of its legacy functionality
or is just an ornamental dud for compatibility. While I
found no silicon version specific documentation link to
this MSR, a general description exists in Intel's developer's
reference which agrees with the functional behavior of
other bootloader/kernel code I've examined accessing
BBL_CR_CTL3. Regrettably winxp appears to be setting bit #19
called out as "reserved" in the above document.
So to minimally accommodate this MSR, kvm msr get will provide
the equivalent mock data and kvm msr write will simply toss the
guest passed data without interpretation. While this treatment
of BBL_CR_CTL3 addresses the immediate problem, the approach may
be modified pending clarification from Intel.
Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Change logs against Andi's original version:
- Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra)
- Fixed a major event scheduling issue. There cannot be a ref++ on an
event that has already done ref++ once and without calling
put_constraint() in between. (Stephane Eranian)
- Use thread_cpumask for percore allocation. (Lin Ming)
- Use MSR names in the extra reg lists. (Lin Ming)
- Remove redundant "c = NULL" in intel_percore_constraints
- Fix comment of perf_event_attr::config1
Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event
that can be used to monitor any offcore accesses from a core.
This is a very useful event for various tunings, and it's
also needed to implement the generic LLC-* events correctly.
Unfortunately this event requires programming a mask in a separate
register. And worse this separate register is per core, not per
CPU thread.
This patch:
- Teaches perf_events that OFFCORE_RESPONSE needs extra parameters.
The extra parameters are passed by user space in the
perf_event_attr::config1 field.
- Adds support to the Intel perf_event core to schedule per
core resources. This adds fairly generic infrastructure that
can be also used for other per core resources.
The basic code has is patterned after the similar AMD northbridge
constraints code.
Thanks to Stephane Eranian who pointed out some problems
in the original version and suggested improvements.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Just as we had to disable auto-demotion for NHM/WSM,
we need to do the same for Atom (Lincroft version).
In particular, auto-demotion will prevent Lincroft
from entering the S0i3 idle power saving state.
https://bugzilla.kernel.org/show_bug.cgi?id=25252
Signed-off-by: Len Brown <len.brown@intel.com>
Hardware C-state auto-demotion is a mechanism where the HW overrides
the OS C-state request, instead demoting to a shallower state,
which is less expensive, but saves less power.
Modern Linux should generally get exactly the states it requests.
In particular, when a CPU is taken off-line, it must not be demoted, else
it can prevent the entire package from reaching deep C-states.
https://bugzilla.kernel.org/show_bug.cgi?id=25252
Signed-off-by: Len Brown <len.brown@intel.com>
* 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resume
* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-64, asm: Use fxsaveq/fxrestorq in more places
* 'x86-hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, hwmon: Add core threshold notification to therm_throt.c
* 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, paravirt: Use native_halt on a halt, not native_safe_halt
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
locking, lockdep: Convert sprintf_symbol to %pS
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
irq: Better struct irqaction layout
This patch adds code to therm_throt.c to notify core thermal threshold
events. These thresholds are supported by the IA32_THERM_INTERRUPT register.
The status/log for the same is monitored using the IA32_THERM_STATUS register.
The necessary #defines are in msr-index.h. A call back is added to mce.h, to
further notify the thermal stack, about the threshold events.
Signed-off-by: Durgadoss R <durgadoss.r@intel.com>
LKML-Reference: <D6D887BA8C9DFF48B5233887EF04654105C1251710@bgsmsx502.gar.corp.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch adds support for up to 6 hardware counters for AMD family
15h cpus. There is a new MSR range for hardware counters beginning at
MSRC001_0200 Performance Event Select (PERF_CTL0).
Signed-off-by: Robert Richter <robert.richter@amd.com>
Candidate memory ranges were not calculated properly (start
addresses got needlessly rounded down, and end addresses didn't
get rounded up at all), address comparison for secondary CPUs
was done on only part of the address, and disabled status wasn't
tracked properly.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <4CE24DF40200007800022737@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds support for IBS branch target address reporting. A new
MSR (MSRC001_103B IBS Branch Target Address) has been added that
provides the logical address in canonical form for the branch
target. The size of the IBS sample that is transferred to the userland
has been increased.
For backward compatibility, the userland daemon must explicit enable
the feature by writing to the oprofilefs file
ibs_op/branch_target
After enabling branch target address reporting, the userland daemon
must handle the extended size of the IBS sample.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch enables setting of efer bit 13 which is allowed
in all SVM capable processors. This is necessary for the
SLES11 version of Xen 4.0 to boot with nested svm.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Add package level thermal and power limit feature support.
The two MSRs and features are new starting with Intel's Sandy Bridge processor.
Please check Intel 64 and IA-32 Architectures SDMV Vol 3A 14.5.6 Power Limit
Notification and 14.6 Package Level Thermal Management.
This patch also fixes a bug which defines reverse THERM_INT_LOW_ENABLE bit and
THERM_INT_HIGH_ENABLE bit.
[ hpa: fixed up against current tip:x86/cpu ]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
LKML-Reference: <1280448826-12004-2-git-send-email-fenghua.yu@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
MSR_K6_EFER is unused, and MSR_K6_STAR is redundant with MSR_STAR.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1279371808-24804-1-git-send-email-brgerst@gmail.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The new IA32_ENERGY_PERF_BIAS MSR allows system software to give
hardware a hint whether OS policy favors more power saving,
or more performance. This allows the OS to have some influence
on internal hardware power/performance tradeoffs where the OS
has previously had no influence.
The support for this feature is indicated by CPUID.06H.ECX.bit3,
as documented in the Intel Architectures Software Developer's Manual.
This patch discovers support of this feature and displays it
as "epb" in /proc/cpuinfo.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
LKML-Reference: <alpine.LFD.2.00.1006032310160.6669@localhost.localdomain>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Rename CMCI_EN to MCI_CTL2_CMCI_EN and CMCI_THRESHOLD_MASK to
MCI_CTL2_CMCI_THRESHOLD_MASK to make naming consistent.
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1275977348.3444.659.camel@yhuang-dev.sh.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This patch implements a workaround for AMD erratum 383 into
KVM. Without this erratum fix it is possible for a guest to
kill the host machine. This patch implements the suggested
workaround for hypervisors which will be published by the
next revision guide update.
[jan: fix overflow warning on i386]
[xiao: fix unused variable warning]
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The MSR IA32_TEMPERATURE_TARGET contains the TjMax value in the newer
Intel processors.
Signed-off-by: Huaxu Wan <huaxu.wan@linux.intel.com>
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Yong Wang <yong.y.wang@linux.intel.com>
Cc: Rudolf Marek <r.marek@assembler.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Per document, for feature control MSR:
Bit 1 enables VMXON in SMX operation. If the bit is clear, execution
of VMXON in SMX operation causes a general-protection exception.
Bit 2 enables VMXON outside SMX operation. If the bit is clear, execution
of VMXON outside SMX operation causes a general-protection exception.
This patch is to enable this kind of check with SMX for VMXON in KVM.
Signed-off-by: Shane Wang <shane.wang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Currently c1e_idle returns true for all CPUs greater than or equal to
family 0xf model 0x40. This covers too many CPUs.
Meanwhile a respective erratum for the underlying problem was filed
(#400). This patch adds the logic to check whether erratum #400
applies to a given CPU.
Especially for CPUs where SMI/HW triggered C1e is not supported,
c1e_idle() doesn't need to be used. We can check this by looking at
the respective OSVW bit for erratum #400.
Cc: <stable@kernel.org> # .32.x .33.x
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100319110922.GA19614@alberich.amd.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Move the HT bit setting code from p4_pmu_event_map to
p4_hw_config. So the cache events can get HT bit set correctly.
Tested on my P4 desktop, below 6 cache events work:
L1-dcache-load-misses
LLC-load-misses
dTLB-load-misses
dTLB-store-misses
iTLB-loads
iTLB-load-misses
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1268908392.13901.128.camel@minggr.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use NodeId MSR to get NodeId and number of nodes per processor.
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20091216144355.GB28798@alberich.amd.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Clean up write_tsc() and write_tscp_aux() by replacing
hardcoded values.
No change in functionality.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
LKML-Reference: <1260942485-19156-4-git-send-email-sheng@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits)
x86, mce: Fix compilation with !CONFIG_DEBUG_FS in mce-severity.c
x86, mce: CE in last bank prevents panic by unknown MCE
x86, mce: Fake panic support for MCE testing
x86, mce: Move debugfs mce dir creating to mce.c
x86, mce: Support specifying raise mode for software MCE injection
x86, mce: Support specifying context for software mce injection
x86, mce: fix reporting of Thermal Monitoring mechanism enabled
x86, mce: remove never executed code
x86, mce: add missing __cpuinit tags
x86, mce: fix "mce" boot option handling for CONFIG_X86_NEW_MCE
x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUs
x86: mce: Lower maximum number of banks to architecture limit
x86: mce: macros to compute banks MSRs
x86: mce: Move per bank data in a single datastructure
x86: mce: Move code in mce.c
x86: mce: Rename CONFIG_X86_NEW_MCE to CONFIG_X86_MCE
x86: mce: Remove old i386 machine check code
x86: mce: Update X86_MCE description in x86/Kconfig
x86: mce: Make CONFIG_X86_ANCIENT_MCE dependent on CONFIG_X86_MCE
x86, mce: use atomic_inc_return() instead of add by 1
...
Manually fixed up trivial conflicts:
Documentation/feature-removal-schedule.txt
arch/x86/kernel/cpu/mcheck/mce.c
Early Pentium M models use different method for enabling TM2
(per paragraph 13.5.2.3 of the "Intel 64 and IA-32 Architectures
Software Developer's Manual Volume 3A: System Programming Guide,
Part 1").
Tested on the affected Pentium M variant (model == 13).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>