Add pci_ats_supported(), which checks whether a device has an ATS
capability, and whether it is trusted. A device is untrusted if it is
plugged into an external-facing port such as Thunderbolt and could be
spoofing an existing device to exploit weaknesses in the IOMMU
configuration. PCIe ATS is one such weaknesses since it allows
endpoints to cache IOMMU translations and emit transactions with
'Translated' Address Type (10b) that partially bypass the IOMMU
translation.
The SMMUv3 and VT-d IOMMU drivers already disallow ATS and transactions
with 'Translated' Address Type for untrusted devices. Add the check to
pci_enable_ats() to let other drivers (AMD IOMMU for now) benefit from
it.
By checking ats_cap, the pci_ats_supported() helper also returns whether
ATS was globally disabled with pci=noats, and could later include more
things, for example whether the whole PCIe hierarchy down to the
endpoint supports ATS.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20200520152201.3309416-2-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Some Google Apex Edge TPU devices have a class code of 0
(PCI_CLASS_NOT_DEFINED). This prevents the PCI core from assigning
resources for the Apex BARs because __dev_sort_resources() ignores
classless devices, host bridges, and IOAPICs.
On x86, firmware typically assigns those resources, so this was not a
problem. But on some architectures, firmware does *not* assign BARs, and
since the PCI core didn't do it either, the Apex device didn't work
correctly:
apex 0000:01:00.0: can't enable device: BAR 0 [mem 0x00000000-0x00003fff 64bit pref] not claimed
apex 0000:01:00.0: error enabling PCI device
f390d08d8b ("staging: gasket: apex: fixup undefined PCI class") added a
quirk to fix the class code, but it was in the apex driver, and if the
driver was built as a module, it was too late to help.
Move the quirk to the PCI core, where it will always run early enough that
the PCI core will assign resources if necessary.
Link: https://lore.kernel.org/r/CAEzXK1r0Er039iERnc2KJ4jn7ySNUOG9H=Ha8TD8XroVqiZjgg@mail.gmail.com
Fixes: f390d08d8b ("staging: gasket: apex: fixup undefined PCI class")
Reported-by: Luís Mendes <luis.p.mendes@gmail.com>
Debugged-by: Luís Mendes <luis.p.mendes@gmail.com>
Tested-by: Luis Mendes <luis.p.mendes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Todd Poynor <toddpoynor@google.com>
Including:
- ARM-SMMU support for the TLB range invalidation command in
SMMUv3.2.
- ARM-SMMU introduction of command batching helpers to batch up
CD and ATC invalidation.
- ARM-SMMU support for PCI PASID, along with necessary PCI
symbol exports.
- Introduce a generic (actually rename an existing) IOMMU
related pointer in struct device and reduce the IOMMU related
pointers.
- Some fixes for the OMAP IOMMU driver to make it build on 64bit
architectures.
- Various smaller fixes and improvements.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl6MmlQACgkQK/BELZcB
GuP9ug//QtyPYRYdO4ltD6mPvfB7V0qdXksJz+ZVbPOMvqUs1jr1FVYFH1HkOVu5
mFD6OJuQQJrrGukXMERyVgDhUqNr+xHrkGS+X67NrOkUrguyvUfLYSU/GmOH/kdk
w1Smp7pTcHHAMmxGyQWTSFa9jSxKes5ZYBo065Z3/SlcIcTTkbw7V87N3RPrlnCX
s/K7CtSGnKJMpL9DVuNH27eqGlfiuIrQhj/vTQVSn1nF7TjaGKXaRXj+3fcUgrIt
KAfflWiTJncMY6WLjz65iiUtUvgA2Mmgn3CKJnWjgECd70+NybLQ9OAvQO+A2H6s
8XO9DsOOe8HFq/ljev1JGSw5LgB5Ip1RtSk7Ost6mkUFzLlaeTBJFQeHbECI9dne
hksRYL4R8bwiQu+MkQe7HLa6TDb+asqjsayIO3M1oIpF+8mIz/oNOGCeP0cqSiuj
lVMnblAWatrsZrf+AlxZKddIJWiduXoTjtpV64HTTvZeL4/g3kY0ykBXpS4xLj5V
s0KvR6kjR1LYUgpe9jJ3CJTdIlU4MzSlrtq4CYFZvRa7rBLmk2cGsR1jiA3GTGpn
bcqOQNgb5X1mpAzmOZb//pbjozgvCjQpQexyU4tRzs38yk+TK5OnOe5z4M1srHPY
7dTZoUEpAcRm4K+JFQ3+yOtxRTsINYyFUL/Qt8ALbWy4hXluRGY=
=nhuS
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- ARM-SMMU support for the TLB range invalidation command in SMMUv3.2
- ARM-SMMU introduction of command batching helpers to batch up CD and
ATC invalidation
- ARM-SMMU support for PCI PASID, along with necessary PCI symbol
exports
- Introduce a generic (actually rename an existing) IOMMU related
pointer in struct device and reduce the IOMMU related pointers
- Some fixes for the OMAP IOMMU driver to make it build on 64bit
architectures
- Various smaller fixes and improvements
* tag 'iommu-updates-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (39 commits)
iommu: Move fwspec->iommu_priv to struct dev_iommu
iommu/virtio: Use accessor functions for iommu private data
iommu/qcom: Use accessor functions for iommu private data
iommu/mediatek: Use accessor functions for iommu private data
iommu/renesas: Use accessor functions for iommu private data
iommu/arm-smmu: Use accessor functions for iommu private data
iommu/arm-smmu: Refactor master_cfg/fwspec usage
iommu/arm-smmu-v3: Use accessor functions for iommu private data
iommu: Introduce accessors for iommu private data
iommu/arm-smmu: Fix uninitilized variable warning
iommu: Move iommu_fwspec to struct dev_iommu
iommu: Rename struct iommu_param to dev_iommu
iommu/tegra-gart: Remove direct access of dev->iommu_fwspec
drm/msm/mdp5: Remove direct access of dev->iommu_fwspec
ACPI/IORT: Remove direct access of dev->iommu_fwspec
iommu: Define dev_iommu_fwspec_get() for !CONFIG_IOMMU_API
iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE
iommu/virtio: Fix freeing of incomplete domains
iommu/virtio: Fix sparse warning
iommu/vt-d: Add build dependency on IOASID
...
- A large series from Nick for 64-bit to further rework our exception vectors,
and rewrite portions of the syscall entry/exit and interrupt return in C. The
result is much easier to follow code that is also faster in general.
- Cleanup of our ptrace code to split various parts out that had become badly
intertwined with #ifdefs over the years.
- Changes to our NUMA setup under the PowerVM hypervisor which should
hopefully avoid non-sensical topologies which can lead to warnings from the
workqueue code and other problems.
- MAINTAINERS updates to remove some of our old orphan entries and update the
status of others.
- Quite a few other small changes and fixes all over the map.
Thanks to:
Abdul Haleem, afzal mohammed, Alexey Kardashevskiy, Andrew Donnellan, Aneesh
Kumar K.V, Balamuruhan S, Cédric Le Goater, Chen Zhou, Christophe JAILLET,
Christophe Leroy, Christoph Hellwig, Clement Courbet, Daniel Axtens, David
Gibson, Douglas Miller, Fabiano Rosas, Fangrui Song, Ganesh Goudar, Gautham R.
Shenoy, Greg Kroah-Hartman, Greg Kurz, Gustavo Luiz Duarte, Hari Bathini, Ilie
Halip, Jan Kara, Joe Lawrence, Joe Perches, Kajol Jain, Larry Finger,
Laurentiu Tudor, Leonardo Bras, Libor Pechacek, Madhavan Srinivasan, Mahesh
Salgaonkar, Masahiro Yamada, Masami Hiramatsu, Mauricio Faria de Oliveira,
Michael Neuling, Michal Suchanek, Mike Rapoport, Nageswara R Sastry, Nathan
Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers,
Oliver O'Halloran, Po-Hsu Lin, Pratik Rajesh Sampat, Rasmus Villemoes, Ravi
Bangoria, Roman Bolshakov, Sam Bobroff, Sandipan Das, Santosh S, Sedat Dilek,
Segher Boessenkool, Shilpasri G Bhat, Sourabh Jain, Srikar Dronamraju, Stephen
Rothwell, Tyrel Datwyler, Vaibhav Jain, YueHaibing.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl6JypATHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgOTyD/0U90tXb3VXlQcc4OFIb8vWIj76k4Zn
ZSZ7RyOuvb5pCISBZjSK79XkR9eMHT77qagX4V41q64k4yQl8nbgLeVnwL76hLLc
IJCs23f4nsO0uqX/MhSCc5dfOOOS2i8V+OQYtsYWsH5QaG95v0cHIqVaHHMlfQxu
507GO/W5W6KTd4x008b5unQOuE51zMKlKvqEJXkT59obQFpaa2S5Wn7OzhsnarCH
YSRNxaC7vtgBKLA9wUnFh8UUbh0FbOwXBCaq4OhHMhgRihdteVBCzlcR/6c+IRbt
EoZxKzfQ0hI1z5f++kJNaRXMtUbSpM8D1HdKKHgiWjpdBSD0eu2X106KQT2R2ZOF
qhX8xPLWNzdBglA6L43AaZUu+4ayd3QrrJIkjDv/K1rCHZjfGOzSQfoZgTEBNLFA
tC0crhEfw8m98e4EwhCtekGQxdczRdLS9YvtC/h6mU2xkpA35yNSwB1/iuVQdkYD
XyrEqImAQ1PJla7NL0hxSy5ZxrBtMeKT4WZZ0BNgKXryemldg8Tuv3AEyach3BHz
eU0pIwpbnPm1JAPyrpDQ1yEf7QsD77gTPfEvilEci60R9DhvIMGAY+pt0qfME3yX
wOLp2yVBEXlRmvHk/y/+r+m4aCsmwSrikbWwmLLwAAA6JehtzFOWxTEfNpACP23V
mZyyZznsHIIE3Q==
=ARdm
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Slightly late as I had to rebase mid-week to insert a bug fix:
- A large series from Nick for 64-bit to further rework our exception
vectors, and rewrite portions of the syscall entry/exit and
interrupt return in C. The result is much easier to follow code
that is also faster in general.
- Cleanup of our ptrace code to split various parts out that had
become badly intertwined with #ifdefs over the years.
- Changes to our NUMA setup under the PowerVM hypervisor which should
hopefully avoid non-sensical topologies which can lead to warnings
from the workqueue code and other problems.
- MAINTAINERS updates to remove some of our old orphan entries and
update the status of others.
- Quite a few other small changes and fixes all over the map.
Thanks to: Abdul Haleem, afzal mohammed, Alexey Kardashevskiy, Andrew
Donnellan, Aneesh Kumar K.V, Balamuruhan S, Cédric Le Goater, Chen
Zhou, Christophe JAILLET, Christophe Leroy, Christoph Hellwig, Clement
Courbet, Daniel Axtens, David Gibson, Douglas Miller, Fabiano Rosas,
Fangrui Song, Ganesh Goudar, Gautham R. Shenoy, Greg Kroah-Hartman,
Greg Kurz, Gustavo Luiz Duarte, Hari Bathini, Ilie Halip, Jan Kara,
Joe Lawrence, Joe Perches, Kajol Jain, Larry Finger, Laurentiu Tudor,
Leonardo Bras, Libor Pechacek, Madhavan Srinivasan, Mahesh Salgaonkar,
Masahiro Yamada, Masami Hiramatsu, Mauricio Faria de Oliveira, Michael
Neuling, Michal Suchanek, Mike Rapoport, Nageswara R Sastry, Nathan
Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick
Desaulniers, Oliver O'Halloran, Po-Hsu Lin, Pratik Rajesh Sampat,
Rasmus Villemoes, Ravi Bangoria, Roman Bolshakov, Sam Bobroff,
Sandipan Das, Santosh S, Sedat Dilek, Segher Boessenkool, Shilpasri G
Bhat, Sourabh Jain, Srikar Dronamraju, Stephen Rothwell, Tyrel
Datwyler, Vaibhav Jain, YueHaibing"
* tag 'powerpc-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (158 commits)
powerpc: Make setjmp/longjmp signature standard
powerpc/cputable: Remove unnecessary copy of cpu_spec->oprofile_type
powerpc: Suppress .eh_frame generation
powerpc: Drop -fno-dwarf2-cfi-asm
powerpc/32: drop unused ISA_DMA_THRESHOLD
powerpc/powernv: Add documentation for the opal sensor_groups sysfs interfaces
selftests/powerpc: Fix try-run when source tree is not writable
powerpc/vmlinux.lds: Explicitly retain .gnu.hash
powerpc/ptrace: move ptrace_triggered() into hw_breakpoint.c
powerpc/ptrace: create ppc_gethwdinfo()
powerpc/ptrace: create ptrace_get_debugreg()
powerpc/ptrace: split out ADV_DEBUG_REGS related functions.
powerpc/ptrace: move register viewing functions out of ptrace.c
powerpc/ptrace: split out TRANSACTIONAL_MEM related functions.
powerpc/ptrace: split out SPE related functions.
powerpc/ptrace: split out ALTIVEC related functions.
powerpc/ptrace: split out VSX related functions.
powerpc/ptrace: drop PARAMETER_SAVE_AREA_OFFSET
powerpc/ptrace: drop unnecessary #ifdefs CONFIG_PPC64
powerpc/ptrace: remove unused header includes
...
- Update maintainers. Niklas Schnelle takes over zpci and Vineeth Vijayan
common io code.
- Extend cpuinfo to include topology information.
- Add new extended counters for IBM z15 and sampling buffer allocation
rework in perf code.
- Add control over zeroing out memory during system restart.
- CCA protected key block version 2 support and other fixes/improvements
in crypto code.
- Convert to new fallthrough; annotations.
- Replace zero-length arrays with flexible-arrays.
- QDIO debugfs and other small improvements.
- Drop 2-level paging support optimization for compat tasks. Varios
mm cleanups.
- Remove broken and unused hibernate / power management support.
- Remove fake numa support which does not bring any benefits.
- Exclude offline CPUs from CPU topology masks to be more consistent
with other architectures.
- Prevent last branching instruction address leaking to userspace.
- Other small various fixes and improvements all over the code.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl6Ig2YACgkQjYWKoQLX
FBj2gggAibnHOl9d0ngX1mVT4nz51R3V8z5sEQjNMr2uHBmaTqs7pi/00gaFMxoC
NngVEXvL443jSogQivthGgXPpRCV9xdKE3sp38j7fF4LgHoeuDtGd1oaX4W9Rqk0
7Yii35EaO2e2WHdOKaAbu+ZvDRunFjERyntc51MYaIUivFosogSo07vC73vFIArF
VGStS09fJ4Ny76ott896T7Ulx1Iek/MkF1vponEMLGNUIcLIQbbxZxOwgz0pHuEF
SlyyJBnhOIaAJGOYlKREQDt1cew+hsxluPU+a01bwdsmdZv9LH1BGwLayDqTH58i
QWvtEpzJFmDvo9jGM1v81ebaGnyCKg==
=hiGF
-----END PGP SIGNATURE-----
Merge tag 's390-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Update maintainers. Niklas Schnelle takes over zpci and Vineeth
Vijayan common io code.
- Extend cpuinfo to include topology information.
- Add new extended counters for IBM z15 and sampling buffer allocation
rework in perf code.
- Add control over zeroing out memory during system restart.
- CCA protected key block version 2 support and other
fixes/improvements in crypto code.
- Convert to new fallthrough; annotations.
- Replace zero-length arrays with flexible-arrays.
- QDIO debugfs and other small improvements.
- Drop 2-level paging support optimization for compat tasks. Varios mm
cleanups.
- Remove broken and unused hibernate / power management support.
- Remove fake numa support which does not bring any benefits.
- Exclude offline CPUs from CPU topology masks to be more consistent
with other architectures.
- Prevent last branching instruction address leaking to userspace.
- Other small various fixes and improvements all over the code.
* tag 's390-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (57 commits)
s390/mm: cleanup init_new_context() callback
s390/mm: cleanup virtual memory constants usage
s390/mm: remove page table downgrade support
s390/qdio: set qdio_irq->cdev at allocation time
s390/qdio: remove unused function declarations
s390/ccwgroup: remove pm support
s390/ap: remove power management code from ap bus and drivers
s390/zcrypt: use kvmalloc instead of kmalloc for 256k alloc
s390/mm: cleanup arch_get_unmapped_area() and friends
s390/ism: remove pm support
s390/cio: use fallthrough;
s390/vfio: use fallthrough;
s390/zcrypt: use fallthrough;
s390: use fallthrough;
s390/cpum_sf: Fix wrong page count in error message
s390/diag: fix display of diagnose call statistics
s390/ap: Remove ap device suspend and resume callbacks
s390/pci: Improve handling of unset UID
s390/pci: Fix zpci_alloc_domain() over allocation
s390/qdio: pass ISC as parameter to chsc_sadc()
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl6GTQMUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vy3PhAAmqpYBRobOsG8QbmKDjoJEFtkqdvD
z6+4zf/R+hF11RyXjMDwihIe8d+tkQ4eAaYu6Oh5PrTyanz0G0PgeCrivZeytULk
thqQIWzDQMVA5vN/2/Vy8s5s+3HzP8z/MZOFScJ7+xA1MndXptPRTNmFUbjx+GAv
x8/pTp0u9AF6m7itX65DxXvwkzjWamt+Ar4Yx2IcuKAU/M5RtfuZO3PpDnqn7/wk
JFlkRoYeFB6qNnnkPdeyPHl9dALhuhzgdTyklQEnKVW3nf3xThYDhcEwdh6kBQgl
0dH8lL5LXy7PKGN8RES4wB0Vqndw/HlsCF5O4wkkfItbnbJxGJtS139e5973m0ud
sgWvF4yJAT2jCKhIeNz34sePQJMyWALhv0XzZCsJ0YeGHsrV1jrHELkwUT1+eIsT
3UV0iZ6aL06zQJDyKUbbIcQzEQ/wwBC+x9VgsyL54K1quCQZ1N1Nl/dvrb4cRG9m
m9EhJK/brDf4c0uFlOmMTSxV1t5J+z6ZSQnh1ShD/o5yBsxqN6q5brDT6LEs+jbM
LsIkA18jJOd4OyiDs98YiFKvIfFQbQ0LEBQpJwhF0snvfBFMMbUYN/T/NYneWON/
F0TpkFoP7PXDuq55iNaLdnObfzrpC9kdzUyWvePUvjxIl55bkf+/qtUny+H48t4L
dNggvW052d7BHes=
=deWu
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Revert sysfs "rescan" renames that broke apps (Kelsey Skunberg)
- Add more 32 GT/s link speed decoding and improve the implementation
(Yicong Yang)
Resource management:
- Add support for sizing programmable host bridge apertures and fix a
related alpha Nautilus regression (Ivan Kokshaysky)
Interrupts:
- Add boot interrupt quirk mechanism for Xeon chipsets and document
boot interrupts (Sean V Kelley)
PCIe native device hotplug:
- When possible, disable in-band presence detect and use PDS
(Alexandru Gagniuc)
- Add DMI table for devices that don't use in-band presence detection
but don't advertise that correctly (Stuart Hayes)
- Fix hang when powering slots up/down via sysfs (Lukas Wunner)
- Fix an MSI interrupt race (Stuart Hayes)
Virtualization:
- Add ACS quirks for Zhaoxin devices (Raymond Pang)
Error handling:
- Add Error Disconnect Recover (EDR) support so firmware can report
devices disconnected via DPC and we can try to recover (Kuppuswamy
Sathyanarayanan)
Peer-to-peer DMA:
- Add Intel Sky Lake-E Root Ports B, C, D to the whitelist (Andrew
Maier)
ASPM:
- Reduce severity of common clock config message (Chris Packham)
- Clear the correct bits when enabling L1 substates, so we don't go
to the wrong state (Yicong Yang)
Endpoint framework:
- Replace EPF linkup ops with notifier call chain and improve locking
(Kishon Vijay Abraham I)
- Fix concurrent memory allocation in OB address region (Kishon Vijay
Abraham I)
- Move PF function number assignment to EPC core to support multiple
function creation methods (Kishon Vijay Abraham I)
- Fix issue with clearing configfs "start" entry (Kunihiko Hayashi)
- Fix issue with endpoint MSI-X ignoring BAR Indicator and Table
Offset (Kishon Vijay Abraham I)
- Add support for testing DMA transfers (Kishon Vijay Abraham I)
- Add support for testing > 10 endpoint devices (Kishon Vijay Abraham I)
- Add support for tests to clear IRQ (Kishon Vijay Abraham I)
- Add common DT schema for endpoint controllers (Kishon Vijay Abraham I)
Amlogic Meson PCIe controller driver:
- Add DT bindings for AXG PCIe PHY, shared MIPI/PCIe analog PHY (Remi
Pommarel)
- Add Amlogic AXG PCIe PHY, AXG MIPI/PCIe analog PHY drivers (Remi
Pommarel)
Cadence PCIe controller driver:
- Add Root Complex/Endpoint DT schema for Cadence PCIe (Kishon Vijay
Abraham I)
Intel VMD host bridge driver:
- Add two VMD Device IDs that require bus restriction mode (Sushma
Kalakota)
Mobiveil PCIe controller driver:
- Refactor and modularize mobiveil driver (Hou Zhiqiang)
- Add support for Mobiveil GPEX Gen4 host (Hou Zhiqiang)
Microsoft Hyper-V host bridge driver:
- Add support for Hyper-V PCI protocol version 1.3 and
PCI_BUS_RELATIONS2 (Long Li)
- Refactor to prepare for virtual PCI on non-x86 architectures (Boqun
Feng)
- Fix memory leak in hv_pci_probe()'s error path (Dexuan Cui)
NVIDIA Tegra PCIe controller driver:
- Use pci_parse_request_of_pci_ranges() (Rob Herring)
- Add support for endpoint mode and related DT updates (Vidya Sagar)
- Reduce -EPROBE_DEFER error message log level (Thierry Reding)
Qualcomm PCIe controller driver:
- Restrict class fixup to specific Qualcomm devices (Bjorn Andersson)
Synopsys DesignWare PCIe controller driver:
- Refactor core initialization code for endpoint mode (Vidya Sagar)
- Fix endpoint MSI-X to use correct table address (Kishon Vijay
Abraham I)
TI DRA7xx PCIe controller driver:
- Fix MSI IRQ handling (Vignesh Raghavendra)
TI Keystone PCIe controller driver:
- Allow AM654 endpoint to raise MSI-X interrupt (Kishon Vijay Abraham I)
Miscellaneous:
- Quirk ASMedia XHCI USB to avoid "PME# from D0" defect (Kai-Heng
Feng)
- Use ioremap(), not phys_to_virt(), for platform ROM to fix video
ROM mapping with CONFIG_HIGHMEM (Mikel Rychliski)"
* tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (96 commits)
misc: pci_endpoint_test: remove duplicate macro PCI_ENDPOINT_TEST_STATUS
PCI: tegra: Print -EPROBE_DEFER error message at debug level
misc: pci_endpoint_test: Use full pci-endpoint-test name in request_irq()
misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices
tools: PCI: Add 'e' to clear IRQ
misc: pci_endpoint_test: Add ioctl to clear IRQ
misc: pci_endpoint_test: Avoid using module parameter to determine irqtype
PCI: keystone: Allow AM654 PCIe Endpoint to raise MSI-X interrupt
PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address
PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments
misc: pci_endpoint_test: Add support to get DMA option from userspace
tools: PCI: Add 'd' command line option to support DMA
misc: pci_endpoint_test: Use streaming DMA APIs for buffer allocation
PCI: endpoint: functions/pci-epf-test: Print throughput information
PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data
PCI: pciehp: Fix MSI interrupt race
PCI: pciehp: Fix indefinite wait on sysfs requests
PCI: endpoint: Fix clearing start entry in configfs
PCI: tegra: Add support for PCIe endpoint mode in Tegra194
PCI: sysfs: Revert "rescan" file renames
...
- Apply Qualcomm class fixup only to PCIe host bridges, not to all
PCI_VENDOR_ID_QCOM devices (Bjorn Andersson)
* remotes/lorenzo/pci/qcom:
PCI: qcom: Fix the fixup of PCI_VENDOR_ID_QCOM
- Restructure mobiveil driver to support either Root Complex mode or
Endpoint mode (Hou Zhiqiang)
- Collect host initialization into one place (Hou Zhiqiang)
- Collect interrupt-related code into one place (Hou Zhiqiang)
- Split mobiveil into separate files under
drivers/pci/controller/mobiveil for easier reuse (Hou Zhiqiang)
- Add callbacks for interrupt initialization and linkup checking (Hou
Zhiqiang)
- Add 8- and 16-bit CSR accessors (Hou Zhiqiang)
- Initialize host driver only if Header Type is "bridge" (Hou Zhiqiang)
- Add DT bindings for NXP Layerscape SoCs PCIe Gen4 controller (Hou
Zhiqiang)
- Add PCIe Gen4 RC driver for Layerscape SoCs (Hou Zhiqiang)
- Add pcie-mobiveil __iomem annotations (Hou Zhiqiang)
- Add PCI_MSI_IRQ_DOMAIN Kconfig dependency (Hou Zhiqiang)
* remotes/lorenzo/pci/mobiveil:
PCI: mobiveil: Fix unmet dependency warning for PCIE_MOBIVEIL_PLAT
PCI: mobiveil: Fix sparse different address space warnings
PCI: mobiveil: Add PCIe Gen4 RC driver for Layerscape SoCs
dt-bindings: PCI: Add NXP Layerscape SoCs PCIe Gen4 controller
PCI: mobiveil: Add Header Type field check
PCI: mobiveil: Add 8-bit and 16-bit CSR register accessors
PCI: mobiveil: Allow mobiveil_host_init() to be used to re-init host
PCI: mobiveil: Add callback function for link up check
PCI: mobiveil: Add callback function for interrupt initialization
PCI: mobiveil: Modularize the Mobiveil PCIe Host Bridge IP driver
PCI: mobiveil: Collect the interrupt related operations into a function
PCI: mobiveil: Move the host initialization into a function
PCI: mobiveil: Introduce a new structure mobiveil_root_port
- Fix memory leak in hv probe path (Dexuan Cui)
- Add support for Hyper-V protocol 1.3 (Long Li)
- Replace zero-length array with flexible-array member (Gustavo A. R.
Silva)
- Move hypercall definitions to <asm/hyperv-tlfs.h> (Boqun Feng)
- Move retarget definitions to <asm/hyperv-tlfs.h> and make them packed
(Boqun Feng)
- Add struct hv_msi_entry and hv_set_msi_entry_from_desc() to prepare for
future virtual PCI on non-x86 (Boqun Feng)
* remotes/lorenzo/pci/hv:
PCI: hv: Introduce hv_msi_entry
PCI: hv: Move retarget related structures into tlfs header
PCI: hv: Move hypercall related definitions into tlfs header
PCI: hv: Replace zero-length array with flexible-array member
PCI: hv: Add support for protocol 1.3 and support PCI_BUS_RELATIONS2
PCI: hv: Decouple the func definition in hv_dr_state from VSP message
PCI: hv: Add missing kfree(hbus) in hv_pci_probe()'s error handling path
PCI: hv: Remove unnecessary type casting from kzalloc
- Use notification chain instead of EPF linkup ops for EPC events (Kishon
Vijay Abraham I)
- Protect concurrent allocation in endpoint outbound address region
(Kishon Vijay Abraham I)
- Protect concurrent access to pci_epf_ops (Kishon Vijay Abraham I)
- Assign function number for each PF in endpoint core (Kishon Vijay
Abraham I)
- Refactor endpoint mode core initialization (Vidya Sagar)
- Add API to notify when core initialization completes (Vidya Sagar)
- Add test framework support to defer core initialization (Vidya Sagar)
- Update Tegra SoC ABI header to support uninitialization of UPHY PLL
when in endpoint mode without reference clock (Vidya Sagar)
- Add DT and driver support for Tegra194 PCIe endpoint nodes (Vidya
Sagar)
- Add endpoint test support for DMA data transfer (Kishon Vijay
Abraham I)
- Print throughput information in endpoint test (Kishon Vijay Abraham I)
- Use streaming DMA APIs for endpoint test buffer allocation (Kishon
Vijay Abraham I)
- Add endpoint test command line option for DMA (Kishon Vijay Abraham I)
- When stopping a controller via configfs, clear endpoint "start" entry
to prevent WARN_ON (Kunihiko Hayashi)
- Update endpoint ->set_msix() to pay attention to MSI-X BAR Indicator
and offset when finding MSI-X tables (Kishon Vijay Abraham I)
- MSI-X tables are in local memory, not in the PCI address space. Update
pcie-designware-ep to account for this (Kishon Vijay Abraham I)
- Allow AM654 PCIe Endpoint to raise MSI-X interrupts (Kishon Vijay
Abraham I)
- Avoid using module parameter to determine irqtype for endpoint test
(Kishon Vijay Abraham I)
- Add ioctl to clear IRQ for endpoint test (Kishon Vijay Abraham I)
- Add endpoint test 'e' option to clear IRQ (Kishon Vijay Abraham I)
- Bump limit on number of endpoint test devices from 10 to 10,000 (Kishon
Vijay Abraham I)
- Use full pci-endpoint-test name in request_irq() for easier profiling
(Kishon Vijay Abraham I)
- Reduce log level of -EPROBE_DEFER error messages to debug (Thierry
Reding)
* remotes/lorenzo/pci/endpoint:
misc: pci_endpoint_test: remove duplicate macro PCI_ENDPOINT_TEST_STATUS
PCI: tegra: Print -EPROBE_DEFER error message at debug level
misc: pci_endpoint_test: Use full pci-endpoint-test name in request_irq()
misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices
tools: PCI: Add 'e' to clear IRQ
misc: pci_endpoint_test: Add ioctl to clear IRQ
misc: pci_endpoint_test: Avoid using module parameter to determine irqtype
PCI: keystone: Allow AM654 PCIe Endpoint to raise MSI-X interrupt
PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address
PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments
misc: pci_endpoint_test: Add support to get DMA option from userspace
tools: PCI: Add 'd' command line option to support DMA
misc: pci_endpoint_test: Use streaming DMA APIs for buffer allocation
PCI: endpoint: functions/pci-epf-test: Print throughput information
PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data
PCI: endpoint: Fix clearing start entry in configfs
PCI: tegra: Add support for PCIe endpoint mode in Tegra194
dt-bindings: PCI: tegra: Add DT support for PCIe EP nodes in Tegra194
soc/tegra: bpmp: Update ABI header
PCI: pci-epf-test: Add support to defer core initialization
PCI: dwc: Add API to notify core initialization completion
PCI: endpoint: Add notification for core init completion
PCI: dwc: Refactor core initialization code for EP mode
PCI: endpoint: Add core init notifying feature
PCI: endpoint: Assign function number for each PF in EPC core
PCI: endpoint: Protect concurrent access to pci_epf_ops with mutex
PCI: endpoint: Fix for concurrent memory allocation in OB address region
PCI: endpoint: Replace spinlock with mutex
PCI: endpoint: Use notification chain mechanism to notify EPC events to EPF
- Use ioremap(), not phys_to_virt() for platform ROM, to fix video ROM
mapping with CONFIG_HIGHMEM (Mikel Rychliski)
- Add support for root bus sizing so we don't have to assume host bridge
windows are known a priori (Ivan Kokshaysky)
- Fix alpha Nautilus PCI setup, which has been broken since we started
enforcing window limits in resource allocation (Ivan Kokshaysky)
* pci/resource:
alpha: Fix nautilus PCI setup
PCI: Add support for root bus sizing
PCI: Use ioremap(), not phys_to_virt() for platform ROM
- Move _HPX type array from stack to static data (Colin Ian King)
- Avoid an ASMedia XHCI USB PME# defect; apparently it doesn't assert
PME# when USB3.0 devices are hotplugged in D0 (Kai-Heng Feng)
- Revert sysfs "rescan" file renames that broke an application (Kelsey
Skunberg)
* pci/misc:
PCI: sysfs: Revert "rescan" file renames
PCI: Avoid ASMedia XHCI USB PME# from D0 defect
PCI/ACPI: Move pcie_to_hpx3_type[] from stack to static data
- Disable in-band presence detection when possible (Alexandru Gagniuc)
- Poll for presence detect if in-band presence detection is disabled
(Alexandru Gagniuc)
- Add DMI table of systems that don't support in-band presence detection
(Stuart Hayes)
- Fix indefinite pciehp wait caused by race in handling sysfs requests
(Lukas Wunner)
- Fix pciehp MSI interrupt race that caused us to miss interrupts (Stuart
Hayes)
* pci/hotplug:
PCI: pciehp: Fix MSI interrupt race
PCI: pciehp: Fix indefinite wait on sysfs requests
PCI: pciehp: Add DMI table for in-band presence detection disabled
PCI: pciehp: Wait for PDS if in-band presence is disabled
PCI: pciehp: Disable in-band presence detect when possible
- Update error status after reset_link() so we don't report "recovery
failed" when it in fact succeeded (Kuppuswamy Sathyanarayanan)
- Move DPC data into struct pci_dev instead of allocating a separate
struct dpc_dev (Bjorn Helgaas)
- Remove AER/DPC service dependency to simplify error recovery
(Kuppuswamy Sathyanarayanan)
- Return error recovery status for future use by EDR, which needs to tell
firmware whether recovery was successful (Kuppuswamy Sathyanarayanan)
- Cache DPC capability info in core since it's needed by EDR as well as
DPC driver (Kuppuswamy Sathyanarayanan)
- Add pci_aer_raw_clear_status() to allow EDR recovery path to clear AER
status even when OS doesn't own the AER capability (Kuppuswamy
Sathyanarayanan)
- Add Error Disconnect Recover (EDR) support, so firmware can use ACPI
notification to tell the OS that devices have been disconnected, e.g.,
via DPC, and that OS should attempt recovery (Kuppuswamy
Sathyanarayanan)
- Rename AER error status clearing interfaces to be more consistent
(Kuppuswamy Sathyanarayanan)
* pci/edr:
PCI/AER: Rationalize error status register clearing
PCI/DPC: Add Error Disconnect Recover (EDR) support
PCI/DPC: Expose dpc_process_error(), dpc_reset_link() for use by EDR
PCI/AER: Add pci_aer_raw_clear_status() to unconditionally clear Error Status
PCI/DPC: Cache DPC capabilities in pci_init_capabilities()
PCI/ERR: Return status of pcie_do_recovery()
PCI/ERR: Remove service dependency in pcie_do_recovery()
PCI/DPC: Move DPC data into struct pci_dev
PCI/ERR: Update error status after reset_link()
PCI/ERR: Combine pci_channel_io_frozen cases
Probe deferral is an expected error condition that will usually be
recovered from. Print such error messages at debug level to make them
available for diagnostic purposes when building with debugging enabled
and hide them otherwise to not spam the kernel log with them.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Tested-by: Vidya Sagar <vidyas@nvidia.com>
AM654 PCIe EP controller has MSI-X capability register and has the
ability to raise MSI-X interrupt. Add support in pci-keystone.c
for PCIe endpoint controller in AM654 to raise MSI-X interrupts.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
commit beb4641a78 ("PCI: dwc: Add MSI-X callbacks handler"),
in order to raise MSI-X interrupt, obtained MSIX table address from
Base Address Register (BAR). However BAR only holds PCI address
programmed by the host whereas the MSI-X table should be in the local
memory.
Store the MSI-X table address (virtual address) as part of ->set_bar()
callback and use that to get the message address and message data
here.
Fixes: beb4641a78 ("PCI: dwc: Add MSI-X callbacks handler")
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
commit 8963106eab ("PCI: endpoint: Add MSI-X interfaces") while
adding support to raise MSI-X interrupts from endpoint didn't include
BAR Indicator register (BIR) configuration and MSI-X table offset as
arguments in pci_epc_set_msix(). This would result in endpoint
controller register using random BAR indicator register, the memory
for which might not be allocated by the endpoint function driver.
Add BAR indicator register and MSI-X table offset as arguments in
pci_epc_set_msix() and allocate space for MSI-X table and pending
bit array (PBA) in pci-epf-test endpoint function driver.
Fixes: 8963106eab ("PCI: endpoint: Add MSI-X interfaces")
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Print throughput information in KB/s after every completed transfer,
including information on whether DMA is used or not.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Alan Mikhak <alan.mikhak@sifive.com>
Use dmaengine API and add support for transferring data using DMA.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Alan Mikhak <alan.mikhak@sifive.com>
Pull networking updates from David Miller:
"Highlights:
1) Fix the iwlwifi regression, from Johannes Berg.
2) Support BSS coloring and 802.11 encapsulation offloading in
hardware, from John Crispin.
3) Fix some potential Spectre issues in qtnfmac, from Sergey
Matyukevich.
4) Add TTL decrement action to openvswitch, from Matteo Croce.
5) Allow paralleization through flow_action setup by not taking the
RTNL mutex, from Vlad Buslov.
6) A lot of zero-length array to flexible-array conversions, from
Gustavo A. R. Silva.
7) Align XDP statistics names across several drivers for consistency,
from Lorenzo Bianconi.
8) Add various pieces of infrastructure for offloading conntrack, and
make use of it in mlx5 driver, from Paul Blakey.
9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki.
10) Lots of parallelization improvements during configuration changes
in mlxsw driver, from Ido Schimmel.
11) Add support to devlink for generic packet traps, which report
packets dropped during ACL processing. And use them in mlxsw
driver. From Jiri Pirko.
12) Support bcmgenet on ACPI, from Jeremy Linton.
13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei
Starovoitov, and your's truly.
14) Support XDP meta-data in virtio_net, from Yuya Kusakabe.
15) Fix sysfs permissions when network devices change namespaces, from
Christian Brauner.
16) Add a flags element to ethtool_ops so that drivers can more simply
indicate which coalescing parameters they actually support, and
therefore the generic layer can validate the user's ethtool
request. Use this in all drivers, from Jakub Kicinski.
17) Offload FIFO qdisc in mlxsw, from Petr Machata.
18) Support UDP sockets in sockmap, from Lorenz Bauer.
19) Fix stretch ACK bugs in several TCP congestion control modules,
from Pengcheng Yang.
20) Support virtual functiosn in octeontx2 driver, from Tomasz
Duszynski.
21) Add region operations for devlink and use it in ice driver to dump
NVM contents, from Jacob Keller.
22) Add support for hw offload of MACSEC, from Antoine Tenart.
23) Add support for BPF programs that can be attached to LSM hooks,
from KP Singh.
24) Support for multiple paths, path managers, and counters in MPTCP.
From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti,
and others.
25) More progress on adding the netlink interface to ethtool, from
Michal Kubecek"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits)
net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline
cxgb4/chcr: nic-tls stats in ethtool
net: dsa: fix oops while probing Marvell DSA switches
net/bpfilter: remove superfluous testing message
net: macb: Fix handling of fixed-link node
net: dsa: ksz: Select KSZ protocol tag
netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write
net: stmmac: add EHL 2.5Gbps PCI info and PCI ID
net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID
net: stmmac: create dwmac-intel.c to contain all Intel platform
net: dsa: bcm_sf2: Support specifying VLAN tag egress rule
net: dsa: bcm_sf2: Add support for matching VLAN TCI
net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions
net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT
net: dsa: bcm_sf2: Disable learning for ASP port
net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge
net: dsa: b53: Prevent tagged VLAN on port 7 for 7278
net: dsa: b53: Restore VLAN entries upon (re)configuration
net: dsa: bcm_sf2: Fix overflow checks
hv_netvsc: Remove unnecessary round_up for recv_completion_cnt
...
Without this commit, a PCIe hotplug port can stop generating interrupts on
hotplug events, so device adds and removals will not be seen:
The pciehp interrupt handler pciehp_isr() reads the Slot Status register
and then writes back to it to clear the bits that caused the interrupt. If
a different interrupt event bit gets set between the read and the write,
pciehp_isr() returns without having cleared all of the interrupt event
bits. If this happens when the MSI isn't masked (which by default it isn't
in handle_edge_irq(), and which it will never be when MSI per-vector
masking is not supported), we won't get any more hotplug interrupts from
that device.
That is expected behavior, according to the PCIe Base Spec r5.0, section
6.7.3.4, "Software Notification of Hot-Plug Events".
Because the Presence Detect Changed and Data Link Layer State Changed event
bits can both get set at nearly the same time when a device is added or
removed, this is more likely to happen than it might seem. The issue was
found (and can be reproduced rather easily) by connecting and disconnecting
an NVMe storage device on at least one system model where the NVMe devices
were being connected to an AMD PCIe port (PCI device 0x1022/0x1483).
Fix the issue by modifying pciehp_isr() to loop back and re-read the Slot
Status register immediately after writing to it, until it sees that all of
the event status bits have been cleared.
[lukas: drop loop count limitation, write "events" instead of "status",
don't loop back in INTx and poll modes, tweak code comment & commit msg]
Link: https://lore.kernel.org/r/78b4ced5072bfe6e369d20e8b47c279b8c7af12e.1582121613.git.lukas@wunner.de
Tested-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
David Hoyer reports that powering pciehp slots up or down via sysfs may
hang: The call to wait_event() in pciehp_sysfs_enable_slot() and
_disable_slot() does not return because ctrl->ist_running remains true.
This flag, which was introduced by commit 157c1062fc ("PCI: pciehp: Avoid
returning prematurely from sysfs requests"), signifies that the IRQ thread
pciehp_ist() is running. It is set to true at the top of pciehp_ist() and
reset to false at the end. However there are two additional return
statements in pciehp_ist() before which the commit neglected to reset the
flag to false and wake up waiters for the flag.
That omission opens up the following race when powering up the slot:
* pciehp_ist() runs because a PCI_EXP_SLTSTA_PDC event was requested
by pciehp_sysfs_enable_slot()
* pciehp_ist() turns on slot power via the following call stack:
pciehp_handle_presence_or_link_change() -> pciehp_enable_slot() ->
__pciehp_enable_slot() -> board_added() -> pciehp_power_on_slot()
* after slot power is turned on, the link comes up, resulting in a
PCI_EXP_SLTSTA_DLLSC event
* the IRQ handler pciehp_isr() stores the event in ctrl->pending_events
and returns IRQ_WAKE_THREAD
* the IRQ thread is already woken (it's bringing up the slot), but the
genirq code remembers to re-run the IRQ thread after it has finished
(such that it can deal with the new event) by setting IRQTF_RUNTHREAD
via __handle_irq_event_percpu() -> __irq_wake_thread()
* the IRQ thread removes PCI_EXP_SLTSTA_DLLSC from ctrl->pending_events
via board_added() -> pciehp_check_link_status() in order to deal with
presence and link flaps per commit 6c35a1ac3d ("PCI: pciehp:
Tolerate initially unstable link")
* after pciehp_ist() has successfully brought up the slot, it resets
ctrl->ist_running to false and wakes up the sysfs requester
* the genirq code re-runs pciehp_ist(), which sets ctrl->ist_running
to true but then returns with IRQ_NONE because ctrl->pending_events
is empty
* pciehp_sysfs_enable_slot() is finally woken but notices that
ctrl->ist_running is true, hence continues waiting
The only way to get the hung task going again is to trigger a hotplug
event which brings down the slot, e.g. by yanking out the card.
The same race exists when powering down the slot because remove_board()
likewise clears link or presence changes in ctrl->pending_events per commit
3943af9d01 ("PCI: pciehp: Ignore Link State Changes after powering off a
slot") and thereby may cause a re-run of pciehp_ist() which returns with
IRQ_NONE without resetting ctrl->ist_running to false.
Fix by adding a goto label before the teardown steps at the end of
pciehp_ist() and jumping to that label from the two return statements which
currently neglect to reset the ctrl->ist_running flag.
Fixes: 157c1062fc ("PCI: pciehp: Avoid returning prematurely from sysfs requests")
Link: https://lore.kernel.org/r/cca1effa488065cb055120aa01b65719094bdcb5.1584530321.git.lukas@wunner.de
Reported-by: David Hoyer <David.Hoyer@netapp.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Cc: stable@vger.kernel.org # v4.19+
After an endpoint is started through configfs, if 0 is written to the
configfs entry 'start', the controller stops but the epc_group->start
value remains 1.
A subsequent unlinking of the function from the controller would trigger
a spurious WARN_ON_ONCE() in pci_epc_epf_unlink() despite right
behavior.
Fix it by setting epc_group->start = 0 when a controller is stopped
using configfs.
Fixes: d746799116 ("PCI: endpoint: Introduce configfs entry for configuring EP functions")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Add support for the endpoint mode of Synopsys DesignWare core based
dual mode PCIe controllers present in Tegra194 SoC.
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Treewide:
- Cleanup of setup_irq() which is not longer required because the
memory allocator is available early. Most cleanup changes come
through the various maintainer trees, so the final removal of
setup_irq() is postponed towards the end of the merge window.
Core:
- Protection against unsafe invocation of interrupt handlers and unsafe
interrupt injection including a fixup of the offending PCI/AER error
injection mechanism.
Invoking interrupt handlers from arbitrary contexts, i.e. outside of
an actual interrupt, can cause inconsistent state on the fragile
x86 interrupt affinity changing hardware trainwreck.
Drivers:
- Second wave of support for the new ARM GICv4.1
- Multi-instance support for Xilinx and PLIC interrupt controllers
- CPU-Hotplug support for PLIC
- The obligatory new driver for X1000 TCU
- Enhancements, cleanups and fixes all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6B888THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoeMJD/9v8GcI/DSY87Fmo7s4odLFVU0J8zZ6
7QlYjSPm4yWv4pqn1TEnEF2pKz5X9Euhoh8BmdMKtdXBqlS4Ix9N+pH8ModcxyQo
aX97zuRUxvqfeeVE+yQRwbbMREj9jj9RW8FRtA39+l5H3uC1GDcc+2aAMIaykQ7+
8lo/6wBd8ZrZ0gsNf4KjlBwMDYAlQSRWxrff38PQ2XRpGKowdp8JFYZuq5Vp0ljJ
r2cE75ldmFSfmtuhhVroBRY0GAqW4/8v8/syAN3Q9jOEII60qhA0dqR085B9veWa
DHSqgLmzyUFFXN7Ntzt/fDirJVsIM4BE9qGu3ftCYHMaPB8hG+xqjbZe9E3D2e/d
+0Pb3TG8EHVOIwzv1t9+6462qYGkBhmBXtbj6GptPYk2Ai4HZlNaSsa8jUNyHvGz
WDegdRjt7O5RjqDH/VwrQxW/AEp05f/1egweBXbq9aF6j9nqeOur75c/PdxZxAX5
WUMtouXP2WN+sMW8k1T5cmVMGWxLGBB0wwG4LC/mXzHnkDiN1+2wEUHmhS8Voi3q
3HXeYBJeukUYbVvMKRvWVAD330TxFjAyd6pPwCdoNY2ZngJnQWlDD9vbYYX2osoW
kP+KhIANNBVqdK7NqlLoqcr3SdHn01pQYuVHejNzxb7E6/mmpMlaYDJc/rMPi/eM
0/rzl8fAj/WyBQ==
=DZ/G
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt subsystem:
Treewide:
- Cleanup of setup_irq() which is not longer required because the
memory allocator is available early.
Most cleanup changes come through the various maintainer trees, so
the final removal of setup_irq() is postponed towards the end of
the merge window.
Core:
- Protection against unsafe invocation of interrupt handlers and
unsafe interrupt injection including a fixup of the offending
PCI/AER error injection mechanism.
Invoking interrupt handlers from arbitrary contexts, i.e. outside
of an actual interrupt, can cause inconsistent state on the
fragile x86 interrupt affinity changing hardware trainwreck.
Drivers:
- Second wave of support for the new ARM GICv4.1
- Multi-instance support for Xilinx and PLIC interrupt controllers
- CPU-Hotplug support for PLIC
- The obligatory new driver for X1000 TCU
- Enhancements, cleanups and fixes all over the place"
* tag 'irq-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
unicore32: Replace setup_irq() by request_irq()
sh: Replace setup_irq() by request_irq()
hexagon: Replace setup_irq() by request_irq()
c6x: Replace setup_irq() by request_irq()
alpha: Replace setup_irq() by request_irq()
irqchip/gic-v4.1: Eagerly vmap vPEs
irqchip/gic-v4.1: Add VSGI property setup
irqchip/gic-v4.1: Add VSGI allocation/teardown
irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer
irqchip/gic-v4.1: Plumb set_vcpu_affinity SGI callbacks
irqchip/gic-v4.1: Plumb get/set_irqchip_state SGI callbacks
irqchip/gic-v4.1: Plumb mask/unmask SGI callbacks
irqchip/gic-v4.1: Add initial SGI configuration
irqchip/gic-v4.1: Plumb skeletal VSGI irqchip
irqchip/stm32: Retrigger both in eoi and unmask callbacks
irqchip/gic-v3: Move irq_domain_update_bus_token to after checking for NULL domain
irqchip/xilinx: Do not call irq_set_default_host()
irqchip/xilinx: Enable generic irq multi handler
irqchip/xilinx: Fill error code when irq domain registration fails
irqchip/xilinx: Add support for multiple instances
...
Pull perf updates from Ingo Molnar:
"The main changes in this cycle were:
Kernel side changes:
- A couple of x86/cpu cleanups and changes were grandfathered in due
to patch dependencies. These clean up the set of CPU model/family
matching macros with a consistent namespace and C99 initializer
style.
- A bunch of updates to various low level PMU drivers:
* AMD Family 19h L3 uncore PMU
* Intel Tiger Lake uncore support
* misc fixes to LBR TOS sampling
- optprobe fixes
- perf/cgroup: optimize cgroup event sched-in processing
- misc cleanups and fixes
Tooling side changes are to:
- perf {annotate,expr,record,report,stat,test}
- perl scripting
- libapi, libperf and libtraceevent
- vendor events on Intel and S390, ARM cs-etm
- Intel PT updates
- Documentation changes and updates to core facilities
- misc cleanups, fixes and other enhancements"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (89 commits)
cpufreq/intel_pstate: Fix wrong macro conversion
x86/cpu: Cleanup the now unused CPU match macros
hwrng: via_rng: Convert to new X86 CPU match macros
crypto: Convert to new CPU match macros
ASoC: Intel: Convert to new X86 CPU match macros
powercap/intel_rapl: Convert to new X86 CPU match macros
PCI: intel-mid: Convert to new X86 CPU match macros
mmc: sdhci-acpi: Convert to new X86 CPU match macros
intel_idle: Convert to new X86 CPU match macros
extcon: axp288: Convert to new X86 CPU match macros
thermal: Convert to new X86 CPU match macros
hwmon: Convert to new X86 CPU match macros
platform/x86: Convert to new CPU match macros
EDAC: Convert to new X86 CPU match macros
cpufreq: Convert to new X86 CPU match macros
ACPI: Convert to new X86 CPU match macros
x86/platform: Convert to new CPU match macros
x86/kernel: Convert to new CPU match macros
x86/kvm: Convert to new CPU match macros
x86/perf/events: Convert to new CPU match macros
...
We changed these sysfs filenames:
.../pci_bus/<domain:bus>/rescan -> .../pci_bus/<domain:bus>/bus_rescan
.../<domain🚌dev.fn>/rescan -> .../<domain🚌dev.fn>/dev_rescan
and Ruslan reported [1] that this broke a userspace application.
Revert these name changes so both files are named "rescan" again.
Note that we have to use __ATTR() to assign custom C symbols, i.e.,
"struct device_attribute <symbol>".
[1] https://lore.kernel.org/r/CAB=otbSYozS-ZfxB0nCiNnxcbqxwrHOSYxJJtDKa63KzXbXgpw@mail.gmail.com
[bhelgaas: commit log, use __ATTR() both places so we don't have to rename
the attributes]
Fixes: 8bdfa145f5 ("PCI: sysfs: Define device attributes with DEVICE_ATTR*()")
Fixes: 4e2b79436e ("PCI: sysfs: Change DEVICE_ATTR() to DEVICE_ATTR_WO()")
Link: https://lore.kernel.org/r/20200325151708.32612-1-skunberg.kelsey@gmail.com
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org # v5.4+
In certain cases we should be able to enumerate IO and MEM ranges of all
PCI devices installed in the system, and then set respective host bridge
apertures basing on calculated size and alignment. Particularly when
firmware is broken and fails to assign bridge windows properly, like on
Alpha UP1500 platform.
Actually, almost everything is already in place, and required changes are
minimal:
- add "size_windows" flag to struct pci_host_bridge: when set, it
instructs __pci_bus_size_bridges() to continue with the root bus;
- in the __pci_bus_size_bridges() path: add checks for bus->self,
as it can legitimately be null for the root bus.
Link: https://lore.kernel.org/r/20200314194355.GA12510@mail.rc.ru
Tested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
On some EFI systems, the video BIOS is provided by the EFI firmware. The
boot stub code stores the physical address of the ROM image in pdev->rom.
Currently we attempt to access this pointer using phys_to_virt(), which
doesn't work with CONFIG_HIGHMEM.
On these systems, attempting to load the radeon module on a x86_32 kernel
can result in the following:
BUG: unable to handle page fault for address: 3e8ed03c
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 317 Comm: systemd-udevd Not tainted 5.6.0-rc3-next-20200228 #2
Hardware name: Apple Computer, Inc. MacPro1,1/Mac-F4208DC8, BIOS MP11.88Z.005C.B08.0707021221 07/02/07
EIP: radeon_get_bios+0x5ed/0xe50 [radeon]
Code: 00 00 84 c0 0f 85 12 fd ff ff c7 87 64 01 00 00 00 00 00 00 8b 47 08 8b 55 b0 e8 1e 83 e1 d6 85 c0 74 1a 8b 55 c0 85 d2 74 13 <80> 38 55 75 0e 80 78 01 aa 0f 84 a4 03 00 00 8d 74 26 00 68 dc 06
EAX: 3e8ed03c EBX: 00000000 ECX: 3e8ed03c EDX: 00010000
ESI: 00040000 EDI: eec04000 EBP: eef3fc60 ESP: eef3fbe0
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010206
CR0: 80050033 CR2: 3e8ed03c CR3: 2ec77000 CR4: 000006d0
Call Trace:
r520_init+0x26/0x240 [radeon]
radeon_device_init+0x533/0xa50 [radeon]
radeon_driver_load_kms+0x80/0x220 [radeon]
drm_dev_register+0xa7/0x180 [drm]
radeon_pci_probe+0x10f/0x1a0 [radeon]
pci_device_probe+0xd4/0x140
Fix the issue by updating all drivers which can access a platform provided
ROM. Instead of calling the helper function pci_platform_rom() which uses
phys_to_virt(), call ioremap() directly on the pdev->rom.
radeon_read_platform_bios() previously directly accessed an __iomem
pointer. Avoid this by calling memcpy_fromio() instead of kmemdup().
pci_platform_rom() now has no remaining callers, so remove it.
Link: https://lore.kernel.org/r/20200319021623.5426-1-mikel@mikelr.com
Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Many Zhaoxin Root Ports and Switch Downstream Ports do provide ACS-like
capability but have no ACS Capability Structure. Peer-to-Peer transactions
could be blocked between these ports, so add quirk so devices behind them
could be assigned to different IOMMU group.
Link: https://lore.kernel.org/r/20200327091148.5190-4-RaymondPang-oc@zhaoxin.com
Signed-off-by: Raymond Pang <RaymondPang-oc@zhaoxin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some Zhaoxin endpoints are implemented as multi-function devices without an
ACS capability, but they actually don't support peer-to-peer transactions.
Add ACS quirks to declare DMA isolation.
Link: https://lore.kernel.org/r/20200327091148.5190-3-RaymondPang-oc@zhaoxin.com
Signed-off-by: Raymond Pang <RaymondPang-oc@zhaoxin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When the UEFI/BIOS or bootloader has not initialised a PCIe device we would
get the following message:
kern.warning: pci 0000:00:01.0: ASPM: current common clock configuration is broken, reconfiguring
"warning" and "broken" are slightly misleading. On an embedded system it is
quite possible for the bootloader to avoid configuring PCIe devices if they
are not needed.
Downgrade the message to pci_info() and change "broken" to "inconsistent"
since we fix up the inconsistency in the code immediately following the
message (and emit an error if that fails).
Link: https://lore.kernel.org/r/20200323035530.11569-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The AER interfaces to clear error status registers were a confusing mess:
- pci_cleanup_aer_uncorrect_error_status() cleared non-fatal errors
from the Uncorrectable Error Status register.
- pci_aer_clear_fatal_status() cleared fatal errors from the
Uncorrectable Error Status register.
- pci_cleanup_aer_error_status_regs() cleared the Root Error Status
register (for Root Ports), the Uncorrectable Error Status register,
and the Correctable Error Status register.
Rename them to make them consistent:
From To
---------------------------------------- -------------------------------
pci_cleanup_aer_uncorrect_error_status() pci_aer_clear_nonfatal_status()
pci_aer_clear_fatal_status() pci_aer_clear_fatal_status()
pci_cleanup_aer_error_status_regs() pci_aer_clear_status()
Since pci_cleanup_aer_error_status_regs() (renamed to
pci_aer_clear_status()) is only used within drivers/pci/, move the
declaration from <linux/aer.h> to drivers/pci/pci.h.
[bhelgaas: commit log, add renames]
Link: https://lore.kernel.org/r/d1310a75dc3d28f7e8da4e99c45fbd3e60fe238e.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Error Disconnect Recover (EDR) is a feature that allows ACPI firmware to
notify OSPM that a device has been disconnected due to an error condition
(ACPI v6.3, sec 5.6.6). OSPM advertises its support for EDR on PCI devices
via _OSC (see [1], sec 4.5.1, table 4-4). The OSPM EDR notify handler
should invalidate software state associated with disconnected devices and
may attempt to recover them. OSPM communicates the status of recovery to
the firmware via _OST (sec 6.3.5.2).
For PCIe, firmware may use Downstream Port Containment (DPC) to support
EDR. Per [1], sec 4.5.1, table 4-6, even if firmware has retained control
of DPC, OSPM may read/write DPC control and status registers during the EDR
notification processing window, i.e., from the time it receives an EDR
notification until it clears the DPC Trigger Status.
Note that per [1], sec 4.5.1 and 4.5.2.4,
1. If the OS supports EDR, it should advertise that to firmware by
setting OSC_PCI_EDR_SUPPORT in _OSC Support.
2. If the OS sets OSC_PCI_EXPRESS_DPC_CONTROL in _OSC Control to request
control of the DPC capability, it must also set OSC_PCI_EDR_SUPPORT in
_OSC Support.
Add an EDR notify handler to attempt recovery.
[1] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/12888
[bhelgaas: squash add/enable patches into one]
Link: https://lore.kernel.org/r/90f91fe6d25c13f9d2255d2ce97ca15be307e1bb.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
If firmware controls DPC, it is generally responsible for managing the DPC
capability and events, and the OS should not access the DPC capability.
However, if firmware controls DPC and both the OS and the platform support
Error Disconnect Recover (EDR) notifications, the OS EDR notify handler is
responsible for recovery, and the notify handler may read/write the DPC
capability until it clears the DPC Trigger Status bit. See [1], sec 4.5.1,
table 4-6.
Expose some DPC error handling functions so they can be used by the EDR
notify handler.
[1] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/12888
Link: https://lore.kernel.org/r/e9000bb15b3a4293e81d98bb29ead7c84a6393c9.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per the SFI _OSC and DPC Updates ECN [1] implementation note flowchart, the
OS seems to be expected to clear AER status even if it doesn't have
ownership of the AER capability. Unlike the DPC capability, where a DPC
ECN [2] specifies a window when the OS is allowed to access DPC registers
even if it doesn't have ownership, there is no clear model for AER.
Add pci_aer_raw_clear_status() to clear the AER error status registers
unconditionally. This is intended for use only by the EDR path (see [2]).
[1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
2020, affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/14076
[2] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/12888
[bhelgaas: changelog]
Link: https://lore.kernel.org/r/c19ad28f3633cce67448609e89a75635da0da07d.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
As per the DPC Enhancements ECN [1], sec 4.5.1, table 4-4, if the OS
supports Error Disconnect Recover (EDR), it must invalidate the software
state associated with child devices of the port without attempting to
access the child device hardware. In addition, if the OS supports DPC, it
must attempt to recover the child devices if the port implements the DPC
Capability. If the OS continues operation, the OS must inform the firmware
of the status of the recovery operation via the _OST method.
Return the result of pcie_do_recovery() so we can report it to firmware via
_OST.
[1] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/12888
Link: https://lore.kernel.org/r/eb60ec89448769349c6722954ffbf2de163155b5.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously we passed the PCIe service type parameter to pcie_do_recovery(),
where reset_link() looked up the underlying pci_port_service_driver and its
.reset_link() function pointer. Instead of using this roundabout way, we
can just pass the driver-specific .reset_link() callback function when
calling pcie_do_recovery() function.
This allows us to call pcie_do_recovery() from code that is not a PCIe port
service driver, e.g., Error Disconnect Recover (EDR) support.
Remove pcie_port_find_service() and pcie_port_service_driver.reset_link
since they are now unused.
Link: https://lore.kernel.org/r/60e02b87b526cdf2930400059d98704bf0a147d1.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit bdb5ac8577 ("PCI/ERR: Handle fatal error recovery") uses
reset_link() to recover from fatal errors. But during fatal error
recovery, if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER then even after successful recovery (using
reset_link()) pcie_do_recovery() will report the recovery result as
failure. Update the status of error after reset_link().
You can reproduce this issue by triggering a SW DPC using "DPC Software
Trigger" bit in "DPC Control Register". You should see recovery failed
dmesg log as below:
pcieport 0000:00:16.0: DPC: containment event, status:0x1f27 source:0x0000
pcieport 0000:00:16.0: DPC: software trigger detected
pci 0000:04:00.0: AER: can't recover (no error_detected callback)
pcieport 0000:00:16.0: AER: device recovery failed
Fixes: bdb5ac8577 ("PCI/ERR: Handle fatal error recovery")
Link: https://lore.kernel.org/r/a255fcb3a3fdebcd90f84e08b555f1786eb8eba2.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
[bhelgaas: split pci_channel_io_frozen simplification to separate patch]
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>