Commit Graph

5182 Commits

Author SHA1 Message Date
Bjorn Helgaas
2583157141 Merge branch 'pci/virtualization' into next
* pci/virtualization:
  PCI: Add comments about ROM BAR updating
  PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
  PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
  PCI: Don't update VF BARs while VF memory space is enabled
  PCI: Separate VF BAR updates from standard BAR updates
  PCI: Update BARs using property bits appropriate for type
  PCI: Ignore BAR updates on virtual functions
  PCI: Do any VF BAR updates before enabling the BARs
  PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+
  PCI: Convert Mellanox broken INTx quirks to be for listed devices only
  PCI: Convert broken INTx masking quirks from HEADER to FINAL
  net/mlx4_core: Use device ID defines
  PCI: Add Mellanox device IDs
2016-12-12 11:25:05 -06:00
Bjorn Helgaas
daaed10443 Merge branch 'pci/pm' into next
* pci/pm:
  x86/platform/intel-mid: Constify mid_pci_platform_pm
  PCI: pciehp: Add runtime PM support for PCIe hotplug ports
  ACPI / hotplug / PCI: Make device_is_managed_by_native_pciehp() public
  ACPI / hotplug / PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
  PCI: Unfold conditions to block runtime PM on PCIe ports
  PCI: Consolidate conditions to allow runtime PM on PCIe ports
  PCI: Activate runtime PM on a PCIe port only if it can suspend
  PCI: Speed up algorithm in pci_bridge_d3_update()
  PCI: Autosense device removal in pci_bridge_d3_update()
  PCI: Don't acquire ref on parent in pci_bridge_d3_update()
  USB: UHCI: report non-PME wakeup signalling for Intel hardware
  PCI: Check for PME in targeted sleep state
2016-12-12 11:25:04 -06:00
Bjorn Helgaas
db5ba86412 Merge branch 'pci/msi' into next
* pci/msi:
  PCI/MSI: Check for NULL affinity mask in pci_irq_get_affinity()
2016-12-12 11:25:04 -06:00
Bjorn Helgaas
c1f2e80c19 Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)
  PCI: Expand "VPD access disabled" quirk message
  PCI: pciehp: Remove loading message
  PCI: hotplug: Remove hotplug core message
  PCI: Remove service driver load/unload messages
  PCI/AER: Log AER IRQ when claiming Root Port
  PCI/AER: Log errors with PCI device, not PCIe service device
  PCI/AER: Remove unused version macros
  PCI/PME: Log PME IRQ when claiming Root Port
  PCI/PME: Drop unused support for PMEs from Root Complex Event Collectors
  PCI: Move config space size macros to pci_regs.h
2016-12-12 11:25:03 -06:00
Bjorn Helgaas
4617aedbd2 Merge branch 'pci/hotplug' into next
* pci/hotplug:
  PCI: pciehp: Leave power indicator on when enabling already-enabled slot
  PCI: pciehp: Prioritize data-link event over presence detect
  PCI: cpqphp: Add missing call to pci_disable_device()
2016-12-12 11:25:03 -06:00
Bjorn Helgaas
2f0f3733c4 Merge branch 'pci/enumeration' into next
* pci/enumeration:
  PCI: Warn on possible RW1C corruption for sub-32 bit config writes
  PCI: Create revision file in sysfs
2016-12-12 11:25:02 -06:00
Bjorn Helgaas
5e0ad9f686 Merge branch 'pci/ecam' into next
* pci/ecam:
  PCI: Explain ARM64 ACPI/MCFG quirk Kconfig and build strategy
  PCI: Add MCFG quirks for X-Gene host controller
  PCI: Add MCFG quirks for Cavium ThunderX pass1.x host controller
  PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller
  PCI: thunder-pem: Factor out resource lookup
  PCI: Add MCFG quirks for HiSilicon Hip05/06/07 host controllers
  PCI: Add MCFG quirks for Qualcomm QDF2432 host controller
  PCI/ACPI: Provide acpi_get_rc_resources() for ARM64 platform
  PCI/ACPI: Check for platform-specific MCFG quirks
  PCI/ACPI: Extend pci_mcfg_lookup() to return ECAM config accessors
  arm64: PCI: Exclude ACPI "consumer" resources from host bridge windows
  arm64: PCI: Manage controller-specific data on per-controller basis
  arm64: PCI: Search ACPI namespace to ensure ECAM space is reserved
  arm64: PCI: Add local struct device pointers
  ACPI: Add acpi_resource_consumer() to find device that claims a resource
2016-12-12 11:25:02 -06:00
Alexey Kardashevskiy
1c7de2b4ff PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)
There is at least one Chelsio 10Gb card which uses VPD area to store some
non-standard blocks (example below).  However pci_vpd_size() returns the
length of the first block only assuming that there can be only one VPD "End
Tag".

Since 4e1a635552 ("vfio/pci: Use kernel VPD access functions"), VFIO
blocks access beyond that offset, which prevents the guest "cxgb3" driver
from probing the device.  The host system does not have this problem as its
driver accesses the config space directly without pci_read_vpd().

Add a quirk to override the VPD size to a bigger value.  The maximum size
is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h.
We do not read the tag as the cxgb3 driver does as the driver supports
writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes
boundary.  The quirk is registered for all devices supported by the cxgb3
driver.

This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3
driver itself accesses VPD directly and the problem only exists with the
vfio-pci driver (when cxgb3 is not running on the host and may not be even
loaded) which blocks accesses beyond the first block of VPD data.  However
vfio-pci itself does not have quirks mechanism so we add it to PCI.

This is the controller:
Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]

This is what I parsed from its VPD:
===
b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K'
 0000 Large item 42 bytes; name 0x2 Identifier String
	b'10 Gigabit Ethernet-SR PCI Express Adapter'
 002d Large item 74 bytes; name 0x10
	#00 [EC] len=7: b'D76809 '
	#0a [FN] len=7: b'46K7897'
	#14 [PN] len=7: b'46K7897'
	#1e [MN] len=4: b'1037'
	#25 [FC] len=4: b'5769'
	#2c [SN] len=12: b'YL102035603V'
	#3b [NA] len=12: b'00145E992ED1'
 007a Small item 1 bytes; name 0xf End Tag

 0c00 Large item 16 bytes; name 0x2 Identifier String
	b'S310E-SR-X      '
 0c13 Large item 234 bytes; name 0x10
	#00 [PN] len=16: b'TBD             '
	#13 [EC] len=16: b'110107730D2     '
	#26 [SN] len=16: b'97YL102035603V  '
	#39 [NA] len=12: b'00145E992ED1'
	#48 [V0] len=6: b'175000'
	#51 [V1] len=6: b'266666'
	#5a [V2] len=6: b'266666'
	#63 [V3] len=6: b'2000  '
	#6c [V4] len=2: b'1 '
	#71 [V5] len=6: b'c2    '
	#7a [V6] len=6: b'0     '
	#83 [V7] len=2: b'1 '
	#88 [V8] len=2: b'0 '
	#8d [V9] len=2: b'0 '
	#92 [VA] len=2: b'0 '
	#97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
 0d00 Large item 252 bytes; name 0x11
	#00 [VC] len=16: b'122310_1222 dp  '
	#13 [VD] len=16: b'610-0001-00 H1\x00\x00'
	#26 [VE] len=16: b'122310_1353 fp  '
	#39 [VF] len=16: b'610-0001-00 H1\x00\x00'
	#4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
 0dff Small item 0 bytes; name 0xf End Tag

10f3 Large item 13315 bytes; name 0x62
!!! unknown item name 98: b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00'
===

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 10:05:24 -06:00
Bjorn Helgaas
044bc425bb PCI: Expand "VPD access disabled" quirk message
It's not very enlightening to see

  pci 0000:07:00.0: [Firmware Bug]: VPD access disabled

in the dmesg log because there's no clue about what the firmware bug is.
Expand the message to explain why we're disabling VPD.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 10:05:24 -06:00
Bjorn Helgaas
5fbeef6377 PCI: pciehp: Remove loading message
Remove the "PCI Express Hot Plug Controller Driver" version message.  I
don't think it contains any useful information.  Remove unused #defines
and move the author information to a comment.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 10:05:24 -06:00
Bjorn Helgaas
d9b47d5496 PCI: hotplug: Remove hotplug core message
Remove the "PCI Hot Plug PCI Core" version message.  I don't think it
contains any useful information.  Remove unused #defines and move the
author information to a comment.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 10:05:24 -06:00
Bjorn Helgaas
98892fae40 PCI: Remove service driver load/unload messages
Remove the "service driver %s loaded" and unloaded messages.  All service
drivers already log something in their probe functions, where they can log
more useful details.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 10:05:24 -06:00
Bjorn Helgaas
68a55ae5c0 PCI/AER: Log AER IRQ when claiming Root Port
Add a log message when we enable AER on a Root Port and the hierarchy below
it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 10:05:23 -06:00
Bjorn Helgaas
576700b67a PCI/AER: Log errors with PCI device, not PCIe service device
All other AER-related log messages use the PCI device, e.g.,
"pci 0000:00:1c.0", not the PCIe service device, e.g.,
"aer 0000:00:1c.0:pcie02".

Change the probe error messages to match the rest and include a little
context.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 10:05:23 -06:00
Bjorn Helgaas
2298a7aaa8 PCI/AER: Remove unused version macros
Remove the unused DRIVER_VERSION, DRIVER_AUTHOR, and DRIVER_DESC macros.
The author information is already included in a comment above.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 10:05:23 -06:00
Bjorn Helgaas
a902d81ac8 PCI/PME: Log PME IRQ when claiming Root Port
We already log a "Signaling PME" whenever the PME service driver claims a
Root Port.  In fact, we also log the same message for every device in the
hierarchy below the Root Port.

Log the "Signaling PME" once (only for the Root Port, since we can
trivially find out which devices are below the Root Port), and include the
IRQ number in the message to help connect the dots with /proc/interrupts.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-12 10:05:23 -06:00
Bjorn Helgaas
0a1e1b26f5 PCI/PME: Drop unused support for PMEs from Root Complex Event Collectors
Since we register pcie_pme_driver only for PCI_EXP_TYPE_ROOT_PORT, the PME
driver never claims Root Complex Event Collectors.

Remove unused code related to Root Complex Event Collectors.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-12 10:05:23 -06:00
Wang Sheng-Hui
cc10385b6f PCI: Move config space size macros to pci_regs.h
Move PCI configuration space size macros (PCI_CFG_SPACE_SIZE and
PCI_CFG_SPACE_EXP_SIZE) from drivers/pci/pci.h to
include/uapi/linux/pci_regs.h so they can be used by more drivers and
eliminate duplicate definitions.

[bhelgaas: Expand comment to include PCI-X details]
Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 10:05:22 -06:00
Lukas Wunner
c931225480 x86/platform/intel-mid: Constify mid_pci_platform_pm
This struct never needs to be modified.  The size of pci-mid.o ELF
sections changes thusly:

  -.data          56
  +.data           0
  -.rodata        32
  +.rodata        88

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-12 09:45:47 -06:00
David Daney
e53f9a28be PCI/ASPM: Don't retrain link if ASPM not possible
Some (defective) PCIe devices are not able to reliably do link retraining.

Check to see if ASPM is possible between link partners before configuring
common clocking, and doing the resulting link retraining.  If ASPM is not
possible, there is no reason to risk losing access to a device due to an
unnecessary link retraining.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-08 14:44:22 -06:00
Ashok Raj
c4ae2adedb PCI: pciehp: Leave power indicator on when enabling already-enabled slot
If an error occurs when enabling a slot, pciehp_power_thread() turns off
the power indicator.  But if the only error is that the slot was already
enabled, we should leave the power indicator on.

Return success if called to enable an already-enabled slot.
This is in the same spirit of the special handling for EEXISTS when
pciehp_configure_device() determines the slot devices already exist.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
2016-12-08 12:02:25 -06:00
Ashok Raj
385895fef6 PCI: pciehp: Prioritize data-link event over presence detect
If Slot Status indicates changes in both Data Link Layer Status and
Presence Detect, prioritize the Link status change.

When both events are observed, pciehp currently relies on the Slot Status
Presence Detect State (PDS) to agree with the Link Status Data Link Layer
Active status.  The Presence Detect State, however, may be set to 1 through
out-of-band presence detect even if the link is down, which creates
conflicting events.

Since the Link Status accurately reflects the reachability of the
downstream bus, the Link Status event should take precedence over a
Presence Detect event.  Skip checking the PDC status if we handled a link
event in the same handler.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
2016-12-07 17:00:44 -06:00
Bjorn Helgaas
ca5ab37b19 PCI: Explain ARM64 ACPI/MCFG quirk Kconfig and build strategy
Add Makefile comments to explain the Kconfig and build strategy for ARM64
drivers that work around not-quite-ECAM issues.  No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-07 14:34:58 -06:00
Duc Dang
c5d4603961 PCI: Add MCFG quirks for X-Gene host controller
PCIe controllers in X-Gene SoCs are not ECAM compliant: software needs to
configure additional controller's register to address device at
bus:dev:function.

Add a quirk to discover controller MMIO register space and configure
controller registers to select and address the target secondary device.

The quirk will only be applied for X-Gene PCIe MCFG table with
OEM revison 1, 2, 3 or 4 (PCIe controller v1 and v2 on X-Gene SoCs).

Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Duc Dang <dhdang@apm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:50 -06:00
Tomasz Nowicki
648d93fc77 PCI: Add MCFG quirks for Cavium ThunderX pass1.x host controller
ThunderX pass1.x requires to emulate the EA headers for on-chip devices
hence it has to use custom pci_thunder_ecam_ops for accessing PCI config
space (pci-thunder-ecam.c). Add new entries to MCFG quirk array where it
can be applied while probing ACPI based PCI host controller.

ThunderX pass1.x is using the same way for accessing off-chip devices
(so-called PEM) as silicon pass-2.x so we need to add PEM quirk entries
too.

Quirk is considered for ThunderX silicon pass1.x only which is identified
via MCFG revision 2.

ThunderX pass 1.x requires the following accessors:

  NUMA node 0 PCI segments  0- 3: pci_thunder_ecam_ops (MCFG quirk)
  NUMA node 0 PCI segments  4- 9: thunder_pem_ecam_ops (MCFG quirk)
  NUMA node 1 PCI segments 10-13: pci_thunder_ecam_ops (MCFG quirk)
  NUMA node 1 PCI segments 14-19: thunder_pem_ecam_ops (MCFG quirk)

[bhelgaas: change Makefile/ifdefs so quirk doesn't depend on
CONFIG_PCI_HOST_THUNDER_ECAM]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:50 -06:00
Tomasz Nowicki
44f22bd91e PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller
ThunderX PCIe controller to off-chip devices (so-called PEM) is not fully
compliant with ECAM standard. It uses non-standard configuration space
accessors (see thunder_pem_ecam_ops) and custom configuration space
granulation (see bus_shift = 24). In order to access configuration space
and probe PEM as ACPI-based PCI host controller we need to add MCFG quirk
infrastructure. This involves:
1. A new thunder_pem_acpi_init() init function to locate PEM-specific
   register ranges using ACPI.
2. Export PEM thunder_pem_ecam_ops structure so it is visible to MCFG quirk
   code.
3. New quirk entries for each PEM segment. Each contains platform IDs,
   mentioned thunder_pem_ecam_ops and CFG resources.

Quirk is considered for ThunderX silicon pass2.x only which is identified
via MCFG revision 1.

ThunderX pass 2.x requires the following accessors:

  NUMA Node 0 PCI segments  0- 3: pci_generic_ecam_ops (ECAM-compliant)
  NUMA Node 0 PCI segments  4- 9: thunder_pem_ecam_ops (MCFG quirk)
  NUMA Node 1 PCI segments 10-13: pci_generic_ecam_ops (ECAM-compliant)
  NUMA Node 1 PCI segments 14-19: thunder_pem_ecam_ops (MCFG quirk)

[bhelgaas: adapt to use acpi_get_rc_resources(), update Makefile/ifdefs so
quirk doesn't depend on CONFIG_PCI_HOST_THUNDER_PEM]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Bjorn Helgaas
0d414268fb PCI: thunder-pem: Factor out resource lookup
Pull the register resource lookup out of thunder_pem_init() so we can
easily add a corresponding lookup using ACPI.  No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Dongdong Liu
5f00f1a017 PCI: Add MCFG quirks for HiSilicon Hip05/06/07 host controllers
The PCIe controller in Hip05/Hip06/Hip07 SoCs is not completely
ECAM-compliant.  It is non-ECAM only for the RC bus config space; for any
other bus underneath the root bus it does support ECAM access.

Add specific quirks for PCI config space accessors.  This involves:
1. New initialization call hisi_pcie_init() to obtain RC base
addresses from PNP0C02 at the root of the ACPI namespace (under \_SB).
2. New entry in common quirk array.

[bhelgaas: move to pcie-hisi.c and change Makefile/ifdefs so quirk doesn't
depend on CONFIG_PCI_HISI]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Christopher Covington
2ca5b8ddc6 PCI: Add MCFG quirks for Qualcomm QDF2432 host controller
The Qualcomm Technologies QDF2432 SoC does not support accesses smaller
than 32 bits to the PCI configuration space.  Register the appropriate
quirk.

[bhelgaas: add QCOM_ECAM32 macro, ifdef for ACPI and PCI_QUIRKS]
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Dongdong Liu
169de969c0 PCI/ACPI: Provide acpi_get_rc_resources() for ARM64 platform
The acpi_get_rc_resources() is used to get the RC register address that can
not be described in MCFG.  It takes the _HID & segment to look for and
outputs the RC address resource.  Use PNP0C02 devices to describe such RC
address resource.  Use _UID to match segment to tell which root bus the
PNP0C02 resource belongs to.

[bhelgaas: add dev argument, wrap in #ifdef CONFIG_PCI_QUIRKS]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-12-06 13:45:49 -06:00
Bjorn Helgaas
0b457dde3c PCI: Add comments about ROM BAR updating
pci_update_resource() updates a hardware BAR so its address matches the
kernel's struct resource UNLESS it's a disabled ROM BAR.  We only update
those when we enable the ROM.

It's not obvious from the code why ROM BARs should be handled specially.
Apparently there are Matrox devices with defective ROM BARs that read as
zero when disabled.  That means that if pci_enable_rom() reads the disabled
BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and
writes it back, it would enable the ROM at address zero.

Add comments and references to explain why we can't make the code look more
rational.

The code changes are from 755528c860 ("Ignore disabled ROM resources at
setup") and 8085ce084c ("[PATCH] Fix PCI ROM mapping").

Link: https://lkml.org/lkml/2005/8/30/138
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Bjorn Helgaas
7a6d312b50 PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Bjorn Helgaas
286c2378aa PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.

Compute the BAR address inline and remove pci_resource_bar().  That makes
pci_iov_resource_bar() unused, so remove that as well.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Bjorn Helgaas
546ba9f8f2 PCI: Don't update VF BARs while VF memory space is enabled
If we update a VF BAR while it's enabled, there are two potential problems:

  1) Any driver that's using the VF has a cached BAR value that is stale
     after the update, and

  2) We can't update 64-bit BARs atomically, so the intermediate state
     (new lower dword with old upper dword) may conflict with another
     device, and an access by a driver unrelated to the VF may cause a bus
     error.

Warn about attempts to update VF BARs while they are enabled.  This is a
programming error, so use dev_WARN() to get a backtrace.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Bjorn Helgaas
6ffa2489c5 PCI: Separate VF BAR updates from standard BAR updates
Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.

Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).

This patch:

  - Renames pci_update_resource() to pci_std_update_resource(),
  - Adds pci_iov_update_resource(),
  - Makes pci_update_resource() a wrapper that calls the appropriate one,

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Bjorn Helgaas
45d004f4af PCI: Update BARs using property bits appropriate for type
The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed
to be read-only, but we do save them in res->flags and include them when
updating the BAR.

Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of
PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits
2-3 of I/O addresses.

Use PCI_ROM_ADDRESS_MASK for ROM BARs.  This means we'll only check the top
21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if
the update was successful.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-11-29 08:14:47 -06:00
Bjorn Helgaas
63880b230a PCI: Ignore BAR updates on virtual functions
VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.

We already ignore these updates because of 70675e0b6a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-28 11:37:56 -06:00
Gavin Shan
f40ec3c748 PCI: Do any VF BAR updates before enabling the BARs
Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable().  But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.

Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled.  The half-updated state may conflict with other
devices in the system.

Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.

[bhelgaas: changelog]
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-11-23 17:21:42 -06:00
Quentin Lambert
b11d207fb2 PCI: cpqphp: Add missing call to pci_disable_device()
Most error branches following the call to pci_enable_device() contain a
call to pci_disable_device().  Add these calls where they are missing.

This issue was found with Hector.

Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-11-23 16:54:32 -06:00
Noa Osherovich
1600f62534 PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+
Mellanox devices were marked as having INTx masking ability broken.  As a
result, the VFIO driver fails to start when more than one device function
is passed-through to a VM if both have the same INTx pin.

Prior to Connect-IB, Mellanox devices exposed to the operating system one
PCI function per all ports.  Starting from Connect-IB, the devices are
function-per-port.  When passing the second function to a VM, VFIO will
fail to start.

Exclude ConnectX-4, ConnectX4-Lx and Connect-IB from the list of Mellanox
devices marked as having broken INTx masking:

- ConnectX-4 and ConnectX4-LX firmware version is checked. If INTx
  masking is supported, we unmark the broken INTx masking.
- Connect-IB does not support INTx currently so will not cause any
  problem.

[bhelgaas: call pci_disable_device() always, after iounmap()]
Fixes: 11e42532ad ("PCI: Assume all Mellanox devices have broken INTx masking")
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-23 12:33:32 -06:00
Noa Osherovich
d76d2fe05f PCI: Convert Mellanox broken INTx quirks to be for listed devices only
Change Mellanox's broken_intx_masking() quirk from an "all Mellanox
devices" to a quirk for listed devices only.

[bhelgaas: remove #defines, reorder to keep other quirks together]
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-23 12:33:32 -06:00
Noa Osherovich
b88214ce4d PCI: Convert broken INTx masking quirks from HEADER to FINAL
Convert all quirk_broken_intx_masking() quirks from HEADER to FINAL.

The quirk sets dev->broken_intx_masking, which is only used by
pci_intx_mask_supported(), which is not needed until after FINAL
quirks have been run.

[bhelgaas: changelog]
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-23 12:33:31 -06:00
Bjorn Helgaas
fb26592301 PCI: Warn on possible RW1C corruption for sub-32 bit config writes
Hardware that supports only 32-bit config writes is not spec-compliant.
For example, if software performs a 16-bit write, we must do a 32-bit read,
merge in the 16 bits we intend to write, followed by a 32-bit write.  If
the 16 bits we *don't* intend to write happen to have any RW1C (write-one-
to-clear) bits set, we just inadvertently cleared something we shouldn't
have.

Add a rate-limited warning when we do sub-32 bit config writes.  Remove
similar probe-time warnings from some of the affected host bridge drivers.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Enthusiastically-Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>	# rockchip
Acked-by: Thierry Reding <treding@nvidia.com>
2016-11-21 16:25:39 -06:00
Emil Velikov
702ed3be1b PCI: Create revision file in sysfs
Currently the revision isn't available via sysfs/libudev thus if one wants
to know the value one needs to read through the config file, which can be
quite time-consuming because it wakes/powers up the device.

There are at least two userspace components which could make use the new
file: libpciaccess and libdrm.  The former wakes up _every_ PCI device,
which can be observed via glxinfo when using Mesa 10.0+ drivers.  The
latter, in association with Mesa 13.0, can lead to 2-3 second delays while
starting firefox, thunderbird or chromium.

Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502
Tested-by: Mauro Santos <registo.mailling@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch
CC: Greg KH <gregkh@linuxfoundation.org>
2016-11-21 16:25:32 -06:00
Lukas Wunner
68db9bc814 PCI: pciehp: Add runtime PM support for PCIe hotplug ports
Linux 4.8 added support for runtime suspending PCIe ports to D3hot with
commit 006d44e49a ("PCI: Add runtime PM support for PCIe ports"), but
excluded hotplug ports.  Those are now afforded runtime PM by the present
commit.

Hotplug ports require a few extra considerations:

- The configuration space of the port remains accessible in D3hot, so all
  the functions to read or modify the Slot Status and Slot Control
  registers need not be modified.  Even turning on slot power doesn't seem
  to require the port to be in D0, at least the PCIe spec doesn't say so
  and I confirmed that by testing with a Thunderbolt controller.

- However D0 is required to access devices on the secondary bus.  This
  happens in pciehp_check_link_status() and pciehp_configure_device() (both
  called from board_added()) and in pciehp_unconfigure_device() (called
  from remove_board()), so acquire a runtime PM ref for their invocation.

- The hotplug port stays active as long as it has active children.  If all
  hotplugged devices below the port runtime suspend, the port is allowed to
  runtime suspend as well.  Plug and unplug detection continues to work in
  D3hot.

- Hotplug interrupts are delivered in-band, so while the hotplug port
  itself is allowed to go to D3hot, its parent ports must stay in D0 for
  interrupts to come through.  Add a corresponding restriction to
  pci_dev_check_d3cold().

- Runtime PM may only be allowed if the hotplug port is handled natively by
  the OS.  On ACPI systems, the port may alternatively be handled by the
  firmware and things break if the OS puts the port into D3 behind the
  firmware's back:  E.g. Thunderbolt hotplug ports on non-Macs are handled
  by Intel's firmware in System Management Mode and the firmware is known
  to access devices on the port's secondary bus without checking first if
  the port is in D0: https://bugzilla.kernel.org/show_bug.cgi?id=53811

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-11-17 19:00:29 -06:00
Lukas Wunner
437eb7bf7b ACPI / hotplug / PCI: Make device_is_managed_by_native_pciehp() public
We're about to add runtime PM of hotplug ports, but we need to restrict it
to ports that are handled natively by the OS:  If they're handled by the
firmware (which is the case for Thunderbolt on non-Macs), things would
break if the OS put the ports into D3hot behind the firmware's back.

To determine if a hotplug port is handled natively, one has to walk up from
the port to the root bridge and check the cached _OSC Control Field for the
value of the "PCI Express Native Hot Plug control" bit.  There's already a
function to do that, device_is_managed_by_native_pciehp(), but it's private
to drivers/pci/hotplug/acpiphp_glue.c and only compiled in if
CONFIG_HOTPLUG_PCI_ACPI is enabled.

Make it public and move it to drivers/pci/pci-acpi.c, so that it is
available in the more general CONFIG_ACPI case.

The function contains a check if the device in question is a hotplug port
and returns false if it's not.  The caller we're going to add doesn't need
this as it only calls the function if it actually *is* a hotplug port.
Move the check out of the function into the single existing caller.

Rename it to pciehp_is_native() and add some kerneldoc and polish.

No functional change intended.

Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-17 18:47:58 -06:00
Lukas Wunner
6ef13824e0 ACPI / hotplug / PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
We cache the PCI_EXP_SLTCAP_HPC bit in pci_dev->is_hotplug_bridge on device
probe, so there's no need to read it again when adding the ACPI hotplug
context.

Here's the call chain to prove that no ordering issue is introduced:

pci_scan_child_bus [drivers/pci/probe.c]
  pci_scan_slot
    pci_scan_single_device
      pci_scan_device
        pci_setup_device
          set_pcie_hotplug_bridge
            [is_hotplug_bridge bit is set here]
  pci_scan_bridge
    pci_add_new_bus
      pci_alloc_child_bus
        pcibios_add_bus  [arch/(x86|arm64|ia64)/...]
          acpi_pci_add_bus [drivers/pci/pci-acpi.c]
            acpiphp_enumerate_slots [drivers/pci/hotplug/acpiphp_glue.c]
              acpiphp_add_context
                device_is_managed_by_native_pciehp
                  [is_hotplug_bridge bit is queried here]

No functional change intended.

Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-17 18:47:38 -06:00
Lukas Wunner
718a0609ae PCI: Unfold conditions to block runtime PM on PCIe ports
The conditions to block D3 on parent ports are currently condensed into a
single expression in pci_dev_check_d3cold().  Upcoming commits will add
further conditions for hotplug ports, making this expression fairly large
and impenetrable.  Unfold the conditions to maintain readability when they
are amended.

No functional change intended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Mika Westerberg <mika.westerberg@linux.intel.com>
CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-17 18:47:05 -06:00
Lukas Wunner
97a90aee5d PCI: Consolidate conditions to allow runtime PM on PCIe ports
The conditions to allow runtime PM on PCIe ports are currently spread
across two different files:  The condition relating to hotplug ports is
located in portdrv_pci.c whereas all other conditions are located in pci.c.

Consolidate all conditions in a single place in pci.c, thus making it
easier to follow the logic and amend conditions down the road.

Note that the condition relating to hotplug ports is inserted *before* the
condition relating to the "pcie_port_pm=force" command line option, so
runtime PM is not afforded to hotplug ports even if this option is given.
That's exactly how the code behaved up until now.  If this is not desired,
the ordering of the conditions can simply be reversed.

No functional change intended.

Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-17 18:46:22 -06:00
Lukas Wunner
c6a6330706 PCI: Activate runtime PM on a PCIe port only if it can suspend
Currently pcie_portdrv_probe() activates runtime PM on a PCIe port even
if it will never actually suspend because the BIOS is too old or the
"pcie_port_pm=off" option was specified on the kernel command line.

A few CPU cycles can be saved by not activating runtime PM at all in these
cases, because rpm_idle() and rpm_suspend() will bail out right at the
beginning when calling rpm_check_suspend_allowed(), instead of carrying out
various locking and assignments, invoking rpm_callback(), getting back
-EBUSY and rolling everything back.

The conditions checked in pci_bridge_d3_possible() are all static, they
never change during uptime of the system, hence it's safe to call this to
determine if runtime PM should be activated.

No functional change intended.

Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-17 18:46:06 -06:00