Update the ls_add_pcie_port() signature to keep it consistent with the
other DesignWare-based host drivers.
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
For the LS1021a PCIe controller, some status registers are located in SCFG,
unlike other Layerscape devices.
Move SCFG-related code to ls1021_pcie_host_init() and rename
ls_pcie_link_up() to ls1021_pcie_link_up() because LTSSM status is also in
SCFG.
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Layerscape PCIe controller supports root complex (RC) and endpoint (EP)
modes, which can be set by RCW.
If not in RC mode, return -ENODEV without claiming the controller.
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
ls_pcie_establish_link() does not do any real operation, except to wait for
the linkup establishment. In fact, this is not necessary. Moreover, each
PCIe controller not inserted device will increase the Linux startup time
about 200ms.
Remove ls_pcie_establish_link().
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously, dw_pcie_host_init() created the PCI host bridge with
pci_common_init_dev(), an ARM-specific function that supplies the ARM-
specific pci_sys_data structure as the PCI "sysdata".
Make pcie-designware.c arch-agnostic by reimplementing the functionality of
pci_common_init_dev() directly in dw_pcie_host_init().
Note that this changes the bridge sysdata from the ARM pci_sys_data to the
DesignWare pcie_port structure. This doesn't affect the ARM sysdata users
because they are all specific to non-DesignWare host bridges, which will
still have pci_sys_data.
[bhelgaas: changelog]
Tested-by: James Morse <james.morse@arm.com>
Tested-by: Gabriel Fernandez <gabriel.fernandez@st.com>
Tested-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
Use the new of_pci_get_host_bridge_resources() API in place of the PCI OF
DT parser.
[bhelgaas: changelog]
Tested-by: James Morse <james.morse@arm.com>
Tested-by: Gabriel Fernandez <gabriel.fernandez@st.com>
Tested-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
Revert f4c55c5a3f ("PCI: designware: Program ATU with untranslated
address").
Note that dra7xx_pcie_host_init() now modifies pp->io_base, but we still
need the original value for dw_pcie_setup() in the path below, so this adds
a new io_base_tmp member. It will be removed later when dw_pcie_setup() is
removed.
dra7xx_add_pcie_port
dw_pcie_host_init
pp->io_base = range.cpu_addr
pp->io_base_tmp = range.cpu_addr # <-- added
pp->ops->host_init
dra7xx_pcie_host_init # ops->host_init
pp->io_base &= DRA7XX_CPU_TO_BUS_ADDR # <-- modified
pci_common_init_dev(..., &dw_pci)
pcibios_init_hw
hw->setup
dw_pcie_setup # hw_pci.setup
pci_ioremap_io(..., pp->io_base_tmp) # <-- original addr required
[bhelgaas: changelog]
Tested-by: James Morse <james.morse@arm.com>
Tested-by: Gabriel Fernandez <gabriel.fernandez@st.com>
Tested-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
Commit f4c55c5a3f ("PCI: designware: Program ATU with untranslated
address") added the calculation of PCI bus addresses in pcie-designware.c,
storing them in new fields added in struct pcie_port. This calculation is
done for every DesignWare user even though it only applies to DRA7xx.
Move the calculation of the bus addresses to the DRA7xx driver to allow the
rework of DesignWare to use the new DT parsing API.
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
Currently "num-lanes" is read in dw_pcie_host_init(), but it is only used
if we call dw_pcie_setup_rc() while bringing up the link. If the link has
already been brought up by firmware, we need not call dw_pcie_setup_rc(),
and "num-lanes" is unnecessary.
Only complain about "num-lanes" if we actually need it and we didn't find a
valid value.
[bhelgaas: changelog]
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add sanity checks on "addr" input parameter in dw_pcie_cfg_read() and
dw_pcie_cfg_write(). These checks make sure that accesses are aligned on
their size, e.g., a 4-byte config access is aligned on a 4-byte boundary.
[bhelgaas: changelog, set *val = 0 in failure case]
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
Callers of dw_pcie_cfg_read() and dw_pcie_cfg_write() previously had to
split the address into "addr" and "where". The callees assumed "addr" was
32-bit aligned (with zeros in the low two bits) and they used only the low
two bits of "where".
Accept the entire address in "addr" and drop the now-redundant "where"
argument. As an example, this replaces this:
int dw_pcie_cfg_read(void __iomem *addr, int where, int size, u32 *val)
*val = readb(addr + (where & 1));
with this:
int dw_pcie_cfg_read(void __iomem *addr, int size, u32 *val)
*val = readb(addr):
[bhelgaas: changelog, split access size change to separate patch]
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
dw_pcie_cfg_write() uses the exact 8-, 16-, or 32-bit access size
requested, but dw_pcie_cfg_read() previously performed a 32-bit read and
masked out the bits requested.
Use the exact access size in dw_pcie_cfg_read(). For example, if we want
an 8-bit read, use readb() instead of using readl() and masking out the 8
bits we need. This makes it symmetric with dw_pcie_cfg_write().
[bhelgaas: split into separate patch, set *val = 0 in failure case]
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The first argument of dw_pcie_cfg_read/write() is a 32-bit aligned address.
The second argument is the byte offset into a 32-bit word, and
dw_pcie_cfg_read/write() only look at the low two bits.
SPEAr13xx used dw_pcie_cfg_read() and dw_pcie_cfg_write() incorrectly: it
passed important address bits in the second argument, where they were
ignored.
Pass the complete 32-bit word address in the first argument and only the
2-bit offset into that word in the second argument.
Without this fix, SPEAr13xx host will never work with few buggy gen1 card
which connects with only gen1 host and also with any endpoint which would
generate a read request of more than 128 bytes.
[bhelgaas: changelog]
Reported-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.17+
Set up the high part of the MSI target address to allow the MSI target to
be above 4GB on 64bit and PAE systems.
[bhelgaas: changelog]
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
* pm-sleep:
PM / hibernate: fix a comment typo
input: i8042: Avoid resetting controller on system suspend/resume
PM / PCI / ACPI: Kick devices that might have been reset by firmware
PM / sleep: Add flags to indicate platform firmware involvement
PM / sleep: Drop pm_request_idle() from pm_generic_complete()
PCI / PM: Avoid resuming more devices during system suspend
PM / wakeup: wakeup_source_create: use kstrdup_const
PM / sleep: Report interrupt that caused system wakeup
NUMA
- Prevent out of bounds access in sysfs numa_node override (Sasha Levin)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWMqyJAAoJEFmIoMA60/r8PXMP/jk33V2VHXmcdAnRpdUM09at
VEosZlsPXmAGwlcbf0y9BMdwu8M8Kd1LAtiTXhbFchoWi3NG8ECPsIlYnQfqVzhm
VEEE4IyMHNfLOgQAf+ZfD5duBjsTDoyl3D2XYa8ugV1jJVs4Vpf2lyWVnwrG32E9
MTf2plaHWtjsser78PA0hQ5w5jJz41acgv9P88mdWmYyr+u2h+G8w+Ro2bLyVsiW
dcSIM7L1R6j9Kp52BqXq31rwHXQIF8v+yDaHNTKR6PzcufyuHKsK2fALa7LSam2P
EJEj7D8FVPFqYs2XRdPiYI+/wjMcM59CETIZ5NtEzjkQvoeTQhLa3iA8LrS4OMNI
JQWbPIHu9dB2Y2fFyeO31kW8+G8zgSKPcdhg9gAdoPspVX387+KHR+aiSMOlGsTu
wCyMQsuQSqcNkKGAyPcaQe6AUaI+3Ri3awuBV3/o20tNq2upPqeljvZa6v3W/Ua+
OSKE9rdRxsMzi1M3sLIDYIg0mD3K+horH52A3cjoOXehhSFX8pucbuk6bvYszPxq
0rPLX7fasbVo/yTLz4RgIk9LK2yxpg7TO1MRQb4byCbBqVJU+7R9JxakqstmJGXv
W0huOvn776rtcpxItfbckyfCsVhqcZ13xP1osjCcFLciSe+eq4dKCY1iH0OyvSOu
S+TkJWdpKEU9L0Tfxjvo
=nuxn
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.3-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"Sorry for this last-minute update; it's been in -next for quite a
while, but I forgot about it until I started getting ready for the
merge window.
It's small and fixes a way a user could cause a panic via sysfs, so I
think it's worth getting it in v4.3.
NUMA:
- Prevent out of bounds access in sysfs numa_node override (Sasha Levin)"
* tag 'pci-v4.3-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Prevent out of bounds access in numa_node override
Disable VFs if pcibios_enable_sriov() fails, just like we do for other
errors in sriov_enable(). Call pcibios_sriov_disable() if virtfn_add()
fails.
[bhelgaas: changelog, split to separate patch for reviewability]
Fixes: 995df527f3 ("PCI: Add pcibios_sriov_enable() and pcibios_sriov_disable()")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after clearing
VF Enable before reading any field in the SR-IOV Extended Capability.
Wait 1 second before calling pci_iov_set_numvfs(), which reads
PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets PCI_SRIOV_NUM_VF.
[bhelgaas: split to separate patch for reviewability, add spec reference]
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move pcibios_sriov_disable() up so it's defined before a future use.
[bhelgaas: split to separate patch for reviewability]
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>
If virtfn_add() fails, we call virtfn_remove() for any previously added
devices. Remove the devices in reverse order (first-added is
last-removed), which is more natural and doesn't require an additional
variable.
[bhelgaas: changelog, split to separate patch for reviewability]
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>
On ARM64, setting the root bus number to -1 causes probe failure.
Moreover, we should use the bus number specified in the DT as we could have
multiple PCIe controllers with different bus ranges.
Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
The R-Car PCIe host controller driver uses pci_common_init_dev(), which is
ARM-specific and requires the ARM struct hw_pci. The part of
pci_common_init_dev() that is needed is limited and can be done here
without using hw_pci.
Note that the ARM pcibios functions expect the PCI sysdata to be a pointer
to a struct pci_sys_data. Add a struct pci_sys_data as the first element
in struct gen_pci so that when we use a gen_pci pointer as sysdata, it is
also a pointer to a struct pci_sys_data.
Create and scan the root bus directly without using the ARM
pci_common_init_dev() interface.
Based on 499733e0cc ("PCI: generic: Remove dependency on ARM-specific
struct hw_pci").
Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Make PCI aware of the I/O resources.
Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
The pcie-rcar.c driver (controlled by PCI_RCAR_GEN2_PCIE) uses struct
pci_sys_data and pci_ioremap_io(), which only exist on ARM. Building it on
other arches, e.g., arm64/shmobile, causes errors like this:
drivers/pci/host/pcie-rcar.c:138:52: warning: 'struct pci_sys_data' declared inside parameter list
drivers/pci/host/pcie-rcar.c:380:4: error: implicit declaration of function 'pci_ioremap_io' [-Werror=implicit-function-declaration]
Build pcie-rcar.c only on ARM.
[bhelgaas: changelog, split to separate pci-rcar-gen2 from pcie-rcar]
Reported-by: Wolfram Sang <wsa@the-dreams.de> (pci_ioremap_io())
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The pci-rcar-gen2.c driver (controlled by PCI_RCAR_GEN2) uses struct
pci_sys_data, which only exists on ARM. Building it on other arches, e.g.,
arm64/shmobile, causes errors like this:
drivers/pci/host/pci-rcar-gen2.c: In function 'rcar_pci_cfg_base': drivers/pci/host/pci-rcar-gen2.c:112:34: error: dereferencing pointer to incomplete type
struct rcar_pci_priv *priv = sys->private_data;
^
Build pci-rcar-gen2.c only on ARM.
[bhelgaas: changelog, split to separate pci-rcar-gen2 from pcie-rcar]
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
An Enhanced Allocation Capability entry with BEI 0 fills in
dev->resource[0] just like a real BAR 0 would, but non-EA experts might not
connect "EA - BEI 0" with BAR 0.
Decode the EA jargon a little bit, e.g., change this:
pci 0002:01:00.0: EA - BEI 0, Prop 0x00: [mem 0x84300000-0x84303fff]
to this:
pci 0002:01:00.0: BAR 0: [mem 0x84300000-0x84303fff] (from Enhanced Allocation, properties 0x00)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Expand bitmask #defines completely. This puts the shift in the code
instead of in the #define, but it makes it more obvious in the header file
how fields in the register are laid out.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
SR-IOV BARs can be specified via EA entries. Extend the EA parser to
extract the SRIOV BAR resources, and modify sriov_init() to use resources
previously obtained via EA.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean O. Stalley <sean.stalley@intel.com>
Add support for devices using Enhanced Allocation entries instead of BARs.
This allows the kernel to parse the EA Extended Capability structure in PCI
config space and claim the BAR-equivalent resources.
See https://pcisig.com/sites/default/files/specification_documents/ECN_Enhanced_Allocation_23_Oct_2014_Final.pdf
[bhelgaas: add spec URL, s/pci_ea_set_flags/pci_ea_flags/, consolidate
declarations, print unknown property in hex to match spec]
Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
[david.daney@cavium.com: Add more support/checking for Entry Properties,
allow EA behind bridges, rewrite some error messages.]
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The new Enhanced Allocation (EA) capability support (patches to follow)
creates resources with the IORESOURCE_PCI_FIXED set. During resource
assignment in pci_bus_assign_resources(), IORESOURCE_PCI_FIXED resources
are not given a parent. This, in turn, causes pci_enable_resources() to
fail with a "not claimed" error.
So, in __pci_bus_assign_resources(), for IORESOURCE_PCI_FIXED resources,
try to request the resource from a parent bus.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean O. Stalley <sean.stalley@intel.com>
The new Enhanced Allocation (EA) capability support (patches to follow)
creates resources with the IORESOURCE_PCI_FIXED set. Since these resources
cannot be relocated or resized, their alignment is not really defined, and
it is therefore not specified. This causes a problem in pbus_size_mem()
where resources with unspecified alignment are disabled.
So, in pbus_size_mem() skip IORESOURCE_PCI_FIXED resources, instead of
disabling them.
[bhelgaas: folded in "flags & IORESOURCE_PCI_FIXED" fix from David]
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean O. Stalley <sean.stalley@intel.com>
Previously, we read, validated, and cached PCI_SRIOV_VF_OFFSET and
PCI_SRIOV_VF_STRIDE in sriov_enable(). But sriov_init() now does
that via compute_max_vf_buses(), so we don't need to do it again.
Remove the PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE config reads from
sriov_enable(). The pci_sriov structure already contains the offset and
stride corresponding to the current NumVFs.
[bhelgaas: split to separate patch for reviewability]
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>
The enumeration path should leave NumVFs set to zero. But after
4449f07972 ("PCI: Calculate maximum number of buses required for VFs"),
we call virtfn_max_buses() in the enumeration path, which changes NumVFs.
This NumVFs change is visible via lspci and sysfs until a driver enables
SR-IOV.
Iterate from TotalVFs down to zero so NumVFs is zero when we're finished
computing the maximum number of buses. Validate offset and stride in
the loop, so we can test it at every possible NumVFs setting. Rename
virtfn_max_buses() to compute_max_vf_buses() to hint that it does have a
side effect of updating iov->max_VF_buses.
[bhelgaas: changelog, rename, allow numVF==1 && stride==0, rework loop,
reverse sense of error path]
Fixes: 4449f07972 ("PCI: Calculate maximum number of buses required for VFs")
Based-on-patch-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
For some SR-IOV devices, the number of available virtual functions, i.e.,
TotalVFs, increases after setting the ARI Capable Hierarchy bit in the
SR-IOV Control register. This violates the SR-IOV spec, r1.1, sec 3.3.6,
which says TotalVFs is HwInit, but we don't need TotalVFs before setting
the ARI Capable bit anyway.
Set the ARI Capable Hierarchy bit (if ARI is enabled in the upstream
bridge) before reading TotalVFs.
[bhelgaas: changelog]
Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add the Altera PCIe host controller driver.
[bhelgaas: whitespace, fold in DT and maintainer updates, OF_PCI
dependency from Arnd]
Signed-off-by: Ley Foon Tan <lftan@altera.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Rob Herring <robh@kernel.org> (DT binding)
The Chelsio T5 has a PCIe compliance erratum that causes Malformed TLP or
Unexpected Completion errors in some systems, which may cause device access
timeouts.
Per PCIe r3.0, sec 2.2.9, "Completion headers must supply the same values
for the Attribute as were supplied in the header of the corresponding
Request, except as explicitly allowed when IDO is used."
Instead of copying the Attributes from the Request to the Completion, the
T5 always generates Completions with zero Attributes. The receiver of a
Completion whose Attributes don't match the Request may accept it (which
itself seems non-compliant based on sec 2.3.2), or it may handle it as a
Malformed TLP or an Unexpected Completion, which will probably lead to a
device access timeout.
Work around this by disabling "Relaxed Ordering" and "No Snoop" in the Root
Port so it always generate Requests with zero Attributes.
This does affect all other devices which are downstream of that Root Port,
but these are performance optimizations that should not make a functional
difference.
Note that Configuration Space accesses are never supposed to have TLP
Attributes, so we're safe waiting till after any Configuration Space
accesses to do the Root Port "fixup".
Based on original work by Casey Leedom <leedom@chelsio.com>
[bhelgaas: changelog, comments, rename to pci_find_pcie_root_port(), rework
to use pci_upstream_bridge() and check for Root Port device type, edit
diagnostics to clarify intent and devices affected]
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Up to now, work items to be queued to be handled by pciehp_power_thread()
are allocated using kmalloc() in three different locations. If not needed,
kfree() is called to free the allocated data.
Introduce a separate function to allocate the work item and queue it, and
call it only if needed. This reduces code duplication and avoids having to
free memory if the work item does not need to get executed.
[bhelgaas: tweak "no memory" message, make pciehp_queue_power_work() static]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Certain SoCs require the PCIe outbound mapping to be configured in
software. Add support for those chips.
[jonmason: Use %pap format when printing size_t to avoid warnings in 32-bit
build.]
[arnd: Use div64_u64() instead of "%" to avoid __aeabi_uldivmod link error
in 32-bit build.]
Signed-off-by: Ray Jui <rjui@broadcom.com>
Signed-off-by: Jon Mason <jonmason@broadcom.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
So far, we've always considered that for a given PCI device, its
MSI controller was either set by the architecture-specific
pcibios hook, or simply inherited from the host bridge.
This doesn't cover things like firmware-defined topologies like
msi-map (DT) or IORT (ACPI), which can provide information about
which MSI controller to use on a per-device basis.
This patch adds the necessary hook into the MSI code to allow this
feature, and provides the msi-map functionnality as a first
implementation.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
So far, we have considered that the MSI domain for a device was
either set via the architecture-dependent pcibios implementation
or inherited from the host bridge.
As we're about to break that assumption, add pci_dev_msi_domain
which is the equivalent of pci_host_bridge_msi_domain, but for
a single device.
Other than moving things around a bit, this patch on its own
has no effect.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Now that we have a function that implements the complexity of the
"msi-parent" property parsing, switch to that.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Add pci_msi_domain_get_msi_rid() to return the MSI requester id (RID).
Initially needed by gic-v3 based systems. It will be used by follow on
patch to drivers/irqchip/irq-gic-v3-its-pci-msi.c
Initially supports mapping the RID via OF device tree. In the future,
this could be extended to use ACPI _IORT tables as well.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When we create a generic MSI domain, that MSI_FLAG_USE_DEF_CHIP_OPS
is set, and that any of .mask or .unmask are NULL in the irq_chip
structure, we set them to pci_msi_[un]mask_irq.
This is a bad idea for at least two reasons:
- PCI_MSI might not be selected, kernel fails to build (yes, this is
legitimate, at least on arm64!)
- This may not be a PCI/MSI domain at all (platform MSI, for example)
Either way, this looks wrong. Move the overriding of mask/unmask to
the PCI counterpart, and panic is any of these two methods is not
set in the core code (they really should be present).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1444760085-27857-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irqbalance uses sysfs attributes to populate its internal database, which
is then used to bind the IRQ to the appropriate NUMA node.
On a device accepting multiple MSIs and with interrupt remapping enabled,
only the first IRQ entry is exported in the "msi_irqs" directory. This
results in irqbalance having no clue of the NUMA affinity for the extra
IRQs, so it can't bind them to the correct node.
Export all MSI interrupts as sysfs attributes when relevant.
[bhelgaas: changelog]
Signed-off-by: Romain Bezut <rbezut@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
There is a concern that if the platform firmware was involved in
the system resume that's being completed, some devices might have
been reset by it and if those devices had the power.direct_complete
flag set during the preceding suspend transition, they may stay
in a reset-power-on state indefinitely (until they are runtime-resumed
and then suspended again). That may not be a big deal from the
individual device's perspective, but if the system is an SoC, it may
be prevented from entering deep SoC-wide low-power states on idle
because of that.
The devices that are most likely to be affected by this issue are
PCI devices and ACPI-enumerated devices using the general ACPI PM
domain, so to prevent it from happening for those devices, force a
runtime resume for them if they have their power.direct_complete
flags set and the platform firmware was involved in the resume
transition currently in progress.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The pm_request_idle() in pm_generic_complete() is pointless as it is
called with the runtime PM usage counter different from zero (bumped
up by the core during the prepare phase of system suspend) and the
core calls pm_runtime_put() for all devices after executing their
complete callbacks, so drop it.
This allows the PCI PM layer to use pm_generic_complete() too.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
As we continue to push of_node towards the outskirts of irq domains,
let's start tackling the case of msi_create_irq_domain and its little
friends.
This has limited impact in both PCI/MSI, platform MSI, and a few
drivers.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-17-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit bac2a909a0 (PCI / PM: Avoid resuming PCI devices during
system suspend) introduced a mechanism by which some PCI devices that
were runtime-suspended at the system suspend time might be left in
that state for the duration of the system suspend-resume cycle.
However, it overlooked devices that were marked as capable of waking
up the system just because PME support was detected in their PCI
config space.
Namely, in that case, device_can_wakeup(dev) returns 'true' for the
device and if the device is not configured for system wakeup,
device_may_wakeup(dev) returns 'false' and it will be resumed during
system suspend even though configuring it for system wakeup may not
really make sense at all.
To avoid this problem, simply disable PME for PCI devices that have
not been configured for system wakeup and are runtime-suspended at
the system suspend time for the duration of the suspend-resume cycle.
If the device is in D3cold, its config space is not available and it
shouldn't be written to, but that's only possible if the device
has platform PM support and the platform code is responsible for
checking whether or not the device's configuration is suitable for
system suspend in that case.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Make the offset from the beginning of the "reg" property be from the
starting bus number, rather than zero. Hoist the invariant size
calculation out of the mapping for loop.
Update host-generic-pci.txt to clarify the semantics of the "reg" property
with respect to non-zero starting bus numbers.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rob Herring <robh@kernel.org>
Now that we advertise a PCIe capability, the Linux PCI layer will not scan
the bus for devices other than in slot 0. This makes the work-around to
trap accesses to devices other than slot 0 unnecessary.
Tested-by: Willy Tarreau <w@1wt.eu> (Iomega iConnect Kirkwood, MiraBox Armada 370)
Tested-by: Andrew Lunn <andrew@lunn.ch> (D-Link DIR664 Kirkwood)
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Add a PCI Express root complex capability block so the PCI layer identifies
the bridge as a PCI Express device.
We expose this as a version 1 PCIe capability block, with slot support. We
disable the clock power management capability as this depends on boards
wiring the CLKREQ# signal.
Tested-by: Willy Tarreau <w@1wt.eu> (Iomega iConnect Kirkwood, MiraBox Armada 370)
Tested-by: Andrew Lunn <andrew@lunn.ch> (D-Link DIR664 Kirkwood)
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Add an implementation to handle clock and reset handling that is compliant
with the PCIe specification. The clock should be running and stable for
100us prior to reset being released, and we should re-assert reset prior to
stopping the clock.
Tested-by: Willy Tarreau <w@1wt.eu> (Iomega iConnect Kirkwood, MiraBox Armada 370)
Tested-by: Andrew Lunn <andrew@lunn.ch> (D-Link DIR664 Kirkwood)
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Use a gpio_desc to carry around the gpio, so we can then make use of the
GPIOF_ACTIVE_LOW property rather than carrying that around as well. This
also avoids needing to use gpio_is_valid() to check whether we have a GPIO;
checking for a non-NULL descriptor is simpler.
Tested-by: Willy Tarreau <w@1wt.eu> (Iomega iConnect Kirkwood, MiraBox Armada 370)
Tested-by: Andrew Lunn <andrew@lunn.ch> (D-Link DIR664 Kirkwood)
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Rather than using devm_kzalloc() and multiplying the element and number,
use the provided devm_kcalloc() helper for this.
Tested-by: Willy Tarreau <w@1wt.eu> (Iomega iConnect Kirkwood, MiraBox Armada 370)
Tested-by: Andrew Lunn <andrew@lunn.ch> (D-Link DIR664 Kirkwood)
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
We are in a context where we can sleep, and the PCIe reset gpio may be on
an I2C expander. Use the cansleep() variant when setting the GPIO value.
Tested-by: Willy Tarreau <w@1wt.eu> (Iomega iConnect Kirkwood, MiraBox Armada 370)
Tested-by: Andrew Lunn <andrew@lunn.ch> (D-Link DIR664 Kirkwood)
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Split the PCIe port DT parsing and resource claiming from setting up the
actual ports. This allows us to gather all the resources first, before
touching the hardware. This is important as some of these resources (such
as the GPIO for the PCIe reset) may defer probing.
Tested-by: Willy Tarreau <w@1wt.eu> (Iomega iConnect Kirkwood, MiraBox Armada 370)
Tested-by: Andrew Lunn <andrew@lunn.ch> (D-Link DIR664 Kirkwood)
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
The mvebu PCI port parsing is weak due to:
1) allocations via kasprintf() were not cleaned up when we encounter an
error or decide to skip the port.
2) kasprintf() wasn't checked for failure.
3) of_get_named_gpio_flags() returns EPROBE_DEFER if the GPIO is not
present, not devm_gpio_request_one().
4) the of_node was not being put when terminating the loop.
Fix these oversights.
Tested-by: Willy Tarreau <w@1wt.eu> (Iomega iConnect Kirkwood, MiraBox Armada 370)
Tested-by: Andrew Lunn <andrew@lunn.ch> (D-Link DIR664 Kirkwood)
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Move the PCIe port parsing and resource claiming to a separate function in
preparation to add proper cleanup of claimed resources.
Tested-by: Willy Tarreau <w@1wt.eu> (Iomega iConnect Kirkwood, MiraBox Armada 370)
Tested-by: Andrew Lunn <andrew@lunn.ch> (D-Link DIR664 Kirkwood)
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Use the port->name string which we previously formatted when referring to
the name of a port, rather than manually creating the port name each time.
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Tested-by: Andrew Lunn <andrew@lunn.ch> (Kirkwood DIR665)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
If we have a missing required property, report the full node name rather
than a vague "PCIe DT node" statement. This allows the exact node in error
to be identified immediately.
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Tested-by: Andrew Lunn <andrew@lunn.ch> (Kirkwood DIR665)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Rather than using for_each_child_of_node() and testing each child's
availability, use the for_each_available_child_of_node() helper instead.
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Tested-by: Andrew Lunn <andrew@lunn.ch> (Kirkwood DIR665)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Rather than open-coding of_get_available_child_count(), use the provided
helper instead.
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Tested-by: Andrew Lunn <andrew@lunn.ch> (Kirkwood DIR665)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
The idea that you can arbitarily read 32-bits from PCI configuration space,
modify a sub-field (like the command register) and write it back without
consequence is deeply flawed.
Status registers (such as the status register, PCIe device status register,
etc) contain status bits which are read, write-one-to-clear.
What this means is that reading 32-bits from the command register,
modifying the command register, and then writing it back has the effect of
clearing any status bits that were indicating at that time. Same for the
PCIe device control register clearing bits in the PCIe device status
register.
Since the Armada chips support byte, 16-bit and 32-bit accesses to the
registers (unless otherwise stated) and the PCI configuration data register
does not specify otherwise, it seems logical that the chip can indeed
generate the proper configuration access cycles down to byte level.
Testing with an ASM1062 PCIe to SATA mini-PCIe card on Armada 388. PCIe
capability at 0x80, DevCtl at 0x88, DevSta at 0x8a.
Before:
/# setpci -s 1:0.0 0x88.l - DevSta: CorrErr+
00012810
/# setpci -s 1:0.0 0x88.w=0x2810 - Write DevCtl only
/# setpci -s 1:0.0 0x88.l - CorrErr cleared - FAIL
00002810
After:
/# setpci -s 1:0.0 0x88.l - DevSta: CorrErr+
00012810
/# setpci -s 1:0.0 0x88.w=0x2810 - check DevCtl only write
/# setpci -s 1:0.0 0x88.l - CorErr remains set
00012810
/# setpci -s 1:0.0 0x88.w=0x281f - check DevCtl write works
/# setpci -s 1:0.0 0x88.l - devctl field updated
0001281f
/# setpci -s 1:0.0 0x8a.w=0xffff - clear DevSta
/# setpci -s 1:0.0 0x88.l - CorrErr now cleared
0000281f
/# setpci -s 1:0.0 0x88.w=0x2810 - restore DevCtl
/# setpci -s 1:0.0 0x88.l - check
00002810
[bhelgaas: changelog]
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Tested-by: Andrew Lunn <andrew@lunn.ch> (Kirkwood DIR665)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
PCI requires reads to reserved or unimplemented configuration space to
return zero and complete normally (see PCI r3.0, sec 6.1). However, the
root port software implementation was returning 0xfffffff and
PCIBIOS_BAD_REGISTER_NUMBER.
Return zero when reading reserved or unimplemented config space.
[bhelgaas: changelog]
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> (Armada XP GP)
Tested-by: Andrew Lunn <andrew@lunn.ch> (Kirkwood DIR665)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
If the bus is being configured with a bus-range that does not start at
zero, pass that starting bus number to pci_scan_root_bus(). Passing the
incorrect value of zero causes attempted config accesses outside of the
supported range, which cascades to an OOPs spew and eventual kernel panic.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
The generic driver kept a global struct pci_ops ("gen_pci_ops") which it
patched with the .map_bus() method appropriate for the bus device. This is
a problem when we have two different types of bus devices: the .map_bus()
method for the last device probed clobbers the method for previous devices.
The result is that only the last bus device probed has the correct
.map_bus(), and the others fail.
Move the struct pci_ops into the bus-specific structure and initialize a
pointer to it when the bus device is probed.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Will Deacon <will.deacon@arm.com>
63692df103 ("PCI: Allow numa_node override via sysfs") didn't check that
the numa node provided by userspace is valid. Passing a node number too
high would attempt to access invalid memory and trigger a kernel panic.
Fixes: 63692df103 ("PCI: Allow numa_node override via sysfs")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.19+
After 8d63bc7bea ("PCI/MSI: pci-xgene-msi: Get rid of struct
msi_controller"), it is no longer required to assign msi_controller for
X-Gene PCIe host bridge to support MSI.
Remove this unnecessary code. This also avoids a warning message ("failed
to enable MSI") during boot.
[bhelgaas: changelog]
Signed-off-by: Duc Dang <dhdang@apm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Tanmay Inamdar <tinamdar@apm.com>
Improve the link detection logic by explicitly querying the link status
register to ensure link is active.
Also force class to PCI_CLASS_BRIDGE_PCI (0x0604) through the host
configuration space register.
Signed-off-by: Ray Jui <rjui@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Scott Branden <sbranden@broadcom.com>
The current reset logic does not always properly reset the device. For
example, in the case when the perst_b signal is already de-asserted in the
bootloader, the current reset logic fails to trigger a proper assert ->
de-assert reset sequence.
Fix the issue by always triggering the proper reset sequence.
Also explicitly select the desired reset source, i.e., perst_b, and reduce
the wait time after the device comes out of reset from 250 ms to 100 ms,
based on recommendation from the ASIC team.
Tested-by: Vladimir Dreizin <vdreizin@broadcom.com>
Tested-by: Darren Edamura <dedamura@broadcom.com>
Signed-off-by: Ray Jui <rjui@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Vladimir Dreizin <vdreizin@broadcom.com>
Reviewed-by: Trac Hoang <trhoang@broadcom.com>
Reviewed-by: Scott Branden <sbranden@broadcom.com>
After 459a07721c ("PCI: Build setup-irq.o for arm64"), we build
setup-irq.o for arm64, so we can use pci_fixup_irqs() on both arm and
arm64.
Remove the "#ifdef CONFIG_ARM" around the call to pci_fixup_irqs().
[bhelgaas: changelog]
Signed-off-by: Ray Jui <rjui@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Resource management
- Revert pci_read_bridge_bases() unification (Bjorn Helgaas)
- Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn Helgaas)
MSI
- Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson)
Renesas R-Car host bridge driver
- Add R8A7794 support (Sergei Shtylyov)
Miscellaneous
- Fix devfn for VPD access through function 0 (Alex Williamson)
- Use function 0 VPD only for identical functions (Alex Williamson)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWBUq2AAoJEFmIoMA60/r8wbwP/0D/+fKEPYJlB6hx1wLHpVk3
K//vEwH0RgA3v2X53QUoHg94gTYhSZLKX0zdAFshbphE0HCZ6AO3UO+/ZJ3cui6J
PYvKOnhby2ErNotZqrs3DQIM8rGgl0ZVgoFQrAWEvwiHRHI/r2ArK/oR4PiBjxJT
StYuJoTkZIlJyHXza6tvHDcWi+Jc8t8r0HC4Vs32BlaVBQM0SH3CMxHfhJw/Q9xP
WHFif1sH0N+p7WDyHH71C1T8POOgXY73BsD2AC0se3lRYZ9SVkOVy9ECGUucx8F6
LDAuFelwRvW2Dr9kh38+5f8Xp155E+eZ6zRWW9/JlrUKVEtHhOFhtrRfDNKHuDCt
B9ETrxDiSUFAdQ2weye9BK6aXK0CHF6YP3PCbvK77qFUUsN8csFSKktanKrFAbML
CdjkVkEoeLHw+aXzyDg0pSBRZMQ24dTQDh7YqOFZGuEjCLPXOEQ8nitf0IzBB0KI
4QetT/QK3bKkgtVKTwPP+s9f4g+fA/oiwJ21ZTV9hi/9upywTa/umCUvH9Fmb8Fp
VZeTzugSht0+ioXpaF/6/KO0Ccp/t5uAHYeuBBMqiHX7ks8DdnfPCwbWNRKkg25O
Qy7Y8VnnOtesRCAqBq5y/hHlLUluMkjYpEblYFiD6HBWcjUh6xE6LlIO4mwnjqWI
zjB3w7+0GOrvS7dBSx0N
=f4c3
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"These are fixes for things we merged for v4.3 (VPD, MSI, and bridge
window management), and a new Renesas R8A7794 SoC device ID.
Details:
Resource management:
- Revert pci_read_bridge_bases() unification (Bjorn Helgaas)
- Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn
Helgaas)
MSI:
- Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson)
Renesas R-Car host bridge driver:
- Add R8A7794 support (Sergei Shtylyov)
Miscellaneous:
- Fix devfn for VPD access through function 0 (Alex Williamson)
- Use function 0 VPD only for identical functions (Alex Williamson)"
* tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: rcar: Add R8A7794 support
PCI: Use function 0 VPD for identical functions, regular VPD for others
PCI: Fix devfn for VPD access through function 0
PCI/MSI: Fix MSI IRQ domains for VFs on virtual buses
PCI: Clear IORESOURCE_UNSET when clipping a bridge window
PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"
Section 3.2 "Device Runtime Power Management" of pci.txt has become
outdated, so update it to correctly reflect the current code flow.
Also update the comment in local_pci_probe() to document the fact
that pm_runtime_put_noidle() is not the only runtime PM helper
function that can be used to decrement the device's runtime PM
usage counter in .probe().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Add a #define for PCIE_PHY_RX_ASIC_OUT_VALID and use it instead of a
hardcoded value.
[bhelgaas: drop PCIE_PHY_DEBUG_R0_LTSSM_MASK; updated in future patch]
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
When devm_request_irq() fails, imx6_add_pcie_port() should return the real
error code instead of always returning -ENODEV.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Add Renesas R8A7794 SoC support to the Renesas R-Car gen2 PCI driver.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
932c435cab ("PCI: Add dev_flags bit to access VPD through function 0")
added PCI_DEV_FLAGS_VPD_REF_F0. Previously, we set the flag on every
non-zero function of quirked devices. If a function turned out to be
different from function 0, i.e., it had a different class, vendor ID, or
device ID, the flag remained set but we didn't make VPD accessible at all.
Flip this around so we only set PCI_DEV_FLAGS_VPD_REF_F0 for functions that
are identical to function 0, and allow regular VPD access for any other
functions.
[bhelgaas: changelog, stable tag]
Fixes: 932c435cab ("PCI: Add dev_flags bit to access VPD through function 0")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Myron Stowe <myron.stowe@redhat.com>
Acked-by: Mark Rustad <mark.d.rustad@intel.com>
CC: stable@vger.kernel.org
Commit 932c435cab ("PCI: Add dev_flags bit to access VPD through function
0") passes PCI_SLOT(devfn) for the devfn parameter of pci_get_slot().
Generally this works because we're fairly well guaranteed that a PCIe
device is at slot address 0, but for the general case, including
conventional PCI, it's incorrect. We need to get the slot and then convert
it back into a devfn.
Fixes: 932c435cab ("PCI: Add dev_flags bit to access VPD through function 0")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Myron Stowe <myron.stowe@redhat.com>
Acked-by: Mark Rustad <mark.d.rustad@intel.com>
CC: stable@vger.kernel.org
SR-IOV creates a virtual bus where bus->self is NULL. When we add VFs and
scan for an MSI domain, pci_set_bus_msi_domain() dereferences bus->self,
which causes a kernel NULL pointer dereference oops.
Scan up to the parent bus until we find a real bridge where we can get the
MSI domain.
[bhelgaas: changelog]
Fixes: 44aa0c657e ("PCI/MSI: Add hooks to populate the msi_domain field")
Tested-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
MSI is broken on SiS 761 chipset at least on PC Chips A31G board. No
interrupts are delivered once MSI is enabled for a device. This causes
hang on X11 start with a nVidia card installed (with nouveau driver).
Disable MSI completely for this chipset.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
If pci_assign_resource() fails to assign space for a BAR, we may restore
the BAR to whatever firmware left there at boot-time (this depends on
whether the arch implements pcibios_retrieve_fw_addr()). The messages we
print are not as useful as they could be:
pci 0000:00:01.0: BAR 15: assigned [mem 0xc0000000-0xc01fffff 64bit pref]
pci 0000:01:00.0: BAR 0: no space for [mem size 0x10000000 pref]
pci 0000:01:00.0: BAR 0: trying firmware assignment [mem size 0x10000000 pref]
pci 0000:01:00.0: BAR 0: [mem size 0x10000000 pref] conflicts with PCI Bus 0000:00 [mem 0xc0000000-0xffffffff window]
The last two lines should contain the actual BAR address, not the size.
Clear IORESOURCE_UNSET so we print the address. If requesting the
firmware-assigned resource fails, mark it IORESOURCE_UNSET again.
This is a cosmetic change to clarify the message: previously, if
pci_revert_fw_address() succeeded, pci_assign_resource() cleared
IORESOURCE_UNSET anyway, so this isn't really a functional change.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491#c50
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
c770cb4cb5 ("PCI: Mark invalid BARs as unassigned") sets IORESOURCE_UNSET
if we fail to claim a resource. If we tried to claim a bridge window,
failed, clipped the window, and tried to claim the clipped window, we
failed again because of IORESOURCE_UNSET:
pci_bus 0000:00: root bus resource [mem 0xc0000000-0xffffffff window]
pci 0000:00:01.0: can't claim BAR 15 [mem 0xbdf00000-0xddefffff 64bit pref]: no compatible bridge window
pci 0000:00:01.0: [mem size 0x20000000 64bit pref] clipped to [mem size 0x1df00000 64bit pref]
pci 0000:00:01.0: bridge window [mem size 0x1df00000 64bit pref]
pci 0000:00:01.0: can't claim BAR 15 [mem size 0x1df00000 64bit pref]: no address assigned
The 00:01.0 window started as [mem 0xbdf00000-0xddefffff 64bit pref]. That
starts before the host bridge window [mem 0xc0000000-0xffffffff window], so
we clipped the 00:01.0 window to [mem 0xc0000000-0xddefffff 64bit pref].
But we left it marked IORESOURCE_UNSET, so the second claim failed when it
should have succeeded.
This means downstream devices will also fail for lack of resources, e.g.,
in the bugzilla below,
radeon 0000:01:00.0: Fatal error during GPU init
Clear IORESOURCE_UNSET when we clip a bridge window. Also clear
IORESOURCE_UNSET in our copy of the unclipped window so we can see exactly
what the original window was and how it now fits inside the upstream
window.
Fixes: c770cb4cb5 ("PCI: Mark invalid BARs as unassigned")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491#c47
Based-on-patch-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Based-on-patch-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
CC: stable@vger.kernel.org # v4.1+
In store_remove_id(), set the default return value to -ENODEV, and
overwrite it with the input buffer size if we find a matching list entry.
Then we don't need to test whether to return an error or the count.
No functional change.
[bhelgaas: changelog]
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make get_msi_addr() return phys_addr_t, not u32. This allows the MSI
target address to be above 4GB for 64bit or PAE systems.
No functional change for the current 32bit platform users as phys_addr_t
maps to u32 for them.
[bhelgaas: changelog]
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
Implement multivector MSI IRQ setup. This allows to set up and use multiple
MSI IRQs per device.
[bhelgaas: changelog, use -EINVAL instead of -ENOSYS]
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
Factor out the PCI MSI message setup from the single MSI setup function.
This will be reused by the multivector MSI setup.
No functional change yet.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Pratyush Anand <pratyush.anand@gmail.com>
Add a msi_controller setup_irqs() method so MSI chip providers can
implement their own multivector MSI setup.
[bhelgaas: changelog]
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pratyush Anand <pratyush.anand@gmail.com>
The value under PORT_LOGIC_LINK_WIDTH_MASK is 0x1, 0x2, 0x4, 0x8. In IP
v4.2, bits [16:8] are defined for NUM_OF_LANES. But in IP v4.4, bits[12:8]
are defined for NUM_OF_LANES, bits [16:13] are for other usages (bit 16 is
AUTO_LANE_FLIP_CTRL_EN, bits [15:13] are PRE_DET_LANE).
As there is no conflict about NUM_OF_LANES between v4.2 and v4.4, change
the mask value to avoid future problems.
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
When pci-host-generic looks for the probe-only property, it seems to trust
the DT to be correctly written, and assumes that there is a parameter to
the property.
Unfortunately, this is not always the case, and some firmware expose this
property naked. The driver ends up making a decision based on whatever the
property pointer points to, which is likely to be junk.
Switch to the common of_pci.c implementation that doesn't suffer from this
problem.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
AER errors might be recorded when powering-on devices. These errors can be
ignored, so firmware usually clears them before the OS enumerates devices.
However, firmware is not involved when devices are added via hotplug, so
the OS may discover power-up errors that should be ignored. The same may
happen when powering up devices when resuming after suspend.
Clear the AER error status registers during enumeration and resume.
[bhelgaas: changelog, remove repetitive comments]
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Most interrupt flow handlers do not use the irq argument. Those few
which use it can retrieve the irq number from the irq descriptor.
Remove the argument.
Search and replace was done with coccinelle and some extra helper
scripts around it. Thanks to Julia for her help!
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Revert dff22d2054 ("PCI: Call pci_read_bridge_bases() from core instead
of arch code").
Reading PCI bridge windows is not arch-specific in itself, but there is PCI
core code that doesn't work correctly if we read them too early. For
example, Hannes found this case on an ARM Freescale i.mx6 board:
pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff]
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window)
pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000]
pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000]
pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100]
The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs
0x204100 of space, and mem windows are megabyte-aligned.
Bus sizing can increase a bridge window size, but never *decrease* it (see
d65245c329 ("PCI: don't shrink bridge resources")). Prior to
dff22d2054, ARM didn't read bridge windows at all, so the "original size"
was zero, and we assigned a 3MB window.
After dff22d2054, we read the bridge windows before sizing the bus. The
firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since
we never decrease the size, we kept 16MB even though we only needed 3MB.
But 16MB doesn't fit in the host bridge aperture, so we failed to assign
space for the window and the downstream devices.
I think this is a defect in the PCI core: we shouldn't rely on the firmware
to assign sensible windows.
Ray reported a similar problem, also on ARM, with Broadcom iProc.
Issues like this are too hard to fix right now, so revert dff22d2054.
Reported-by: Hannes <oe5hpm@gmail.com>
Reported-by: Ray Jui <rjui@broadcom.com>
Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com
Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.
Don't update VF BARs in pci_restore_bars().
This avoids spurious "BAR %d: error updating" messages that we see when
doing vfio pass-through after 6eb7018705 ("vfio-pci: Move idle devices to
D3hot power state").
[bhelgaas: changelog, fix whitespace]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There are two kexec load syscalls, kexec_load another and kexec_file_load.
kexec_file_load has been splited as kernel/kexec_file.c. In this patch I
split kexec_load syscall code to kernel/kexec.c.
And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
use kexec_file_load only, or vice verse.
The original requirement is from Ted Ts'o, he want kexec kernel signature
being checked with CONFIG_KEXEC_VERIFY_SIG enabled. But kexec-tools use
kexec_load syscall can bypass the checking.
Vivek Goyal proposed to create a common kconfig option so user can compile
in only one syscall for loading kexec kernel. KEXEC/KEXEC_FILE selects
KEXEC_CORE so that old config files still work.
Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
KEXEC_CORE in arch Kconfig. Also updated general kernel code with to
kexec_load syscall.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull parisc updates from Helge Deller:
"The most important changes in this patchset are:
- re-enable 64bit PCI bus addresses which were temporarily disabled
for PA-RISC in kernel 4.2
- fix the 64bit CAS operation in the LWS path which now enables us to
enable the 64bit gcc atomic builtins even on 32bit userspace with
64bit kernel
- fix a long-standing bug which sometimes crashed kernel at bootup
while serial interrupt wasn't registered yet"
* 'parisc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Use platform_device_register_simple("rtc-generic")
parisc: Drop CONFIG_SMP around update_cr16_clocksource()
parisc: Use double word condition in 64bit CAS operation
parisc: Filter out spurious interrupts in PA-RISC irq handler
parisc: Additionally check for in_atomic() in page fault handler
PCI,parisc: Enable 64-bit bus addresses on PA-RISC
parisc: Define ioremap_uc and ioremap_wc
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map. This facility is used by the pmem driver to
enable pfn_to_page() operations on the page frames returned by DAX
('direct_access' in 'struct block_device_operations'). For now, the
'memmap' allocation for these "device" pages comes from "System
RAM". Support for allocating the memmap from device memory will
arrive in a later kernel.
2/ Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3. Completion of
the conversion is targeted for v4.4.
3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
5/ Miscellaneous updates and fixes to libnvdimm including support
for issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
slsw6DkrWT60CRE42nbK
=o57/
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This update has successfully completed a 0day-kbuild run and has
appeared in a linux-next release. The changes outside of the typical
drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
the introduction of ZONE_DEVICE + devm_memremap_pages().
Summary:
- Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map.
This facility is used by the pmem driver to enable pfn_to_page()
operations on the page frames returned by DAX ('direct_access' in
'struct block_device_operations').
For now, the 'memmap' allocation for these "device" pages comes
from "System RAM". Support for allocating the memmap from device
memory will arrive in a later kernel.
- Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3.
Completion of the conversion is targeted for v4.4.
- Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
- Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
- Miscellaneous updates and fixes to libnvdimm including support for
issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes"
* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
libnvdimm, pmem: direct map legacy pmem by default
libnvdimm, pmem: 'struct page' for pmem
libnvdimm, pfn: 'struct page' provider infrastructure
x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
add devm_memremap_pages
mm: ZONE_DEVICE for "device memory"
mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
dax: drop size parameter to ->direct_access()
nd_blk: change aperture mapping from WC to WB
nvdimm: change to use generic kvfree()
pmem, dax: have direct_access use __pmem annotation
dax: update I/O path to do proper PMEM flushing
pmem: add copy_from_iter_pmem() and clear_pmem()
pmem, x86: clean up conditional pmem includes
pmem: remove layer when calling arch_has_wmb_pmem()
pmem, x86: move x86 PMEM API to new pmem.h header
libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
pmem: switch to devm_ allocations
devres: add devm_memremap
libnvdimm, btt: write and validate parent_uuid
...
Commit 3a9ad0b ("PCI: Add pci_bus_addr_t") unconditionally introduced usage of
64-bit PCI bus addresses on all 64-bit platforms which broke PA-RISC.
It turned out that due to enabling the 64-bit addresses, the PCI logic decided
to use the GMMIO instead of the LMMIO region. This commit simply disables
registering the GMMIO and thus we fall back to use the LMMIO region as before.
Reverts commit 45ea2a5fed
("PCI: Don't use 64-bit bus addresses on PA-RISC")
To: linux-parisc@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Meelis Roos <mroos@linux.ee>
Cc: stable@vger.kernel.org # v3.19+
Signed-off-by: Helge Deller <deller@gmx.de>
Pull irq updates from Thomas Gleixner:
"This updated pull request does not contain the last few GIC related
patches which were reported to cause a regression. There is a fix
available, but I let it breed for a couple of days first.
The irq departement provides:
- new infrastructure to support non PCI based MSI interrupts
- a couple of new irq chip drivers
- the usual pile of fixlets and updates to irq chip drivers
- preparatory changes for removal of the irq argument from interrupt
flow handlers
- preparatory changes to remove IRQF_VALID"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources
irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2
irqchip: Add documentation for the bcm2836 interrupt controller
irqchip/bcm2835: Add support for being used as a second level controller
irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ
PCI: xilinx: Fix typo in function name
irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance
irqchip/gic: Only allow the primary GIC to set the CPU map
PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove
unicore32/irq: Prepare puv3_gpio_handler for irq argument removal
tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal
m68k/irq: Prepare irq handlers for irq argument removal
C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal
blackfin: Prepare irq handlers for irq argument removal
arc/irq: Prepare idu_cascade_isr for irq argument removal
sparc/irq: Use access helper irq_data_get_affinity_mask()
sparc/irq: Use helper irq_data_get_irq_handler_data()
parisc/irq: Use access helper irq_data_get_affinity_mask()
mn10300/irq: Use access helper irq_data_get_affinity_mask()
irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal
...
Here's our branch of ARM64 contents for this merge window.
Most of this is DT contents for new SoCs (or those who have seen new
device support added). Maybe we should stop separating out the arm64
contents here to avoid the kind of internal conflicts as we got this
time around, where 32- and 64-bit contents conflicted.
Anyhow, on the actual contents:
New SoCs:
- Broadcom North Star 2 (ns2)
- Marvell Berlin4CT
- Mediatek MT6795
- Rockchip RK3368
In addition, there are enhancements for the following platforms:
- Mediatek MT8173: cpuidle-dt updates, misc other additions
- ZyncMP: A bunch of devices added to the existing DTSI
- Qualcomm MSM8916 and APQ8016 updates for USB, etc.
+ A handful of other updates for various platforms
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV5N8IAAoJEIwa5zzehBx3qt0P/ju6RvgS2vUDk6QxjBvzp2oN
daXOj73Nefe4Qef7GT5k5K1hLFVflKmesIDqG/eTfGTDSjMjQaJufZTKGh+5arLQ
28XAm897gsjt+NVzEgdkG1QD8GIlfth+S3ylh/nLx+zOMTz353dMgpNY0o3eWVFr
TYq+x6u4UFEmBFvkYm7P9ONfqKKiaO2pqjsA41SfKao0c8+kwrB5jWO2m4DaYqhC
Mj6ug8HoQLJk+ZpbhcsUDD4tAMIL0VJWl446LLCsd/tk81J05K95vPCTEdyGRu0U
dwe0W0PgODPMqyYTXteVdOuT1n7CYFlySAKfJUx4yW4w9Y5hHbB6sHYRHHt6unny
Ynn5u2xIUHKaqiOeShDHV433ssCB+1tgBefoAVPqY2kYC8eCP46zrieVwZip4/KA
rEldc1c+tCUYROe39F4ljg3I4wFNtBhfn3JCoocvbljmMSro5MD3zf1n8p21C6GV
vCidRbIXHVd89FALFL/1r1vuMAlaF6TD/FyxYVrJXgr8LTj5gak93/O5WpsYam6c
GwiTnykz+3eHvDi7xrhsohCHptvE7TnBSNhfkxsngFoJr3evAbU3B4qJKLzrZCvf
KHVIoSylEl+4+M6VISowid6NL4ADc8mNeHXfGxFHVBEfnK/tH3VySM4hV0reAh4m
NR+TX15zqfF4uudoAqlC
=SZR9
-----END PGP SIGNATURE-----
Merge tag 'armsoc-arm64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC 64-bit changes from Olof Johansson:
"Here's our branch of ARM64 contents for this merge window.
Most of this is DT contents for new SoCs (or those who have seen new
device support added). Maybe we should stop separating out the arm64
contents here to avoid the kind of internal conflicts as we got this
time around, where 32- and 64-bit contents conflicted.
Anyhow, on the actual contents:
New SoCs:
- Broadcom North Star 2 (ns2)
- Marvell Berlin4CT
- Mediatek MT6795
- Rockchip RK3368
In addition, there are enhancements for the following platforms:
- Mediatek MT8173: cpuidle-dt updates, misc other additions
- ZyncMP: A bunch of devices added to the existing DTSI
- Qualcomm MSM8916 and APQ8016 updates for USB, etc.
+ a handful of other updates for various platforms"
* tag 'armsoc-arm64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (47 commits)
ARM64: dts: vexpress: Use assigned-clock-parents for sp810
ARM64: dts: mt6795: enable basic SMP bringup for MT6795
arm64: Enable Marvell Berlin SoC family in defconfig
arm64: Enable Marvell Berlin SoC family in Kconfig
arm64: dts: Add dts files for Marvell Berlin4CT SoC
ARM64: zynqmp: Move SPI nodes to the right location
ARM64: zynqmp: Move uart and ttcs to the right location
ARM64: zynqmp: Enable spi flashes on ep108
ARM64: zynqmp: Add eeprom memories on i2c bus
ARM64: zynqmp: Enable sdhci on ep108
ARM64: zynqmp: Enable watchdog on ep108
ARM64: zynqmp: Add DWC3 usb support
ARM64: zynqmp: Add SMMU support
ARM64: zynqmp: Add CANs node for platform
ARM64: zynqmp: Use zynqmp specific compatible string for gpio
devicetree: xilinx: zynqmp: add sata node
PCI: iproc: Fix BCMA dependency in Kconfig
arm64: dts: Add Broadcom North Star 2 support
arm64: Add Broadcom iProc family support
PCI: iproc: Fix ARM64 dependency in Kconfig
...
Pull x86 mm updates from Ingo Molnar:
"The dominant change in this cycle was the continued work to isolate
kernel drivers from MTRR legacies: this tree gets rid of all kernel
internal driver interfaces to MTRRs (mostly by rewriting it to proper
PAT interfaces), the only access left is the /proc/mtrr ABI.
This work was done by Luis R Rodriguez.
There's also some related PCI interface additions for which I've
Cc:-ed Bjorn"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del()
s390/io: Add pci_iomap_wc() and pci_iomap_wc_range()
drivers/dma/iop-adma: Use dma_alloc_writecombine() kernel-style
drivers/video/fbdev/vt8623fb: Use arch_phys_wc_add() and pci_iomap_wc()
drivers/video/fbdev/s3fb: Use arch_phys_wc_add() and pci_iomap_wc()
drivers/video/fbdev/arkfb.c: Use arch_phys_wc_add() and pci_iomap_wc()
PCI: Add pci_iomap_wc() variants
drivers/video/fbdev/gxt4500: Use pci_ioremap_wc_bar() to map framebuffer
drivers/video/fbdev/kyrofb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
drivers/video/fbdev/i740fb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
PCI: Add pci_ioremap_wc_bar()
x86/mm: Make kernel/check.c explicitly non-modular
x86/mm/pat: Make mm/pageattr[-test].c explicitly non-modular
x86/mm/pat: Add comments to cachemode translation tables
arch/*/io.h: Add ioremap_uc() to all architectures
drivers/video/fbdev/atyfb: Use arch_phys_wc_add() and ioremap_wc()
drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC
drivers/video/fbdev/atyfb: Clarify ioremap() base and length used
drivers/video/fbdev/atyfb: Carve out framebuffer length fudging into a helper
x86/mm, asm-generic: Add IOMMU ioremap_uc() variant default
...
Commit 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel
doesn't support MSI") changed the location of the code that initialises
dev->msi_cap/msix_cap and then disables MSI/MSI-X interrupts at PCI
probe time in devices that have this flag set. It moved the code from
pci_msi_init_pci_dev() to a new function named pci_msi_setup_pci_dev(),
called by pci_setup_device().
The pseries PCI probing code does not call pci_setup_device(), so since
the aforementioned commit the function pci_msi_setup_pci_dev() is not
called and MSI/MSI-X interrupts are left enabled. Additionally because
dev->msi_cap/msix_cap are not initialised no driver can ever enable
MSI/MSI-X.
To fix this, the pseries PCI probe should manually call
pci_msi_setup_pci_dev(), so this patch makes it non-static.
Fixes: 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
[mpe: Update change log to mention dev->msi_cap/msix_cap]
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This lets drivers take advantage of PAT when available. It
should help with the transition of converting video drivers over
to ioremap_wc() to help with the goal of eventually using
_PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on
ioremap_nocache(), see:
de33c442ed ("x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()")
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: <syrjala@sci.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Ville Syrjälä <syrjala@sci.fi>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: airlied@linux.ie
Cc: benh@kernel.crashing.org
Cc: dan.j.williams@intel.com
Cc: konrad.wilk@oracle.com
Cc: linux-fbdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: mst@redhat.com
Cc: vinod.koul@intel.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/1440443613-13696-2-git-send-email-mcgrof@do-not-panic.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On multi-function JMicron SATA/PATA/AHCI devices, the PATA controller at
function 1 doesn't work if it is powered on before the SATA controller at
function 0. The result is that PATA doesn't work after resume, and we
print messages like this:
pata_jmicron 0000:02:00.1: Refused to change power state, currently in D3
irq 17: nobody cared (try booting with the "irqpoll" option)
Async resume was introduced in v3.15 by 76569faa62 ("PM / sleep:
Asynchronous threads for resume_noirq"). Prior to that, we powered on
the functions in order, so this problem shouldn't happen.
e6b7e41cdd ("ata: Disabling the async PM for JMicron chip 363/361")
solved the problem for JMicron 361 and 363 devices. With async suspend
disabled, we always power on function 0 before function 1.
Barto then reported the same problem with a JMicron 368 (see comment #57 in
the bugzilla).
Rather than extending the blacklist piecemeal, disable async suspend for
all JMicron multi-function SATA/PATA/AHCI devices.
This quirk could stay in the ahci and pata_jmicron drivers, but it's likely
the problem will occur even if pata_jmicron isn't loaded until after the
suspend/resume. Making it a PCI quirk ensures that we'll preserve the
power-on order even if the drivers aren't loaded.
[bhelgaas: changelog, limit to multi-function, limit to IDE/ATA]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=81551
Reported-and-tested-by: Barto <mister.freeman@laposte.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.15+
* pci/host-dra7xx:
PCI: dra7xx: Remove unneeded use of IS_ERR_VALUE()
* pci/host-imx6:
PCI: imx6: Simplify a trivial if-return sequence
* pci/host-spear:
PCI: spear: Use BUG_ON() instead of condition followed by BUG()
Firmware typically configures the PCIe fabric with a consistent Max Payload
Size setting based on the devices present at boot. A hot-added device
typically has the power-on default MPS setting (128 bytes), which may not
match the fabric.
The previous Linux default, in the absence of any "pci=pcie_bus_*" options,
was PCIE_BUS_TUNE_OFF, in which we never touch MPS, even for hot-added
devices.
Add a new default setting, PCIE_BUS_DEFAULT, in which we make sure every
device's MPS setting matches the upstream bridge. This makes it more
likely that a hot-added device will work in a system with optimized MPS
configuration.
Note that if we hot-add a device that only supports 128-byte MPS, it still
likely won't work because we don't reconfigure the rest of the fabric.
Booting with "pci=pcie_bus_peer2peer" is a workaround for this because it
sets MPS to 128 for everything.
[bhelgaas: changelog, new default, rework for pci_configure_device() path]
Tested-by: Keith Busch <keith.busch@intel.com>
Tested-by: Jordan Hargrave <jharg93@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Meelis and Helge reported that 3a9ad0b4fd ("PCI: Add pci_bus_addr_t")
caused HPMCs on A500 and hangs on rp5470.
PA-RISC does not set ARCH_DMA_ADDR_T_64BIT, even for 64-bit kernels, so
prior to 3a9ad0b4fd, we always used 32-bit PCI addresses. After
3a9ad0b4fd, we do use 64-bit PCI addresses in 64-bit kernels, and
apparently there's some PA-RISC problem related to them.
Fixes: 3a9ad0b4fd ("PCI: Add pci_bus_addr_t")
Link: http://lkml.kernel.org/r/alpine.LRH.2.11.1507260929000.30065@math.ut.ee
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Based-on-idea-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
CC: stable@vger.kernel.org # v3.19+
Previously we checked for invalid MPS settings, i.e., a device with MPS
different than its upstream bridge, in pcie_bus_detect_mps(). We only did
this if the arch or hotplug driver called pcie_bus_configure_settings(),
and then only if PCIe bus tuning was disabled (PCIE_BUS_TUNE_OFF).
Move the MPS checking code to pci_configure_device(), so we do it in the
pci_device_add() path for every device.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
of_parse_phandle() returns a device_node pointer with the refcount
incremented. We should dispose of this reference when we're finished.
Drop the reference acquired by of_parse_phandle().
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The pcibios_msi_controller() hook was only implemented by ARM, and it sets
pci_bus->msi now, so it doesn't need this hook anymore.
Remove the unused pcibios_msi_controller() hook.
[bhelgaas: changelog, split into separate patch]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
ARM previously stored the msi_controller pointer in its sysdata, struct
pci_sys_data, and implemented pcibios_msi_controller() to retrieve it.
That made PCI host controller drivers specific to ARM because they had to
put the msi_controller pointer in the ARM-specific pci_sys_data.
There is now a generic mechanism, pci_scan_root_bus_msi(), for giving the
msi_controller pointer to the PCI core. Use this for all ARM systems and
for the DesignWare and Xilinx PCI host controller drivers.
This removes an ARM dependency from the DesignWare, DRA7xx, EXYNOS, i.MX6,
Keystone, Layerscape, SPEAr13xx, and Xilinx drivers.
[bhelgaas: changelog, split into separate patch]
Suggested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
CC: Pratyush Anand <pratyush.anand@gmail.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Simon Horman <horms@verge.net.au>
CC: Russell King <linux@arm.linux.org.uk>
CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
CC: Thierry Reding <thierry.reding@gmail.com>
CC: Michal Simek <michal.simek@xilinx.com>
CC: Marc Zyngier <marc.zyngier@arm.com>
Add a pci_scan_root_bus_msi() interface so an arch can specify the MSI
controller up front. This removes the need for a pcibios callback to set
the MSI controller later.
This is not exported because I'd like to replace the variety of "scan root
bus" interfaces with a single, more extensible interface that can handle
the MSI controller, domain, pci_ops, resources, etc. I hope this interface
is temporary.
[bhelgaas: changelog, split into separate patch]
Suggested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Make pci-host-generic driver (kernel option PCI_HOST_GENERIC) available on
arm64.
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
ARM64 requires setup-irq.o to provide pci_fixup_irqs() implementation. We
are adding this now to support the pci-host-generic host controller, but we
enable it for ARM64 PCI so that other host controllers can use this as
well.
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The generic OF-based host controller driver uses pci_common_init_dev(),
which is ARM-specific and requires the ARM struct hw_pci. The part of
pci_common_init_dev() that is needed is limited and can be done here
without using hw_pci.
Note that the ARM pcibios functions expect the PCI sysdata to be a pointer
to a struct pci_sys_data. Add a struct pci_sys_data as the first element
in struct gen_pci so that when we use a gen_pci pointer as sysdata, it is
also a pointer to a struct pci_sys_data.
Create and scan the root bus directly without using the ARM
pci_common_init_dev() interface.
[bhelgaas: changelog, move pcie_bus_configure_settings() before
pci_bus_add_devices(), combine !PCI_PROBE_ONLY blocks]
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Simplify a trivial if-return sequence by combining it with a preceding
function call.
The semantic patch that makes this change is available in
scripts/coccinelle/misc/simple_return.cocci.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Lucas Stach <l.stach@pengutronix.de>
Use BUG_ON() instead of an if condition followed by BUG().
The semantic patch that makes this change is available in
scripts/coccinelle/misc/bugon.cocci.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Pratyush Anand <pratyush.anand@gmail.com>
There is no need to use the IS_ERR_VALUE() macro for checking the return
value from pm_runtime_* functions.
Test for a negative pm_runtime_get_sync() return value instead of using
IS_ERR_VALUE().
The semantic patch that makes this change is available in
scripts/coccinelle/api/pm_runtime.cocci.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Kishon Vijay Abraham I <kishon@ti.com>
We should not assume any particular hardware topology. Commit d0751b98df
("PCI: Add dev->has_secondary_link to track downstream PCIe links") relied
on the assumption that every PCIe hierarchy is rooted at a Root Port. But
we can't rely on any assumption about what hardware we will find; we just
have to deal with the world as it is.
On some platforms, PCIe devices (endpoints, switch upstream ports, etc.)
appear directly on the root bus, and there is no Root Port in the PCI bus
hierarchy. For example, Meelis observed these top-level devices on a
Sparc V245:
0000:02:00.0 PCI bridge to [bus 03-0d] Switch Upstream Port
0001:02:00.0 PCI bridge to [bus 03] PCIe to PCI/PCI-X Bridge
These devices *look* like they have links going upstream, but there really
are no upstream devices.
In set_pcie_port_type(), we used the parent device to figure out which side
of a switch port has a link, so if the parent device did not exist, we
dereferenced a NULL parent pointer.
Check whether the parent device exists before dereferencing it.
Meelis observed this oops on Sparc V245 and T2000. Ben Herrenschmidt says
this is also possible on IBM PowerVM guests on PowerPC.
[bhelgaas: changelog, comment]
Link: http://lkml.kernel.org/r/alpine.LRH.2.20.1508122118210.18637@math.ut.ee
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
There's a typo in commit e39758e0ea in linux-next, which incorrectly
spells "msi_desc_to_pci_sysdata()" as "msi_desc_to_pci_sys_data()" and
causes build failure:
> ../drivers/pci/host/pcie-xilinx.c:235:3: error: implicit declaration
of function 'msi_desc_to_pci_sys_data' [-Werror=implicit-function-declaration]
Fixes: e39758e0ea "PCI: Use helper functions to access fields in struct msi_desc"
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Brown <broonie@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Sören Brinkmann <soren.brinkmann@xilinx.com>
Cc: Srikanth Thokala <sthokal@xilinx.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Yijing Wang <wangyijing@huawei.com>
Link: http://lkml.kernel.org/r/1439912763-10645-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* pci/host-dra7xx:
ARM: dts: am57xx-evm: Add 'gpios' property with gpio2_8
PCI: dra7xx: Add support to make GPIO drive PERST# line
PCI: dra7xx: Clear MSE bit during suspend so clocks will idle
PCI: dra7xx: Add PM support
PCI: dra7xx: Disable pm_runtime on get_sync failure
* pci/host-iproc:
PCI: iproc: Allow BCMA bus driver to be built as module
PCI: iproc: Add arm64 support
PCI: iproc: Delete unnecessary checks before phy calls
* pci/hotplug:
PCI: pciehp: Remove ignored MRL sensor interrupt events
PCI: pciehp: Remove unused interrupt events
PCI: pciehp: Handle invalid data when reading from non-existent devices
PCI: Hold pci_slot_mutex while searching bus->slots list
PCI: Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem
PCI: pciehp: Simplify pcie_poll_cmd()
PCI: Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot
* pci/iommu:
PCI: Remove pci_ats_enabled()
PCI: Stop caching ATS Invalidate Queue Depth
PCI: Move ATS declarations to linux/pci.h so they're all together
PCI: Clean up ATS error handling
PCI: Use pci_physfn() rather than looking up physfn by hand
PCI: Inline the ATS setup code into pci_ats_init()
PCI: Rationalize pci_ats_queue_depth() error checking
PCI: Reduce size of ATS structure elements
PCI: Embed ATS info directly into struct pci_dev
PCI: Allocate ATS struct during enumeration
iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth
* pci/irq:
PCI: Kill off set_irq_flags() usage
* pci/virtualization:
PCI: Add ACS quirks for Intel I219-LM/V
Remove pci_ats_enabled(). There are no callers outside the ATS code
itself. We don't need to check ats_cap, because if we don't find an ATS
capability, we'll never set ats_enabled.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Stop caching the Invalidate Queue Depth in struct pci_dev.
pci_ats_queue_depth() is typically called only once per device, and it
returns a fixed value per-device, so callers who need the value frequently
can cache it themselves.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
There's no need to BUG() if we enable ATS when it's already enabled. We
don't need to BUG() when disabling ATS on a device that doesn't support ATS
or if it's already disabled. If ATS is enabled, certainly we found an ATS
capability in the past, so it should still be there now.
Clean up these error paths.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Use the pci_physfn() helper rather than looking up physfn by hand.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
The ATS setup code in ats_alloc_one() is only used by pci_ats_init(), so
inline it there. No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
We previously returned -ENODEV for devices that don't support ATS (except
that we always returned 0 for VFs, whether or not they support ATS).
For consistency, always return -EINVAL (not -ENODEV) if the device doesn't
support ATS. Return zero for VFs that support ATS.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
The pci_ats struct is small and will get smaller, so I don't think it's
worth allocating it separately from the pci_dev struct.
Embed the ATS fields directly into struct pci_dev.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Previously, we allocated pci_ats structures when an IOMMU driver called
pci_enable_ats(). An SR-IOV VF shares the STU setting with its PF, so when
enabling ATS on the VF, we allocated a pci_ats struct for the PF if it
didn't already have one. We held the sriov->lock to serialize threads
concurrently enabling ATS on several VFS so only one would allocate the PF
pci_ats.
Gregor reported a deadlock here:
pci_enable_sriov
sriov_enable
virtfn_add
mutex_lock(dev->sriov->lock) # acquire sriov->lock
pci_device_add
device_add
BUS_NOTIFY_ADD_DEVICE notifier chain
iommu_bus_notifier
amd_iommu_add_device # iommu_ops.add_device
init_iommu_group
iommu_group_get_for_dev
iommu_group_add_device
__iommu_attach_device
amd_iommu_attach_device # iommu_ops.attach_device
attach_device
pci_enable_ats
mutex_lock(dev->sriov->lock) # deadlock
There's no reason to delay allocating the pci_ats struct, and if we
allocate it for each device at enumeration-time, there's no need for
locking in pci_enable_ats().
Allocate pci_ats struct during enumeration, when we initialize other
capabilities.
Note that this implementation requires ATS to be enabled on the PF first,
before on any of the VFs because the PF controls the STU for all the VFs.
Link: http://permalink.gmane.org/gmane.linux.kernel.iommu/9433
Reported-by: Gregor Dick <gdick@solarflare.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
The PERST# line in am57x-evm is connected to a GPIO line and PERST# should
be driven high to indicate the clocks are stable (As per Figure 2-10: Power
Up of the PCIe CEM spec 3.0).
Add support to make GPIO drive PERST# line.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
DRA7xx requires the MSE bit to be cleared to set the master in standby
mode. (In DRA7xx TRM_vE, section 24.9.4.5.2.2.1 PCIe Controller Master
Standby Behavior advises to use the clearing of the local MSE bit to set
the master in standby. Without this some of the clocks do not idle).
Clear the MSE bit on suspend and enable it on resume. Clearing MSE bit is
required to get clocks to be idled after suspend.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Add PM support to pci-dra7xx so PCI clocks can be disabled during suspend
and enabled during resume without affecting PCI functionality.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Fix the error handling when pm_runtime_get_sync() fails.
If pm_runtime_get_sync() fails, call pm_runtime_disable() so there are no
unbalanced pm_runtime_enable() calls.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Change CONFIG_PCIE_IPROC_BCMA to tristate to make it possible to build this
driver as a module.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ray Jui <rjui@broadcom.com>
The Intel 100-series chipset now includes the integrated Ethernet as part
of a multifunction package. The Ethernet function does not include native
ACS support, but Intel confirms that the device is not capable of peer-to-
peer within the package. We can therefore quirk it to expose the
isolation.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: John Ronciak <john.ronciak@gmail.com>
set_irq_flags is ARM-specific with custom flags which have genirq
equivalents. Convert drivers to use the genirq interfaces directly, so we
can kill off set_irq_flags. The translation of flags is as follows:
IRQF_VALID -> !IRQ_NOREQUEST
IRQF_PROBE -> !IRQ_NOPROBE
IRQF_NOAUTOEN -> IRQ_NOAUTOEN
For IRQs managed by an irqdomain, the irqdomain core code handles clearing
and setting IRQ_NOREQUEST already, so there is no need to do this in .map()
functions, and we can simply remove the set_irq_flags calls. Some users
also modify IRQ_NOPROBE, and this has been maintained although it is not
clear that is really needed. There appears to be a great deal of blind
copy and paste of this code.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
CC: Kishon Vijay Abraham I <kishon@ti.com>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Thierry Reding <thierry.reding@gmail.com>
CC: Stephen Warren <swarren@wwwdotorg.org>
CC: Alexandre Courbot <gnurou@gmail.com>
CC: Jingoo Han <jingoohan1@gmail.com>
CC: Pratyush Anand <pratyush.anand@gmail.com>
CC: Simon Horman <horms@verge.net.au>
CC: Michal Simek <michal.simek@xilinx.com>
CC: "Sören Brinkmann" <soren.brinkmann@xilinx.com>
Quoting Arnd:
I was thinking the opposite approach and basically removing all uses
of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
them.and we can probably replace them all with hardcoded
ioremap_cached() calls in the cases they are actually useful.
All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
ioremap_nocache() if the resource is cacheable, however ioremap() is
uncached by default. Clearly none of the existing usages care about the
cacheability. Particularly devm_ioremap_resource() never worked as
advertised since it always fell back to plain ioremap().
Clean this up as the new direction we want is to convert
ioremap_<type>() usages to memremap(..., flags).
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We queued interrupt events for the MRL being opened or closed, but the code
in interrupt_event_handler() that handles these events ignored them.
Stop enabling MRL interrupts and remove the ignored events.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The list of interrupt events (INT_BUTTON_IGNORE, INT_PRESENCE_ON, etc.) was
copied from other hotplug drivers, but pciehp doesn't use them all.
Remove the interrupt events that aren't used by pciehp.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
It's platform-dependent, but an MMIO read to a non-existent PCI device
generally returns data with all bits set. This happens when the host
bridge or Root Complex times out waiting for a response from the device and
fabricates return data to complete the CPU's read.
One example, reported in the bugzilla below, involved this hierarchy:
pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
pci 0000:06:00.0: PCI bridge to [bus 07] Thunderbolt Downstream Port
pci 0000:07:00.0: BCM57762 NIC
Unplugging the Thunderbolt switch and the NIC below it resulted in this:
pciehp 0000:03:03.0: Surprise Removal
tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
pciehp 0000:06:00.0: unloading service driver pciehp
pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
pciehp 0000:06:00.0: Switch interrupt received
pciehp 0000:06:00.0: Latch open on Slot
pciehp 0000:06:00.0: Attention button interrupt received
pciehp 0000:06:00.0: Button pressed on Slot
pciehp 0000:06:00.0: Presence/Notify input change
pciehp 0000:06:00.0: Card present on Slot
pciehp 0000:06:00.0: Power fault interrupt received
pciehp 0000:06:00.0: Data Link Layer State change
pciehp 0000:06:00.0: Link Up event
The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
remove methods.
Since the NIC was already gone, tg3 received 0xffffffff when it tried to
read from the device. The resulting timeout is a tg3 issue and not of
interest here.
Similarly, since the 06:00.0 Thunderbolt switch was already gone,
pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
thought that was valid status showing that many events had happened: the
latch had been opened, the attention button had been pressed, a card was
now present, and the link was now up. These are all wrong, of course, but
pciehp went on to try to power up and enumerate devices below the
non-existent bridge:
pciehp 0000:06:00.0: PCI slot - powering on due to button press
pciehp 0000:06:00.0: Surprise Insertion
pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI capabilities list for Intel DH895xCC VFs (device id 0x0443) with
QuickAssist Technology is prematurely terminated in hardware.
Workaround the issue by hard-coding the known expected next capability
pointer and saving the PCIE cap into internal buffer.
Patch generated against cryptodev-2.6
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>