Commit Graph

121 Commits

Author SHA1 Message Date
Bjorn Helgaas
6b13672469 PCI: Use spec names for SR-IOV capability fields
Use the same names (almost) as the spec for TotalVFs, InitialVFs, NumVFs.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 21:38:53 -07:00
Donald Dutile
bff73156d3 PCI: Provide method to reduce the number of total VFs supported
Some implementations of SRIOV provide a capability structure
value of TotalVFs that is greater than what the software can support.
Provide a method to reduce the capability structure reported value
to the value the driver can support.
This ensures sysfs reports the current capability of the system,
hardware and software.
Example for its use: igb & ixgbe -- report 8 & 64 as TotalVFs,
but drivers only support 7 & 63 maximum.

Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 21:37:39 -07:00
Yinghai Lu
4e15c46bdc PCI: Add pci_device_type to pdev's device struct
Need type filled in device structure so it can be used for visible
attribute control in sysfs for pci_dev.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 20:07:31 -07:00
Bjorn Helgaas
a7711ba109 Merge branch 'pci/rafael-pci_set_power_state-rebase' into next
* pci/rafael-pci_set_power_state-rebase:
  PCI / PM: restore the original behavior of pci_set_power_state()
2012-07-05 16:29:52 -06:00
Rafael J. Wysocki
db288c9c5f PCI / PM: restore the original behavior of pci_set_power_state()
Commit cc2893b6 (PCI: Ensure we re-enable devices on resume)
addressed the problem with USB not being powered after resume on
recent Lenovo machines, but it did that in a suboptimal way.
Namely, it should have changed the relevant code paths only,
which are pci_pm_resume_noirq() and pci_pm_restore_noirq() supposed
to restore the device's power and standard configuration registers
after system resume from suspend or hibernation.  Instead, however,
it modified pci_set_power_state() which is executed in several
other situations too.  That resulted in some undesirable effects,
like attempting to change a device's power state in the same way
multiple times in a row (up to as many as 4 times in a row in the
snd_hda_intel driver).

Fix the bug addressed by commit cc2893b6 in an alternative way,
by forcibly powering up all devices in pci_pm_default_resume_early(),
which is called by pci_pm_resume_noirq() and pci_pm_restore_noirq()
to restore the device's power and standard configuration registers,
and modifying pci_pm_runtime_resume() to avoid the forcible power-up
if not necessary.  Then, revert the changes made by commit cc2893b6
to make the confusion introduced by it go away.

Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-07-05 15:20:00 -06:00
Bjorn Helgaas
35e7f73c32 Merge branch 'topic/huang-d3cold-v7' into next
* topic/huang-d3cold-v7:
  PCI/PM: add PCIe runtime D3cold support
  PCI: do not call pci_set_power_state with PCI_D3cold
  PCI/PM: add runtime PM support to PCIe port
  ACPI/PM: specify lowest allowed state for device sleep state
2012-06-23 11:59:43 -06:00
Huang Ying
448bd857d4 PCI/PM: add PCIe runtime D3cold support
This patch adds runtime D3cold support and corresponding ACPI platform
support.  This patch only enables runtime D3cold support; it does not
enable D3cold support during system suspend/hibernate.

D3cold is the deepest power saving state for a PCIe device, where its main
power is removed.  While it is in D3cold, you can't access the device at
all, not even its configuration space (which is still accessible in D3hot).
Therefore the PCI PM registers can not be used to transition into/out of
the D3cold state; that must be done by platform logic such as ACPI _PR3.

To support wakeup from D3cold, a system may provide auxiliary power, which
allows a device to request wakeup using a Beacon or the sideband WAKE#
signal.  WAKE# is usually connected to platform logic such as ACPI GPE.
This is quite different from other power saving states, where devices
request wakeup via a PME message on the PCIe link.

Some devices, such as those in plug-in slots, have no direct platform
logic.  For example, there is usually no ACPI _PR3 for them.  D3cold
support for these devices can be done via the PCIe Downstream Port leading
to the device.  When the PCIe port is powered on/off, the device is powered
on/off too.  Wakeup events from the device will be notified to the
corresponding PCIe port.

For more information about PCIe D3cold and corresponding ACPI support,
please refer to:

- PCI Express Base Specification Revision 2.0
- Advanced Configuration and Power Interface Specification Revision 5.0

[bhelgaas: changelog]
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Originally-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-06-23 10:50:59 -06:00
Bjorn Helgaas
cc2fa3fa32 Merge branch 'topic/alex-vfio-prep' into next
* topic/alex-vfio-prep:
  PCI: misc pci_reg additions
  PCI: create common pcibios_err_to_errno
  PCI: export pci_user functions for use by other drivers
  PCI: add ACS validation utility
  PCI: add PCI DMA source ID quirk
2012-06-13 17:04:54 -06:00
Yinghai Lu
06aef8cec7 PCI: hotplug: remove pci_do_scan_bus()
All callers of pci_do_scan_bus() are gone, so remove it.

Note that pci_do_scan_bus() was exported, so out-of-tree modules could
depend on it.

[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-06-13 15:42:27 -06:00
Yinghai Lu
a8e4b9c101 PCI: add generic pci_hp_add_bridge()
This creates a generic pci_hp_add_bridge() that can be used by several
hotplug drivers.

[bhelgaas: split out from pciehp patch]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-06-13 15:42:26 -06:00
Alex Williamson
c63587d7f5 PCI: export pci_user functions for use by other drivers
VFIO PCI support will make use of these for user-initiated
PCI config accesses.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-06-12 09:21:42 -06:00
Yinghai Lu
2069ecfbe1 PCI: Move "pci reassigndev resource alignment" out of quirks.c
This isn't really a quirk; calling it directly from pci_add_device makes
more sense.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:37:26 -08:00
Yinghai Lu
b55438fdd5 PCI: prepare pci=realloc for multiple options
Let the user could enable and disable with pci=realloc=on or pci=realloc=off

Also
1. move variable and functions near the place they are used.
2. change macro to function
3. change related functions and variable to static and _init
4. update parameter description accordingly.

This will let us add a config option to control default behavior, and
still allow the user to turn off automatic reallocation if it fails on
their platform until a permanent solution is found.

-v2: still honor pci=realloc, and treat it as pci=realloc=on
     also use enum instead of ...
-v3: update kernel-paramenters.txt according to Jesse.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 08:47:42 -08:00
Yinghai Lu
f796841e49 PCI: fix memleak for pci dev removing during hotplug
unreferenced object 0xffff880276d17700 (size 64):
  comm "swapper/0", pid 1, jiffies 4294897182 (age 3976.028s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 18 f9 de 76 02 88 ff ff  ...........v....
    10 00 00 00 0e 00 00 00 0f 28 40 00 00 00 00 00  .........(@.....
  backtrace:
    [<ffffffff81c8aede>] kmemleak_alloc+0x26/0x43
    [<ffffffff811385f0>] __kmalloc+0x121/0x183
    [<ffffffff813cf821>] pci_add_cap_save_buffer+0x35/0x7c
    [<ffffffff813d12b7>] pci_allocate_cap_save_buffers+0x1d/0x65
    [<ffffffff813cdb52>] pci_device_add+0x92/0xf1
    [<ffffffff81c8afe6>] pci_scan_single_device+0x9f/0xa1
    [<ffffffff813cdbd2>] pci_scan_slot.part.20+0x21/0x106
    [<ffffffff813cdce2>] pci_scan_slot+0x2b/0x35
    [<ffffffff81c8dae4>] __pci_scan_child_bus+0x51/0x107
    [<ffffffff81c8d75b>] pci_scan_bridge+0x376/0x6ae
    [<ffffffff81c8db60>] __pci_scan_child_bus+0xcd/0x107
    [<ffffffff81c8dbab>] pci_scan_child_bus+0x11/0x2a
    [<ffffffff81cca58c>] pci_acpi_scan_root+0x18b/0x21c
    [<ffffffff81c916be>] acpi_pci_root_add+0x1e1/0x42a
    [<ffffffff81406210>] acpi_device_probe+0x50/0x190
    [<ffffffff814a0227>] really_probe+0x99/0x126

Need to free saved_buffer for capabilities.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-23 12:08:53 -08:00
Yinghai Lu
efdc87dab1 PCI: Separate pci_bus_read_dev_vendor_id from pci_scan_device
We can reuse it for pciehp probing.

-v2: according to Kenji, fix crs timeout checking, and export the function
     for later use when pciehp is compiled as a module.

Suggested-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:59 -08:00
Hao, Xudong
1900ca132f PCI: Enable ATS at the device state restore
During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly.
This patch enables ATS at the device state restore if PCI device has ATS
capability.

Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:18 -08:00
Jan Kiszka
a2e27787f8 PCI: Introduce INTx check & mask API
These new PCI services allow to probe for 2.3-compliant INTx masking
support and then use the feature from PCI interrupt handlers. The
services are properly synchronized with concurrent config space access
via sysfs or on device reset.

This enables generic PCI device drivers like uio_pci_generic or KVM's
device assignment to implement the necessary kernel-side IRQ handling
without any knowledge about device-specific interrupt status and control
registers.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:34 -08:00
Ram Pai
0a2daa1cf3 PCI: make cardbus-bridge resources optional
Allocate resources to cardbus bridge only after all other genuine
resources requests are satisfied. Dont retry if resource allocation
for cardbus-bridges fail.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-01 11:50:40 -07:00
Linus Torvalds
6d16d6d9bb Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  iommu/core: Fix build with INTR_REMAP=y && CONFIG_DMAR=n
  iommu/amd: Don't use MSI address range for DMA addresses
  iommu/amd: Move missing parts to drivers/iommu
  iommu: Move iommu Kconfig entries to submenu
  x86/ia64: intel-iommu: move to drivers/iommu/
  x86: amd_iommu: move to drivers/iommu/
  msm: iommu: move to drivers/iommu/
  drivers: iommu: move to a dedicated folder
  x86/amd-iommu: Store device alias as dev_data pointer
  x86/amd-iommu: Search for existind dev_data before allocting a new one
  x86/amd-iommu: Allow dev_data->alias to be NULL
  x86/amd-iommu: Use only dev_data in low-level domain attach/detach functions
  x86/amd-iommu: Use only dev_data for dte and iotlb flushing routines
  x86/amd-iommu: Store ATS state in dev_data
  x86/amd-iommu: Store devid in dev_data
  x86/amd-iommu: Introduce global dev_data_list
  x86/amd-iommu: Remove redundant device_flush_dte() calls
  iommu-api: Add missing header file

Fix up trivial conflicts (independent additions close to each other) in
drivers/Makefile and include/linux/pci.h
2011-07-22 16:39:42 -07:00
Ram Pai
f483d3923d PCI: conditional resource-reallocation through kernel parameter pci=realloc
Multiple attempts to dynamically reallocate pci resources have
unfortunately lead to regressions. Though we continue to fix the
regressions and fine tune the dynamic-reallocation behavior, we have not
reached a acceptable state yet.
    
This patch provides a interim solution. It disables dynamic reallocation
by default, but adds the ability to enable it through pci=realloc kernel
command line parameter.
    
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-08 15:49:20 -07:00
Ohad Ben-Cohen
166e9278a3 x86/ia64: intel-iommu: move to drivers/iommu/
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.

Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
in this patch, too.

As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.

Compile-tested on x86_64.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:30 +02:00
Linus Torvalds
5e152b4c9e Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits)
  PCI: Don't use dmi_name_in_vendors in quirk
  PCI: remove unused AER functions
  PCI/sysfs: move bus cpuaffinity to class dev_attrs
  PCI: add rescan to /sys/.../pci_bus/.../
  PCI: update bridge resources to get more big ranges when allocating space (again)
  KVM: Use pci_store/load_saved_state() around VM device usage
  PCI: Add interfaces to store and load the device saved state
  PCI: Track the size of each saved capability data area
  PCI/e1000e: Add and use pci_disable_link_state_locked()
  x86/PCI: derive pcibios_last_bus from ACPI MCFG
  PCI: add latency tolerance reporting enable/disable support
  PCI: add OBFF enable/disable support
  PCI: add ID-based ordering enable/disable support
  PCI hotplug: acpiphp: assume device is in state D0 after powering on a slot.
  PCI: Set PCIE maxpayload for card during hotplug insertion
  PCI/ACPI: Report _OSC control mask returned on failure to get control
  x86/PCI: irq and pci_ids patch for Intel Panther Point DeviceIDs
  PCI: handle positive error codes
  PCI: check pci_vpd_pci22_wait() return
  PCI: Use ICH6_GPIO_EN in ich6_lpc_acpi_gpio
  ...

Fix up trivial conflicts in include/linux/pci_ids.h: commit a6e5e2be44
moved the intel SMBUS ID definitons to the i2c-i801.c driver.
2011-05-23 15:39:34 -07:00
Yinghai Lu
dc2c2c9dd5 PCI/sysfs: move bus cpuaffinity to class dev_attrs
Requested by Greg KH to fix a race condition in the creating of PCI bus
cpuaffinity files.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:13 -07:00
Yinghai Lu
b9d320fcb6 PCI: add rescan to /sys/.../pci_bus/.../
After remove the device from /sys, we have to rescan all or
find out the bridge and access /sys../device/rescan there.

this patch add /sys/.../pci_bus/.../rescan. So user can rescan more easy.
that is more clean and easy to understand.

like after remove 0000:c4:00.0, you can rescan 0000:c4 directly.

-v2: According to Jesse, use function instead of exposing attr, so could hide
	#ifdef in header file.
     also add code to remove rescan file in remove path.
-v3: GregKH pointed out that we should use dev_attrs to avoid racing.
     So add pcibus_attrs and make it to be member of pcibus_attrs.
-v4: Change name to pcibus_dev_attrs according to GregKH

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:12 -07:00
Joerg Roedel
5cdede2408 PCI: Move ATS declarations in seperate header file
This patch moves the relevant declarations from the local
header file in drivers/pci to a more accessible locations so
that it can be used by the AMD IOMMU driver too.
The file is named pci-ats.h because support for the PCI PRI
capability will also be added there in a later patch-set.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-04-11 09:01:41 +02:00
Narendra_K@Dell.com
6058989bad PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs
This patch exports ACPI _DSM (Device Specific Method) provided firmware
instance number and string name of PCI devices as defined by 'PCI
Firmware Specification Revision 3.1' section 4.6.7.( DSM for Naming a
PCI or PCI Express Device Under Operating Systems) to sysfs.

New files created are:
  /sys/bus/pci/devices/.../label which contains the firmware name for
the device in question, and
  /sys/bus/pci/devices/.../acpi_index which contains the firmware device type
instance for the given device.

cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/acpi_index
1
cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/label
Embedded Broadcom 5709C NIC 1

cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/acpi_index
2
cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/label
Embedded Broadcom 5709C NIC 2

The ACPI _DSM provided firmware 'instance number' and 'string name' will
be given priority if the firmware also provides 'SMBIOS type 41 device
type instance and string'.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-03-04 10:41:56 -08:00
Rafael J. Wysocki
b6e335aeeb PCI/PM: Use pm_wakeup_event() directly for reporting wakeup events
After recent changes related to wakeup events pm_wakeup_event()
automatically checks if the given device is configured to signal wakeup,
so pci_wakeup_event() may be a static inline function calling
pm_wakeup_event() directly.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-01-14 08:55:43 -08:00
Rafael J. Wysocki
415e12b237 PCI/ACPI: Request _OSC control once for each root bridge (v3)
Move the evaluation of acpi_pci_osc_control_set() (to request control of
PCI Express native features) into acpi_pci_root_add() to avoid calling
it many times for the same root complex with the same arguments.
Additionally, check if all of the requisite _OSC support bits are set
before calling acpi_pci_osc_control_set() for a given root complex.

References: https://bugzilla.kernel.org/show_bug.cgi?id=20232
Reported-by: Ozan Caglayan <ozan@pardus.org.tr>
Tested-by: Ozan Caglayan <ozan@pardus.org.tr>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-01-14 08:55:41 -08:00
Martin Wilck
3b519e4ea6 PCI: fix size checks for mmap() on /proc/bus/pci files
The checks for valid mmaps of PCI resources made through /proc/bus/pci files
that were introduced in 9eff02e204 have several
problems:

1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0,
whereas under /sys/bus/pci/devices, the start of the resource corresponds
to offset 0. This may lead to false negatives in pci_mmap_fits(), which
implicitly assumes the /sys/bus/pci/devices layout.

2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads
to false positives, because pci_mmap_fits() doesn't treat empty resources
correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT)
in this case!).

3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus
WARNINGS for the first resources that don't fit until the correct one is found.

On many controllers the first 2-4 BARs are used, and the others are empty.
In this case, an mmap attempt will first fail on the non-empty BARs
(including the "right" BAR because of 1.) and emit bogus WARNINGS because
of 3., and finally succeed on the first empty BAR because of 2.
This is certainly not the intended behaviour.

This patch addresses all 3 issues.
Updated with an enum type for the additional parameter for pci_mmap_fits().

Cc: stable@kernel.org
Signed-off-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-11 09:34:32 -08:00
Matthew Garrett
bf4d290869 PCI: Export some PCI PM functionality
It's helpful to have some extra PCI power management functions available to
platform code, so move the declarations to an exported header.

Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-17 20:03:06 -07:00
Cam Macdonell
0e52247a2e PCI: fix pci_resource_alignment prototype
This fixes the prototype for both pci_resource_alignment() and
pci_sriov_resource_alignment().

Patch started as debugging effort from Cam Macdonell.

Cc: Cam Macdonell <cam@cs.ualberta.ca>
Cc: Avi Kivity <avi@redhat.com>
[chrisw: add iov bits]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-09-09 13:41:25 -07:00
Rafael J. Wysocki
f1a7bfaf6b PCI: PCIe AER: Introduce pci_aer_available()
Introduce a function allowing the caller to check whether to try to
enable PCIe AER.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-08-24 13:43:08 -07:00
Linus Torvalds
1cfd2bda8c Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (30 commits)
  PCI: update for owner removal from struct device_attribute
  PCI: Fix warnings when CONFIG_DMI unset
  PCI: Do not run NVidia quirks related to MSI with MSI disabled
  x86/PCI: use for_each_pci_dev()
  PCI: use for_each_pci_dev()
  PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc()
  PCI: export SMBIOS provided firmware instance and label to sysfs
  PCI: Allow read/write access to sysfs I/O port resources
  x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLAN
  PCI: remove unused HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_{SIZE|BOUNDARY}
  PCI: disable mmio during bar sizing
  PCI: MSI: Remove unsafe and unnecessary hardware access
  PCI: Default PCIe ASPM control to on and require !EMBEDDED to disable
  PCI: kernel oops on access to pci proc file while hot-removal
  PCI: pci-sysfs: remove casts from void*
  ACPI: Disable ASPM if the platform won't provide _OSC control for PCIe
  PCI hotplug: make sure child bridges are enabled at hotplug time
  PCI hotplug: shpchp: Removed check for hotplug of display devices
  PCI hotplug: pciehp: Fixed return value sign for pciehp_unconfigure_device
  PCI: Don't enable aspm before drivers have had a chance to veto it
  ...
2010-08-06 11:44:36 -07:00
Narendra K
b879743f26 PCI: Fix warnings when CONFIG_DMI unset
This patch fixes the below warnings introduced by the commit
911e1c9b05 ("PCI:
export SMBIOS provided firmware instance and label to sysfs").

drivers/pci/pci.h: In function ‘pci_create_firmware_label_files’:
drivers/pci/pci.h:16: warning: ‘return’ with a value, in function returning void
drivers/pci/pci.h: In function ‘pci_remove_firmware_label_files’:
drivers/pci/pci.h:18: warning: ‘return’ with a value, in function returning void

The warnings are seen because of the below code, doing a retun 0
from the functions 'pci_create_firmware_label_files' and
'pci_remove_firmware_label_files' defined as void.

+#ifndef CONFIG_DMI
+static inline void pci_create_firmware_label_files(struct pci_dev *pdev)
+{ return 0; }
+static inline void pci_remove_firmware_label_files(struct pci_dev *pdev)
+{ return 0; }

Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-08-02 09:11:10 -07:00
Narendra K
911e1c9b05 PCI: export SMBIOS provided firmware instance and label to sysfs
This patch exports SMBIOS provided firmware instance and label of
onboard PCI devices to sysfs.  New files are:
  /sys/bus/pci/devices/.../label which contains the firmware name for
the device in question, and
  /sys/bus/pci/devices/.../index which contains the firmware device type
instance for the given device.

Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30 09:36:01 -07:00
Rafael J. Wysocki
c125e96f04 PM: Make it possible to avoid races between wakeup and system sleep
One of the arguments during the suspend blockers discussion was that
the mainline kernel didn't contain any mechanisms making it possible
to avoid races between wakeup and system suspend.

Generally, there are two problems in that area.  First, if a wakeup
event occurs exactly when /sys/power/state is being written to, it
may be delivered to user space right before the freezer kicks in, so
the user space consumer of the event may not be able to process it
before the system is suspended.  Second, if a wakeup event occurs
after user space has been frozen, it is not generally guaranteed that
the ongoing transition of the system into a sleep state will be
aborted.

To address these issues introduce a new global sysfs attribute,
/sys/power/wakeup_count, associated with a running counter of wakeup
events and three helper functions, pm_stay_awake(), pm_relax(), and
pm_wakeup_event(), that may be used by kernel subsystems to control
the behavior of this attribute and to request the PM core to abort
system transitions into a sleep state already in progress.

The /sys/power/wakeup_count file may be read from or written to by
user space.  Reads will always succeed (unless interrupted by a
signal) and return the current value of the wakeup events counter.
Writes, however, will only succeed if the written number is equal to
the current value of the wakeup events counter.  If a write is
successful, it will cause the kernel to save the current value of the
wakeup events counter and to abort the subsequent system transition
into a sleep state if any wakeup events are reported after the write
has returned.

[The assumption is that before writing to /sys/power/state user space
will first read from /sys/power/wakeup_count.  Next, user space
consumers of wakeup events will have a chance to acknowledge or
veto the upcoming system transition to a sleep state.  Finally, if
the transition is allowed to proceed, /sys/power/wakeup_count will
be written to and if that succeeds, /sys/power/state will be written
to as well.  Still, if any wakeup events are reported to the PM core
by kernel subsystems after that point, the transition will be
aborted.]

Additionally, put a wakeup events counter into struct dev_pm_info and
make these per-device wakeup event counters available via sysfs,
so that it's possible to check the activity of various wakeup event
sources within the kernel.

To illustrate how subsystems can use pm_wakeup_event(), make the
low-level PCI runtime PM wakeup-handling code use it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
2010-07-19 01:58:48 +02:00
Bill Pemberton
8356dda2a5 PCI: make bitfield unsigned
Fix sparse warning:

drivers/pci/pci.h:247:25: error: dubious one-bit signed bitfield

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
CC: linux-pci@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-05-11 10:07:20 +02:00
Rafael J. Wysocki
6cbf82148f PCI PM: Run-time callbacks for PCI bus type
Introduce run-time PM callbacks for the PCI bus type.  Make the new
callbacks work in analogy with the existing system sleep PM
callbacks, so that the drivers already converted to struct dev_pm_ops
can use their suspend and resume routines for run-time PM without
modifications.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:21:19 -08:00
Rafael J. Wysocki
b67ea76172 PCI / ACPI / PM: Platform support for PCI PME wake-up
Although the majority of PCI devices can generate PMEs that in
principle may be used to wake up devices suspended at run time,
platform support is generally necessary to convert PMEs into wake-up
events that can be delivered to the kernel.  If ACPI is used for this
purpose, PME signals generated by a PCI device will trigger the ACPI
GPE associated with the device to generate an ACPI wake-up event that
we can set up a handler for, provided that everything is configured
correctly.

Unfortunately, the subset of PCI devices that have GPEs associated
with them is quite limited.  The devices without dedicated GPEs have
to rely on the GPEs associated with other devices (in the majority of
cases their upstream bridges and, possibly, the root bridge) to
generate ACPI wake-up events in response to PME signals from them.

Add ACPI platform support for PCI PME wake-up:
o Add a framework making is possible to use ACPI system notify
  handlers for run-time PM.
o Add new PCI platform callback ->run_wake() to struct
  pci_platform_pm_ops allowing us to enable/disable the platform to
  generate wake-up events for given device.  Implemet this callback
  for the ACPI platform.
o Define ACPI wake-up handlers for PCI devices and PCI root buses and
  make the PCI-ACPI binding code register wake-up notifiers for all
  PCI devices present in the ACPI tables.
o Add function pci_dev_run_wake() which can be used by PCI drivers to
  check if given device is capable of generating wake-up events at
  run time.

Developed in cooperation with Matthew Garrett <mjg@redhat.com>.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:21:02 -08:00
Rafael J. Wysocki
58ff463396 PCI PM: Add function for checking PME status of devices
Add function pci_check_pme_status() that will check the PME status
bit of given device and clear it along with the PME enable bit.  It
will be necessary for PCI run-time power management.

Based on a patch from Shaohua Li <shaohua.li@intel.com>

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:20:24 -08:00
Rafael J. Wysocki
93177a748b PCI: Clean up build for CONFIG_PCI_QUIRKS unset
Currently, drivers/pci/quirks.c is built unconditionally, but if
CONFIG_PCI_QUIRKS is unset, the only things actually built in this
file are definitions of global variables and empty functions (due to
the #ifdef CONFIG_PCI_QUIRKS embracing all of the code inside the
file).  This is not particularly nice and if someone overlooks
the #ifdef CONFIG_PCI_QUIRKS, build errors are introduced.

To clean that up, move the definitions of the global variables in
quirks.c that are always built to pci.c, move the definitions of
the empty functions (compiled when CONFIG_PCI_QUIRKS is unset) to
headers (additionally make these functions static inline) and modify
drivers/pci/Makefile so that quirks.c is only built if
CONFIG_PCI_QUIRKS is set.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:15:21 -08:00
Rafael J. Wysocki
5b889bf237 PCI: Fix build if quirks are not enabled
After commit b9c3b26641 ("PCI: support
device-specific reset methods") the kernel build is broken if
CONFIG_PCI_QUIRKS is unset.

Fix this by moving pci_dev_specific_reset() to drivers/pci/quirks.c and
providing an empty replacement for !CONFIG_PCI_QUIRKS builds.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-31 12:00:45 -08:00
Dexuan Cui
b9c3b26641 PCI: support device-specific reset methods
Add a new type of quirk for resetting devices at pci_dev_reset time.
This is necessary to handle device with nonstandard reset procedures,
especially useful for guest drivers.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 13:37:50 -08:00
Allen Kay
ae21ee65e8 PCI: acs p2p upsteram forwarding enabling
Note: dom0 checking in v4 has been separated out into 2/2.

This patch enables P2P upstream forwarding in ACS capable PCIe switches.
It solves two potential problems in virtualization environment where a PCIe
device is assigned to a guest domain using a HW iommu such as VT-d:

1) Unintentional failure caused by guest physical address programmed
   into the device's DMA that happens to match the memory address range
   of other downstream ports in the same PCIe switch.  This causes the PCI
   transaction to go to the matching downstream port instead of go to the
   root complex to get translated by VT-d as it should be.

2) Malicious guest software intentionally attacks another downstream
   PCIe device by programming the DMA address into the assigned device
   that matches memory address range of the downstream PCIe port.

We are in process of implementing device filtering software in KVM/XEN
management software to allow device assignment of PCIe devices behind a PCIe
switch only if it has ACS capability and with the P2P upstream forwarding bits
enabled.  This patch is intended to work for both KVM and Xen environments.

Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Reviewed-by: Mathew Wilcox <willy@linux.intel.com>
Reviewed-by: Chris Wright <chris@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-11-04 08:47:25 -08:00
Eric W. Biederman
0ba379ec0f PCI: Simplify hotplug mch quirk.
There is a very old quirk for the intel E7502 E7320 and E7525 memory
controller hubs that disables usage of msi interrupts on pcie hotplug
bridges of those devices, and disables changing the affinity of irqs.

Today all we have to do to disable msi on a specific device is to set
dev->no_msi, which is much more straightforward than the previous
logic.

The re-running of this fixup after pci hotplug happens below these
devices is totally bogus.  All of the state we change is pure software
state and we don't change the hardware at all.  Which means hotplug on
the lower devices doesn't have a chance to change this state.  So we
can safely remove the special case from the pciehp driver and the pcie
portdriver.

I suspect the special case was someone's expermental debug code that
slipped in. Certainly it isn't mentioned in commit
6fb8880a61510295aece04a542767161f624dffe aka BKrev:
41966101LJ_ogfOU0m2aE6teZfQnuQ where the code first appears.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 14:06:49 -07:00
Michael S. Tsirkin
711d57796f PCI: expose function reset capability in sysfs
Some devices allow an individual function to be reset without affecting
other functions in the same device: that's what pci_reset_function does.
For devices that have this support, expose reset attribite in sysfs.

This is useful e.g. for virtualization, where a qemu userspace
process wants to reset the device when the guest is reset,
to emulate machine reboot as closely as possible.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:24 -07:00
Chris Wright
6faf17f6f1 PCI SR-IOV: correct broken resource alignment calculations
An SR-IOV capable device includes an SR-IOV PCIe capability which
describes the Virtual Function (VF) BAR requirements.  A typical SR-IOV
device can support multiple VFs whose BARs must be in a contiguous region,
effectively an array of VF BARs.  The BAR reports the size requirement
for a single VF.  We calculate the full range needed by simply multiplying
the VF BAR size with the number of possible VFs and create a resource
spanning the full range.

This all seems sane enough except it artificially inflates the alignment
requirement for the VF BAR.  The VF BAR need only be aligned to the size
of a single BAR not the contiguous range of VF BARs.  This can cause us
to fail to allocate resources for the BAR despite the fact that we
actually have enough space.

This patch adds a thin PCI specific layer over the generic
resource_alignment() function which is aware of the special nature of
VF BARs and does sorting and allocation based on the smaller alignment
requirement.

I recognize that while resource_alignment is generic, it's basically a
PCI helper.  An alternative to this patch is to add PCI VF BAR specific
information to struct resource.  I opted for the extra layer rather than
adding such PCI specific information to struct resource.  This does
have the slight downside that we don't cache the BAR size and re-read
for each alignment query (happens a small handful of times during boot
for each VF BAR).

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Yu Zhao <yu.zhao@intel.com>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-08-30 08:37:25 -07:00
Yu Zhao
e277d2fc79 PCI: handle Virtual Function ATS enabling
The SR-IOV spec requires that the Smallest Translation Unit and
the Invalidate Queue Depth fields in the Virtual Function ATS
capability are hardwired to 0. If a function is a Virtual Function,
then and set its Physical Function's STU before enabling the ATS.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-05-18 11:25:58 +01:00
Yu Zhao
302b4215da PCI: support the ATS capability
The PCIe ATS capability makes the Endpoint be able to request the
DMA address translation from the IOMMU and cache the translation
in the device side, thus alleviate IOMMU pressure and improve the
hardware performance in the I/O virtualization environment.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-05-18 11:25:54 +01:00
Linus Torvalds
e76e5b2c66 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (88 commits)
  PCI: fix HT MSI mapping fix
  PCI: don't enable too much HT MSI mapping
  x86/PCI: make pci=lastbus=255 work when acpi is on
  PCI: save and restore PCIe 2.0 registers
  PCI: update fakephp for bus_id removal
  PCI: fix kernel oops on bridge removal
  PCI: fix conflict between SR-IOV and config space sizing
  powerpc/PCI: include pci.h in powerpc MSI implementation
  PCI Hotplug: schedule fakephp for feature removal
  PCI Hotplug: rename legacy_fakephp to fakephp
  PCI Hotplug: restore fakephp interface with complete reimplementation
  PCI: Introduce /sys/bus/pci/devices/.../rescan
  PCI: Introduce /sys/bus/pci/devices/.../remove
  PCI: Introduce /sys/bus/pci/rescan
  PCI: Introduce pci_rescan_bus()
  PCI: do not enable bridges more than once
  PCI: do not initialize bridges more than once
  PCI: always scan child buses
  PCI: pci_scan_slot() returns newly found devices
  PCI: don't scan existing devices
  ...

Fix trivial append-only conflict in Documentation/feature-removal-schedule.txt
2009-04-01 09:47:12 -07:00