Check dma_pte_present() and only free the page if there _is_ one.
Kind of surprising that there was no warning about this.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
On Wed, 2009-07-01 at 16:59 -0700, Linus Torvalds wrote:
> I also _really_ hate how you do
>
> (unsigned long)pte >> VTD_PAGE_SHIFT ==
> (unsigned long)first_pte >> VTD_PAGE_SHIFT
Kill this, in favour of just looking to see if the incremented pte
pointer has 'wrapped' onto the next page. Which means we have to check
it _after_ incrementing it, not before.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
For many purposes, including interrupt-swizzling, devices with ARI
enabled behave as if they have one device (number 0) and 256 functions.
This probably hasn't bitten us in practice because all ARI devices I've
seen are also IOV devices, and IOV devices are required to use MSI.
This isn't guaranteed, and there are legitimate reasons to use ARI
without IOV, and hence potentially use pin-based interrupts.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This would have found the bug in i386 pci_unmap_addr() a long time ago.
We shouldn't just silently return without doing anything.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Since we're using cmpxchg64() anyway (because that's the only way to do
an atomic 64-bit store on i386), we might as well ditch the extra
locking and just use cmpxchg64() to ensure that we don't add the page
twice.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This fixes kernel.org bug #13584. The IOVA code attempted to optimise
the insertion of new ranges into the rbtree, with the unfortunate result
that some ranges just didn't get inserted into the tree at all. Then
those ranges would be handed out more than once, and things kind of go
downhill from there.
Introduced after 2.6.25 by ddf02886cb
("PCI: iova RB tree setup tweak").
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: mark gross <mgross@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As with other functions, batch the CPU data cache flushes and don't keep
recalculating PTE addresses.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The loop condition was wrong -- we should free a PMD only if its
_entire_ range is within the range we're intending to clear. The
early-termination condition was right, but not the loop.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Instead of calling domain_pfn_mapping() repeatedly with single or
small numbers of pages, just pass the sglist in. It can optimise the
number of cache flushes like domain_pfn_mapping() does, and gives a huge
speedup for large scatterlists.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
There are 2 problems on mask states in suspend/resume.
[1]:
It is better to restore the mask states of MSI/MSI-X to initial states
(MSI is unmasked, MSI-X is masked) when we release the device.
The pci_msi_shutdown() does the restoration of mask states for MSI,
while the msi_free_irqs() does it for MSI-X. In other words, in the
"disable" path both of MSI and MSI-X are handled, but in the "shutdown"
path only MSI is handled.
MSI:
pci_disable_msi()
=> pci_msi_shutdown()
[ mask states for MSI restored ]
=> msi_set_enable(dev, pos, 0);
=> msi_free_irqs()
MSI-X:
pci_disable_msix()
=> pci_msix_shutdown()
=> msix_set_enable(dev, 0);
=> msix_free_all_irqs
=> msi_free_irqs()
[ mask states for MSI-X restored ]
This patch moves the masking for MSI-X from msi_free_irqs() to
pci_msix_shutdown().
This change has some positive side effects:
- It prevents OS from touching mask states before reading preserved
bits in the register, which can be happen if msi_free_irqs() is
called from error path in msix_capability_init().
- It also prevents touching the register after turning off MSI-X in
"disable" path, which can be a problem on some devices.
[2]:
We have cache of the mask state in msi_desc, which is automatically
updated when msi/msix_mask_irq() is called. This cached states are
used for the resume.
But since what need to be restored in the resume is the states before
the shutdown on the suspend, calling msi/msix_mask_irq() from
pci_msi/msix_shutdown() is not appropriate.
This patch introduces __msi/msix_mask_irq() that do mask as same
as msi/msix_mask_irq() but does not update cached state, for use
in pci_msi/msix_shutdown().
[updated: get rid of msi/msix_mask_irq_nocache() (proposed by Matthew Wilcox)]
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The initial state of mask register of MSI is unmasked. We set it
masked before calling arch_setup_msi_irqs(). If arch_setup_msi_irq()
fails, it is better to restore the state of the mask register.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These names are too long! Drop _OFFSET to save some bytes/lines.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The ALi loses some state if it goes into D3. Unfortunately even with the
chipset documents I can't figure out how to restore some bits of it.
The VIA one saves/restores apparently fine but the ACPI _GTM methods break
on some platforms if we do this and this causes cable misdetections.
These are both effectively regressions as historically nothing matched the
devices and then decided not to bind to them. Nowdays something is binding
to all sorts of devices and a result they get dumped into D3.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For devices attached to the root bus, we can't trigger Secondary Bus
Reset because there is no bridge device associated with the bus. So
need to check bus->self again NULL first before using it.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In current code it continues setup even if alloc_msi_entry() for MSI-X
is failed due to lack of memory. It means arch_setup_msi_irqs() might
be called with msi_desc entries less than its argument nvec.
At least x86's arch_setup_msi_irqs() uses list_for_each_entry() for
dev->msi_list that suspected to have entries same numbers as nvec, and
it doesn't check the number of allocated vectors and passed arg nvec.
Therefore it will result in success of pci_enable_msix(), with less
vectors allocated than requested.
This patch fixes the error route to return -ENOMEM, instead of continuing
the setup (proposed by Matthew Wilcox).
Note that there is no iounmap in msi_free_irqs() if no msi_disc is
allocated.
Reviewed-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
There's no need for the separate iommu_alloc_iova() function, and
certainly not for it to be global. Remove the underscores while we're at
it.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
As with dma_pte_clear_range(), don't keep flushing a single PTE at a
time. And also micro-optimise the setting of PTE values rather than
using the helper functions to do all the masking.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
It's a bit silly to repeatedly call domain_flush_cache() for each PTE
individually, as we clear it. Instead, batch them up and flush a whole
range at a time. We might as well refrain from recalculating the PTE
address from scratch each time round the loop too.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This is fairly broken anyway -- it doesn't take hotplug into account.
We should probably be checking page_is_ram() instead.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Most of its callers are having to shift for themselves anyway, so we might
as well do it in iommu_flush_iotlb_psi().
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
... and use it in the trivial cases; the other callers want individual
(and bisectable) attention, since I screwed them up the first time...
Make the BUG_ON() happen on too-large virtual address rather than
physical address, too. That's the one we care about.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Use unaligned address for domain->max_addr. That algorithm isn't ideal
anyway -- we should probably just look at the last iova in the tree.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
With some cleanup of intel_unmap_page(), intel_unmap_sg() and
vm_domain_exit() to no longer play with 64-bit addresses.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Add some helpers for converting between VT-d and normal system pfns,
since system pages can be larger than VT-d pages.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
There's no need for the GFX workaround now we have 'iommu=pt' for the
cases where people really care about performance. There's no need to
have a special case for just one type of device.
This also speeds up the iommu=pt path and reduces memory usage by
setting up the si_domain _once_ and then using it for all devices,
rather than giving each device its own private page tables.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In caching mode, domain ID 0 is reserved for non-present to present
mapping flush. Device IOTLB doesn't need to be flushed in this case.
Previously we were avoiding the flush for domain zero, even if the IOMMU
wasn't in caching mode and domain zero wasn't special.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Drop the e820 scanning and use existing function for finding valid
RAM regions to add to 1:1 mapping.
Signed-off-by: Chris Wright <chrisw@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (72 commits)
asus-laptop: remove EXPERIMENTAL dependency
asus-laptop: use pr_fmt and pr_<level>
eeepc-laptop: cpufv updates
eeepc-laptop: sync eeepc-laptop with asus_acpi
asus_acpi: Deprecate in favor of asus-laptop
acpi4asus: update MAINTAINER and KConfig links
asus-laptop: platform dev as parent for led and backlight
eeepc-laptop: enable camera by default
ACPI: Rename ACPI processor device bus ID
acerhdf: Acer Aspire One fan control
ACPI: video: DMI workaround broken Acer 7720 BIOS enabling display brightness
ACPI: run ACPI device hot removal in kacpi_hotplug_wq
ACPI: Add the reference count to avoid unloading ACPI video bus twice
ACPI: DMI to disable Vista compatibility on some Sony laptops
ACPI: fix a deadlock in hotplug case
Show the physical device node of backlight class device.
ACPI: pdc init related memory leak with physical CPU hotplug
ACPI: pci_root: remove unused dev/fn information
ACPI: pci_root: simplify list traversals
ACPI: pci_root: use driver data rather than list lookup
...
To support domain-isolation usages, the platform hardware must be
capable of uniquely identifying the requestor (source-id) for each
interrupt message. Without source-id checking for interrupt remapping
, a rouge guest/VM with assigned devices can launch interrupt attacks
to bring down anothe guest/VM or the VMM itself.
This patch adds source-id checking for interrupt remapping, and then
really isolates interrupts for guests/VMs with assigned devices.
Because PCI subsystem is not initialized yet when set up IOAPIC
entries, use read_pci_config_byte to access PCI config space directly.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Interrupt remapping table entry is 128bits. Currently, it only sets low
64bits of irte in modify_irte and free_irte. This ignores high 64bits
setting of irte, that means source-id setting will be ignored. This patch
sets the whole 128bits of irte when modify/free it. Following source-id
checking patch depends on this.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Identity mapping for IOMMU defines a single domain to 1:1 map all PCI
devices to all usable memory.
This reduces map/unmap overhead in DMA API's and improve IOMMU
performance. On 10Gb network cards, Netperf shows no performance
degradation compared to non-IOMMU performance.
This method may lose some of DMA remapping benefits like isolation.
The patch sets up identity mapping for all PCI devices to all usable
memory. In the DMA API, there is no overhead to maintain page tables,
invalidate iotlb, flush cache etc.
32 bit DMA devices don't use identity mapping domain, in order to access
memory beyond 4GiB.
When kernel option iommu=pt, pass through is first tried. If pass
through succeeds, IOMMU goes to pass through. If pass through is not
supported in hw or fail for whatever reason, IOMMU goes to identity
mapping.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* git://git.infradead.org/~dwmw2/iommu-2.6.31:
intel-iommu: Fix one last ia64 build problem in Pass Through Support
VT-d: support the device IOTLB
VT-d: cleanup iommu_flush_iotlb_psi and flush_unmaps
VT-d: add device IOTLB invalidation support
VT-d: parse ATSR in DMA Remapping Reporting Structure
PCI: handle Virtual Function ATS enabling
PCI: support the ATS capability
intel-iommu: dmar_set_interrupt return error value
intel-iommu: Tidy up iommu->gcmd handling
intel-iommu: Fix tiny theoretical race in write-buffer flush.
intel-iommu: Clean up handling of "caching mode" vs. IOTLB flushing.
intel-iommu: Clean up handling of "caching mode" vs. context flushing.
VT-d: fix invalid domain id for KVM context flush
Fix !CONFIG_DMAR build failure introduced by Intel IOMMU Pass Through Support
Intel IOMMU Pass Through Support
Fix up trivial conflicts in drivers/pci/{intel-iommu.c,intr_remapping.c}
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (74 commits)
PCI: make msi_free_irqs() to use msix_mask_irq() instead of open coded write
PCI: Fix the NIU MSI-X problem in a better way
PCI ASPM: remove get_root_port_link
PCI ASPM: cleanup pcie_aspm_sanity_check
PCI ASPM: remove has_switch field
PCI ASPM: cleanup calc_Lx_latency
PCI ASPM: cleanup pcie_aspm_get_cap_device
PCI ASPM: cleanup clkpm checks
PCI ASPM: cleanup __pcie_aspm_check_state_one
PCI ASPM: cleanup initialization
PCI ASPM: cleanup change input argument of aspm functions
PCI ASPM: cleanup misc in struct pcie_link_state
PCI ASPM: cleanup clkpm state in struct pcie_link_state
PCI ASPM: cleanup latency field in struct pcie_link_state
PCI ASPM: cleanup aspm state field in struct pcie_link_state
PCI ASPM: fix typo in struct pcie_link_state
PCI: drivers/pci/slot.c should depend on CONFIG_SYSFS
PCI: remove redundant __msi_set_enable()
PCI PM: consistently use type bool for wake enable variable
x86/ACPI: Correct maximum allowed _CRS returned resources and warn if exceeded
...
Use msix_mask_irq() instead of direct use of writel, so as not to clear
preserved bits in the Vector Control register [31:1].
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The previous MSI-X fix (8d18101853) had
three bugs. First, it didn't move the write that disabled the vector.
This led to writing garbage to the MSI-X vector (spotted by Michael
Ellerman). It didn't fix the PCI resume case, and it had a race window
where the device could generate an interrupt before the MSI-X registers
were programmed (leading to a DMA to random addresses).
Fortunately, the MSI-X capability has a bit to mask all the vectors.
By setting this bit instead of clearing the enable bit, we can ensure
the device will not generate spurious interrupts. Since the capability
is now enabled, the NIU device will not have a problem with the reads
and writes to the MSI-X registers being in the original order in the code.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
By having a pointer to the root port link, we can remove loops in
get_root_port_link() to search the root port link.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We don't need the 'has_switch' field in the struct pcie_link_state.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cleanup for calc_L0S_latency() and calc_L1_latency().
- Separate exit latency and acceptable latency calculation.
- Some minor cleanups.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the current ASPM implementation, callers of pcie_set_clock_pm() check
Clock PM capability of the link or current Clock PM state of the link.
This check should be done in pcie_set_clock_pm() itself.
This patch moves those checks into pcie_set_clock_pm(). It also
introduces pcie_set_clkpm_nocheck() that is equivalent to old
pcie_set_clock_pm(), for the caller who wants to change Clocl PM state
regardless of the Clock PM capability or current Clock PM state. In
addition, this patch changes the function name from
pcie_set_clock_pm() to pcie_set_clkpm() for consistency.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Clean up ASPM initialization by refactoring some functionality, renaming
functions, and moving things around.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the current ASPM implementation, there are many functions that
take a pointer to struct pci_dev corresponding to the upstream component
of the link as a parameter. But, since those functions handle PCI
express link state, a pointer to struct pcie_link_state is more
suitable than a pointer to struct pci_dev. Changing a parameter to a
pointer to struct pcie_link_state makes ASPM code much simpler and
easier to read. This patch also contains some minor cleanups. This patch
doesn't have any functional change.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cleanup for some fields in pcie_link_state.
- Add comments.
- make "downstream_has_switch" field 1-bit.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The "clk_pm_capable", "clk_pm_enable" and "bios_clk_state" fields in
the struct pcie_link_state only take 1-bit value. So those fields
don't need to be defined as unsigned int. This patch makes those
fields 1-bit, and cleans up some related code.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Clean up latency related data structures for ASPM.
- Introduce struct acpi_latency for exit latency and acceptable
latency management. With this change, struct endpoint_state is no
longer needed.
- We don't need to hold both upstream latency and downstream latency
in the current implementation.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The "support_state", "enabled_state" and "bios_aspm_state" fields in
the struct pcie_link_state take 2-bit value. So those fields don't
need to be defined as unsigned int. This patch makes those fields
2-bit, and cleans up some related code.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix a typo in struct pcie_link_state.
The "sibiling" field in the struct pcie_link_state should be
"sibling".
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
There is no way to interact with a physical PCI slot without
sysfs, so encode the dependency and prevent this build error:
drivers/pci/slot.c: In function 'pci_hp_create_module_link':
drivers/pci/slot.c:327: error: 'module_kset' undeclared
This patch _should_ make pci-sysfs.o depend on CONFIG_SYSFS too,
but we cannot (yet) because the PCI core merrily assumes the
existence of sysfs:
drivers/built-in.o: In function `pci_bus_add_device':
drivers/pci/bus.c:89: undefined reference to `pci_create_sysfs_dev_files'
drivers/built-in.o: In function `pci_stop_dev':
drivers/pci/remove.c:24: undefined reference to `pci_remove_sysfs_dev_files'
So do the minimal bit for now and figure out how to untangle it
later.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fix-suggested-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We have the 'pos' of the MSI capability at all locations which call
msi_set_enable(), so pass it to msi_set_enable() instead of making it
find the capability every time.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Now that acpi_get_pci_dev is available, let's use it instead of
acpi_get_pci_id.
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Returns whether an ACPI CA node is a PCI root bridge or not.
This API is generically useful, and shouldn't just be a hotplug function.
The implementation becomes much simpler as well.
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Instead of starting from the iomem or ioport roots, start from the
parent bus' resources. This fixes a bug where child resources would
appear above their parents resources if they had the same size.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Other functions use type bool, so use that for pci_enable_wake as well.
Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If a PCI device is not power-manageable either by the platform, or
with the help of the native PCI PM interface, pci_target_state() will
return either PCI_D3hot, or PCI_POWER_ERROR for it, depending on
whether or not the device is configured to wake up the system. Alas,
none of these return values is correct, because each of them causes
pci_prepare_to_sleep() to return error code, although it should
complete successfully in such a case.
Fix this problem by making pci_target_state() always return PCI_D0
for devices that cannot be power managed.
Cc: stable@kernel.org
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCI-to-PCI Bridge 1.2 specifies that the Secondary Bus Reset bit can
force the assertion of RST# on the secondary interface, which can be
used to reset all devices including subordinates under this bus. This
can be used to reset a function if this function is the only device
under this bus.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCI PM 1.2 specifies that the device will perform an internal reset upon
transitioning from D3hot to D0 when the NO_SOFT_RESET bit is clear. This
method can be used to reset a function if neither PCIe FLR nor PCI AF FLR
are supported.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch enhances the FLR functions:
1) remove disable_irq() so the shared IRQ won't be disabled.
2) replace the 1s wait with 100, 200 and 400ms wait intervals
for the Pending Transaction.
3) replace mdelay() with msleep().
4) add might_sleep().
5) lock the device to prevent PM suspend from accessing the CSRs
during the reset.
6) coding style fixes.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Debugging PCIE AER code can be very difficult because it is hard
to trigger various real hardware errors. This patch provide a
software based error injection tool, which can fake various PCIE
errors with a user space helper tool named "aer-inject". Which
can be gotten from:
http://www.kernel.org/pub/linux/kernel/people/yhuang/
The patch fakes AER error by faking some PCIE AER related
registers and an AER interrupt for specified the PCIE device.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When a root port receives the same errors more than once before the
kernel process them, the Multiple Error Messages Received flags are set
by hardware. Because the root port could only save one kind of
correctable error source id and another uncorrectable error source id at
the same time, the second message sender id is lost if the 2 messages
are sent from 2 different devices. This patch makes the kernel search
all devices under the root port when multiple messages are received.
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When the bus id part of error source id is equal to 0 or nosourceid=1,
make the kernel probe the AER status registers of all devices under the
root port to find the initial error reporter.
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Based on PCI Express AER specs, a root port might receive multiple
TLP errors while it could only save a correctable error source id
and an uncorrectable error source id at the same time. In addition,
some root port hardware might be unable to provide a correct source
id, i.e., the source id, or the bus id part of the source id provided
by root port might be equal to 0.
The patchset implements the support in kernel by searching the device
tree under the root port.
Patch 1 changes parameter cb of function pci_walk_bus to return a value.
When cb return non-zero, pci_walk_bus stops more searching on the
device tree.
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The "owner" field in struct hotplug_slot_ops is initialized by PCI
hotplug core. So each hotplug controller driver doesn't need to
initialize it.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Create symbolic link to hotplug driver module in the PCI slot
directory (/sys/bus/pci/slots/<SLOT#>). In the past, we need to load
hotplug drivers one by one to identify the hotplug driver that handles
the slot, and it was very inconvenient especially for trouble shooting.
With this change, we can easily identify the hotplug driver.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Current has_foo() functions in pci_hotplug_core.c returns 0 if the
"foo" property is true. It would cause misunderstanding. In addition,
the error code of those functions is never checked, so this patch
changes those functions' error code to 'bool' and return true if the
property "foo" is true.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The EMI support in pciehp is obviously broken. It is implemented using
struct hotplug_slot_attribute, but sysfs_ops for pci_slot_ktype is NOT
for struct hotplug_slot_attribute, but for struct pci_slot_attribute.
This bug had been there for a long time, maybe it was introduced when
PCI slot framework was introduced. The reason why this bug didn't
cause any problem is maybe the EMI support is not tested at all
because of lack of test environment.
As described above, the EMI support in pciehp seems not to be tested
at all. So this patch removes EMI support from pciehp, instead of
fixing the bug.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This is used by PCIE AER error injection to fake an PCI AER interrupt.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_bus_set_ops changes pci_ops associated with a pci_bus. This can be
used by debug tools such as PCIE AER error injection to fake some PCI
configuration registers.
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_is_root_bus() in pci_common_swizzle() for checking if the pci
bus is root, for code consistency.
Reviewed-by: Alex Chiang <achiang@hp.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_is_root_bus() in pci_get_interrupt_pin() for checking if the
pci bus is root, for code consistency.
Reviewed-by: Alex Chiang <achiang@hp.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_is_root_bus() in pci_read_bridge_bases() to check if the pci
bus is root, for code consistency.
Reviewed-by: Alex Chiang <achiang@hp.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_is_root_bus() in pci_find_upstream_pcie_bridge() to check if
the pci bus is root, for code consistency.
Reviewed-by: Alex Chiang <achiang@hp.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I found no references to SMBus in ACPI DSDT disassembly on my laptop
so this should be safe.
Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some BIOSes hide 'overflow' device (dev #6) for i82875P/PE chipsets.
The same happens for i82865P/PE. Add a quirk to enable this device.
This allows i82875 EDAC driver to bind to chipset's dev #6 and not
dev #0 as the latter is used by AGP driver.
On my laptop (i82865P based) ACPI code is disabling this device
again in \_SB.PCI0._CRS method (called at least at PNP init time).
This can be easily worked around by patching DSDT.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Acked-by: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch (as1235) adds an array of PCI power-state names, together
with a simple inline accessor routine.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device. Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used. These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.
Cc: linux-pci@vger.kernel.org
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (103 commits)
powerpc: Fix bug in move of altivec code to vector.S
powerpc: Add support for swiotlb on 32-bit
powerpc/spufs: Remove unused error path
powerpc: Fix warning when printing a resource_size_t
powerpc/xmon: Remove unused variable in xmon.c
powerpc/pseries: Fix warnings when printing resource_size_t
powerpc: Shield code specific to 64-bit server processors
powerpc: Separate PACA fields for server CPUs
powerpc: Split exception handling out of head_64.S
powerpc: Introduce CONFIG_PPC_BOOK3S
powerpc: Move VMX and VSX asm code to vector.S
powerpc: Set init_bootmem_done on NUMA platforms as well
powerpc/mm: Fix a AB->BA deadlock scenario with nohash MMU context lock
powerpc/mm: Fix some SMP issues with MMU context handling
powerpc: Add PTRACE_SINGLEBLOCK support
fbdev: Add PLB support and cleanup DCR in xilinxfb driver.
powerpc/virtex: Add ml510 reference design device tree
powerpc/virtex: Add Xilinx ML510 reference design support
powerpc/virtex: refactor intc driver and add support for i8259 cascading
powerpc/virtex: Add support for Xilinx PCI host bridge
...
Trivial patch which adds the __init and __exit macros to the module_init /
module_exit functions from drivers/pci/hotplug/sgi_hotplug.c
linux version 2.6.30-rc8
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Addition of one unknown subsystem identifier to the quirks handler for
chipset i82855GM_HB on notebook Asus A6L. This exposes the otherwise
hidden SMBus controller within the south bridge ICH4-M.
Signed-off-by: Mats Erik Andersson <mats.andersson@gisladisker.se>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Adds support for PCI Express transaction layer end-to-end CRC checking
(ECRC). This patch will enable/disable ECRC checking by setting/clearing
the ECRC Check Enable and/or ECRC Generation Enable bits for devices that
support ECRC.
The ECRC setting is controlled by the "pci=ecrc=<policy>" command-line
option. If this option is not set or is set to 'bios", the enable and
generation bits are left in whatever state that firmware/BIOS set them to.
The "off" setting turns them off, and the "on" option turns them on (if the
device supports it).
Turning ECRC on or off can be a data integrity versus performance
tradeoff. In theory, turning it on will catch more data errors, turning
it off means possibly better performance since CRC does not need to be
calculated by the PCIe hardware and packet sizes are reduced.
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
According to the PCI PM specification (PCI Bus Power Management
Interface Specification, Rev. 1.2, Section 5.4.1) we are supposed to
reinitialize devices that have PCI_PM_CTRL_NO_SOFT_RESET clear during
all transitions from PCI_D3hot to PCI_D0, but we only do it if the
device's current_state field is equal to PCI_UNKNOWN.
This may lead to problems if a device with PCI_PM_CTRL_NO_SOFT_RESET
unset is put into PCI_D3hot at run time by its driver and
pci_set_power_state() is used to put it back into PCI_D0, because in
that case the device will remain uninitialized after
pci_set_power_state() has returned. Prevent that from happening by
modifying pci_raw_set_power_state() to reinitialize devices with
PCI_PM_CTRL_NO_SOFT_RESET unset during all transitions from D3 to D0.
Cc: stable@kernel.org
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCIe root complex integrated endpoint does not implement ARI, so this
kind of endpoint uses 3-bit function number. The function dependency
link of the integrated endpoint should be calculated using the device
number plus the value from function dependency link register.
Normal endpoint always implements ARI and the function dependency link
register contains 8-bit function number (i.e. `devfn' from software's
perspective).
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We always call pci_stop_bus_device before calling pci_destroy_dev.
Since pci_stop_bus_device calls pci_stop_dev, there is no need
for pci_destroy_dev to repeat the call.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_enable_msix currently returns -EINVAL if you ask
for more vectors than supported by the device, which would
typically cause fallback to regular interrupts.
It's better to return the table size, making the driver retry
MSI-X with less vectors.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
VIA has a strange chipset, it has root port under a bridge. Disable ASPM
for such strange chipset.
Cc: stable@kernel.org
Tested-by: Wolfgang Denk <wd@denx.de>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The last in-tree caller of pci_find_slot has been converted, so
let's get rid of this deprecated interface.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Convert uses of pci_find_slot to modern API.
In the conversion sites, we end up calling pci_dev_put() right away.
This may seem like it misses the entire point of doing something like
pci_get_bus_and_slot(), since we drop the reference so soon, but it turns
out we don't actually do much with the returned pci_dev.
I plan on untangling cpqphp further, but clearly cpqphp never worried too
much about a properly refcounted pci_dev anyway. For now, this conversion
seems reasonable, as it gets rid of the last in-tree caller of pci_find_slot.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Eliminate this warning:
warning: return discards qualifiers from pointer target type
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I have no clue what the original intent here was, but the code as
written is useless.
The old dbg() statement above the old callsite might lead one to think
that at one point, there was supposed to be some recursion, but any
sense of sanity here has been lost to the ravages of time.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Instead of making multiple calls to pcibios_get_irq_routing_table, let's
just do it once and save the answer.
The reason we were making multiple calls is because we liked to calculate
its length and perform some loop over it. Instead of open-coding the length
calculation every time, provide it in an inline helper function.
Finally, since pci_print_IRQ_route() is used only for debug, let's only
do it when cpqhp_debug is set.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Handle an empty slot at the top of the loop, and continue early.
This allows us to un-indent the rest of the function by one level.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Check for an empty slot, and return early if so.
This allows us to un-indent the rest of the function by one level.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Style and whitespace cleanups, no functional change.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Apply DeMorgan's theorem:
if ((pdev->revision > 2) || (vendor_id == PCI_VENDOR_ID_INTEL))
turns into
if ((pdev->revision <= 2) && (vendor_id != PCI_VENDOR_ID_INTEL))
Now we can bail out early from the function if the controller is not
supported.
This allows us to un-indent the remainder of the function quite a bit and
make it much more readable.
Fix up some extra braces, and un-indent the 'case' labels in the switch
statement as per CodingStyle.
No functional change.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Clean up style and eliminate superfluous braces and parens.
No functional change.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Impact: refactor
Refactor code to follow convention more closely and eliminate the need
for some useless prototypes.
No functional change.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix up comments from C++ to C-style, wrapping if necessary, etc.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Clean up all stray whitespace issues, such as trailing whitespace,
spaces before tabs, etc. and whatever else vim's c_space_errors
highlights in red.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We could run out of space under under 4g, but devices under transparent
bridges can use 64bit resources, so keep trying on the parent bus until
we hit a non-transparent bridge.
Impact: better support for assigning unassigned resources
Reviewed-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We should not assign 64bit ranges to PCI devices that only take 32bit
prefetchable addresses.
Try to set IORESOURCE_MEM_64 in 64bit resource of pci_device/pci_bridge
and make the bus resource only have that bit set when all devices under
it support 64bit prefetchable memory. Use that flag to allocate
resources from that range.
Reported-by: Yannick <yannick.roehlly@free.fr>
Reviewed-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
drivers/pci/hotplug/ibmphp_core.c:1414: warning: `ibmphp_exit' defined but not used
Signed-off-by: Zhenwen Xu <helight.xu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device. Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used. These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Impact: cleanup, spec compliance
This patch does:
- Remove unused msi/msix_enable/disable macros.
User should use msi/msix_set_enable() functions instead.
- Remove unused msix_mask/unmask/pending macros.
These macros are useless because they are not based on any of
the PCI Local Bus Specifications properly.
It seems that they were written based on a draft of PCI spec,
and that the draft was the MSI-X ECN that underwent membership
review in September 2002.
(* In the draft, the size of a entry in MSI-X table was 64bit,
containing 32bit message data and DWORD aligned lower address
plus a pending bit and a mask bit.(30+1+1bit) The higher
address was placed in MSI-X capability structure and shared
by all entries.)
- Remove PCI_MSIX_FLAGS_BITMASK.
This definition also come from the draft ECN.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (76 commits)
x86, apic: Fix dummy apic read operation together with broken MP handling
x86, apic: Restore irqs on fail paths
x86: Print real IOAPIC version for x86-64
x86: enable_update_mptable should be a macro
sparseirq: Allow early irq_desc allocation
x86, io-apic: Don't mark pin_programmed early
x86, irq: don't call mp_config_acpi_gsi() if update_mptable is not enabled
x86, irq: update_mptable needs pci_routeirq
x86: don't call read_apic_id if !cpu_has_apic
x86, apic: introduce io_apic_irq_attr
x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector(), fix
x86: read apic ID in the !acpi_lapic case
x86: apic: Fixmap apic address even if apic disabled
x86: display extended apic registers with print_local_APIC and cpu_debug code
x86: read apic ID in the !acpi_lapic case
x86: clean up and fix setup_clear/force_cpu_cap handling
x86: apic: Check rev 3 fadt correctly for physical_apic bit
x86/pci: update pirq_enable_irq() to setup io apic routing
x86/acpi: move setup io apic routing out of CONFIG_ACPI scope
x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector()
...
The device class may be changed after the fixup, so re-read the class
value from pci_dev when configuring the device. Otherwise some devices
such as JMicron SATA controller won't work.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Tested-by: Marc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Conflicts:
arch/mips/sibyte/bcm1480/irq.c
arch/mips/sibyte/sb1250/irq.c
Merge reason: we gathered a few conflicts plus update to latest upstream fixes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
An oops can occur if a user attempts to use both PCI logical
hotplug and the ACPI physical hotplug driver (acpiphp) in this
sequence, where $slot/address == $device.
In other words, if acpiphp has claimed a PCI device, and that
device is logically removed, then acpiphp may oops when it
attempts to access it again.
# echo 1 > /sys/bus/pci/devices/$device/remove
# echo 0 > /sys/bus/pci/slots/$slot/power
Unable to handle kernel NULL pointer dereference (address 0000000000000000)
Call Trace:
[<a000000100016390>] show_stack+0x50/0xa0
[<a000000100016c60>] show_regs+0x820/0x860
[<a00000010003b390>] die+0x190/0x2a0
[<a000000100066a40>] ia64_do_page_fault+0x8e0/0xa40
[<a00000010000c7a0>] ia64_native_leave_kernel+0x0/0x270
[<a0000001003b2660>] pci_remove_bus_device+0x120/0x260
[<a0000002060549f0>] acpiphp_disable_slot+0x410/0x540 [acpiphp]
[<a0000002060505c0>] disable_slot+0xc0/0x120 [acpiphp]
[<a0000002040d21c0>] power_write_file+0x1e0/0x2a0 [pci_hotplug]
[<a0000001003bb820>] pci_slot_attr_store+0x60/0xa0
[<a000000100240f70>] sysfs_write_file+0x230/0x2c0
[<a000000100195750>] vfs_write+0x190/0x2e0
[<a0000001001961a0>] sys_write+0x80/0x100
[<a00000010000c600>] ia64_ret_from_syscall+0x0/0x20
[<a000000000010720>] __kernel_syscall_via_break+0x0/0x20
The root cause of this oops is that the logical remove ("echo 1 >
/sys/bus/pci/devices/$device/remove") destroyed the pci_dev. The
pci_dev struct itself wasn't deallocated because acpiphp kept a
reference, but some of its fields became invalid.
acpiphp doesn't have any real reason to keep a pointer to a
pci_dev around. It can always derive it using pci_get_slot().
If a logical remove destroys the pci_dev, acpiphp won't find it
and is thus prevented from causing mischief.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* Removed building setup-irq on ppc32, we don't use it anymore
* Remove duplicate prototype for setup_grackle() code that needs it
gets it from <asm/grackle.h>
* Removed gratuitous pci_io_size type differences between ppc32/ppc64
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Recent PCI PM changes introduced a bug that causes some devices to be
mishandled after kexec and during early initialization. The failure
scenario in the kexec case is the following:
* Assume a PCI device is not power-manageable by the platform and has
PCI_PM_CTRL_NO_SOFT_RESET set in PMCSR.
* The device is put into D3 before kexec (using the native PCI PM).
* After kexec, pci_setup_device() sets the device's power state to
PCI_UNKNOWN.
* pci_set_power_state(dev, PCI_D0) is called by the device's driver.
* __pci_start_power_transition(dev, PCI_D0) is called and since the
device is not power-manageable by the platform, it causes
pci_update_current_state(dev, PCI_D0) to be called. As a result
the device's current_state field is updated to PCI_D3, in
accordance with the contents of its PCI PM registers.
* pci_raw_set_power_state() is called and it changes the device power
state to D0. *However*, it should also call pci_restore_bars() to
reinitialize the device, but it doesn't, because the device's
current_state field has been modified earlier.
To prevent this from happening, modify pci_platform_power_transition()
so that it doesn't use pci_update_current_state() to update the
current_state field for devices that aren't power-manageable by the
platform. Instead, this field should be updated directly for devices
that don't support the native PCI PM.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Enable the device IOTLB (i.e. ATS) for both the bare metal and KVM
environments.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Make iommu_flush_iotlb_psi() and flush_unmaps() more readable.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Support device IOTLB invalidation to flush the translation cached
in the Endpoint.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Parse the Root Port ATS Capability Reporting Structure in the DMA
Remapping Reporting Structure ACPI table.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The SR-IOV spec requires that the Smallest Translation Unit and
the Invalidate Queue Depth fields in the Virtual Function ATS
capability are hardwired to 0. If a function is a Virtual Function,
then and set its Physical Function's STU before enabling the ATS.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The PCIe ATS capability makes the Endpoint be able to request the
DMA address translation from the IOMMU and cache the translation
in the device side, thus alleviate IOMMU pressure and improve the
hardware performance in the I/O virtualization environment.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
according to Ingo, io_apic irq-setup related functions have too many
parameters with a repetitive signature.
So reduce related funcs to get less params by passing a pointer
to a newly defined io_apic_irq_attr structure.
v2: io_apic_irq ==> irq_attr
triggering ==> trigger
v3: add set_io_apic_irq_attr
[ Impact: cleanup ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Len Brown <lenb@kernel.org>
LKML-Reference: <4A08ACD3.2070401@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI MSI: Fix MSI-X with NIU cards
PCI: Fix pci-e port driver slot_reset bad default return value