We return 1 if the IOMMU has been detected. Zero or an error number
if we failed to find it. This is in preperation of using the IOMMU_INIT
so that we can detect whether an IOMMU is present. I have not
tested this for regression on Calgary, nor on AMD Vi chipsets as
I don't have that hardware.
CC: Muli Ben-Yehuda <muli@il.ibm.com>
CC: "Jon D. Mason" <jdmason@kudzu.us>
CC: "Darrick J. Wong" <djwong@us.ibm.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: David Woodhouse <David.Woodhouse@intel.com>
CC: Chris Wright <chrisw@sous-sol.org>
CC: Yinghai Lu <yinghai@kernel.org>
CC: Joerg Roedel <joerg.roedel@amd.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
LKML-Reference: <1282845485-8991-3-git-send-email-konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* git://git.infradead.org/iommu-2.6:
intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables
intel-iommu: Combine the BIOS DMAR table warning messages
panic: Add taint flag TAINT_FIRMWARE_WORKAROUND ('I')
panic: Allow warnings to set different taint flags
intel-iommu: intel_iommu_map_range failed at very end of address space
intel-iommu: errors with smaller iommu widths
intel-iommu: Fix boot inside 64bit virtualbox with io-apic disabled
intel-iommu: use physfn to search drhd for VF
intel-iommu: Print out iommu seq_id
intel-iommu: Don't complain that ACPI_DMAR_SCOPE_TYPE_IOAPIC is not supported
intel-iommu: Avoid global flushes with caching mode.
intel-iommu: Use correct domain ID when caching mode is enabled
intel-iommu mistakenly uses offset_pfn when caching mode is enabled
intel-iommu: use for_each_set_bit()
intel-iommu: Fix section mismatch dmar_ir_support() uses dmar_tbl.
We now know how to deal with these tables so that they are harmless.
Set TAINT_FIRMWARE_WORKAROUND instead of the default TAINT_WARN.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We have nearly the same code for warnings repeated four times. Move
it into a separate function.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Commit 074835f014 ("intel-iommu: Fix
kernel hand if interrupt remapping disabled in BIOS") is adding a check
for interrupt remapping disabled and is dereferencing the dmar_tbl
pointer without checking its value.
Unfortunately, this value is null when booting inside a 64bit virtual
box guest with io-apic disabled, leading to a crash. With a check on it,
the guest is now booting. It's triggering a WARN() in
clockevent_delta2ns but it's better than not booting at all and allows
the user to see there's something wrong on their io-apic setup.
Signed-off-by: Arnaud Patard <apatard@mandriva.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
When virtfn is used, we should use physfn to find correct drhd
-v2: add pci_physfn() Suggested by Roland Dreier <rdreier@cisco.com>
do can remove ifdef in dmar.c
-v3: Chris pointed out we need that for dma_find_matched_atsr_unit too
also change dmar_pci_device_match() static
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Roland Dreier <rdreier@cisco.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
dmar_tbl is declared as __initdata, but dmar_ir_support() is not declared
as an __init function. Fix is simple since the only caller of dmar_ir_support
(intr_remapping_supported) is an __init function.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* git://git.infradead.org/iommu-2.6:
implement early_io{re,un}map for ia64
Revert "Intel IOMMU: Avoid memory allocation failures in dma map api calls"
intel-iommu: ignore page table validation in pass through mode
intel-iommu: Fix oops with intel_iommu=igfx_off
intel-iommu: Check for an RMRR which ends before it starts.
intel-iommu: Apply BIOS sanity checks for interrupt remapping too.
intel-iommu: Detect DMAR in hyperspace at probe time.
dmar: Fix build failure without NUMA, warn on bogus RHSA tables and don't abort
iommu: Allocate dma-remapping structures using numa locality info
intr_remap: Allocate intr-remapping table using numa locality info
dmar: Allocate queued invalidation structure using numa locality info
dmar: support for parsing Remapping Hardware Static Affinity structure
The BIOS errors where an IOMMU is reported either at zero or a bogus
address are causing problems even when the IOMMU is disabled -- because
interrupt remapping uses the same hardware. Ensure that the checks get
applied for the interrupt remapping initialisation too.
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Many BIOSes will lie to us about the existence of an IOMMU, and claim
that there is one at an address which actually returns all 0xFF.
We need to detect this early, so that we know we don't have a viable
IOMMU and can set up swiotlb before it's too late.
Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
x86, Calgary IOMMU quirk: Find nearest matching Calgary while walking up the PCI tree
x86/amd-iommu: Remove amd_iommu_pd_table
x86/amd-iommu: Move reset_iommu_command_buffer out of locked code
x86/amd-iommu: Cleanup DTE flushing code
x86/amd-iommu: Introduce iommu_flush_device() function
x86/amd-iommu: Cleanup attach/detach_device code
x86/amd-iommu: Keep devices per domain in a list
x86/amd-iommu: Add device bind reference counting
x86/amd-iommu: Use dev->arch->iommu to store iommu related information
x86/amd-iommu: Remove support for domain sharing
x86/amd-iommu: Rearrange dma_ops related functions
x86/amd-iommu: Move some pte allocation functions in the right section
x86/amd-iommu: Remove iommu parameter from dma_ops_domain_alloc
x86/amd-iommu: Use get_device_id and check_device where appropriate
x86/amd-iommu: Move find_protection_domain to helper functions
x86/amd-iommu: Simplify get_device_resources()
x86/amd-iommu: Let domain_for_device handle aliases
x86/amd-iommu: Remove iommu specific handling from dma_ops path
x86/amd-iommu: Remove iommu parameter from __(un)map_single
x86/amd-iommu: Make alloc_new_range aware of multiple IOMMUs
...
Commit ae21ee65e8 "PCI: acs p2p upsteram
forwarding enabling" doesn't actually enable ACS.
Add a function to pci core to allow an IOMMU to request that ACS
be enabled. The existing mechanism of using iommu_found() in the pci
core to know when ACS should be enabled doesn't actually work due to
initialization order; iommu has only been detected not initialized.
Have Intel and AMD IOMMUs request ACS, and Xen does as well during early
init of dom0.
Cc: Allen Kay <allen.m.kay@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Change for PCI core to use pci_is_pcie() instead of checking
pci_dev->is_pcie.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 86cf898e1d ("intel-iommu: Check for
'DMAR at zero' BIOS error earlier.") was supposed to work by pretending
not to detect an IOMMU if it was actually being reported by the BIOS at
physical address zero.
However, the intel_iommu_init() function is called unconditionally, as
are the corresponding functions for other IOMMU hardware.
So the patch only worked if you have RAM above the 4GiB boundary. It
caused swiotlb to be initialised when no IOMMU was detected during early
boot, and thus the later IOMMU init would refuse to run.
But if you have less RAM than that, swiotlb wouldn't get set up and the
IOMMU _would_ still end up being initialised, even though we never
claimed to detect it.
This patch also sets the dmar_disabled flag when the error is detected
during the initial detection phase -- so that the later call to
intel_iommu_init() will return without doing anything, regardless of
whether swiotlb is used or not.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If HW IOMMU initialization fails (Intel VT-d often does this,
typically due to BIOS bugs), we fall back to nommu. It doesn't
work for the majority since nowadays we have more than 4GB
memory so we must use swiotlb instead of nommu.
The problem is that it's too late to initialize swiotlb when HW
IOMMU initialization fails. We need to allocate swiotlb memory
earlier from bootmem allocator. Chris explained the issue in
detail:
http://marc.info/?l=linux-kernel&m=125657444317079&w=2
The current x86 IOMMU initialization sequence is too complicated
and handling the above issue makes it more hacky.
This patch changes x86 IOMMU initialization sequence to handle
the above issue cleanly.
The new x86 IOMMU initialization sequence are:
1. we initialize the swiotlb (and setting swiotlb to 1) in the case
of (max_pfn > MAX_DMA32_PFN && !no_iommu). dma_ops is set to
swiotlb_dma_ops or nommu_dma_ops. if swiotlb usage is forced by
the boot option, we finish here.
2. we call the detection functions of all the IOMMUs
3. the detection function sets x86_init.iommu.iommu_init to the
IOMMU initialization function (so we can avoid calling the
initialization functions of all the IOMMUs needlessly).
4. if the IOMMU initialization function doesn't need to swiotlb
then sets swiotlb to zero (e.g. the initialization is
sucessful).
5. if we find that swiotlb is set to zero, we free swiotlb
resource.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
LKML-Reference: <1257849980-22640-10-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This changes detect_intel_iommu() to set intel_iommu_init() to
iommu_init hook if detect_intel_iommu() finds the IOMMU.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
LKML-Reference: <1257849980-22640-6-git-send-email-fujita.tomonori@lab.ntt.co.jp>
[ -v2: build fix for the !CONFIG_DMAR case ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Chris Wright has some patches which let us fall back to swiotlb nicely
if IOMMU initialisation fails. But those are a bit much for 2.6.32.
Instead, let's shift the check for the biggest problem, the HP and Acer
BIOS bug which reports a DMAR at physical address zero. That one can
actually be checked much earlier -- before we even admit to having
detected an IOMMU in the first place. So the swiotlb init goes ahead as
we want.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Add support for parsing Remapping Hardware Static Affinity (RHSA) structure.
This enables identifying the association between remapping hardware units and
the corresponding proximity domain. This enables to allocate transalation
structures closer to the remapping hardware unit.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
I recently got a system where the DMAR table included a couple of RHSA
(remapping hardware static affinity) entries. Rather than printing a
message about an "Unknown DMAR structure," it would probably be more
useful to dump the RHSA structure (as other DMAR structures are dumped).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We might as well print the type of the DMAR structure we don't know how
to handle when skipping it. Then someone getting this message has a
chance of telling whether the structure is just bogus, or if there
really is something valid that the kernel doesn't know how to handle.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* git://git.infradead.org/iommu-2.6: (23 commits)
intel-iommu: Disable PMRs after we enable translation, not before
intel-iommu: Kill DMAR_BROKEN_GFX_WA option.
intel-iommu: Fix integer wrap on 32 bit kernels
intel-iommu: Fix integer overflow in dma_pte_{clear_range,free_pagetable}()
intel-iommu: Limit DOMAIN_MAX_PFN to fit in an 'unsigned long'
intel-iommu: Fix kernel hang if interrupt remapping disabled in BIOS
intel-iommu: Disallow interrupt remapping if not all ioapics covered
intel-iommu: include linux/dmi.h to use dmi_ routines
pci/dmar: correct off-by-one error in dmar_fault()
intel-iommu: Cope with yet another BIOS screwup causing crashes
intel-iommu: iommu init error path bug fixes
intel-iommu: Mark functions with __init
USB: Work around BIOS bugs by quiescing USB controllers earlier
ia64: IOMMU passthrough mode shouldn't trigger swiotlb init
intel-iommu: make domain_add_dev_info() call domain_context_mapping()
intel-iommu: Unify hardware and software passthrough support
intel-iommu: Cope with broken HP DC7900 BIOS
iommu=pt is a valid early param
intel-iommu: double kfree()
intel-iommu: Kill pointless intel_unmap_single() function
...
Fixed up trivial include lines conflict in drivers/pci/intel-iommu.c
BIOS clear DMAR table INTR_REMAP flag to disable interrupt remapping. Current
kernel only check interrupt remapping(IR) flag in DRHD's extended capability
register to decide interrupt remapping support or not. But IR flag will not
change when BIOS disable/enable interrupt remapping.
When user disable interrupt remapping in BIOS or BIOS often defaultly disable
interrupt remapping feature when BIOS is not mature.Though BIOS disable
interrupt remapping but intr_remapping_supported function will always report
to OS support interrupt remapping if VT-d2 chipset populated. On this
cases, kernel will continue enable interrupt remapping and result kernel panic.
This bug exist on almost all platforms with interrupt remapping support.
This patch add DMAR table INTR_REMAP flag check before enable interrupt
remapping.
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Current kernel enable interrupt remapping only when all the vt-d unit support
interrupt remapping. So it is reasonable we should also disallow enabling
intr-remapping if there any io-apics that are not listed under vt-d units.
Otherwise we can run into issues.
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Move tboot.h from asm to linux to fix the build errors of intel_txt
patch on non-X86 platforms. Remove the tboot code from generic code
init/main.c and kernel/cpu.c.
Signed-off-by: Shane Wang <shane.wang@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
DMAR faults are recorded into a ring of "fault recording registers".
fault_index is a 0-based index into the ring. The code allows the
0-based fault_index to be equal to the total number of fault registers
available from the cap_num_fault_regs() macro, which causes access
beyond the last available register.
Signed-off-by Troy Heber <troy.heber@hp.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Linux/ACPI core files using internal.h all PREFIX "ACPI: ",
however, not all ACPI drivers use/want it -- and they
should not have to #undef PREFIX to define their own.
Add GPL commment to internal.h while we are there.
This does not change any actual console output,
asside from a whitespace fix.
Signed-off-by: Len Brown <len.brown@intel.com>
Yet another reason why trusting this stuff to the BIOS was a bad idea.
The HP DC7900 BIOS reports an iommu at an address which just returns all
ones, when VT-d is disabled in the BIOS.
Fix up the missing iounmap in the error paths while we're at it.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The tboot module will DMA protect all of memory in order to ensure the that
kernel will be able to initialize without compromise (from DMA). Consequently,
the kernel must enable Intel Virtualization Technology for Directed I/O
(VT-d or Intel IOMMU) in order to replace this broad protection with the
appropriate page-granular protection. Otherwise DMA devices will be unable
to read or write from memory and the kernel will eventually panic.
Because runtime IOMMU support is configurable by command line options, this
patch will force it to be enabled regardless of the options specified, and will
log a message if it was required to force it on.
dmar.c | 7 +++++++
intel-iommu.c | 17 +++++++++++++++--
2 files changed, 22 insertions(+), 2 deletions(-)
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Signed-off-by: Shane Wang <shane.wang@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Support device IOTLB invalidation to flush the translation cached
in the Endpoint.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Parse the Root Port ATS Capability Reporting Structure in the DMA
Remapping Reporting Structure ACPI table.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
As we just did for context cache flushing, clean up the logic around
whether we need to flush the iotlb or just the write-buffer, depending
on caching mode.
Fix the same bug in qi_flush_iotlb() that qi_flush_context() had -- it
isn't supposed to be returning an error; it's supposed to be returning a
flag which triggers a write-buffer flush.
Remove some superfluous conditional write-buffer flushes which could
never have happened because they weren't for non-present-to-present
mapping changes anyway.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
It really doesn't make a lot of sense to have some of the logic to
handle caching vs. non-caching mode duplicated in qi_flush_context() and
__iommu_flush_context(), while the return value indicates whether the
caller should take other action which depends on the same thing.
Especially since qi_flush_context() thought it was returning something
entirely different anyway.
This patch makes qi_flush_context() and __iommu_flush_context() both
return void, removes the 'non_present_entry_flush' argument and makes
the only call site which _set_ that argument to 1 do the right thing.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The patch adds kernel parameter intel_iommu=pt to set up pass through
mode in context mapping entry. This disables DMAR in linux kernel; but
KVM still runs on VT-d and interrupt remapping still works.
In this mode, kernel uses swiotlb for DMA API functions but other VT-d
functionalities are enabled for KVM. KVM always uses multi level
translation page table in VT-d. By default, pass though mode is disabled
in kernel.
This is useful when people don't want to enable VT-d DMAR in kernel but
still want to use KVM and interrupt remapping for reasons like DMAR
performance concern or debug purpose.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Weidong Han <weidong@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
If the BIOS does something obviously stupid, like claiming that the
registers for the IOMMU are at physical address zero, then print a nasty
message and abort, rather than trying to set up the IOMMU and then later
panicking.
It's becoming more and more obvious that trusting this stuff to the BIOS
was a mistake.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We were comparing {bus,devfn} and assuming that a match meant it was the
same device. It doesn't -- the same {bus,devfn} can exist in
multiple PCI domains. Include domain number in device identification
(and call it 'segment' in most places, because there's already a lot of
references to 'domain' which means something else, and this code is
infected with ACPI thinking already).
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>