Allocate irq descs on any NUMA node (we don't care) rather than
specifically node 0, which may not exist.
(At the moment NUMA is meaningless within a domain, so any info
the kernel has is just from an SRAT table we haven't suppressed/disabled.)
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
If this is a non-HIGHMEM 32-bit kernel, then the page structures only go
up to the limit of addressable memory, even if more memory is physically
present. Don't try to add that extra memory to the balloon.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
When we allocate a vector for MSI/MSI-X we save away the PIRQ, and the
vector value. When we unmap (de-allocate) the MSI/MSI-X vector(s) we
need to provide the PIRQ and the vector value. What we did instead
was to provide the GSI (which was zero) and the vector value, and we
got these unhappy error messages:
(XEN) irq.c:1575: dom0: pirq 0 not mapped
[ 7.733415] unmap irq failed -22
This patches fixes this and we use the PIRQ value instead of the GSI
value.
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If the user specifies mem= on the kernel command line, some or all
of the extra memory E820 region may be clipped away, so make sure
we don't try to add more extra memory than exists in E820.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Add extra pages in the pseudo-physical address space to the balloon
so we can extend into them later.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
The per-cpu event channel masks can be updated unlocked from multiple
CPUs, so use the locked variant.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
To bind all event channels to CPU#0, it is not sufficient to set all
of its cpu_evtchn_mask[] bits; all other CPUs also need to get their
bits cleared. Otherwise, evtchn_do_upcall() will start handling
interrupts on CPUs they're not intended to run on, which can be
particularly bad for per-CPU ones.
[ linux-2.6.18-xen.hg 7de7453dee36 ]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm
* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
xen: register xen pci notifier
xen: initialize cpu masks for pv guests in xen_smp_init
xen: add a missing #include to arch/x86/pci/xen.c
xen: mask the MTRR feature from the cpuid
xen: make hvc_xen console work for dom0.
xen: add the direct mapping area for ISA bus access
xen: Initialize xenbus for dom0.
xen: use vcpu_ops to setup cpu masks
xen: map a dummy page for local apic and ioapic in xen_set_fixmap
xen: remap MSIs into pirqs when running as initial domain
xen: remap GSIs as pirqs when running as initial domain
xen: introduce XEN_DOM0 as a silent option
xen: map MSIs into pirqs
xen: support GSI -> pirq remapping in PV on HVM guests
xen: add xen hvm acpi_register_gsi variant
acpi: use indirect call to register gsi in different modes
xen: implement xen_hvm_register_pirq
xen: get the maximum number of pirqs from xen
xen: support pirq != irq
* 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits)
X86/PCI: Remove the dependency on isapnp_disable.
xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c
MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright.
x86: xen: Sanitse irq handling (part two)
swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it.
MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer.
xen/pci: Request ACS when Xen-SWIOTLB is activated.
xen-pcifront: Xen PCI frontend driver.
xenbus: prevent warnings on unhandled enumeration values
xenbus: Xen paravirtualised PCI hotplug support.
xen/x86/PCI: Add support for the Xen PCI subsystem
x86: Introduce x86_msi_ops
msi: Introduce default_[teardown|setup]_msi_irqs with fallback.
x86/PCI: Export pci_walk_bus function.
x86/PCI: make sure _PAGE_IOMAP it set on pci mappings
x86/PCI: Clean up pci_cache_line_size
xen: fix shared irq device passthrough
xen: Provide a variant of xen_poll_irq with timeout.
xen: Find an unbound irq number in reverse order (high to low).
xen: statically initialize cpu_evtchn_mask_p
...
Fix up trivial conflicts in drivers/pci/Makefile
Register a pci notifier to add (or remove) pci devices to Xen via
hypercalls. Xen needs to know the pci devices present in the system to
handle pci passthrough and even MSI remapping in the initial domain.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* 'upstream/xenfs' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
xen/privcmd: make privcmd visible in domU
xen/privcmd: move remap_domain_mfn_range() to core xen code and export.
privcmd: MMAPBATCH: Fix error handling/reporting
xenbus: export xen_store_interface for xenfs
xen/privcmd: make sure vma is ours before doing anything to it
xen/privcmd: print SIGBUS faults
xen/xenfs: set_page_dirty is supposed to return true if it dirties
xen/privcmd: create address space to allow writable mmaps
xen: add privcmd driver
xen: add variable hypercall caller
xen: add xen_set_domain_pte()
xen: add /proc/xen/xsd_{kva,port} to xenfs
* 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: (29 commits)
xen: include xen/xen.h for definition of xen_initial_domain()
xen: use host E820 map for dom0
xen: correctly rebuild mfn list list after migration.
xen: improvements to VIRQ_DEBUG output
xen: set up IRQ before binding virq to evtchn
xen: ensure that all event channels start off bound to VCPU 0
xen/hvc: only notify if we actually sent something
xen: don't add extra_pages for RAM after mem_end
xen: add support for PAT
xen: make sure xen_max_p2m_pfn is up to date
xen: limit extra memory to a certain ratio of base
xen: add extra pages for E820 RAM regions, even if beyond mem_end
xen: make sure xen_extra_mem_start is beyond all non-RAM e820
xen: implement "extra" memory to reserve space for pages not present at boot
xen: Use host-provided E820 map
xen: don't map missing memory
xen: defer building p2m mfn structures until kernel is mapped
xen: add return value to set_phys_to_machine()
xen: convert p2m to a 3 level tree
xen: make install_p2mtop_page() static
...
Fix up trivial conflict in arch/x86/xen/mmu.c, and fix the use of
'reserve_early()' - in the new memblock world order it is now
'memblock_x86_reserve_range()' instead. Pointed out by Jeremy.
Do initial xenbus/xenstore setup in dom0. In dom0 we need to actually
allocate the xenstore resources, rather than being given them from
outside.
[ Impact: initialize Xenbus ]
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Implement xen_create_msi_irq to create an msi and remap it as pirq.
Use xen_create_msi_irq to implement an initial domain specific version
of setup_msi_irqs.
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Implement xen_register_gsi to setup the correct triggering and polarity
properties of a gsi.
Implement xen_register_pirq to register a particular gsi as pirq and
receive interrupts as events.
Call xen_setup_pirqs to register all the legacy ISA irqs as pirqs.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Map MSIs into pirqs, writing 0 in the MSI vector data field and the pirq
number in the MSI destination id field.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Disable pcifront when running on HVM: it is meant to be used with pv
guests that don't have PCI bus.
Use acpi_register_gsi_xen_hvm to remap GSIs into pirqs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen_hvm_register_pirq allows the kernel to map a GSI into a Xen pirq and
receive the interrupt as an event channel from that point on.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Use PHYSDEVOP_get_nr_pirqs to get the maximum number of pirqs from xen.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
PHYSDEVOP_map_pirq might return a pirq different from what we asked if
we are running as an HVM guest, so we need to be able to support pirqs
that are different from linux irqs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* Fix bitmask formatting on 64 bit by specifying correct field widths.
* Output both global and local masked and pending information.
* Indicate in list of pending interrupts whether they are pending in
the L2, masked globally and/or masked locally.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
All event channels startbound to VCPU 0 so ensure that cpu_evtchn_mask
is initialised to reflect this. Otherwise there is a race after registering an
event channel but before the affinity is explicitly set where the event channel
can be delivered. If this happens then the event channel remains pending in the
L1 (evtchn_pending) array but is cleared in L2 (evtchn_pending_sel), this means
the event channel cannot be reraised until another event channel happens to
trigger the same L2 entry on that VCPU.
sizeof(cpu_evtchn_mask(0))==sizeof(unsigned long*) which is not correct, and
causes only the first 32 or 64 event channels (depending on architecture) to be
initially bound to VCPU0. Use sizeof(struct cpu_evtchn_s) instead.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: stable@kernel.org
Change event delivery to:
- mask+clear event in the upcall function
- use handle_fasteoi_irq as the handler
- unmask in the eoi function (and handle migration)
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
vfs: make no_llseek the default
vfs: don't use BKL in default_llseek
llseek: automatically add .llseek fop
libfs: use generic_file_llseek for simple_attr
mac80211: disallow seeks in minstrel debug code
lirc: make chardev nonseekable
viotape: use noop_llseek
raw: use explicit llseek file operations
ibmasmfs: use generic_file_llseek
spufs: use llseek in all file operations
arm/omap: use generic_file_llseek in iommu_debug
lkdtm: use generic_file_llseek in debugfs
net/wireless: use generic_file_llseek in debugfs
drm: use noop_llseek
It has its uses in a domU as well as dom0. Xen will prevent an
unprivileged domain from doing anything untoward.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
This allows xenfs to be built as a module, previously it required flush_tlb_all
and arbitrary_virt_to_machine to be exported.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
On error IOCTL_PRIVCMD_MMAPBATCH is expected to set the top nibble of
the effected MFN and return 0. Currently it leaves the MFN unmodified
and returns the number of failures. Therefore:
- reimplement remap_domain_mfn_range() using direct
HYPERVISOR_mmu_update() calls and small batches. The xen_set_domain_pte()
interface does not report errors and since some failures are
expected/normal using the multicall infrastructure is too noisy.
- return 0 as expected
- writeback the updated MFN list to mmapbatch->arr not over mmapbatch,
smashing the caller's stack.
- remap_domain_mfn_range can be static.
With this change I am able to start an HVM domain.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
xen_store_interface is needed by xenfs, and xenfs may be a module.
[ Impact: build fix for modular xenfs ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Test vma->vm_ops is our operations to make sure we created it.
We don't want to stomp on other random vmas.
[ Impact: bugfix; prevent ioctl from affecting other mappings ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
I don't think it matters at all in this case (there's only one caller
which checks the return value), but may as well be strictly correct.
[ Impact: cleanup ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
These are necessary to allow writeable mmap of the privcmd node to
succeed without being marked read-only for writenotify purposes. Which
in turn is necessary to allow mappings of foreign guest pages
[ Impact: bugfix: allow writable mappings ]
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
The privcmd interface in xenfs allows the tool stack in the privileged
domain to get fairly direct access to the hypervisor in order to do
various management things such as domain construction.
[ Impact: new xenfs interface for privileged operations ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
These are used by the userspace xenstore daemon, which runs in dom0.
Xenstored is what's behind the xenfs "xenbus" filesystem.
[ Impact: provide mapping and port to usermode for xenstore ]
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Without this dependency we get these compile errors:
linux-next-20101020/drivers/xen/biomerge.c: In function 'xen_biovec_phys_mergeable':
linux-next-20101020/drivers/xen/biomerge.c:8: error: dereferencing pointer to incomplete type
linux-next-20101020/drivers/xen/biomerge.c:9: error: dereferencing pointer to incomplete type
linux-next-20101020/drivers/xen/biomerge.c:11: error: implicit declaration of function '__BIOVEC_PHYS_MERGEABLE'
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Thomas Gleixner cleaned up event handling to use the
sparse_irq handling, but the xen-pcifront patches utilized the
old mechanism. This fixes them to work with sparse_irq handling.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We used to depend on CONFIG_SWIOTLB, but that is disabled by default.
So when compiling we get this compile error:
arch/x86/xen/pci-swiotlb-xen.c: In function 'pci_xen_swiotlb_detect':
arch/x86/xen/pci-swiotlb-xen.c:48: error: lvalue required as left operand of assignment
Fix it by actually activating the SWIOTLB library.
Reported-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
It used to done in the Xen startup code but that is not really
appropiate.
[v2: Update Kconfig with PCI requirement]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The Xen PCI front driver adds two new states that are utilizez
for PCI hotplug support. This is a patch pulled from the
linux-2.6-xen-sparse tree.
Signed-off-by: Noboru Iwamatsu <n_iwamatsu@jp.fujitsu.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Yosuke Iwamatsu <y-iwamatsu@ab.jp.nec.com>
The frontend stub lives in arch/x86/pci/xen.c, alongside other
sub-arch PCI init code (e.g. olpc.c).
It provides a mechanism for Xen PCI frontend to setup/destroy
legacy interrupts, MSI/MSI-X, and PCI configuration operations.
[ Impact: add core of Xen PCI support ]
[ v2: Removed the IOMMU code and only focusing on PCI.]
[ v3: removed usage of pci_scan_all_fns as that does not exist]
[ v4: introduced pci_xen value to fix compile warnings]
[ v5: squished fixes+features in one patch, changed Reviewed-by to Ccs]
[ v7: added Acked-by]
Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Qing He <qing.he@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
In driver/xen/events.c, whether bind_pirq is shareable or not is
determined by desc->action is NULL or not. But in __setup_irq,
startup(irq) is invoked before desc->action is assigned with
new action. So desc->action in startup_irq is always NULL, and
bind_pirq is always not shareable. This results in pt_irq_create_bind
failure when passthrough a device which shares irq to other devices.
This patch doesn't use probing_irq to determine if pirq is shareable
or not, instead set shareable flag in irq_info according to trigger
mode in xen_allocate_pirq. Set level triggered interrupts shareable.
Thus use this flag to set bind_pirq flag accordingly.
[v2: arch/x86/xen/pci.c no more, so file skipped]
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The 'xen_poll_irq_timeout' provides a method to pass in
the poll timeout for IRQs if requested. We also export
those two poll functions as Xen PCI fronted uses them.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
In earlier Xen Linux kernels, the IRQ mapping was a straight 1:1 and the
find_unbound_irq started looking around 256 for open IRQs and up. IRQs
from 0 to 255 were reserved for PCI devices. Previous to this patch,
the 'find_unbound_irq' started looking at get_nr_hw_irqs() number.
For privileged domain where the ACPI information is available that
returns the upper-bound of what the GSIs. For non-privileged PV domains,
where ACPI is no-existent the get_nr_hw_irqs() reports the IRQ_LEGACY (16).
With PCI passthrough enabled, and with PCI cards that have IRQs pinned
to a higher number than 16 we collide with previously allocated IRQs.
Specifically the PCI IRQs collide with the IPI's for Xen functions
(as they are allocated earlier).
For example:
00:00.11 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller (prog-if 10 [OHCI])
...
Interrupt: pin A routed to IRQ 18
[root@localhost ~]# cat /proc/interrupts | head
CPU0 CPU1 CPU2
16: 38186 0 0 xen-dyn-virq timer0
17: 149 0 0 xen-dyn-ipi spinlock0
18: 962 0 0 xen-dyn-ipi resched0
and when the USB controller is loaded, the kernel reports:
IRQ handler type mismatch for IRQ 18
current handler: resched0
One way to fix this is to reverse the logic when looking for un-used
IRQ numbers and start with the highest available number. With that,
we would get:
CPU0 CPU1 CPU2
... snip ..
292: 35 0 0 xen-dyn-ipi callfunc0
293: 3992 0 0 xen-dyn-ipi resched0
294: 224 0 0 xen-dyn-ipi spinlock0
295: 57183 0 0 xen-dyn-virq timer0
NMI: 0 0 0 Non-maskable interrupts
.. snip ..
And interrupts for PCI cards are now accessible.
This patch also includes the fix, found by Ian Campbell, titled
"xen: fix off-by-one error in find_unbound_irq."
[v2: Added an explanation in the code]
[v3: Rebased on top of tip/irq/core]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Sometimes cpu_evtchn_mask_p can get used early, before it has been
allocated. Statically initialize it with an initdata version to catch
any early references.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Impact: cleanup
Make pirq show useful information in /proc/interrupts
[v2: Removed the parts for arch/x86/xen/pci.c ]
Signed-off-by: Gerd Hoffmann <kraxel@xeni.home.kraxel.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Dynamically allocate the irq_info and evtchn_to_irq arrays, so that
1) the irq_info array scales to the actual number of possible irqs,
and 2) we don't needlessly increase the static size of the kernel
when we aren't running under Xen.
Derived on patch from Mike Travis <travis@sgi.com>.
[Impact: reduce memory usage ]
[v2: Conflict in drivers/xen/events.c: Replaced alloc_bootmen with kcalloc ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>