.. compile options. This way the user can decide during runtime whether they
want the default 'vpci' (virtual pci passthrough) or where the PCI devices
are passed in without any BDF renumbering. The option 'passthrough' allows
the user to toggle the it from 0 (vpci) to 1 (passthrough).
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The latter is easily fixed - by the developer compiling the
module with -DDEBUG. And during runtime - the loglvl provides
quite a lot of useful data.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
- Remove the slot and controller controller backend as they
are not used.
- Document the find pciback_[read|write]_config_[byte|word|dword]
to make it easier to find.
- Collapse the code from conf_space_capability_msi into pciback_ops.c
- Collapse conf_space_capability_[pm|vpd].c in conf_space_capability.c
[and remove the conf_space_capability.h file]
- Rename all visible functions from pciback to xen_pcibk.
- Rename all the printk/pr_info, etc that use the "pciback" to say
"xen-pciback".
- Convert functions that are not referenced outside the code to be
static to save on name space.
- Do the same thing for structures that are internal to the driver.
- Run checkpatch.pl after the renames and fixup its warnings and
fix any compile errors caused by the variable rename
- Cleanup any structs that checkpath.pl commented about or just
look odd.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If the verbose_request is set (and loglevel high enough), print out
the MSI/MSI-X values that are sent to the guest. This should aid in
debugging issues.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If we try to setup an fake IRQ handler for legacy interrupts
for devices that only have MSI-X (most if not all SR-IOV cards),
we will fail with this:
pciback[0000:01:10.0]: failed to install fake IRQ handler for IRQ 0! (rc:-38)
Since those cards don't have anything in dev->irq.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
pciback is rather generic for a modular distro style kernel.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
We were using coarse spinlocks that could end up with a deadlock.
This patch fixes that and makes the spinlocks much more fine-grained.
We also drop be->watchding state spinlocks as they are already
guarded by the xenwatch_thread against multiple customers. Without
that we would trigger the BUG: scheduling while atomic.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If the device that is to be shared with a guest is a level device and
the IRQ is shared with the initial domain we need to take actions.
Mainly we install a dummy IRQ handler that will ACK on the interrupt
line so as to not have the initial domain disable the interrupt line.
This dummy IRQ handler is not enabled when the device MSI/MSI-X lines
are set, nor for edge interrupts. And also not for level interrupts
that are not shared amongst devices. Lastly, if the user passes
to the guest all of the PCI devices on the shared line the we won't
install the dummy handler either.
There is also SysFS instrumentation to check its state and turn
IRQ ACKing on/off if necessary.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
In cases where the guest is abruptly killed and has not disabled
MSI/MSI-X interrupts we want to do it for it.
Otherwise when the guest is started up and enables MSI, we would
get a WARN() that the device already had been enabled.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
These changes are for PV guest to use Virtual Function. Because the VF's
vendor, device registers in cfg space are 0xffff, which are invalid and
ignored by PCI device scan. Values in 'struct pci_dev' are fixed up by
SR-IOV code, and using these values will present correct VID and DID to
PV guest kernel.
And command registers in the cfg space are read only 0, which means we
have to emulate MMIO enable bit (VF only uses MMIO resource) so PV
kernel can work properly.
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When the front-end and back-end start negotiating we register
the domain that will use the PCI device. Furthermore during shutdown
of guest or unbinding of the PCI device (and unloading of module)
from pciback we unregister the domain owner.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Checkpatch found some extra warnings and errors. This mega
patch fixes them all in one big swoop. We also spruce
up the pcistub_ids to use DEFINE_PCI_DEVICE_TABLE macro
(suggested by Jan Beulich).
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This is the host side counterpart to the frontend driver in
drivers/pci/xen-pcifront.c. The PV protocol is also implemented by
frontend drivers in other OSes too, such as the BSDs.
The PV protocol is rather simple. There is page shared with the guest,
which has the 'struct xen_pci_sharedinfo' embossed in it. The backend
has a thread that is kicked every-time the structure is changed and
based on the operation field it performs specific tasks:
XEN_PCI_OP_conf_[read|write]:
Read/Write 0xCF8/0xCFC filtered data. (conf_space*.c)
Based on which field is probed, we either enable/disable the PCI
device, change power state, read VPD, etc. The major goal of this
call is to provide a Physical IRQ (PIRQ) to the guest.
The PIRQ is Xen hypervisor global IRQ value irrespective of the IRQ
is tied in to the IO-APIC, or is a vector. For GSI type
interrupts, the PIRQ==GSI holds. For MSI/MSI-X the
PIRQ value != Linux IRQ number (thought PIRQ==vector).
Please note, that with Xen, all interrupts (except those level shared ones)
are injected directly to the guest - there is no host interaction.
XEN_PCI_OP_[enable|disable]_msi[|x] (pciback_ops.c)
Enables/disables the MSI/MSI-X capability of the device. These operations
setup the MSI/MSI-X vectors for the guest and pass them to the frontend.
When the device is activated, the interrupts are directly injected in the
guest without involving the host.
XEN_PCI_OP_aer_[detected|resume|mmio|slotreset]: In case of failure,
perform the appropriate AER commands on the guest. Right now that is
a cop-out - we just kill the guest.
Besides implementing those commands, it can also
- hide a PCI device from the host. When booting up, the user can specify
xen-pciback.hide=(1:0:0)(BDF..) so that host does not try to use the
device.
The driver was lifted from linux-2.6.18.hg tree and fixed up
so that it could compile under v3.0. Per suggestion from Jesse Barnes
moved the driver to drivers/xen/xen-pciback.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Commit 13e12d14e2 ("vfs: reorganize 'struct inode' layout a bit")
moved things around a bit changed i_state to be unsigned int instead of
unsigned long. That was to help structure layout for the 64-bit case,
and shrink 'struct inode' a bit (admittedly that only happened when
spinlock debugging was on and i_flags didn't pack with i_lock).
However, Meelis Roos reports that this results in unaligned exceptions
on sprc, and it turns out that the bit-locking primitives that we use
for the I_NEW bit want to use the bitops. Which want 'unsigned long',
not 'unsigned int'.
We really should fix the bit locking code to not have that kind of
requirement, but that's a much bigger change. So for now, revert that
field back to 'unsigned long' (but keep the other re-ordering changes
from the commit that caused this).
Andi points out that we have played games with this in 'struct page', so
it's solvable with other hacks too, but since right now the struct inode
size advantage only happens with some rare config options, it's not
worth fighting.
It _would_ be worth fixing the bitlocking code, though. Especially
since there is no type safety in the bitlocking code (this never caused
any warnings, and worked fine on x86-64, because the bitlocks take a
'void *' and x86-64 doesn't care that deeply about alignment). So it's
currently a very easy problem to trigger by mistake and never notice.
Reported-by: Meelis Roos <mroos@linux.ee>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms/r6xx+: voltage fixes
drm/nouveau: drop leftover debugging
drm/radeon: avoid warnings from r600/eg irq handlers on powered off card.
drm/radeon/kms: add missing param for dce3.2 DP transmitter setup
drm/radeon/kms/atom: fix duallink on some early DCE3.2 cards
drm/nouveau: fix assumption that semaphore dmaobj is valid in x-chan sync
drm/nv50/disp: fix gamma with page flipping overlay turned on
drm/nouveau/pm: Prevent overflow in nouveau_perf_init()
drm/nouveau: fix big-endian switch
* 'for-2.6.40' of git://linux-nfs.org/~bfields/linux:
nfsd4: fix break_lease flags on nfsd open
nfsd: link returns nfserr_delay when breaking lease
nfsd: v4 support requires CRYPTO
nfsd: fix dependency of nfsd on auth_rpcgss
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits)
pxa168_eth: fix race in transmit path.
ipv4, ping: Remove duplicate icmp.h include
netxen: fix race in skb->len access
sgi-xp: fix a use after free
hp100: fix an skb->len race
netpoll: copy dev name of slaves to struct netpoll
ipv4: fix multicast losses
r8169: fix static initializers.
inet_diag: fix inet_diag_bc_audit()
gigaset: call module_put before restart of if_open()
farsync: add module_put to error path in fst_open()
net: rfs: enable RFS before first data packet is received
fs_enet: fix freescale FCC ethernet dp buffer alignment
netdev: bfin_mac: fix memory leak when freeing dma descriptors
vlan: don't call ndo_vlan_rx_register on hardware that doesn't have vlan support
caif: Bugfix - XOFF removed channel from caif-mux
tun: teach the tun/tap driver to support netpoll
dp83640: drop PHY status frames in the driver.
dp83640: fix phy status frame event parsing
phylib: Allow BCM63XX PHY to be selected only on BCM63XX.
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
devcgroup_inode_permission: take "is it a device node" checks to inlined wrapper
fix comment in generic_permission()
kill obsolete comment for follow_down()
proc_sys_permission() is OK in RCU mode
reiserfs_permission() doesn't need to bail out in RCU mode
proc_fd_permission() is doesn't need to bail out in RCU mode
nilfs2_permission() doesn't need to bail out in RCU mode
logfs doesn't need ->permission() at all
coda_ioctl_permission() is safe in RCU mode
cifs_permission() doesn't need to bail out in RCU mode
bad_inode_permission() is safe from RCU mode
ubifs: dereferencing an ERR_PTR in ubifs_mount()
0xff01 is not an actual voltage value, but a flag
for the driver. If the power state as that value,
skip setting the voltage.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The DGT runs at 27 MHz divided by 4 on 8660 and 8960.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David Brown <davidb@codeaurora.org>
Because the socket buffer is freed in the completion interrupt, it is not
safe to access it after submitting it to the hardware.
Cc: stable@kernel.org
Cc: Sachin Sanap <ssanap@marvell.com>
Cc: Zhangfei Gao <zgao6@marvell.com>
Cc: Philip Rakity <prakity@marvell.com>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the duplicate inclusion of net/icmp.h from net/ipv4/ping.c
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
As soon as skb is given to hardware, TX completion can free skb under
us.
Therefore, we should update dev stats before kicking the device.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/setup: Fix for incorrect xen_extra_mem_start.
xen: When calling power_off, don't call the halt function.
xen: Fix compile warning when CONFIG_SMP is not defined.
xen: support CONFIG_MAXSMP
xen: partially revert "xen: set max_pfn_mapped to the last pfn mapped"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: sh_keysc - 8x8 MODE_6 fix
Input: omap-keypad - add missing input_sync()
Input: evdev - try to wake up readers only if we have full packet
Input: properly assign return value of clamp() macro.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: avoid delayed metadata items during commits
btrfs: fix uninitialized return value
btrfs: fix wrong reservation when doing delayed inode operations
btrfs: Remove unused sysfs code
btrfs: fix dereference of ERR_PTR value
Btrfs: fix relocation races
Btrfs: set no_trans_join after trying to expand the transaction
Btrfs: protect the pending_snapshots list with trans_lock
Btrfs: fix path leakage on subvol deletion
Btrfs: drop the delalloc_bytes check in shrink_delalloc
Btrfs: check the return value from set_anon_super
* 'kvm-updates/3.0' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Fix register corruption in pvclock_scale_delta
KVM: MMU: fix opposite condition in mapping_level_dirty_bitmap
KVM: VMX: do not overwrite uptodate vcpu->arch.cr3 on KVM_SET_SREGS
KVM: MMU: Fix build warnings in walk_addr_generic()
inode_permission() calls devcgroup_inode_permission() and almost all such
calls are _not_ for device nodes; let's at least keep the common path
straight...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
nothing blocking there, since all instances of sysctl
->permissions() method are non-blocking - both of them,
that is.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
... and never did, what with its ->permission() being what we do by default
when ->permission is NULL...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
return -EIO; is *not* a blocking operation, thank you very much.
Nick, what the hell have you been smoking?
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
d251ed271d "ubifs: fix sget races" left out the goto from this
error path so the static checkers complain that we're dereferencing
"sb" when it's an ERR_PTR.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Thanks to Casey Bodley for pointing out that on a read open we pass 0,
instead of O_RDONLY, to break_lease, with the result that a read open is
treated like a write open for the purposes of lease breaking!
Reported-by: Casey Bodley <cbodley@citi.umich.edu>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* 'drm-nouveau-fixes' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau: fix assumption that semaphore dmaobj is valid in x-chan sync
drm/nv50/disp: fix gamma with page flipping overlay turned on
drm/nouveau/pm: Prevent overflow in nouveau_perf_init()
drm/nouveau: fix big-endian switch
Since we were calling the wptr function before checking if the IH was
even enabled, or the GPU wasn't shutdown, we'd get spam in the logs when
the GPU readback 0xffffffff. This reorders things so we return early
in the no IH and GPU shutdown cases.
Reported-and-tested-by: ManDay on #radeon
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is used during phy init to set up the phy for DP. This may
fix DP problems on DCE3.2 cards.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Certain revisions of the vbios on DCE3.2 cards have a bug
in the transmitter control table which prevents duallink from
being enabled properly on some cards. The action switch statement
jumps to the wrong offset for the OUTPUT_ENABLE action. The fix
is to use the ENABLE action rather than the OUTPUT_ENABLE action
on the affected cards. In fixed version of the vbios, both
actions jump to the same offset, so the change should be safe.
Reported-and-tested-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Its illegal to dereference skb after dev_kfree_skb(skb)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Robin Holt <holt@sgi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As soon as skb is given to hardware and spinlock released, TX completion
can free skb under us. Therefore, we should update netdev stats before
spinlock release.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>