Due to problems at cam.org, my nico@cam.org email address is no longer
valid. FRom now on, nico@fluxnic.net should be used instead.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, pat: Fix cacheflush address in change_page_attr_set_clr()
mm: remove !NUMA condition from PAGEFLAGS_EXTENDED condition set
x86: Fix earlyprintk=dbgp for machines without NX
x86, pat: Sanity check remap_pfn_range for RAM region
x86, pat: Lookup the protection from memtype list on vm_insert_pfn()
x86, pat: Add lookup_memtype to get the current memtype of a paddr
x86, pat: Use page flags to track memtypes of RAM pages
x86, pat: Generalize the use of page flag PG_uncached
x86, pat: Add rbtree to do quick lookup in memtype tracking
x86, pat: Add PAT reserve free to io_mapping* APIs
x86, pat: New i/f for driver to request memtype for IO regions
x86, pat: ioremap to follow same PAT restrictions as other PAT users
x86, pat: Keep identity maps consistent with mmaps even when pat_disabled
x86, mtrr: make mtrr_aps_delayed_init static bool
x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init
generic-ipi: Allow cpus not yet online to call smp_call_function with irqs disabled
x86: Fix an incorrect argument of reserve_bootmem()
x86: Fix system crash when loading with "reservetop" parameter
* 'x86-txt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, intel_txt: clean up the impact on generic code, unbreak non-x86
x86, intel_txt: Handle ACPI_SLEEP without X86_TRAMPOLINE
x86, intel_txt: Fix typos in Kconfig help
x86, intel_txt: Factor out the code for S3 setup
x86, intel_txt: tboot.c needs <asm/fixmap.h>
intel_txt: Force IOMMU on for Intel TXT launch
x86, intel_txt: Intel TXT Sx shutdown support
x86, intel_txt: Intel TXT reboot/halt shutdown support
x86, intel_txt: Intel TXT boot support
* 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp/intel: remove restore in resume
agp: fix uninorth build
intel-agp: Set dma mask for i915
agp: kill phys_to_gart() and gart_to_phys()
intel-agp: fix sglist allocation to avoid vmalloc()
intel-agp: Move repeated sglist free into separate function
agp: Switch agp_{un,}map_page() to take struct page * argument
agp: tidy up handling of scratch pages w.r.t. DMA API
intel_agp: Use PCI DMA API correctly on chipsets new enough to have IOMMU
agp: Add generic support for graphics dma remapping
agp: Switch mask_memory() method to take address argument again, not page
Avoid the cache buddies from biasing the time distribution away
from fork()ers. Normally the next buddy will be the preferred
scheduling target, but this makes fork()s prefer to run the new
child, whereas we prefer to run the parent, since that will
generate more work.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In order to extend the functions to have more than 1 flag (sync),
rename the argument to flags, and explicitly define a WF_ space for
individual flags.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In order to be able to rename the sync argument, we need to rename
the current flag argument.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
APERF/MPERF support for cpu_power.
APERF/MPERF is arch defined to be a relative scale of work capacity
per logical cpu, this is assumed to include SMT and Turbo mode.
APERF/MPERF are specified to both reset to 0 when either counter
wraps, which is highly inconvenient, since that'll give a blimp
when that happens. The manual specifies writing 0 to the counters
after each read, but that's 1) too expensive, and 2) destroys the
possibility of sharing these counters with other users, so we live
with the blimp - the other existing user does too.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If we're looking to place a new task, we might as well find the
idlest position _now_, not 1 tick ago.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make the idle balancer more agressive, to improve a
x264 encoding workload provided by Jason Garrett-Glaser:
NEXT_BUDDY NO_LB_BIAS
encoded 600 frames, 252.82 fps, 22096.60 kb/s
encoded 600 frames, 250.69 fps, 22096.60 kb/s
encoded 600 frames, 245.76 fps, 22096.60 kb/s
NO_NEXT_BUDDY LB_BIAS
encoded 600 frames, 344.44 fps, 22096.60 kb/s
encoded 600 frames, 346.66 fps, 22096.60 kb/s
encoded 600 frames, 352.59 fps, 22096.60 kb/s
NO_NEXT_BUDDY NO_LB_BIAS
encoded 600 frames, 425.75 fps, 22096.60 kb/s
encoded 600 frames, 425.45 fps, 22096.60 kb/s
encoded 600 frames, 422.49 fps, 22096.60 kb/s
Peter pointed out that this is better done via newidle_idx,
not via LB_BIAS, newidle balancing should look for where
there is load _now_, not where there was load 2 ticks ago.
Worst-case latencies are improved as well as no buddies
means less vruntime spread. (as per prior lkml discussions)
This change improves kbuild-peak parallelism as well.
Reported-by: Jason Garrett-Glaser <darkshikari@gmail.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1253011667.9128.16.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
CPU level should have WAKE_AFFINE, whereas ALLNODES is dubious.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When merging select_task_rq_fair() and sched_balance_self() we lost
the use of wake_idx, restore that and set them to 0 to make wake
balancing more aggressive.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The problem with wake_idle() is that is doesn't respect things like
cpu_power, which means it doesn't deal well with SMT nor the recent
RT interaction.
To cure this, it needs to do what sched_balance_self() does, which
leads to the possibility of merging select_task_rq_fair() and
sched_balance_self().
Modify sched_balance_self() to:
- update_shares() when walking up the domain tree,
(it only called it for the top domain, but it should
have done this anyway), which allows us to remove
this ugly bit from try_to_wake_up().
- do wake_affine() on the smallest domain that contains
both this (the waking) and the prev (the wakee) cpu for
WAKE invocations.
Then use the top-down balance steps it had to replace wake_idle().
This leads to the dissapearance of SD_WAKE_BALANCE and
SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE.
SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective.
Touch all topology bits to replace the old with new SD flags --
platforms might need re-tuning, enabling SD_BALANCE_WAKE
conditionally on a NUMA distance seems like a good additional
feature, magny-core and small nehalem systems would want this
enabled, systems with slow interconnects would not.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We're going to want to drop rq->lock in try_to_wake_up() for a
longer period of time, however we also want to deal with concurrent
waking of the same task, which is currently handled by holding
rq->lock.
So introduce a new TASK state, namely TASK_WAKING, which indicates
someone is already waking the task (other wakers will fail p->state
& state).
We also keep preemption disabled over the whole ttwu().
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rather ugly patch to fully place the sched_balance_self() code
inside the fair class.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
After the recent mq change there is the new select_queue qdisc class
method used in tc_modify_qdisc, but it works OK only for direct child
qdiscs of mq qdisc. Grandchildren always get the first tx queue, which
would give wrong qdisc_root etc. results (e.g. for sch_htb as child of
sch_prio). This patch fixes it by using parent's dev_queue for such
grandchildren qdiscs. The select_queue method's return type is changed
BTW.
With feedback from: Patrick McHardy <kaber@trash.net>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parse RxRPC security index 5 type keys (Kerberos 5 tokens).
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow add_key() and KEYCTL_INSTANTIATE to accept key payloads in XDR form as
described by openafs-1.4.10/src/auth/afs_token.xg. This provides a way of
passing kaserver, Kerberos 4, Kerberos 5 and GSSAPI keys from userspace, and
allows for future expansion.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Declare the security index constants symbolically rather than just referring
to them numerically.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct socket has a 16 bit hole that triggers kmemcheck warnings.
As suggested by Ingo, use kmemcheck annotations
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes commit e36b9d16c6. The approach
there is to call dev_close()/dev_open() whenever the device type is changed in
order to remap the device IP multicast addresses to HW multicast addresses.
This approach suffers from 2 drawbacks:
*. It assumes tha the device is UP when calling dev_close(), or otherwise
dev_close() has no affect. It is worth to mention that initscripts (Redhat)
and sysconfig (Suse) doesn't act the same in this matter.
*. dev_close() has other side affects, like deleting entries from the routing
table, which might be unnecessary.
The fix here is to directly remap the IP multicast addresses to HW multicast
addresses for a bonding device that changes its type, and nothing else.
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was once upon time so that snd_sthresh was a 16-bit quantity.
...That has not been true for long period of time. I run across
some ancient compares which still seem to trust such legacy.
Put all that magic into a single place, I hopefully found all
of them.
Compile tested, though linking of allyesconfig is ridiculous
nowadays it seems.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
set_normalized_timespec() nsec argument is of type long. The recent
timekeeping changes of ktime_get_ts() feed
ts->tv_nsec + tomono.tv_nsec + nsecs
to set_normalized_timespec(). On 32 bit machines that sum can be
larger than (1 << 31) and therefor result in a negative value which
screws up the result completely.
Make the nsec argument of set_normalized_timespec() s64 to fix the
problem at hand. This also prevents similar problems for future users
of set_normalized_timespec().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Carsten Emde <carsten.emde@osadl.org>
LKML-Reference: <new-submission>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: John Stultz <johnstul@us.ibm.com>
* 'for-linus3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
SELinux: inline selinux_is_enabled in !CONFIG_SECURITY_SELINUX
KEYS: Fix garbage collector
KEYS: Unlock tasklist when exiting early from keyctl_session_to_parent
CRED: Allow put_cred() to cope with a NULL groups list
SELinux: flush the avc before disabling SELinux
SELinux: seperate avc_cache flushing
Creds: creds->security can be NULL is selinux is disabled
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (23 commits)
at_hdmac: Rework suspend_late()/resume_early()
PM: Reset transition_started at dpm_resume_noirq
PM: Update kerneldoc comments in drivers/base/power/main.c
PM: Add convenience macro to make switching to dev_pm_ops less error-prone
hp-wmi: Switch driver to dev_pm_ops
floppy: Switch driver to dev_pm_ops
PM: Trivial fixes
PM / Hibernate / Memory hotplug: Always use for_each_populated_zone()
PM/Hibernate: Do not try to allocate too much memory too hard (rev. 2)
PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
PM/Hibernate: Rework shrinking of memory
PM: Fix typo in label name s/Platofrm_finish/Platform_finish/
PM: Run-time PM platform device bus support
PM: Introduce core framework for run-time PM of I/O devices (rev. 17)
Driver Core: Make PM operations a const pointer
PM: Remove platform device suspend_late()/resume_early() V2
USB: Rework musb suspend()/resume_early()
I2C: Rework i2c-s3c2410 suspend_late()/resume() V2
I2C: Rework i2c-pxa suspend_late()/resume_early()
DMA: Rework txx9dmac suspend_late()/resume_early()
...
Fix trivial conflict in drivers/base/platform.c (due to same
constification patch being merged in both sides, along with some other
PM work in the PM branch)
Using relative pathnames in #include statements interacts badly with
SystemTap, since the fs/ext4/*.h header files are not packaged up as
part of a distribution kernel's header files. Since systemtap doesn't
use TP_fast_assign(), we can use a blind structure definition and then
make sure the needed header files are defined before the ext4 source
files #include the trace/events/ext4.h header file.
https://bugzilla.redhat.com/show_bug.cgi?id=512478
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Without this patch building a kernel emits millions of warning like:
include/linux/selinux.h:92: warning: ?selinux_is_enabled? defined but not used
When it is build without CONFIG_SECURITY_SELINUX. This is harmless, but
the function should be inlined, so it gets compiled out.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (52 commits)
Input: bcm5974 - silence uninitialized variables warnings
Input: wistron_btns - add keymap for AOpen 1557
Input: psmouse - use boolean type
Input: i8042 - use platform_driver_probe
Input: i8042 - use boolean type where it makes sense
Input: i8042 - try disabling and re-enabling AUX port at close
Input: pxa27x_keypad - allow modifying keymap from userspace
Input: sunkbd - fix formatting
Input: i8042 - bypass AUX IRQ delivery test on laptops
Input: wacom_w8001 - simplify querying logic
Input: atkbd - allow setting force-release bitmap via sysfs
Input: w90p910_keypad - move a dereference below a NULL test
Input: add twl4030_keypad driver
Input: matrix-keypad - add function to build device keymap
Input: tosakbd - fix cleaning up KEY_STROBEs after error
Input: joydev - validate axis/button maps before clobbering current ones
Input: xpad - add USB ID for the drumkit controller from Rock Band
Input: w90p910_keypad - rename driver name to match platform
Input: add new driver for Sentelic Finger Sensing Pad
Input: psmouse - allow defining read-only attributes
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: completely remove apple mightymouse from blacklist
HID: support larger reports than 64 bytes in hiddev
HID: local function should be static
HID: ignore Philips IEEE802.15.4 RF Dongle
HID: ignore all recent SoundGraph iMON devices
HID: fix memory leak on error patch in debug code
HID: fix overrun in quirks initialization
HID: Drop NULL test on list_entry result
HID: driver for Twinhan USB 6253:0100 remote control
HID: adding __init/__exit macros to module init/exit functions
HID: add rumble support for Thrustmaster Dual Trigger 3-in-1
HID: ntrig tool separation and pen usages
HID: Avoid double spin_lock_init on usbhid->lock
HID: add force feedback support for Logitech WingMan Formula Force GP
HID: Support new variants of Samsung USB IR receiver (0419:0001)
HID: fix memory leak on error path in debug code
HID: fix debugfs build with !CONFIG_DEBUG_FS
HID: use debugfs for events/reports dumping
HID: use debugfs for report dumping descriptor
* 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block: (29 commits)
block: use blkdev_issue_discard in blk_ioctl_discard
Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads
block: don't assume device has a request list backing in nr_requests store
block: Optimal I/O limit wrapper
cfq: choose a new next_req when a request is dispatched
Seperate read and write statistics of in_flight requests
aoe: end barrier bios with EOPNOTSUPP
block: trace bio queueing trial only when it occurs
block: enable rq CPU completion affinity by default
cfq: fix the log message after dispatched a request
block: use printk_once
cciss: memory leak in cciss_init_one()
splice: update mtime and atime on files
block: make blk_iopoll_prep_sched() follow normal 0/1 return convention
cfq-iosched: get rid of must_alloc flag
block: use interrupts disabled version of raise_softirq_irqoff()
block: fix comment in blk-iopoll.c
block: adjust default budget for blk-iopoll
block: fix long lines in block/blk-iopoll.c
block: add blk-iopoll, a NAPI like approach for block devices
...
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (209 commits)
[SCSI] fix oops during scsi scanning
[SCSI] libsrp: fix memory leak in srp_ring_free()
[SCSI] libiscsi, bnx2i: make bound ep check common
[SCSI] libiscsi: add completion function for drivers that do not need pdu processing
[SCSI] scsi_dh_rdac: changes for rdac debug logging
[SCSI] scsi_dh_rdac: changes to collect the rdac debug information during the initialization
[SCSI] scsi_dh_rdac: move the init code from rdac_activate to rdac_bus_attach
[SCSI] sg: fix oops in the error path in sg_build_indirect()
[SCSI] mptsas : Bump version to 3.04.12
[SCSI] mptsas : FW event thread and scsi mid layer deadlock in SYNCHRONIZE CACHE command
[SCSI] mptsas : Send DID_NO_CONNECT for pending IOs of removed device
[SCSI] mptsas : PAE Kernel more than 4 GB kernel panic
[SCSI] mptsas : NULL pointer on big endian systems causing Expander not to tear off
[SCSI] mptsas : Sanity check for phyinfo is added
[SCSI] scsi_dh_rdac: Add support for Sun StorageTek ST2500, ST2510 and ST2530
[SCSI] pmcraid: PMC-Sierra MaxRAID driver to support 6Gb/s SAS RAID controller
[SCSI] qla2xxx: Update version number to 8.03.01-k6.
[SCSI] qla2xxx: Properly delete rports attached to a vport.
[SCSI] qla2xxx: Correct various NPIV issues.
[SCSI] qla2xxx: Correct qla2x00_eh_wait_on_command() to wait correctly.
...
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (257 commits)
[ARM] Update mach-types
ARM: 5636/1: Move vendor enum to AMBA include
ARM: Fix pfn_valid() for sparse memory
[ARM] orion5x: Add LaCie NAS 2Big Network support
[ARM] pxa/sharpsl_pm: zaurus c3000 aka spitz: fix resume
ARM: 5686/1: at91: Correct AC97 reset line in at91sam9263ek board
ARM: 5640/1: This patch modifies the support of AC97 on the at91sam9263 ek board
ARM: 5689/1: Update default config of HP Jornada 700-series machines
ARM: 5691/1: fix cache aliasing issues between kmap() and kmap_atomic() with highmem
ARM: 5688/1: ks8695_serial: disable_irq() lockup
ARM: 5687/1: fix an oops with highmem
ARM: 5684/1: Add nuc960 platform to w90x900
ARM: 5683/1: Add nuc950 platform to w90x900
ARM: 5682/1: Add cpu.c and dev.c and modify some files of w90p910 platform
ARM: 5626/1: add suspend/resume functions to amba-pl011 serial driver
ARM: 5625/1: fix hard coded 4K resource size in amba bus detection
MMC: MMCI: convert realview MMC to use gpiolib
ARM: 5685/1: Make MMCI driver compile without gpiolib
ARM: implement highpte
ARM: Show FIQ in /proc/interrupts on CONFIG_FIQ
...
Fix up trivial conflict in arch/arm/kernel/signal.c.
It was due to the TIF_NOTIFY_RESUME addition in commit d0420c83f ("KEYS:
Extend TIF_NOTIFY_RESUME to (almost) all architectures") and follow-ups.
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (202 commits)
MAINTAINERS: update KVM entry
KVM: correct error-handling code
KVM: fix compile warnings on s390
KVM: VMX: Check cpl before emulating debug register access
KVM: fix misreporting of coalesced interrupts by kvm tracer
KVM: x86: drop duplicate kvm_flush_remote_tlb calls
KVM: VMX: call vmx_load_host_state() only if msr is cached
KVM: VMX: Conditionally reload debug register 6
KVM: Use thread debug register storage instead of kvm specific data
KVM guest: do not batch pte updates from interrupt context
KVM: Fix coalesced interrupt reporting in IOAPIC
KVM guest: fix bogus wallclock physical address calculation
KVM: VMX: Fix cr8 exiting control clobbering by EPT
KVM: Optimize kvm_mmu_unprotect_page_virt() for tdp
KVM: Document KVM_CAP_IRQCHIP
KVM: Protect update_cr8_intercept() when running without an apic
KVM: VMX: Fix EPT with WP bit change during paging
KVM: Use kvm_{read,write}_guest_virt() to read and write segment descriptors
KVM: x86 emulator: Add adc and sbb missing decoder flags
KVM: Add missing #include
...
console_print() is an old legacy interface mostly unused in the entire
kernel tree. It's best to clean up its existing use and let developers
use their own implementation of it as they feel fit.
Signed-off-by: Anirban Sinha <asinha@zeugmasystems.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the generic pci_configure_slot() rather than the acpiphp-specific
decode_hpp() and program_hpp().
Unlike the previous acpiphp-specific code, pci_configure_slot() programs
PCIe settings when an _HPX method provides them, so acpiphp-managed PCIe
devices can now be configured.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch adds a new pci_configure_slot() function that programs the
PCI bus characteristics for a newly-added device. This is based on
code in pciehp_pci.c, but should be generic enough to be used by pciehp,
shpchp, and acpiphp.
The hotplug_params struct and the program_hpp_typeX() functions are based
on the ACPI definitions, but they aren't really ACPI-specific, and there's
no alternate implementation, so I don't see the need to abstract them yet.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Reviewed-by: Alex Chiang <achiang@hp.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: fix slab_pad_check()
slub: release kobject if sysfs_create_group failed in sysfs_slab_add
SLUB: fix ARCH_KMALLOC_MINALIGN cases 64 and 256
SLUB: Fix some coding style issues
SLUB: Drop write permission to /proc/slabinfo
slab: remove duplicate kmem_cache_init_late() declarations
slub: change kmem_cache->align to record the real alignment
slub: use size and objsize orders to disable debug flags
slub: add option to disable higher order debugging slabs
This patch makes acpi_get_hp_params_from_firmware() take a
pci_dev rather than a pci_bus and makes it return a standard
int errno rather than acpi_status.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Remove long removed "inet_protocol_base" declaration.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since my commits introducing netns awareness into
genetlink we can get this problem:
BUG: scheduling while atomic: modprobe/1178/0x00000002
2 locks held by modprobe/1178:
#0: (genl_mutex){+.+.+.}, at: [<ffffffff8135ee1a>] genl_register_mc_grou
#1: (rcu_read_lock){.+.+..}, at: [<ffffffff8135eeb5>] genl_register_mc_g
Pid: 1178, comm: modprobe Not tainted 2.6.31-rc8-wl-34789-g95cb731-dirty #
Call Trace:
[<ffffffff8103e285>] __schedule_bug+0x85/0x90
[<ffffffff81403138>] schedule+0x108/0x588
[<ffffffff8135b131>] netlink_table_grab+0xa1/0xf0
[<ffffffff8135c3a7>] netlink_change_ngroups+0x47/0x100
[<ffffffff8135ef0f>] genl_register_mc_group+0x12f/0x290
because I overlooked that netlink_table_grab() will
schedule, thinking it was just the rwlock. However,
in the contention case, that isn't actually true.
Fix this by letting the code grab the netlink table
lock first and then the RCU for netns protection.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'osync_cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
fsync: wait for data writeout completion before calling ->fsync
vfs: Remove generic_osync_inode() and sync_page_range{_nolock}()
fat: Opencode sync_page_range_nolock()
pohmelfs: Use new syncing helper
xfs: Convert sync_page_range() to simple filemap_write_and_wait_range()
ocfs2: Update syncing after splicing to match generic version
ntfs: Use new syncing helpers and update comments
ext4: Remove syncing logic from ext4_file_write
ext3: Remove syncing logic from ext3_file_write
ext2: Update comment about generic_osync_inode
vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode
vfs: Rename generic_file_aio_write_nolock
ocfs2: Use __generic_file_aio_write instead of generic_file_aio_write_nolock
pohmelfs: Use __generic_file_aio_write instead of generic_file_aio_write_nolock
vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write()
vfs: Export __generic_file_aio_write() and add some comments
vfs: Introduce filemap_fdatawait_range
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw:
GFS2: Whitespace fixes
GFS2: Remove unused sysfs file
GFS2: Be extra careful about deallocating inodes
GFS2: Remove no_formal_ino generating code
GFS2: Rename eattr.[ch] as xattr.[ch]
GFS2: Clean up of extended attribute support
GFS2: Add explanation of extended attr on-disk format
GFS2: Add "-o errors=panic|withdraw" mount options
GFS2: jumping to wrong label?
GFS2: free disk inode which is deleted by remote node -V2
GFS2: Add a document explaining GFS2's uevents
GFS2: Add sysfs link to device
GFS2: Replace assertion with proper error handling
GFS2: Improve error handling in inode allocation
GFS2: Add some more info to uevents
GFS2: Add online uevent to GFS2
In a number of cases, the .suspend, .freeze, .poweroff and .resume,
.thaw, .restore functions are identical. However, they all need to be
assigned to avoid regressionsm as the previous code called .suspend
resp. .resume in all those cases. SIMPLE_DEV_PM_OPS helps to deal
with this case.
[rjw: Changed the name of the macro and added the comment explaining its
purpose.]
Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits)
netxen: update copyright
netxen: fix tx timeout recovery
netxen: fix file firmware leak
netxen: improve pci memory access
netxen: change firmware write size
tg3: Fix return ring size breakage
netxen: build fix for INET=n
cdc-phonet: autoconfigure Phonet address
Phonet: back-end for autoconfigured addresses
Phonet: fix netlink address dump error handling
ipv6: Add IFA_F_DADFAILED flag
net: Add DEVTYPE support for Ethernet based devices
mv643xx_eth.c: remove unused txq_set_wrr()
ucc_geth: Fix hangs after switching from full to half duplex
ucc_geth: Rearrange some code to avoid forward declarations
phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
drivers/net/phy: introduce missing kfree
drivers/net/wan: introduce missing kfree
net: force bridge module(s) to be GPL
Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded
...
Fixed up trivial conflicts:
- arch/x86/include/asm/socket.h
converted to <asm-generic/socket.h> in the x86 tree. The generic
header has the same new #define's, so that works out fine.
- drivers/net/tun.c
fix conflict between 89f56d1e9 ("tun: reuse struct sock fields") that
switched over to using 'tun->socket.sk' instead of the redundantly
available (and thus removed) 'tun->sk', and 2b980dbd ("lsm: Add hooks
to the TUN driver") which added a new 'tun->sk' use.
Noted in 'next' by Stephen Rothwell.
acpi_pci_detect_ejectable() goes through effort to convert its
struct pci_bus arg to an acpi_handle, but every time we use this
interface, we already have the handle available.
So let's just use the handle instead of converting back and forth.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The generated functions of TRACE_EVENT uses "flags" in one of the
sub macros which shadows a parameter in the outside macro.
Simple fix is to make the submacro use __flags instead.
Discovered by sparse.
Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Introduce new function for generic inode syncing (vfs_fsync_range) and use
it from fsync() path. Introduce also new helper for syncing after a sync
write (generic_write_sync) using the generic function.
Use these new helpers for syncing from generic VFS functions. This makes
O_SYNC writes to block devices acquire i_mutex for syncing. If we really
care about this, we can make block_fsync() drop the i_mutex and reacquire
it before it returns.
CC: Evgeniy Polyakov <zbr@ioremap.net>
CC: ocfs2-devel@oss.oracle.com
CC: Joel Becker <joel.becker@oracle.com>
CC: Felix Blyakher <felixb@sgi.com>
CC: xfs@oss.sgi.com
CC: Anton Altaparmakov <aia21@cantab.net>
CC: linux-ntfs-dev@lists.sourceforge.net
CC: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
CC: linux-ext4@vger.kernel.org
CC: tytso@mit.edu
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
generic_file_aio_write_nolock() is now used only by block devices and raw
character device. Filesystems should use __generic_file_aio_write() in case
generic_file_aio_write() doesn't suit them. So rename the function to
blkdev_aio_write() and move it to fs/blockdev.c.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Rename __generic_file_aio_write_nolock() to __generic_file_aio_write(), add
comments to write helpers explaining how they should be used and export
__generic_file_aio_write() since it will be used by some filesystems.
CC: ocfs2-devel@oss.oracle.com
CC: Joel Becker <joel.becker@oracle.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
This simple helper saves some filesystems conversion from byte offset
to page numbers and also makes the fdata* interface more complete.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
* 'x86-percpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, percpu: Collect hot percpu variables into one cacheline
x86, percpu: Fix DECLARE/DEFINE_PER_CPU_PAGE_ALIGNED()
x86, percpu: Add 'percpu_read_stable()' interface for cacheable accesses
Some of the generated functions used in the TRACE_EVENT macros are
not declared static, but they are not global.
Discovered by sparse.
Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard,
the only difference between the two is that blkdev_issue_discard needs to
send a barrier discard request and blk_ioctl_discard a non-barrier one,
and blk_ioctl_discard needs to wait on the request. To facilitates this
add a flags argument to blkdev_issue_discard to control both aspects of the
behaviour. This will be very useful later on for using the waiting
funcitonality for other callers.
Based on an earlier patch from Matthew Wilcox <matthew@wil.cx>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The commands are conceptually writes, and in the case of IDE and SCSI
commands actually are writes. They were only reads because we thought
that would interact better with the elevators. Now the elevators know
about discard requests, that advantage no longer exists.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper
around it. DM needs this to avoid poking at the queue_limits directly.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Currently, there is a single in_flight counter measuring the number of
requests in the request_queue. But some monitoring tools would like to
know how many read requests and write requests are in progress. Split the
current in_flight counter into two seperate counters for read and write.
This information is exported as a sysfs attribute, as changing the
currently available stat files would break the existing tools.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
__validate_process_creds should check if selinux is actually enabled before
running tests on the selinux portion of the credentials struct.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
This allows more precise tracking of how the scheduler accounts
(and acts upon) a task having spent N nanoseconds of CPU time.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch increases the max string used by predicates to
handle KSYM_SYMBOL_LEN.
Also moves an include to look nicer.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
__start_mcount_loc[] is unused after init, yet occupies RAM forever
as part of .rodata. 152kiB is typical on a 64-bit architecture. Instead,
__start_mcount_loc should be in the interval [__init_begin, __init_end)
so that the space is reclaimed after init.
__start_mcount_loc[] is generated during the load portion
of kernel build, and is used only by ftrace_init(). ftrace_init is declared
'__init' and is in .init.text, which is freed after init.
__start_mcount_loc is placed into .rodata by a call to MCOUNT_REC inside
the RO_DATA macro of include/asm-generic/vmlinux.lds.h. The array *is*
read-only, but more importantly it is not used after init. So the call to
MCOUNT_REC should be moved from RO_DATA to INIT_DATA.
This patch has been tested on x86_64 with CONFIG_DEBUG_PAGEALLOC=y
which verifies that the address range never is accessed after init.
Signed-off-by: John Reiser <jreiser@BitWagon.com>
LKML-Reference: <4A6DF0B6.7080402@bitwagon.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Booting 2.6.31 and executing
echo 1 >/sys/kernel/debug/tracing/events/enable
leads to
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c032a583>] ftrace_raw_event_block_bio_bounce+0x4b/0xb9
Apparently,
bio = bio_map_user(q, NULL, uaddr, len, reading, gfp_mask);
is called in block/blk-map.c:58 where bio->bi_bdev in set to NULL and
still is NULL when an attempt is made to evaluate bio->bi_bdev->bd_dev
in include/trace/events/block.h:189.
The tracepoint should ensure bio->bi_bdev is not dereferenced, if NULL.
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
LKML-Reference: <4AAAC9B1.9060505@osadl.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
V4L2_FMT_FLAG_EMULATED 0x0002 This format is not native to the device but
emulated through software (usually libv4l2), where possible try to use a
native format instead for better performance.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Currently, V4L uses a scancode table whose index is the scancode and
the value is the keycode. While this works, it has some drawbacks:
1) It requires that the scancode to be at the range 00-7f;
2) keycodes should be masked on 7 bits in order for it to work;
3) due to the 7 bits approach, sometimes it is not possible to replace
the default keyboard to another one with a different encoding rule;
4) it is different than what is done with dvb-usb approach;
5) it requires a typedef for it to work. This is not a recommended
Linux CodingStyle.
This patch is part of a larger series of IR changes. It basically
replaces the IR_KEYTAB_TYPE tables by a structured table:
struct ir_scancode {
u16 scancode;
u32 keycode;
};
This is very close to what dvb does. So, a further integration with DVB
code will be easy.
While we've changed the tables, for now, the IR keycode handling is still
based on the old approach.
The only notable effect is the redution of about 35% of the ir-common
module size:
text data bss dec hex filename
6721 29208 4 35933 8c5d old/ir-common.ko
5756 18040 4 23800 5cf8 new/ir-common.ko
In thesis, we could be using above u8 for scancode, reducing even more the size
of the module, but defining it as u16 is more convenient, since, on dvb, each
scancode has up to 16 bits, and we currently have a few troubles with rc5, as their
scancodes are defined with more than 8 bits.
This patch itself shouldn't be doing any functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
[mchehab@redhat.com: Fix a few wrong IR keymaps]
Signed-off-by: Shine Liu <shinel@foxmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This patch adds files to control si4713 devices.
Internal functions to control device properties
and initialization procedures are into these files.
Also, a v4l2 subdev interface is also exported.
This way other drivers can use this as v4l2 i2c subdevice.
Signed-off-by: Eduardo Valentin <eduardo.valentin@nokia.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This patch adds files which creates the radio interface
for si4713 FM transmitter (modulator) devices.
In order to do the real access to device registers, this
driver uses the v4l2 subdev interface exported by si4713 i2c driver.
Signed-off-by: Eduardo Valentin <eduardo.valentin@nokia.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This patch adds a new class of extended controls. This class
is intended to support FM Radio Modulators properties such as:
rds, audio limiters, audio compression, pilot tone generation,
tuning power levels and preemphasis properties.
Signed-off-by: Eduardo Valentin <eduardo.valentin@nokia.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The upcoming RDS encoder needs support for string controls. This patch
implements the core implementation.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add support for the remote control that comes with the Cinergy Hybrid T USB XS
Thanks to Jelle de Jong for providing sample hardware to test with.
Cc: Jelle de Jong <jelledejong@powercraft.nl>
Signed-off-by: Devin Heitmueller <dheitmueller@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This patch augments the init data passed by bridge drivers to
ir-kbd-i2c, so that the ir_type can be set explicitly, and so
ir-kbd-i2c internal get_key functions can be reused without
requiring symbols from ir-kbd-i2c in the bridge driver.
Signed-off-by: Andy Walls <awalls@radix.net>
Reviewed-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add capabilities to describe an FM transmitter device.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
DMX_ADD_PID allows to add multiple PIDs to a transport stream filter
previously set up with DMX_SET_PES_FILTER and output=DMX_OUT_TSDEMUX_TAP.
DMX_REMOVE_PID is used to drop a PID from a filter.
These ioctls are to be used by readers of /dev/dvb/adapterX/demuxY. They
may be called at any time, i.e. before or after the first filter on the
shared file descriptor was started.
They make it possible to record multiple services without the need to de-
or re-multiplex TS packets.
To accomplish this, dmxdev_filter->feed.ts has been converted to a list
of struct dmxdev_feeds, each containing a PID value and a pointer to a
struct dmx_ts_feed.
Signed-off-by: Andreas Oberritter <obi@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
To make UVC constants accessible by a future UVC gadget driver, move them from
drivers/media/video/uvc/uvcvideo.h to include/linux/usb/video.h.
Signed-off-by: Laurent Pinchart <laurent.pinchart@skynet.be>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
With the changes this file suffered along the time, things got a little disorganized.
In particular, V4L2_PIX_FMT_YVYU were shown as a device-specific format, instead of
yet another variant of YUV.
There's no functional change on this patch. It just adds some comments and reorder
fourcc formats to their proper places.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
bnx2i currently has a check for if a ep is properly bound, so if
iscsi_queuecommand/xmit_task is called while there is no ep
we will not queue IO.
be2iscsi sends IO from queuecommand/xmit_task like how bnx2i does
and needs a similar test. This patch has us just use the suspend_bit
test for this.
When ep_poll has succeeed iscsid will call conn_bind, the LLD will
then call iscsi_conn_bind which will clear the suspend bit.
When ep_disconnect is called (or if there is a conn error) we set
the suspend bit. For the ep_disconnect case I am adding a helper
in this patch that will take the session lock to make sure
iscsi_queuecommand/xmit_task is not running and it will set
the suspend bit.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
beiscsi does not need the iscsi scsi cmd processing. It does not
even get this info on the completion path. This adds a function
to just update the sequencing numbers and complete a task.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This moves the primecell vendor enum definition inside vic.c
out to linux/amba/bus.h where it belongs and replace any
occurances of specific vendor ID:s with the respective enums
instead.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (87 commits)
NFSv4: Disallow 'mount -t nfs4 -overs=2' and 'mount -t nfs4 -overs=3'
NFS: Allow the "nfs" file system type to support NFSv4
NFS: Move details of nfs4_get_sb() to a helper
NFS: Refactor NFSv4 text-based mount option validation
NFS: Mount option parser should detect missing "port="
NFS: out of date comment regarding O_EXCL above nfs3_proc_create()
NFS: Handle a zero-length auth flavor list
SUNRPC: Ensure that sunrpc gets initialised before nfs, lockd, etc...
nfs: fix compile error in rpc_pipefs.h
nfs: Remove reference to generic_osync_inode from a comment
SUNRPC: cache must take a reference to the cache detail's module on open()
NFS: Use the DNS resolver in the mount code.
NFS: Add a dns resolver for use with NFSv4 referrals and migration
SUNRPC: Fix a typo in cache_pipefs_files
nfs: nfs4xdr: optimize low level decoding
nfs: nfs4xdr: get rid of READ_BUF
nfs: nfs4xdr: simplify decode_exchange_id by reusing decode_opaque_inline
nfs: nfs4xdr: get rid of COPYMEM
nfs: nfs4xdr: introduce decode_sessionid helper
nfs: nfs4xdr: introduce decode_verifier helper
...
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (25 commits)
pata_rz1000: use printk_once
ahci: kill @force_restart and refine CLO for ahci_kick_engine()
pata_cs5535: add pci id for AMD based CS5535 controllers
ahci: Add AMD SB900 SATA/IDE controller device IDs
drivers/ata: use resource_size
sata_fsl: Defer non-ncq commands when ncq commands active
libata: add SATA PMP revision information for spec 1.2
libata: fix off-by-one error in ata_tf_read_block()
ahci: Gigabyte GA-MA69VM-S2 can't do 64bit DMA
ahci: make ahci_asus_m2a_vm_32bit_only() quirk more generic
dmi: extend dmi_get_year() to dmi_get_date()
dmi: fix date handling in dmi_get_year()
libata: unbreak TPM filtering by reorganizing ata_scsi_pass_thru()
sata_sis: convert to slave_link
sata_sil24: always set protocol override for non-ATAPI data commands
libata: Export AHCI capabilities
libata: Delegate nonrot flag setting to SCSI
[libata] Add pata_rdc driver for RDC ATA devices
drivers/ata: Remove unnecessary semicolons
libata: remove spindown skipping and warning
...
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (105 commits)
ring-buffer: only enable ring_buffer_swap_cpu when needed
ring-buffer: check for swapped buffers in start of committing
tracing: report error in trace if we fail to swap latency buffer
tracing: add trace_array_printk for internal tracers to use
tracing: pass around ring buffer instead of tracer
tracing: make tracing_reset safe for external use
tracing: use timestamp to determine start of latency traces
tracing: Remove mentioning of legacy latency_trace file from documentation
tracing/filters: Defer pred allocation, fix memory leak
tracing: remove users of tracing_reset
tracing: disable buffers and synchronize_sched before resetting
tracing: disable update max tracer while reading trace
tracing: print out start and stop in latency traces
ring-buffer: disable all cpu buffers when one finds a problem
ring-buffer: do not count discarded events
ring-buffer: remove ring_buffer_event_discard
ring-buffer: fix ring_buffer_read crossing pages
ring-buffer: remove unnecessary cpu_relax
ring-buffer: do not swap buffers during a commit
ring-buffer: do not reset while in a commit
...
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (64 commits)
sched: Fix sched::sched_stat_wait tracepoint field
sched: Disable NEW_FAIR_SLEEPERS for now
sched: Keep kthreads at default priority
sched: Re-tune the scheduler latency defaults to decrease worst-case latencies
sched: Turn off child_runs_first
sched: Ensure that a child can't gain time over it's parent after fork()
sched: enable SD_WAKE_IDLE
sched: Deal with low-load in wake_affine()
sched: Remove short cut from select_task_rq_fair()
sched: Turn on SD_BALANCE_NEWIDLE
sched: Clean up topology.h
sched: Fix dynamic power-balancing crash
sched: Remove reciprocal for cpu_power
sched: Try to deal with low capacity, fix update_sd_power_savings_stats()
sched: Try to deal with low capacity
sched: Scale down cpu_power due to RT tasks
sched: Implement dynamic cpu_power
sched: Add smt_gain
sched: Update the cpu_power sum during load-balance
sched: Add SD_PREFER_SIBLING
...
* 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
perf tools: Avoid unnecessary work in directory lookups
perf stat: Clean up statistics calculations a bit more
perf stat: More advanced variance computation
perf stat: Use stddev_mean in stead of stddev
perf stat: Remove the limit on repeat
perf stat: Change noise calculation to use stddev
x86, perf_counter, bts: Do not allow kernel BTS tracing for now
x86, perf_counter, bts: Correct pointer-to-u64 casts
x86, perf_counter, bts: Fail if BTS is not available
perf_counter: Fix output-sharing error path
perf trace: Fix read_string()
perf trace: Print out in nanoseconds
perf tools: Seek to the end of the header area
perf trace: Fix parsing of perf.data
perf trace: Sample timestamps as well
perf_counter: Introduce new (non-)paranoia level to allow raw tracepoint access
perf trace: Sample the CPU too
perf tools: Work around strict aliasing related warnings
perf tools: Clean up warnings list in the Makefile
perf tools: Complete support for dynamic strings
...
* 'irq-threaded-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Do not mask oneshot edge type interrupts
genirq: Support nested threaded irq handling
genirq: Add buslock support
genirq: Add oneshot support
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits)
rcu: Move end of special early-boot RCU operation earlier
rcu: Changes from reviews: avoid casts, fix/add warnings, improve comments
rcu: Create rcutree plugins to handle hotplug CPU for multi-level trees
rcu: Remove lockdep annotations from RCU's _notrace() API members
rcu: Add #ifdef to suppress __rcu_offline_cpu() warning in !HOTPLUG_CPU builds
rcu: Add CPU-offline processing for single-node configurations
rcu: Add "notrace" to RCU function headers used by ftrace
rcu: Remove CONFIG_PREEMPT_RCU
rcu: Merge preemptable-RCU functionality into hierarchical RCU
rcu: Simplify rcu_pending()/rcu_check_callbacks() API
rcu: Use debugfs_remove_recursive() simplify code.
rcu: Merge per-RCU-flavor initialization into pre-existing macro
rcu: Fix online/offline indication for rcudata.csv trace file
rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h
rcu: Renamings to increase RCU clarity
rcu: Move private definitions from include/linux/rcutree.h to kernel/rcutree.h
rcu: Expunge lingering references to CONFIG_CLASSIC_RCU, optimize on !SMP
rcu: Delay rcu_barrier() wait until beginning of next CPU-hotunplug operation.
rcu: Fix typo in rcu_irq_exit() comment header
rcu: Make rcupreempt_trace.c look at offline CPUs
...
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (59 commits)
x86/gart: Do not select AGP for GART_IOMMU
x86/amd-iommu: Initialize passthrough mode when requested
x86/amd-iommu: Don't detach device from pt domain on driver unbind
x86/amd-iommu: Make sure a device is assigned in passthrough mode
x86/amd-iommu: Align locking between attach_device and detach_device
x86/amd-iommu: Fix device table write order
x86/amd-iommu: Add passthrough mode initialization functions
x86/amd-iommu: Add core functions for pd allocation/freeing
x86/dma: Mark iommu_pass_through as __read_mostly
x86/amd-iommu: Change iommu_map_page to support multiple page sizes
x86/amd-iommu: Support higher level PTEs in iommu_page_unmap
x86/amd-iommu: Remove old page table handling macros
x86/amd-iommu: Use 2-level page tables for dma_ops domains
x86/amd-iommu: Remove bus_addr check in iommu_map_page
x86/amd-iommu: Remove last usages of IOMMU_PTE_L0_INDEX
x86/amd-iommu: Change alloc_pte to support 64 bit address space
x86/amd-iommu: Introduce increase_address_space function
x86/amd-iommu: Flush domains if address space size was increased
x86/amd-iommu: Introduce set_dte_entry function
x86/amd-iommu: Add a gneric version of amd_iommu_flush_all_devices
...
In some cases, the network device driver knows what layer-3 address the
device should have. This adds support for the Phonet stack to
automatically request from the driver and add that address to the
network device.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IFA_F_DADFAILED flag to denote an IPv6 address that has
failed Duplicate Address Detection, that way tools like
/sbin/ip can be more informative.
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:db8::1/64 scope global tentative dadfailed
valid_lft forever preferred_lft forever
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Ethernet framing is used for a lot of devices these days. Most
prominent are WiFi and WiMAX based devices. However for userspace
application it is important to classify these devices correctly and
not only see them as Ethernet devices. The daemons like HAL, DeviceKit
or even NetworkManager with udev support tries to do the classification
in userspace with a lot trickery and extra system calls. This is not
good and actually reaches its limitations. Especially since the kernel
does know the type of the Ethernet device it is pretty stupid.
To solve this problem the underlying device type needs to be set and
then the value will be exported as DEVTYPE via uevents and available
within udev.
# cat /sys/class/net/wlan0/uevent
DEVTYPE=wlan
INTERFACE=wlan0
IFINDEX=5
This is similar to subsystems like USB and SCSI that distinguish
between hosts, devices, disks, partitions etc.
The new SET_NETDEV_DEVTYPE() is a convenience helper to set the actual
device type. All device types are free form, but for convenience the
same strings as used with RFKILL are choosen.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (102 commits)
crypto: sha-s390 - Fix warnings in import function
crypto: vmac - New hash algorithm for intel_txt support
crypto: api - Do not displace newly registered algorithms
crypto: ansi_cprng - Fix module initialization
crypto: xcbc - Fix alignment calculation of xcbc_tfm_ctx
crypto: fips - Depend on ansi_cprng
crypto: blkcipher - Do not use eseqiv on stream ciphers
crypto: ctr - Use chainiv on raw counter mode
Revert crypto: fips - Select CPRNG
crypto: rng - Fix typo
crypto: talitos - add support for 36 bit addressing
crypto: talitos - align locks on cache lines
crypto: talitos - simplify hmac data size calculation
crypto: mv_cesa - Add support for Orion5X crypto engine
crypto: cryptd - Add support to access underlaying shash
crypto: gcm - Use GHASH digest algorithm
crypto: ghash - Add GHASH digest algorithm for GCM
crypto: authenc - Convert to ahash
crypto: api - Fix aligned ctx helper
crypto: hmac - Prehash ipad/opad
...
* 'writeback' of git://git.kernel.dk/linux-2.6-block:
writeback: check for registered bdi in flusher add and inode dirty
writeback: add name to backing_dev_info
writeback: add some debug inode list counters to bdi stats
writeback: get rid of pdflush completely
writeback: switch to per-bdi threads for flushing data
writeback: move dirty inodes from super_block to backing_dev_info
writeback: get rid of generic_sync_sb_inodes() export
* 'kmemleak' of git://linux-arm.org/linux-2.6:
kmemleak: Improve the "Early log buffer exceeded" error message
kmemleak: fix sparse warning for static declarations
kmemleak: fix sparse warning over overshadowed flags
kmemleak: move common painting code together
kmemleak: add clear command support
kmemleak: use bool for true/false questions
kmemleak: Do no create the clean-up thread during kmemleak_disable()
kmemleak: Scan all thread stacks
kmemleak: Don't scan uninitialized memory when kmemcheck is enabled
kmemleak: Ignore the aperture memory hole on x86_64
kmemleak: Printing of the objects hex dump
kmemleak: Do not report alloc_bootmem blocks as leaks
kmemleak: Save the stack trace for early allocations
kmemleak: Mark the early log buffer as __initdata
kmemleak: Dump object information on request
kmemleak: Allow rescheduling during an object scanning
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (57 commits)
binfmt_elf: fix PT_INTERP bss handling
TPM: Fixup boot probe timeout for tpm_tis driver
sysfs: Add labeling support for sysfs
LSM/SELinux: inode_{get,set,notify}secctx hooks to access LSM security context information.
VFS: Factor out part of vfs_setxattr so it can be called from the SELinux hook for inode_setsecctx.
KEYS: Add missing linux/tracehook.h #inclusions
KEYS: Fix default security_session_to_parent()
Security/SELinux: includecheck fix kernel/sysctl.c
KEYS: security_cred_alloc_blank() should return int under all circumstances
IMA: open new file for read
KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
KEYS: Extend TIF_NOTIFY_RESUME to (almost) all architectures [try #6]
KEYS: Do some whitespace cleanups [try #6]
KEYS: Make /proc/keys use keyid not numread as file position [try #6]
KEYS: Add garbage collection for dead, revoked and expired keys. [try #6]
KEYS: Flag dead keys to induce EKEYREVOKED [try #6]
KEYS: Allow keyctl_revoke() on keys that have SETATTR but not WRITE perm [try #6]
KEYS: Deal with dead-type keys appropriately [try #6]
CRED: Add some configurable debugging [try #6]
selinux: Support for the new TUN LSM hooks
...
The userstack trace required the recording of the tgid entry.
Unfortunately, it was added to the generic entry where it wasted
4 bytes of every entry and was only used by one entry.
This patch moves it out of the generic field and moves it into the
only user (userstack_entry).
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Test results here look good, and on big OLTP runs it has also shown
to significantly increase cycles attributed to the database and
cause a performance boost.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This borrows some code from NAPI and implements a polled completion
mode for block devices. The idea is the same as NAPI - instead of
doing the command completion when the irq occurs, schedule a dedicated
softirq in the hopes that we will complete more IO when the iopoll
handler is invoked. Devices have a budget of commands assigned, and will
stay in polled mode as long as they continue to consume their budget
from the iopoll softirq handler. If they do not, the device is set back
to interrupt completion mode.
This patch holds the core bits for blk-iopoll, device driver support
sold separately.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Instead of just checking whether this device uses block layer
tagging, we can improve the detection by looking at the maximum
queue depth it has reached. If that crosses 4, then deem it a
queuing device.
This is important on high IOPS devices, since plugging hurts
the performance there (it can be as much as 10-15% of the sys
time).
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Get rid of any functions that test for these bits and make callers
use bio_rw_flagged() directly. Then it is at least directly apparent
what variable and flag they check.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Failfast has characteristics from other attributes. When issuing,
executing and successuflly completing requests, failfast doesn't make
any difference. It only affects how a request is handled on failure.
Allowing requests with different failfast settings to be merged cause
normal IOs to fail prematurely while not allowing has performance
penalties as failfast is used for read aheads which are likely to be
located near in-flight or to-be-issued normal IOs.
This patch introduces the concept of 'mixed merge'. A request is a
mixed merge if it is merge of segments which require different
handling on failure. Currently the only mixable attributes are
failfast ones (or lack thereof).
When a bio with different failfast settings is added to an existing
request or requests of different failfast settings are merged, the
merged request is marked mixed. Each bio carries failfast settings
and the request always tracks failfast state of the first bio. When
the request fails, blk_rq_err_bytes() can be used to determine how
many bytes can be safely failed without crossing into an area which
requires further retrials.
This allows request merging regardless of failfast settings while
keeping the failure handling correct.
This patch only implements mixed merge but doesn't enable it. The
next one will update SCSI to make use of mixed merge.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
bio and request use the same set of failfast bits. This patch makes
the following changes to simplify things.
* enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_*
bits coincide with __REQ_FAILFAST_* bits.
* The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV
but the matching is useless anyway. init_request_from_bio() is
responsible for setting FAILFAST bits on FS requests and non-FS
requests never use BIO_RW_AHEAD. Drop the code and comment from
blk_rq_bio_prep().
* Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and
simplify FAILFAST flags handling in init_request_from_bio().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Also a debugging aid. We want to catch dirty inodes being added to
backing devices that don't do writeback.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This enables us to track who does what and print info. Its main use
is catching dirty inodes on the default_backing_dev_info, so we can
fix that up.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This gets rid of pdflush for bdi writeout and kupdated style cleaning.
pdflush writeout suffers from lack of locality and also requires more
threads to handle the same workload, since it has to work in a
non-blocking fashion against each queue. This also introduces lumpy
behaviour and potential request starvation, since pdflush can be starved
for queue access if others are accessing it. A sample ffsb workload that
does random writes to files is about 8% faster here on a simple SATA drive
during the benchmark phase. File layout also seems a LOT more smooth in
vmstat:
r b swpd free buff cache si so bi bo in cs us sy id wa
0 1 0 608848 2652 375372 0 0 0 71024 604 24 1 10 48 42
0 1 0 549644 2712 433736 0 0 0 60692 505 27 1 8 48 44
1 0 0 476928 2784 505192 0 0 4 29540 553 24 0 9 53 37
0 1 0 457972 2808 524008 0 0 0 54876 331 16 0 4 38 58
0 1 0 366128 2928 614284 0 0 4 92168 710 58 0 13 53 34
0 1 0 295092 3000 684140 0 0 0 62924 572 23 0 9 53 37
0 1 0 236592 3064 741704 0 0 4 58256 523 17 0 8 48 44
0 1 0 165608 3132 811464 0 0 0 57460 560 21 0 8 54 38
0 1 0 102952 3200 873164 0 0 4 74748 540 29 1 10 48 41
0 1 0 48604 3252 926472 0 0 0 53248 469 29 0 7 47 45
where vanilla tends to fluctuate a lot in the creation phase:
r b swpd free buff cache si so bi bo in cs us sy id wa
1 1 0 678716 5792 303380 0 0 0 74064 565 50 1 11 52 36
1 0 0 662488 5864 319396 0 0 4 352 302 329 0 2 47 51
0 1 0 599312 5924 381468 0 0 0 78164 516 55 0 9 51 40
0 1 0 519952 6008 459516 0 0 4 78156 622 56 1 11 52 37
1 1 0 436640 6092 541632 0 0 0 82244 622 54 0 11 48 41
0 1 0 436640 6092 541660 0 0 0 8 152 39 0 0 51 49
0 1 0 332224 6200 644252 0 0 4 102800 728 46 1 13 49 36
1 0 0 274492 6260 701056 0 0 4 12328 459 49 0 7 50 43
0 1 0 211220 6324 763356 0 0 0 106940 515 37 1 10 51 39
1 0 0 160412 6376 813468 0 0 0 8224 415 43 0 6 49 45
1 1 0 85980 6452 886556 0 0 4 113516 575 39 1 11 54 34
0 2 0 85968 6452 886620 0 0 0 1640 158 211 0 0 46 54
A 10 disk test with btrfs performs 26% faster with per-bdi flushing. A
SSD based writeback test on XFS performs over 20% better as well, with
the throughput being very stable around 1GB/sec, where pdflush only
manages 750MB/sec and fluctuates wildly while doing so. Random buffered
writes to many files behave a lot better as well, as does random mmap'ed
writes.
A separate thread is added to sync the super blocks. In the long term,
adding sync_supers_bdi() functionality could get rid of this thread again.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This is a first step at introducing per-bdi flusher threads. We should
have no change in behaviour, although sb_has_dirty_inodes() is now
ridiculously expensive, as there's no easy way to answer that question.
Not a huge problem, since it'll be deleted in subsequent patches.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This adds two new exported functions:
- writeback_inodes_sb(), which only attempts to writeback dirty inodes on
this super_block, for WB_SYNC_NONE writeout.
- sync_inodes_sb(), which writes out all dirty inodes on this super_block
and also waits for the IO to complete.
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
When an RSCN indicates changes to individual remote ports,
don't blindly log them out and then back in. Instead, determine
whether they're still in the directory, by doing GPN_ID.
If that is successful, call login, which will send ADISC and reverify,
otherwise, call logoff. Perhaps we should just delete the rport,
not send LOGO, but it seems safer.
Also, fix a possible issue where if a mix of records in the RSCN
cause us to queue disc_ports for disc_single and then we decide
to do full rediscovery, we leak memory for those disc_ports queued.
So, go through the list of disc_ports even if doing full discovery.
Free the disc_ports in any case. If any of the disc_single() calls
return error, do a full discovery.
The ability to fill in GPN_ID requests was added to fc_ct_fill().
For this, it needs the FC_ID to be passed in as an arg.
The did parameter for fc_elsct_send() is used for that, since the
actual D_DID will always be 0xfffffc for all CT requests so far.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When rport_login is called on an rport that is already thought
to be logged in, use ADISC. If that fails, redo PLOGI.
This is less disruptive after fabric changes that don't affect
the state of the target.
Implement the sending of ADISC via fc_els_fill.
Add ADISC state to the rport state machine. This is entered from READY
and returns to READY after successful completion. If it fails, the rport
is either logged off and deleted or re-does PLOGI.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Improve lport and rport debug messages to indicate whether
the response is LS_ACC, LS_RJT, closed, or timeout.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This moves the remote port lookup for incoming ELS requests into
fc_rport.c, in preparation for handing PLOGI and LOGO from
unknown rports.
This changes the arg to rport_recv_req from an rdata to an lport.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Currently these values are initialized by the callers. This was exposed
by a later patch that adds PLOGI request support. The patch failed to
initialize the new remote port's roles and it caused problems. This patch
has the rport_create routine initialize the identifiers and then the
callers can override them with real values.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
On some switches, an empty zone causes GPN_FT to be rejected
with reason 9 (unable) explanation 7 (FC-4 types not registered),
which causes discovery to be retried endlessly. Treat this as
just an empty response and consider discovery complete.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When receiving an RSCN, do not log off all rports. This is
extremely disruptive. If, after the GPN_FT response, some
rports haven't been listed, delete them.
Add field disc_id to structs fc_rport_priv and fc_disc.
disc_id is an arbitrary serial number used to identify the
rports found by the latest discovery. This eliminates the need
to go through the rport list when restarting discovery.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Delete unused disc->delay element.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There was no need to have the discovery status stored in struct fc_disc.
Change fc_disc_done() to take the discovery status as an argument
and just pass it on to the discovery callback.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Don't create a "dummy" remote port to go with fc_rport_priv.
Make the rport truly optional by allocating fc_rport_priv separately
and not requiring a dummy rport to be there if we haven't yet done
fc_remote_port_add().
The fc_rport_libfc_priv remains as a structure attached to the
rport for I/O purposes.
Be sure to hold references on rdata when the lock is dropped in
fc_rport_work().
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remote ports will become READY more than once after
ADISC is implemented in a later patch.
The event callback that has been called "CREATED" will mean "READY".
Rename it now in preparation for those changes.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Allow a struct fc_rport_priv to have no fc_rport associated with it.
This sets up to remove the need for "rogue" rports.
Add a few fields to fc_rport_priv that are needed before the fc_rport
is created. These are the ids, maxframe_size, classes, and rport pointer.
Remove the macro PRIV_TO_RPORT(). Just use rdata->rport where appropriate.
To take the place of the get_device()/put_device ops that were used to
hold both the rport and rdata, add a reference count to rdata structures
using kref. When kref_get decrements the refcount to zero, a new template
function releasing the rdata should be called. This will take care of
freeing the rdata and releasing the hold on the rport (for now). After
subsequent patches make the rport truly optional, this release function
will simply free the rdata.
Remove the simple inline function fc_rport_set_name(), which becomes
semanticly ambiguous otherwise. The caller will set the port_name and
node_name in the rdata->Ids, which will later be copied to the rport
when it its created.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
tt.elsct_send is used by both FCP and by the rport state machine.
After further patches, these two modules will use different
structures for the remote port.
So, change elsct_send to use the FC_ID instead of the fc_rport_priv
as its argument. It currently only uses the FC_ID anyway.
For CT requests the destination FC_ID is still implicitly 0xfffffc.
After further patches the did arg on CT requests will be used to
specify the FC_ID being inquired about for GPN_ID or other queries.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The rport and discovery modules deal with remote ports
before fc_remote_port_add() can be done, because the
full set of rport identifiers is not known at early stages.
In preparation for splitting the fc_rport/fc_rport_priv allocation,
make fc_rport_priv the primary interface for the remote port and
discovery engines.
The FCP / SCSI layers still deal with fc_rport and
fc_rport_libfc_priv, however.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
These macros introduce extra undesirable semicolons that keep
them from being used in expressions, and they don't protect
against being passed an expression.
Add parens and remove the semicolons.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The interface for lport->tt.rport_create() takes a fc_disc_port arg,
which is unnatural for most calls. The only reason for this was
to avoid passing in the local port as an argument, but otherwise
added to complexity.
Simplify by just using lport and fc_rport_identifiers.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
While the I/O and LLD interfaces use fc_rport_libfc_priv, the
disc and rport interfaces will use fc_rport_priv, which will
be separately allocated.
Change the disc and rport usage of fc_rport_libfc_priv to fc_rport_priv.
Use #define temporarily to make both names equivalent until a
subsequent patch splits them.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* topic/tlv-minmax:
ALSA: usb-audio - Correct bogus volume dB information
ALSA: usb-audio - Use the new TLV_DB_MINMAX type
ALSA: Add new TLV types for dBwith min/max
* topic/soundcore-preclaim:
sound: make OSS device number claiming optional and schedule its removal
sound: request char-major-* module aliases for missing OSS devices
chrdev: implement __[un]register_chrdev()
* topic/asoc: (226 commits)
ASoC: au1x: PSC-AC97 bugfixes
ASoC: Fix WM835x Out4 capture enumeration
ASoC: Remove unuused hw_read_t
ASoC: fix pxa2xx-ac97.c breakage
ASoC: Fully specify DC servo bits to update in wm_hubs
ASoC: Debugged improper setting of PLL fields in WM8580 driver
ASoC: new board driver to connect bfin-5xx with ad1836 codec
ASoC: OMAP: Add functionality to set CLKR and FSR sources in McBSP DAI
ASoC: davinci: i2c device creation moved into board files
ASoC: Don't reconfigure WM8350 FLL if not needed
ASoC: Fix s3c-i2s-v2 build
ASoC: Make platform data optional for TLV320AIC3x
ASoC: Add S3C24xx dependencies for Simtec machines
ASoC: SDP3430: Fix TWL GPIO6 pin mux request
ASoC: S3C platform: Fix s3c2410_dma_started() called at improper time
ARM: OMAP: McBSP: Merge two functions into omap_mcbsp_start/_stop
ASoC: OMAP: Fix setup of XCCR and RCCR registers in McBSP DAI
OMAP: McBSP: Use textual values in DMA operating mode sysfs files
ARM: OMAP: DMA: Add support for DMA channel self linking on OMAP1510
ASoC: Select core DMA when building for S3C64xx
...
kvm_para.h contains userspace interface and so
should be exported.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
So far unprivileged guest callers running in ring 3 can issue, e.g., MMU
hypercalls. Normally, such callers cannot provide any hand-crafted MMU
command structure as it has to be passed by its physical address, but
they can still crash the guest kernel by passing random addresses.
To close the hole, this patch considers hypercalls valid only if issued
from guest ring 0. This may still be relaxed on a per-hypercall base in
the future once required.
Cc: stable@kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Now KVM allow guest to modify guest's physical address of EPT's identity mapping page.
(change from v1, discard unnecessary check, change ioctl to accept parameter
address rather than value)
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Remove kvm_cpu_has_interrupt() and kvm_arch_interrupt_allowed() from
interface between general code and arch code. kvm_arch_vcpu_runnable()
checks for interrupts instead.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
ioeventfd is a mechanism to register PIO/MMIO regions to trigger an eventfd
signal when written to by a guest. Host userspace can register any
arbitrary IO address with a corresponding eventfd and then pass the eventfd
to a specific end-point of interest for handling.
Normal IO requires a blocking round-trip since the operation may cause
side-effects in the emulated model or may return data to the caller.
Therefore, an IO in KVM traps from the guest to the host, causes a VMX/SVM
"heavy-weight" exit back to userspace, and is ultimately serviced by qemu's
device model synchronously before returning control back to the vcpu.
However, there is a subclass of IO which acts purely as a trigger for
other IO (such as to kick off an out-of-band DMA request, etc). For these
patterns, the synchronous call is particularly expensive since we really
only want to simply get our notification transmitted asychronously and
return as quickly as possible. All the sychronous infrastructure to ensure
proper data-dependencies are met in the normal IO case are just unecessary
overhead for signalling. This adds additional computational load on the
system, as well as latency to the signalling path.
Therefore, we provide a mechanism for registration of an in-kernel trigger
point that allows the VCPU to only require a very brief, lightweight
exit just long enough to signal an eventfd. This also means that any
clients compatible with the eventfd interface (which includes userspace
and kernelspace equally well) can now register to be notified. The end
result should be a more flexible and higher performance notification API
for the backend KVM hypervisor and perhipheral components.
To test this theory, we built a test-harness called "doorbell". This
module has a function called "doorbell_ring()" which simply increments a
counter for each time the doorbell is signaled. It supports signalling
from either an eventfd, or an ioctl().
We then wired up two paths to the doorbell: One via QEMU via a registered
io region and through the doorbell ioctl(). The other is direct via
ioeventfd.
You can download this test harness here:
ftp://ftp.novell.com/dev/ghaskins/doorbell.tar.bz2
The measured results are as follows:
qemu-mmio: 110000 iops, 9.09us rtt
ioeventfd-mmio: 200100 iops, 5.00us rtt
ioeventfd-pio: 367300 iops, 2.72us rtt
I didn't measure qemu-pio, because I have to figure out how to register a
PIO region with qemu's device model, and I got lazy. However, for now we
can extrapolate based on the data from the NULLIO runs of +2.56us for MMIO,
and -350ns for HC, we get:
qemu-pio: 153139 iops, 6.53us rtt
ioeventfd-hc: 412585 iops, 2.37us rtt
these are just for fun, for now, until I can gather more data.
Here is a graph for your convenience:
http://developer.novell.com/wiki/images/7/76/Iofd-chart.png
The conclusion to draw is that we save about 4us by skipping the userspace
hop.
--------------------
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Today kvm_io_bus_regsiter_dev() returns void and will internally BUG_ON
if it fails. We want to create dynamic MMIO/PIO entries driven from
userspace later in the series, so we need to enhance the code to be more
robust with the following changes:
1) Add a return value to the registration function
2) Fix up all the callsites to check the return code, handle any
failures, and percolate the error up to the caller.
3) Add an unregister function that collapses holes in the array
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
When kvm is in hpet_legacy_mode, the hpet is providing the timer
interrupt and the pit should not be. So in legacy mode, the pit timer
is destroyed, but the *state* of the pit is maintained. So if kvm or
the guest tries to modify the state of the pit, this modification is
accepted, *except* that the timer isn't actually started. When we exit
hpet_legacy_mode, the current state of the pit (which is up to date
since we've been accepting modifications) is used to restart the pit
timer.
The saved_mode code in kvm_pit_load_count temporarily changes mode to
0xff in order to destroy the timer, but then restores the actual
value, again maintaining "current" state of the pit for possible later
reenablement.
[avi: add some reserved storage in the ioctl; make SET_PIT2 IOW]
[marcelo: fix memory corruption due to reserved storage]
Signed-off-by: Beth Kon <eak@us.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Add tracepoint in msi/ioapic/pic set_irq() functions,
in IPI sending and in the point where IRQ is placed into
apic's IRR.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This changes bus accesses to use high-level kvm_io_bus_read/kvm_io_bus_write
functions. in_range now becomes unused so it is removed from device ops in
favor of read/write callbacks performing range checks internally.
This allows aliasing (mostly for in-kernel virtio), as well as better error
handling by making it possible to pass errors up to userspace.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Use slots_lock to protect device list on the bus. slots_lock is already
taken for read everywhere, so we only need to take it for write when
registering devices. This is in preparation to removing in_range and
kvm->lock around it.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Change kvm_vcpu_is_bsp to use vcpu_id instead of bsp_vcpu pointer, which
is only initialized at the end of kvm_vm_ioctl_create_vcpu.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This allows use of the powerful ftrace infrastructure.
See Documentation/trace/ for usage information.
[avi, stephen: various build fixes]
[sheng: fix control register breakage]
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Disable usage of 2M pages if VMX_EPT_2MB_PAGE_BIT (bit 16) is clear
in MSR_IA32_VMX_EPT_VPID_CAP and EPT is enabled.
[avi: s/largepages_disabled/largepages_enabled/ to avoid negative logic]
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Archs are free to use vcpu_id as they see fit. For x86 it is used as
vcpu's apic id. New ioctl is added to configure boot vcpu id that was
assumed to be 0 till now.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Protect irq injection/acking data structures with a separate irq_lock
mutex. This fixes the following deadlock:
CPU A CPU B
kvm_vm_ioctl_deassign_dev_irq()
mutex_lock(&kvm->lock); worker_thread()
-> kvm_deassign_irq() -> kvm_assigned_dev_interrupt_work_handler()
-> deassign_host_irq() mutex_lock(&kvm->lock);
-> cancel_work_sync() [blocked]
[gleb: fix ia64 path]
Reported-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Introduce irq_lock, and use to protect ioapic data structures.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Changing s390 code in kvm_arch_vcpu_load/put come across this header
declarations. They are complete duplicates, not even useful forward
declarations as nothing using it is in between (maybe it was that in
the past).
This patch removes the two dispensable lines.
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
We only trap one page for MSI-X entry now, so it's 4k/(128/8) = 256 entries at
most.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The in-kernel speaker emulation is only a dummy and also unneeded from
the performance point of view. Rather, it takes user space support to
generate sound output on the host, e.g. console beeps.
To allow this, introduce KVM_CREATE_PIT2 which controls in-kernel
speaker port emulation via a flag passed along the new IOCTL. It also
leaves room for future extensions of the PIT configuration interface.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
KVM provides a complete virtual system environment for guests, including
support for injecting interrupts modeled after the real exception/interrupt
facilities present on the native platform (such as the IDT on x86).
Virtual interrupts can come from a variety of sources (emulated devices,
pass-through devices, etc) but all must be injected to the guest via
the KVM infrastructure. This patch adds a new mechanism to inject a specific
interrupt to a guest using a decoupled eventfd mechnanism: Any legal signal
on the irqfd (using eventfd semantics from either userspace or kernel) will
translate into an injected interrupt in the guest at the next available
interrupt window.
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The related MSRs are emulated. MCE capability is exported via
extension KVM_CAP_MCE and ioctl KVM_X86_GET_MCE_CAP_SUPPORTED. A new
vcpu ioctl command KVM_X86_SETUP_MCE is used to setup MCE emulation
such as the mcg_cap. MCE is injected via vcpu ioctl command
KVM_X86_SET_MCE. Extended machine-check state (MCG_EXT_P) and CMCI are
not implemented.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
When new child qdiscs are attached to the mq qdisc, they are actually
attached as root qdiscs to the device queues. The lock selection for
new estimators incorrectly picks the root lock of the existing and
to be replaced qdisc, which results in a use-after-free once the old
qdisc has been destroyed.
Mark mq qdisc instances with a new flag and treat qdiscs attached to
mq as children similar to regular root qdiscs.
Additionally prevent estimators from being attached to the mq qdisc
itself since it only updates its byte and packet counters during dumps.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces three new hooks. The inode_getsecctx hook is used to get
all relevant information from an LSM about an inode. The inode_setsecctx is
used to set both the in-core and on-disk state for the inode based on a context
derived from inode_getsecctx.The final hook inode_notifysecctx will notify the
LSM of a change for the in-core state of the inode in question. These hooks are
for use in the labeled NFS code and addresses concerns of how to set security
on an inode in a multi-xattr LSM. For historical reasons Stephen Smalley's
explanation of the reason for these hooks is pasted below.
Quote Stephen Smalley
inode_setsecctx: Change the security context of an inode. Updates the
in core security context managed by the security module and invokes the
fs code as needed (via __vfs_setxattr_noperm) to update any backing
xattrs that represent the context. Example usage: NFS server invokes
this hook to change the security context in its incore inode and on the
backing file system to a value provided by the client on a SETATTR
operation.
inode_notifysecctx: Notify the security module of what the security
context of an inode should be. Initializes the incore security context
managed by the security module for this inode. Example usage: NFS
client invokes this hook to initialize the security context in its
incore inode to the value provided by the server for the file when the
server returned the file's attributes to the client.
Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
This factors out the part of the vfs_setxattr function that performs the
setting of the xattr and its notification. This is needed so the SELinux
implementation of inode_setsecctx can handle the setting of the xattr while
maintaining the proper separation of layers.
Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
The wakeup.prepared flag is used for marking devices that have the
wake-up power already enabled, so that the wake-up power is not
enabled twice in a row for the same device. This assumes, however,
that device wake-up power will only be enabled once, while the device
is being prepared for a system-wide sleep transition, and the second
attempt is made by acpi_enable_wakeup_device_prep().
With the upcoming PCI wake-up rework this assumption will not hold
any more for PCI bridges and the root bridge whose wake-up power
may be enabled as a result of wake-up enable propagation from other
devices (eg. add-on devices that are not associated with any GPEs).
Thus, there may be many attempts to enable wake-up power on a PCI
bridge or the root bridge during a system power state transition
and it's better to replace wakeup.prepared with a reference counter.
Reviewed-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce a new PCI device flag, wakeup_prepared, to prevent PCI
wake-up preparation code from being executed twice in a row for the
same device and for the same purpose.
Reviewed-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In general a BIOS may goof or we may hotplug in a hotplug controller.
In either case the kernel needs to reserve resources for plugging
in more devices in the future instead of creating a minimal resource
assignment.
We already do this for cardbus bridges I am just adding a variant
for pcie bridges.
v2: Make testing for pcie hotplug bridges based on a flag.
So far we only set the flag for pcie but a header_quirk
could easily be added for the non-standard pci hotplug
bridges.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Separate out pci_add_dynid() from store_new_id() and export it so that
in-kernel code can add PCI IDs dynamically. As the function will be
available regardless of HOTPLUG, put it and pull pci_free_dynids()
outside of CONFIG_HOTPLUG.
This will be used by pci-stub to initialize initial IDs via module
param.
While at it, remove bogus get_driver() failure check.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add support for PCI-E 5.0 GT/s in max_bus_speed and cur_bus_speed.
Reviewed-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix some warnings reported in linux-next + also cleanup some
comment errors noticed by Pekka Paalanen.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This is the first of three patches that implement a bit field that PCI
Express device drivers can use to indicate they need a fundamental reset
during error recovery.
By default, the EEH framework on powerpc does what's known as a "hot
reset" during recovery of a PCI Express device. We've found a case
where the device needs a "fundamental reset" to recover properly. The
current PCI error recovery and EEH frameworks do not support this
distinction.
The attached patch (courtesy of Richard Lary) adds a bit field to
pci_dev that indicates whether the device requires a fundamental reset
during recovery.
These patches supersede the previously submitted patch that implemented
a fundamental reset bit field.
Signed-off-by: Mike Mason <mmlnx@us.ibm.com>
Signed-off-by: Richard Lary <rlary@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Background:
Graphic devices are accessed through ranges in I/O or memory space. While most
modern devices allow relocation of such ranges, some "Legacy" VGA devices
implemented on PCI will typically have the same "hard-decoded" addresses as
they did on ISA. For more details see "PCI Bus Binding to IEEE Std 1275-1994
Standard for Boot (Initialization Configuration) Firmware Revision 2.1"
Section 7, Legacy Devices.
The Resource Access Control (RAC) module inside the X server currently does
the task of arbitration when more than one legacy device co-exists on the same
machine. But the problem happens when these devices are trying to be accessed
by different userspace clients (e.g. two server in parallel). Their address
assignments conflict. Therefore an arbitration scheme _outside_ of the X
server is needed to control the sharing of these resources. This document
introduces the operation of the VGA arbiter implemented for Linux kernel.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Tiago Vignatti <tiago.vignatti@nokia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
IDs should generally only be added to pci_ids.h when they're shared
across several files in the tree. IDs that are just used by a single
driver should be defined in the driver instead.
Perhaps documenting this is a good idea to prevent things being moved there,
as it still seems to be happening judging from the git log.
(based on discussion w/gregkh and others).
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some devices allow an individual function to be reset without affecting
other functions in the same device: that's what pci_reset_function does.
For devices that have this support, expose reset attribite in sysfs.
This is useful e.g. for virtualization, where a qemu userspace
process wants to reset the device when the guest is reset,
to emulate machine reboot as closely as possible.
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We can simplify ACPI drivers if we can tell whether a handle is an
ACPI PCI root or not.
Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This was #define'd as 0 on all platforms, so let's get rid of it.
This change makes pci_scan_slot() slightly easier to read.
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Tony Luck <tony.luck@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Russell King <linux@arm.linux.org.uk>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Set child_runs_first default to off.
It hurts 'optimal' make -j<NR_CPUS> workloads as make jobs
get preempted by child tasks, reducing parallelism.
Note, this patch might make existing races in user
applications more prominent than before - so breakages
might be bisected to this commit.
Child-runs-first is broken on SMP to begin with, and we
already had it off briefly in v2.6.23 so most of the
offenders ought to be fixed. Would be nice not to revert
this commit but fix those apps finally ...
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1252486344.28645.18.camel@marge.simson.net>
[ made the sysctl independent of CONFIG_SCHED_DEBUG, in case
people want to work around broken apps. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add support for communicating with a Sonics Silicon Backplane through a
SDIO interface, as found in the Nintendo Wii WLAN daughter card.
The Nintendo Wii WLAN card includes a custom Broadcom 4318 chip with
a SDIO host interface.
Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fix apparent thinko related to RTM_DELADDRLABEL, introduced by commit
2a8cc6c890 ("[IPV6] ADDRCONF: Support
RFC3484 configurable address selection policy table.").
Signed-off-by: Tushar Gohad <tgohad@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are cases where full date information is required instead of
just the year. Add month and day parsing to dmi_get_year() and rename
it to dmi_get_date().
As the original function only required '/' followed by any number of
parseable characters at the end of the string, keep that behavior to
avoid upsetting existing users.
The new function takes dates of format [mm[/dd]]/yy[yy]. Year, month
and date are checked to be in the ranges of [1-9999], [1-12] and
[1-31] respectively and any invalid or out-of-range component is
returned as zero.
The dummy implementation is updated accordingly but the return value
is updated to indicate field not found which is consistent with how
other dummy functions behave.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
When mounting an "nfs" type file system, recognize "v4," "vers=4," or
"nfsvers=4" mount options, and convert the file system to "nfs4" under
the covers.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[trondmy: fixed up binary mount code so it sets the 'version' field too]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
shmfs wants purely standard POSIX ACL semantics, so we can use the new
generic VFS layer POSIX ACL checking rather than cooking our own
'permission()' function.
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is stage one in flattening out the callchains for the common
permission testing. Rather than have most filesystem implement their
own inode->i_op->permission function that just calls back down to the
VFS layers 'generic_permission()' with the per-filesystem ACL checking
function, the filesystem can just expose its 'check_acl' function
directly, and let the VFS layer do everything for it.
This is all just preparatory - no filesystem actually enables this yet.
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add appropriate const prefix to char * arguments in proc helper functions.
Also fixed the caller side to be proper const pointers.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Re-export snd_pcm_format_name() function to be used outside the PCM core.
As a first example, usbaudio is changed to use it now again.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This was a non-trivial merge with some patches sent to Linus
in drm-fixes.
Conflicts:
drivers/gpu/drm/radeon/r300.c
drivers/gpu/drm/radeon/radeon_asic.h
drivers/gpu/drm/radeon/rs600.c
drivers/gpu/drm/radeon/rs690.c
drivers/gpu/drm/radeon/rv515.c
Now that SD_WAKE_IDLE doesn't make pipe-test suck anymore,
enable it by default for MC, CPU and NUMA domains.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The struct snd_monitor_file is used locally only in sound/core/init.c,
thus it should be moved there from the public sound/core.h.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Fix the default security_session_to_parent() in linux/security.h to have a
body.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
This is a brute force removal of the wierd slave interface done for
DLCI -> SDLA transmit. Before it was using non-standard return values
and freeing skb in caller. This changes it to using normal return
values, and freeing in the callee. Luckly only one driver pair was
doing this. Not tested on real hardware, in fact I wonder if this
driver pair is even being used by any users.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a function that can be used to add the default mode for the output device
without EDID.
It will add the default mode that meets with the requirements of given
hdisplay/vdisplay limit.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Make security_cred_alloc_blank() return int, not void, when CONFIG_SECURITY=n.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
This patch adds a classful dummy scheduler which can be used as root qdisc
for multiqueue devices and exposes each device queue as a child class.
This allows to address queues individually and graft them similar to regular
classes. Additionally it presents an accumulated view of the statistics of
all real root qdiscs in the dummy root.
Two new callbacks are added to the qdisc_ops and qdisc_class_ops:
- cl_ops->select_queue selects the tx queue number for new child classes.
- qdisc_ops->attach() overrides root qdisc device grafting to attach
non-shared qdiscs to the queues.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
It will be used in a following patch by the multiqueue qdisc.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the multiqueue integration with the qdisc API suffers from
a few problems:
- with multiple queues, all root qdiscs use the same handle. This means
they can't be exposed to userspace in a backwards compatible fashion.
- all API operations always refer to queue number 0. Newly created
qdiscs are automatically shared between all queues, its not possible
to address individual queues or restore multiqueue behaviour once a
shared qdisc has been attached.
- Dumps only contain the root qdisc of queue 0, in case of non-shared
qdiscs this means the statistics are incomplete.
This patch reintroduces dev->qdisc, which points to the (single) root qdisc
from userspace's point of view. Currently it either points to the first
(non-shared) default qdisc, or a qdisc shared between all queues. The
following patches will introduce a classful dummy qdisc, which will be used
as root qdisc and contain the per-queue qdiscs as children.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
dm snapshot: fix on disk chunk size validation
dm exception store: split set_chunk_size
dm snapshot: fix header corruption race on invalidation
dm snapshot: refactor zero_disk_area to use chunk_io
dm log: userspace add luid to distinguish between concurrent log instances
dm raid1: do not allow log_failure variable to unset after being set
dm log: remove incorrect field from userspace table output
dm log: fix userspace status output
dm stripe: expose correct io hints
dm table: add more context to terse warning messages
dm table: fix queue_limit checking device iterator
dm snapshot: implement iterate devices
dm multipath: fix oops when request based io fails when no paths
Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since
"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
04b836cbf19e885f8366bccb2e4b0474346c02d
commit in 2.6.31.
But the root of the problem lies in the fact that do_execve() path calls
tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.
The tracee must not sleep in TASK_TRACED holding this mutex. Even if we
remove ->cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
another task doing PTRACE_ATTACH should not hang until it is killed or the
tracee resumes.
With this patch do_execve() does not use ->cred_guard_mutex directly and
we do not hold it throughout, instead:
- introduce prepare_bprm_creds() helper, it locks the mutex
and calls prepare_exec_creds() to initialize bprm->cred.
- install_exec_creds() drops the mutex after commit_creds(),
and thus before tracehook_report_exec()->ptrace_stop().
or, if exec fails,
free_bprm() drops this mutex when bprm->cred != NULL which
indicates install_exec_creds() was not called.
Reported-by: Tom Horsley <tom.horsley@att.net>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cancel_delayed_work() has to use del_timer_sync() to guarantee the timer
function is not running after return. But most users doesn't actually
need this, and del_timer_sync() has problems: it is not useable from
interrupt, and it depends on every lock which could be taken from irq.
Introduce __cancel_delayed_work() which calls del_timer() instead.
The immediate reason for this patch is
http://bugzilla.kernel.org/show_bug.cgi?id=13757
but hopefully this helper makes sense anyway.
As for 13757 bug, actually we need requeue_delayed_work(), but its
semantics are not yet clear.
Merge this patch early to resolves cross-tree interdependencies between
input and infiniband.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unlike on some other architectures ino_t is an unsigned int on s390.
So add an explicit cast to avoid lots of compile warnings:
In file included from include/trace/ftrace.h:285,
from include/trace/define_trace.h:61,
from include/trace/events/ext4.h:711,
from fs/ext4/super.c:50:
include/trace/events/ext4.h: In function 'ftrace_raw_output_ext4_free_inode':
include/trace/events/ext4.h:12: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
1. Updates fcoe_rcv() to queue incoming frames to the fcoe per
cpu thread on which this frame's exch was originated and simply
use current cpu for request exch not originated by initiator.
It is redundant to add this code under CONFIG_SMP, so removes
CONFIG_SMP uses around this code.
2. Updates fc_exch_em_alloc, fc_exch_delete, fc_exch_find to use
per cpu exch pools, here fc_exch_delete is rename of older
fc_exch_mgr_delete_ep since ep/exch are now deleted in pools
of EM and so brief new name is sufficient and better name.
Updates these functions to map exch id to their index into exch
pool using fc_cpu_mask, fc_cpu_order and EM min_xid.
This mapping is as per detailed explanation about this in
last patch and basically this is just as lower fc_cpu_mask
bits of exch id as cpu number and upper bit sum of EM min_xid
and exch index in pool.
Uses pool next_index to keep track of exch allocation from
pool along with pool_max_index as upper bound of exches array
in pool.
3. Adds exch pool ptr to fc_exch to free exch to its pool in
fc_exch_delete.
4. Updates fc_exch_mgr_reset to reset all exch pools of an EM,
this required adding fc_exch_pool_reset func to reset exches
in pool and then have fc_exch_mgr_reset call fc_exch_pool_reset
for each pool within each EM for a lport.
5. Removes no longer needed exches array, em_lock, next_xid, and
total_exches from struct fc_exch_mgr, these are not needed after
use of per cpu exch pool, also removes not used max_read,
last_read from struct fc_exch_mgr.
6. Updates locking notes for exch pool lock with fc_exch lock and
uses pool lock in exch allocation, lookup and reset.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Adds per cpu exch pool for these reasons:-
1. Currently an EM instance is shared across all cpus to manage
all exches for all cpus. This required em_lock across all
cpus for an exch alloc, free, lookup and reset each frame
and that made em_lock expensive, so instead having per cpu
exch pool with their own per cpu pool lock will likely reduce
locking contention in fast path for an exch alloc, free and
lookup.
2. Per cpu exch pool will likely improve cache hit ratio since
all frames of an exch will be processed on the same cpu on
which exch originated.
This patch is only prep work to help in keeping complexity of next
patch low, so this patch only sets up per cpu exch pool and related
helper funcs to be used by next patch. The next patch fully makes
use of per cpu exch pool in all code paths ie. tx, rx and reset.
Divides per EM exch id range equally across all cpus to setup per
cpu exch pool. This division is such that lower bits of exch id
carries cpu number info on which exch originated, later a simple
bitwise AND operation on exch id of incoming frame with fc_cpu_mask
retrieves cpu number info to direct all frames to same cpu on which
exch originated. This required a global fc_cpu_mask and fc_cpu_order
initialized to max possible cpus number nr_cpu_ids rounded up to 2's
power, this will be used in mapping exch id and exch ptr array
index in pool during exch allocation, find or reset code paths.
Adds a check in fc_exch_mgr_alloc() to ensure specified min_xid
lower bits are zero since these bits are used to carry cpu info.
Adds and initializes struct fc_exch_pool with all required fields
to manage exches in pool.
Allocates per cpu struct fc_exch_pool with memory for exches array
for range of exches per pool. The exches array memory is followed
by struct fc_exch_pool.
Adds fc_exch_ptr_get/set() helper functions to get/set exch ptr in
pool exches array at specified array index.
Increases default FCOE_MAX_XID to 0x0FFF from 0x07EF, so that more
exches are available per cpu after above described exch id range
division across all cpus to each pool.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If a target closed the connection, we will detect it in the
state_changed or data_ready callout. This adds a new conn
error value to use for this problem, so it is not confused
with when the initiator throws a conn error and drops
the connection.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Since the ability to swap the cpu buffers adds a small overhead to
the recording of a trace, we only want to add it when needed.
Only the irqsoff and preemptoff tracers use this feature, and both are
not recommended for production kernels. This patch disables its use
when neither irqsoff nor preemptoff is configured.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The latency tracers (irqsoff and wakeup) can swap trace buffers
on the fly. If an event is happening and has reserved data on one of
the buffers, and the latency tracer swaps the global buffer with the
max buffer, the result is that the event may commit the data to the
wrong buffer.
This patch changes the API to the trace recording to be recieve the
buffer that was used to reserve a commit. Then this buffer can be passed
in to the commit.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This shrinks the size of struct sctp_association a little.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
This patch introduces a new sysctl option to make IPv4 Address Scoping
configurable <draft-stewart-tsvwg-sctp-ipv4-00.txt>.
In networking environments where DNAT rules in iptables prerouting
chains convert destination IP's to link-local/private IP addresses,
SCTP connections fail to establish as the INIT chunk is dropped by the
kernel due to address scope match failure.
For example to support overlapping IP addresses (same IP address with
different vlan id) a Layer-5 application listens on link local IP's,
and there is a DNAT rule that maps the destination IP to a link local
IP. Such applications never get the SCTP INIT if the address-scoping
draft is strictly followed.
This sysctl configuration allows SCTP to function in such
unconventional networking environments.
Sysctl options:
0 - Disable IPv4 address scoping draft altogether
1 - Enable IPv4 address scoping (default, current behavior)
2 - Enable address scoping but allow IPv4 private addresses in init/init-ack
3 - Enable address scoping but allow IPv4 link local address in init/init-ack
Signed-off-by: Bhaskar Dutta <bhaskar.dutta@globallogic.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
We had a bug that we never stored the user-defined value for
MAXSEG when setting the value on an association. Thus future
PMTU events ended up re-writing the frag point and increasing
it past user limit. Additionally, when setting the option on
the socket/endpoint, we effect all current associations, which
is against spec.
Now, we store the user 'maxseg' value along with the computed
'frag_point'. We inherit 'maxseg' from the socket at association
creation and use it as an upper limit for 'frag_point' when its
set.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
SCTP will delay the last part of a large write due to NAGLE, if that
part is smaller then MTU. Since we are doing large writes, we might
as well send the last portion now instead of waiting untill the next
large write happens. The small portion will be sent as is regardless,
so it's better to not delay it.
This is a result of much discussions with Wei Yongjun <yjwei@cn.fujitsu.com>
and Doug Graham <dgraham@nortel.com>. Many thanks go out to them.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
SCTP has a problem that when small chunks are used, it is possible
to exhaust the receiver buffer without fully closing receive window.
This happens due to all overhead that we have account for with small
messages. To fix this, when receive buffer is exceeded, we'll drop
the window to 0 and save the 'drop' portion. When application starts
reading data and freeing up recevie buffer space, we'll wait until
we've reached the 'drop' window and then add back this 'drop' one
mtu at a time. This worked well in testing and under stress produced
rather even recovery.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Currenlty, sctp breaks up user messages into fragments and
sends each fragment to the lower layer by itself. This means
that for each fragment we go all the way down the stack
and back up. This also discourages bundling of multiple
fragments when they can fit into a sigle packet (ex: due
to user setting a low fragmentation threashold).
We introduce a new command SCTP_CMD_SND_MSG and hand the
whole message down state machine. The state machine and
the side-effect parser will cork the queue, add all chunks
from the message to the queue, and then un-cork the queue
thus causing the chunks to get transmitted.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
If a socket has a lot of association that are in the process of
of being closed/aborted, it is possible for a remote to establish
new associations during the time period that the old ones are shutting
down. If this was a result of a close() call, there will be no socket
and will cause a memory leak. We'll prevent this by setting the
socket state to CLOSING and disallow new associations when in this state.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
This patch removes an unused union definition (sctp_cmsg_data_t)
from include/net/sctp/user.h.
Signed-off-by: Rami Rosen <rosenrami@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>