In active-backup mode, the current bonding code duplicates IGMP
traffic to all slaves, so that switches are up to date in case of a
failover from an active to a backup interface. If bonding then fails
back to the original active interface, it is likely that the "active
slave" switch's IGMP forwarding for the port will be out of date until
some event occurs to refresh the switch (e.g., a membership query).
This patch alters the behavior of bonding to no longer flood
IGMP to all ports, and to issue IGMP JOINs to the newly active port at
the time of a failover. This insures that switches are kept up to date
for all cases.
"GOELLESCH Niels" <niels.goellesch@eurocontrol.int> originally
reported this problem, and included a patch. His original patch was
modified by Jay Vosburgh to additionally remove the existing IGMP flood
behavior, use RCU, streamline code paths, fix trailing white space, and
adjust for style.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix {nf,ip}_ct_iterate_cleanup unconfirmed list handling:
- unconfirmed entries can not be killed manually, they are removed on
confirmation or final destruction of the conntrack entry, which means
we might iterate forever without making forward progress.
This can happen in combination with the conntrack event cache, which
holds a reference to the conntrack entry, which is only released when
the packet makes it all the way through the stack or a different
packet is handled.
- taking references to an unconfirmed entry and using it outside the
locked section doesn't work, the list entries are not refcounted and
another CPU might already be waiting to destroy the entry
What the code really wants to do is make sure the references of the hash
table to the selected conntrack entries are released, so they will be
destroyed once all references from skbs and the event cache are dropped.
Since unconfirmed entries haven't even entered the hash yet, simply mark
them as dying and skip confirmation based on that.
Reported and tested by Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doing something like this on a two cpu system
# echo 0 > /sys/devices/system/cpu/cpu0/online
# echo 1 > /sys/devices/system/cpu/cpu0/online
# echo 0 > /sys/devices/system/cpu/cpu1/online
will give me this:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.21-rc2-g562aa1d4-dirty #7
-------------------------------------------------------
bash/1282 is trying to acquire lock:
(&cpu_base->lock_key){.+..}, at: [<000000000005f17e>] hrtimer_cpu_notify+0xc6/0x240
but task is already holding lock:
(&cpu_base->lock_key#2){.+..}, at: [<000000000005f174>] hrtimer_cpu_notify+0xbc/0x240
which lock already depends on the new lock.
This happens because we have the following code in kernel/hrtimer.c:
migrate_hrtimers(int cpu)
[...]
old_base = &per_cpu(hrtimer_bases, cpu);
new_base = &get_cpu_var(hrtimer_bases);
[...]
spin_lock(&new_base->lock);
spin_lock(&old_base->lock);
Which means the spinlocks are taken in an order which depends on which cpu
gets shut down from which other cpu. Therefore lockdep complains that there
might be an ABBA deadlock. Since migrate_hrtimers() gets only called on
cpu hotplug it's safe to assume that it isn't executed concurrently on a
The same problem exists in kernel/timer.c: migrate_timers().
As pointed out by Christian Borntraeger one possible solution to avoid
the locking order complaints would be to make sure that the locks are
always taken in the same order. E.g. by taking the lock of the cpu with
the lower number first.
To achieve this we introduce two new spinlock functions double_spin_lock
and double_spin_unlock which lock or unlock two locks in a given order.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Christian Borntraeger <cborntra@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently we do not check for vma flags if sys_move_pages is called to move
individual pages. If sys_migrate_pages is called to move pages then we
check for vm_flags that indicate a non migratable vma but that still
includes VM_LOCKED and we can migrate mlocked pages.
Extract the vma_migratable check from mm/mempolicy.c, fix it and put it
into migrate.h so that is can be used from both locations.
Problem was spotted by Lee Schermerhorn
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the SMT-nice feature which idles sibling cpus on SMT cpus to
facilitiate nice working properly where cpu power is shared. The idling of
cpus in the presence of runnable tasks is considered too fragile, easy to
break with outside code, and the complexity of managing this system if an
architecture comes along with many logical cores sharing cpu power will be
unworkable.
Remove the associated per_cpu_gain variable in sched_domains used only by
this code.
Also:
The reason is that with dynticks enabled, this code breaks without yet
further tweaks so dynticks brought on the rapid demise of this code. So
either we tweak this code or kill it off entirely. It was Ingo's preference
to kill it off. Either way this needs to happen for 2.6.21 since dynticks
has gone in.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The gpio_keys driver is wrongly ARM-specific; it can't build on
other platforms with GPIO suport. This fixes that problem.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: pHilipp Zabel <philipp.zabel@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Ben Nizette <ben.nizette@iinet.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In some cases when we are not using msi we need a way to ensure that the
hardware does not have an msi capability enabled. Currently the code has been
calling disable_msi_mode to try and achieve that. However disable_msi_mode
has several other side effects and is only available when msi support is
compiled in so it isn't really appropriate.
Instead this patch implements pci_msi_off which disables all msi and msix
capabilities unconditionally with no additional side effects.
pci_disable_device was redundantly clearing the bus master enable flag and
clearing the msi enable bit. A device that is not allowed to perform bus
mastering operations cannot generate intx or msi interrupt messages as those
are essentially a special case of dma, and require bus mastering. So the call
in pci_disable_device to disable msi capabilities was redundant.
quirk_pcie_pxh also called disable_msi_mode and is updated to use pci_msi_off.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[VLAN]: Avoid a 4-order allocation.
[HDLC] Fix dev->header_cache_update having a random value.
[NetLabel]: Verify sensitivity level has a valid CIPSO mapping
[PPPOE]: Key connections properly on local device.
[AF_UNIX]: Test against sk_max_ack_backlog properly.
[NET]: Fix bugs in "Whether sock accept queue is full" checking
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] MTX1: clear PCI errors
[MIPS] MTX1: add idsel cardbus ressources
[MIPS] MTX1: remove unneeded settings
[MIPS] dma_sync_sg_for_cpu is a no-op except for non-coherent R10000s.
[MIPS] Cobalt: update reserved resources
[MIPS] SN: PCI fixup needs to include <irq.h>.
[MIPS] DMA: Fix a bunch of warnings due to missing inline keywords.
[MIPS] RM: It should be #ifdef CONFIG_FOO not #if CONFIG_FOO ...
[MIPS] Fix and cleanup the mess that a dozen prom_printf variants are.
[MIPS] DEC: Remove redeclarations of mips_machgroup and mips_machtype.
[MIPS] No need to write c0_compare in plat_timer_setup
[MIPS] Convert to RTC-class ds1742 driver
[MIPS] Oprofile: Add missing break statements.
[MIPS] jmr3927: build fix
[MIPS] SNI: Fix mc146818_decode_year
[MIPS] Replace sys32_timer_create with the generic compat_sys_timer_create.
[MIPS] Replace sys32_socketcall with the generic compat_sys_socketcall.
[MIPS] N32 waitid is the same as o32.
The generic rtc-ds1742 driver can be used for RBTX4927 and JMR3927
(with __swizzle_addr trick). This patch also removes MIPS local
DS1742 stuff.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Use the standard magic.h for kvmfs.
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Allocate a distinct inode for every vcpu in a VM. This has the following
benefits:
- the filp cachelines are no longer bounced when f_count is incremented on
every ioctl()
- the API and internal code are distinctly clearer; for example, on the
KVM_GET_REGS ioctl, there is no need to copy the vcpu number from
userspace and then copy the registers back; the vcpu identity is derived
from the fd used to make the call
Right now the performance benefits are completely theoretical since (a) we
don't support more than one vcpu per VM and (b) virtualization hardware
inefficiencies completely everwhelm any cacheline bouncing effects. But
both of these will change, and we need to prepare the API today.
Signed-off-by: Avi Kivity <avi@qumranet.com>
This avoids having filp->f_op and the corresponding inode->i_fop different,
which is a little unorthodox.
The ioctl list is split into two: global kvm ioctls and per-vm ioctls. A new
ioctl, KVM_CREATE_VM, is used to create VMs and return the VM fd.
Signed-off-by: Avi Kivity <avi@qumranet.com>
This adds a special MSR based hypercall API to KVM. This is to be
used by paravirtual kernels and virtual drivers.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
The function ide_get_best_pio_mode() fails to return the correct IORDY setting
for the explicitly specified modes -- fix this along with the heading comment,
and also remove the long commented out code.
Also, while at it, correct the misliading comment about the PIO cycle time in
<linux/ide.h> -- it actually consists of only the active and recovery periods,
with only some chips also including the address setup time into equation...
[ bart: sl82c105 seems to be currently the only driver affected by this fix ]
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This patch splits the vlan_group struct into a multi-allocated struct. On
x86_64, the size of the original struct is a little more than 32KB, causing
a 4-order allocation, which is prune to problems caused by buddy-system
external fragmentation conditions.
I couldn't just use vmalloc() because vfree() cannot be called in the
softirq context of the RCU callback.
Signed-off-by: Dan Aloni <da-x@monatomic.org>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The information contained within platform_data should be self-contained.
Replace the pointer to a MAC address with the actual MAC address in
struct mv643xx_eth_platform_data.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Conditionalize all PM related stuff in libata core layer using
CONFIG_PM.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The initial simplex handling code is fooled if you suspend and resume.
This also causes problems with some single channel controllers which
claim to be simplex.
The fix is fairly simple, instead of keeping a flag to remember if we
gave away the simplex channel we remember the actual owner. As the owner
is always part of the host_set we don't even need a refcount.
Knowing the owner also means we can reassign simplex DMA channels in
future hotplug code etc if we need to
Signed-off-by: Alan Cox <alan@redhat.com>
(and a signed-off for the patch I sent before while I remember)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid:
HID: fix Logitech DiNovo Edge touchwheel and Logic3 /SpectraVideo middle button
HID: add git tree information to MAINTAINERS
HID: fix broken Logitech S510 keyboard report descriptor; make extra keys work
HID: fix possible double-free on error path in hid parser
HID: hid-debug.c should #include <linux/hid-debug.h>
HID: fix bug in zeroing the last field byte in output reports
USB HID: use CONFIG_HID_DEBUG for outputting report descriptor
USB HID: Fix USB vendor and product IDs endianness for USB HID devices
This patch provides the following hugetlb-related fixes to the recent stacked
shm files changes:
- Update is_file_hugepages() so it will reconize hugetlb shm segments.
- get_unmapped_area must be called with the nested file struct to handle
the sfd->file->f_ops->get_unmapped_area == NULL case.
- The fsync f_op must be wrapped since it is specified in the hugetlbfs
f_ops.
This is based on proposed fixes from Eric Biederman that were debugged and
tested by me. Without it, attempting to use hugetlb shared memory segments
on powerpc (and likely ia64) will kill your box.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: William Irwin <bill.irwin@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The CAPI trace debug functions were using a fixed size buffer, which can be
overflowed if wrong formatted CAPI messages were sent to the kernel capi
layer. The code was also not protected against multiple callers. This fix
bug 8028.
Additionally the patch make the CAPI trace functions optional.
Signed-off-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
linux/irq.h uses EINVAL but does not #include linux/errno.h. This results in
the compiler spitting out errors on some files.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
throttle_vm_writeout() is designed to wait for the dirty levels to subside.
But if the caller holds IO or FS locks, we might be holding up that writeout.
So change it to take a single nap to give other devices a chance to clean some
memory, then return.
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename PG_checked to PG_owner_priv_1 to reflect its availablilty as a
private flag for use by the owner/allocator of the page. In the case of
pagecache pages (which might be considered to be owned by the mm),
filesystems may use the flag.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
shmem_{nopage,mmap} are no longer used in ipc/shm.c
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move VIDIOC_DBG_S/G_REGISTER from the internal ioctl list to the
public ioctls, but mark it as experimental for now.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Add support for starting, stopping, pausing and resuming an MPEG (or similar
compressed stream) encoder.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The VIDIOC_G_ENC_INDEX ioctl can obtain the MPEG index from an MPEG
encoder.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The chip matching in struct v4l2_register for VIDIOC_DBG_G/S_REGISTER
was rather primitive. It could not be extended to other busses besides
i2c and it lacked a way to.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Dongle shipped with Logitech DiNovo Edge (0x046d/0xc714) behaves in a weird
non-standard way - it contains multiple reports with the same usage, which
results in remapping of GenericDesktop.X and GenericDesktop.Y usages to
GenericDesktop.Z and GenericDesktop.RX respectively, thus rendering the
touchwheel unusable.
The commit 3506897691 solved this
in a way that it didn't remap certain usages. This however breaks
(at least) middle button of Logic3 / SpectraVideo (0x1267/0x0210),
which in contrary requires the remapping.
To make both of the harware work, allow remapping of these usages again,
and introduce a quirk for Logitech DiNovo Edge "touchwheel" instead - we
disable remapping for key, abs and rel events only for this hardware.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch makes extra keys (F1-F12 in special mode, zooming, rotate, shuffle)
on Logitech S510 keyboard work.
Logitech S510 keyboard sends in report no. 3 keys which are far above the
logical maximum described in descriptor for given report.
This patch introduces a HID quirk for this wireless USB receiver/keyboard
in order to fix the report descriptor before it's being parsed - the logical
maximum and the number of usages is bumped up to 0x104d). The values are in the
"Reserved" area of consumer HUT, so HID_MAX_USAGE had to be changed too.
In addition to proper extracting of the values from report descriptor, proper
HID-input mapping is introduced for them.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This reverts 57a87bb072.
As H. Peter Anvin states, this change broke klibc and it's
not very easy to fix things up without duplicating everything
into userspace.
In the longer term we should have a better solution to this
problem, but for now let's unbreak things.
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit aeeddc1435, which was
half-baked and broken. It just resulted in compile errors, since
cpufreq_register_driver() still changes the 'driver_data' by setting
bits in the flags field. So claiming it is 'const' _really_ doesn't
work.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/kyle/parisc-2.6: (78 commits)
[PARISC] Use symbolic last syscall in __NR_Linux_syscalls
[PARISC] Add missing statfs64 and fstatfs64 syscalls
Revert "[PARISC] Optimize TLB flush on SMP systems"
[PARISC] Compat signal fixes for 64-bit parisc
[PARISC] Reorder syscalls to match unistd.h
Revert "[PATCH] make kernel/signal.c:kill_proc_info() static"
[PARISC] fix sys_rt_sigqueueinfo
[PARISC] fix section mismatch warnings in harmony sound driver
[PARISC] do not export get_register/set_register
[PARISC] add ENTRY()/ENDPROC() and simplify assembly of HP/UX emulation code
[PARISC] convert to use CONFIG_64BIT instead of __LP64__
[PARISC] use CONFIG_64BIT instead of __LP64__
[PARISC] add ASM_EXCEPTIONTABLE_ENTRY() macro
[PARISC] more ENTRY(), ENDPROC(), END() conversions
[PARISC] fix ENTRY() and ENDPROC() for 64bit-parisc
[PARISC] Fixes /proc/cpuinfo cache output on B160L
[PARISC] implement standard ENTRY(), END() and ENDPROC()
[PARISC] kill ENTRY_SYS_CPUS
[PARISC] clean up debugging printks in smp.c
[PARISC] factor syscall_restart code out of do_signal
...
Fix conflict in include/linux/sched.h due to kill_proc_info() being made
publicly available to PARISC again.
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
Revert "Driver core: let request_module() send a /sys/modules/kmod/-uevent"
Driver core: fix error by cleanup up symlinks properly
make kernel/kmod.c:kmod_mk static
power management: fix struct layout and docs
power management: no valid states w/o pm_ops
Driver core: more fallout from class_device changes for pcmcia
sysfs: move struct sysfs_dirent to private header
driver core: refcounting fix
Driver core: remove class_device_rename
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: export autosuspend delay in sysfs
sysfs: allow attributes to be added to groups
USB: make autosuspend delay a module parameter
USB: minor cleanups for sysfs.c
USB: add a blacklist for devices that can't handle some things we throw at them.
USB: refactor usb device matching and create usb_device_match
USB: Wacom driver updates
gadgetfs: Fixed bug in ep_aio_read_retry.
USB: Use USB defines in usbmouse.c and usbkbd.c
USB: add rationale on why usb descriptor structures have to be packed
USB: ftdi_sio: Adding VID and PID for Tellstick
UHCI: Eliminate asynchronous skeleton Queue Headers
UHCI: Add macros for computing DMA values
USB: Davicom DM9601 usbnet driver
USB: asix.c - Add JVC-PRX1 ids
usbmon: Remove erroneous __exit
USB: add driver for iowarrior devices.
USB: option: add a bunch of new device ids
USB: option: remove duplicate device id table
This patch replaces all instances of "set_native_irq_info(irq, mask)"
with "irq_desc[irq].affinity = mask". The latter form is clearer
uses fewer abstractions, and makes access to this field uniform
accross different architectures.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Also export dev_disable as this is needed by drivers doing slave decode
filtering, which will follow shortly
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch (as860) adds two new sysfs routines:
sysfs_add_file_to_group() and sysfs_remove_file_from_group().
A later patch adds code that uses the new routines.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as859) makes the default USB autosuspend delay a module
parameter of usbcore. By setting the delay value at boot time, users
will be able to prevent the system from autosuspending devices which
for some reason can't handle it.
The patch also stores the autosuspend delay as a per-device value. A
later patch will allow the user to change the value, tailoring the
delay for each individual device. A delay value of 0 will prevent
autosuspend.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds a blacklist to the USB core to handle some autosuspend and
string issues that devices have.
Originally written by Oliver, but hacked up a lot by Greg.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add argumentation in defense of using __attribute__((packed)) in USB
descriptors authored by Dave Brownell. Necessary as in some cases it
seems superfluous.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The ioctl is commented out for now, until we verify some userspace
application issues.
Cc: Christian Lucht <lucht@codemercs.com>
Cc: Robert Marquardt <marquardt@codemercs.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This reverts commit c353c3fb07.
It turns out that we end up with a loop trying to load the unix
module and calling netfilter to do that. Will redo the patch
later to not have this loop.
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Because the pm ops in powermac are obviously not using them as intended, I
added documentation for it in kernel-doc format.
Reordering the fields in struct pm_ops not only makes the output of kernel-doc
make more sense but also removes a hole from the structure on 64-bit
platforms.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Macheck <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
struct sysfs_dirent is private to the fs/sysfs/ subtree. It is
not even referenced as an opaque structure outside of that subtree.
The following patch moves the declaration from include/linux/sysfs.h to
fs/sysfs/sysfs.h, making it clearer that nothing else in the kernel
dereferences it.
I have been running this patch for years. Please integrate and forward
upstream if there are no objections.
From: "Adam J. Richter" <adam@yggdrasil.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
No one uses it, and it wasn't exported to modules, so remove it. The
only other user of it was the network code, which is now converted to
use struct device instead.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
videodev2.h contains just the V4L2 API structs and defines.
By allowing this header file to be dual GPL/BSD will enable sharing
userspace apps between Linux and *BSD systems. It will also allow developing
newer BSD licensed drivers that can be shared on Linux and *BSD.
It should be noticed that most of the current V4L drivers, and v4l core
itself are GPL only. This won't be changed by this patch.
Signed-off-by: Michael H. Schimek <mschimek@gmx.at>
Signed-off-by: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Bill Dirks <bill@thedirks.org>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Martin Rubli <mrubli@gmx.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Remove a section containing basically ideas for future sliced VBI standards.
This can be resurrected should any of this be actually implemented. For now
it only pollutes this header file.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The Sliced VBI API is no longer marked experimental. Introduced in 2.6.14
and with only a single modification in 2.6.19 I think we can consider this
API to be solid.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Maybe someday there will be a device with a register address space >
32-bits, or maybe an i2c device which uses a protocol > 4 bytes long to
address its registers.
Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The direct register access ioctls were defined as kernel internal only,
but they are very useful for debugging hardware from userspace and are
used as such. Officially export them.
VIDIOC_INT_[SG]_REGISTER is renamed to VIDIOC_DBG_[SG]_REGISTER
Definition of ioctl and struct v4l2_register is moved from v4l2-common.h
to videodev2.h.
Types used in struct v4l2_register are changed to the userspace
exportable versions (u32 -> __u32, etc).
Use of VIDIOC_DBG_S_REGISTER requires CAP_SYS_ADMIN permission, so move
the check into the video_ioctl2() dispatcher so it doesn't need to be
duplicated in each driver's call-back function. CAP_SYS_ADMIN check is
added to pvrusb2 (which doesn't use video_ioctl2).
Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Bill Dirks asked me to update his entries at kernel files, since
he change his e-mail.
I've also updated a few web broken links or obsolete info to the curent
sites where V4L drivers and API are being discussed currently.
CC: Bill Dirks <bill@thedirks.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
ata_port has two different id fields - id and port_no. id is
system-wide 1-based unique id for the port while port_no is 0-based
host-wide port number. The former is primarily used to identify the
ATA port to the user in printk messages while the latter is used in
various places in libata core and LLDs to index the port inside the
host.
The two fields feel quite similar and sometimes ap->id is used in
place of ap->port_no, which is very difficult to spot. This patch
renames ap->id to ap->print_id to reduce the possibility of such bugs.
Some printk messages are adjusted such that id string (ata%u[.%u])
isn't printed twice and/or to use ata_*_printk() instead of hardcoded
id format.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The current EH speed down code is more of a proof that the EH
framework is capable of adjusting transfer speed in response to error.
This patch puts some intelligence into EH speed down sequence. The
rules are..
* If there have been more than three timeout, HSM violation or
unclassified DEV errors for known supported commands during last 10
mins, NCQ is turned off.
* If there have been more than three timeout or HSM violation for known
supported command, transfer mode is slowed down. If DMA is active,
it is first slowered by one grade (e.g. UDMA133->100). If that
doesn't help, it's slowered to 40c limit (UDMA33). If PIO is
active, it's slowered by one grade first. If that doesn't help,
PIO0 is forced. Note that this rule does not change transfer mode.
DMA is never degraded into PIO by this rule.
* If there have been more than ten ATA bus, timeout, HSM violation or
unclassified device errors for known supported commands && speeding
down DMA mode didn't help, the device is forced into PIO mode. Note
that this rule is considered only for PATA devices and is pretty
difficult to trigger.
One error can only trigger one rule at a time. After a rule is
triggered, error history is cleared such that the next speed down
happens only after some number of errors are accumulated. This makes
sense because now speed down is done in bigger stride.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This is the patch for PATA controller of Celleb.
This driver uses the managed iomap (devres).
Because this driver needs special taskfile accesses, there is
a copy of ata_std_softreset(). ata_dev_try_classify() is exported
so that it can be used in this function.
Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Akira Iguchi <akira2.iguchi@toshiba.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The current header file definitions for autofs version 5 have caused a couple
of problems for application builds downstream.
This fixes the problem by separating the definitions.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix 23 of these sparse warnings on x86_64 allmodconfig:
include/linux/cdrom.h:942:19: error: dubious bitfield without explicit
`signed' or `unsigned'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This driver provides the core functionality of the SM501, which is a
multi-function chip including two framebuffers, video acceleration, USB,
and many other peripheral blocks.
The driver exports a number of entries for the peripheral drivers to use.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The problem comes when ks0108/cfag12864b are built-in and no parallel port is
present. ks0108_init() is called first, as it should be, but fails to load
(as there is no parallel port to use).
After that, cfag12864b_init() gets called, without knowing anything about
ks0108 failed, and calls ks0108_writecontrol(), which dereferences an
uninitialized pointer.
Init order is OK, I think. The problem is how to stop cfag12864b_init() being
called if ks0108 failed to load. modprobe does it for us, but, how when
built-in?
Signed-off-by: Miguel Ojeda Sandonis <maxextreme@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We frequently need the maximum number of possible processors in order to
allocate arrays for all processors. So far this was done using
highest_possible_processor_id(). However, we do need the number of
processors not the highest id. Moreover the number was so far dynamically
calculated on each invokation. The number of possible processors does not
change when the system is running. We can therefore calculate that number
once.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
highest_possible_node_id() is currently used to calculate the last possible
node idso that the network subsystem can figure out how to size per node
arrays.
I think having the ability to determine the maximum amount of nodes in a
system at runtime is useful but then we should name this entry
correspondingly, it should return the number of node_ids, and the the value
needs to be setup only once on bootup. The node_possible_map does not
change after bootup.
This patch introduces nr_node_ids and replaces the use of
highest_possible_node_id(). nr_node_ids is calculated on bootup when the
page allocators pagesets are initialized.
[deweerdt@free.fr: fix oops]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
allnoconfig:
mm/mincore.c: In function 'do_mincore':
mm/mincore.c:122: warning: unused variable 'entry'
Yet another entry in the why-macros-are-wrong encyclopedia.
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several people have reported failures in dynamic major device number handling
due to the recent changes in there to avoid handing out the local/experimental
majors.
Rolf reports that this is due to a gcc-4.1.0 bug.
The patch refactors that code a lot in an attempt to provoke the compiler into
behaving.
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Somehow we got the layout of the v3 superblock wrong, which causes crashes due
to overindexing of the buffer_head array in statfs on large fielsystems.
Cc: "Cedric Augonnet" <cedric.augonnet@gmail.com>
Cc: "Daniel Aragones" <danarag@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If one of clear_bit, change_bit or set_bit is defined as a do { } while (0)
function usage of these functions in parenthesis like
(foo_bit(23, &var))
while be expaned to something like
(do { ... } while (0)}).
resulting in a build error. This patch removes the useless parenthesis.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (21 commits)
natsemi: Support Aculab E1/T1 PMXc cPCI carrier cards
natsemi: Add support for using MII port with no PHY
skge: race with workq and RTNL
Replace local random function with random32()
s2io: RTNL and flush_scheduled_work deadlock
8139too: RTNL and flush_scheduled_work deadlock
sis190: RTNL and flush_scheduled_work deadlock
r8169: RTNL and flush_scheduled_work deadlock
[PATCH] ieee80211softmac: Fix setting of initial transmit rates
[PATCH] bcm43xx: OFDM fix for rev 1 cards
[PATCH] bcm43xx: Fix for 4311 and 02/07/07 specification changes
[PATCH] prism54: correct assignment of DOT1XENABLE in WE-19 codepaths
[PATCH] zd1211rw: Readd zd_addr_t cast
[PATCH] bcm43xx: Fix for oops on resume
[PATCH] bcm43xx: Ignore ampdu status reports
[PATCH] wavelan: Use ARRAY_SIZE macro when appropriate
[PATCH] hostap: Use ARRAY_SIZE macro when appropriate
[PATCH] misc-wireless: Use ARRAY_SIZE macro when appropriate
[PATCH] ipw2100: Use ARRAY_SIZE macro when appropriate
[PATCH] bcm43xx: Janitorial change - remove two unused variables
...
Per device data such as brightness belongs to the indivdual device
and should therefore be separate from the the backlight operation
function pointers. This patch splits the two types of data and
allows simplifcation of some code.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
fb_info->bl_mutex is badly thought out and the backlight class doesn't
need it if the framebuffer/backlight register/unregister order is
consistent, particularly after the backlight locking fixes.
Fix the drivers to use the order:
backlight_device_register()
register_framebuffer()
unregister_framebuffer()
backlight_device_unregister()
and turn bl_mutex into a lock for the bl_curve data only.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
The backlight class wants notification whenever the console is blanked
but doesn't get this when hardware blanking fails and software blanking
is used. Changing FB_EVENT_BLANK to report both would be a behaviour
change which could confuse the console layer so add a new event for
software blanking and have the backlight class listen for both.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
backlight_device->sem has a very specific use as documented in the
header file. The external users of this are using it for a different
reason, to serialise access to the update_status() method.
backlight users were supposed to implement their own internal
serialisation of update_status() if needed but everyone is doing
things differently and incorrectly. Therefore add a global mutex to
take care of serialisation for everyone, once and for all.
Locking for get_brightness remains optional since most users don't
need it.
Also update the lcd class in a similar way.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Remove uneeded owner field from backlight_properties structure.
Nothing uses it and it is unlikely that it will ever be used. The
backlight class uses other means to ensure that nothing references
unloaded code.
Based on a patch from Dmitry Torokhov <dtor@insightbb.com>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
powerpc gets:
init/main.c: In function `do_basic_setup':
init/main.c:714: warning: implicit declaration of function `init_irq_proc'
but we cannot include linux/irq.h in generic code.
Fix it by moving the declaration into linux/interrupt.h instead.
And make sure all code that defines init_irq_proc() is including
linux/interrupt.h.
And nuke an ifdef-in-C
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/dtor/input:
Input: remove obsolete setup parameters from input drivers
Input: HIL - fix improper call to release_region()
Input: hid-lgff - treat devices as joysticks unless told otherwise
Input: HID - add support for Logitech Formula Force EX
Input: gpio-keys - switch to common GPIO API
Input: do not lock device when showing name, phys and uniq
Input: i8042 - let serio bus suspend ports
Input: psmouse - properly reset mouse on shutdown/suspend
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
Documentation/kernel-docs.txt update.
arch/cris: typo in KERN_INFO
Storage class should be before const qualifier
kernel/printk.c: comment fix
update I/O sched Kconfig help texts - CFQ is now default, not AS.
Remove duplicate listing of Cris arch from README
kbuild: more doc. cleanups
doc: make doc. for maxcpus= more visible
drivers/net/eexpress.c: remove duplicate comment
add a help text for BLK_DEV_GENERIC
correct a dead URL in the IP_MULTICAST help text
fix the BAYCOM_SER_HDX help text
fix SCSI_SCAN_ASYNC help text
trivial documentation patch for platform.txt
Fix typos concerning hierarchy
Fix comment typo "spin_lock_irqrestore".
Fix misspellings of "agressive".
drivers/scsi/a100u2w.c: trivial typo patch
Correct trivial typo in log2.h.
Remove useless FIND_FIRST_BIT() macro from cardbus.c.
...
* 'acpi' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[PATCH] libata: wrong sizeof for BUFFER
[PATCH] libata: change order of _SDD/_GTF execution (resend #3)
[PATCH] libata: ACPI _SDD support
[PATCH] libata: ACPI and _GTF support
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (51 commits)
sk98lin: mark deprecated in Kconfig
Hostess SV-11 depends on INET
Fix link autonegotiation timer.
sk98lin: planned removal
B44: increase wait loop
b44: replace define
e1000: allow ethtool to see link status when down
e1000: remove obsolete custom pci_save_state code
e1000: fix shared interrupt warning message
atm: Use ARRAY_SIZE macro when appropriate
bugfixes and new hardware support for arcnet driver
pcnet32 NAPI no longer experimental
MAINTAINER
macb: Remove inappropriate spinlocks around mii calls
Convert meth to netdev_priv
sky2: v1.13
sky2: receive error handling improvements
sky2: transmit timeout
sky2: flow control negotiation for Yukon-FE
sky2: no need to reset pause bits on shutdown
...
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (117 commits)
[ARM] 4058/2: iop32x: set ->broken_parity_status on n2100 onboard r8169 ports
[ARM] 4140/1: AACI stability add ac97 timeout and retries
[ARM] 4139/1: AACI record support
[ARM] 4138/1: AACI: multiple channel support for IRQ handling
[ARM] 4211/1: Provide a defconfig for ns9xxx
[ARM] 4210/1: base for new machine type "NetSilicon NS9360"
[ARM] 4222/1: S3C2443: Remove reference to missing S3C2443_PM
[ARM] 4221/1: S3C2443: DMA support
[ARM] 4220/1: S3C24XX: DMA system initialised from sysdev
[ARM] 4219/1: S3C2443: DMA source definitions
[ARM] 4218/1: S3C2412: fix CONFIG_CPU_S3C2412_ONLY wrt to S3C2443
[ARM] 4217/1: S3C24XX: remove the dma channel show at startup
[ARM] 4090/2: avoid clash between PXA and SA1111 defines
[ARM] 4216/1: add .gitignore entries for ARM specific files
[ARM] 4214/2: S3C2410: Add Armzone QT2410
[ARM] 4215/1: s3c2410 usb device: per-platform vbus_draw
[ARM] 4213/1: S3C2410 - Update definition of ADCTSC_XY_PST
[ARM] 4098/1: ARM: rtc_lock only used with rtc_cmos
[ARM] 4137/1: Add kexec support
[ARM] 4201/1: SMP barriers pair needed for the secondary boot process
...
Fix up conflict due to typedef removal in sound/arm/aaci.h
Let serio subsystem take care of suspending the ports; concentrate
on suspending/resuming the controller itself.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Provide an audit record of the descriptor pair returned by pipe() and
socketpair(). Rewritten from the original posted to linux-audit by
John D. Ramsdell <ramsdell@mitre.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add device id for the Attansic L1 chip to pci_ids.h, then use it.
Signed-off-by: Chris Snook <csnook@redhat.com>
Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix the various misspellings of "agressive", as well as a couple
other things on the same lines while we're there.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Single typo correction in include/linux/log2.h.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Correct mis-spellings of "algorithm", "appear", "consistent" and
(shame, shame) "kernel".
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
* since ide_hwif_t.ide_dma_host_on is called either when drive->using_dma == 1
or when return value is discarded make it void, also drop "ide_" prefix
* make __ide_dma_host_on() void and drop "__" prefix
v2:
* while at it rename atiixp_ide_dma_host_on() to atiixp_dma_host_on()
and sgiioc4_ide_dma_host_on() to sgiioc4_dma_host_on().
[ Noticed by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* since ide_hwif_t.ide_dma_{host_off,off_quietly} always return '0'
make these functions void and while at it drop "ide_" prefix
* fix comment for __ide_dma_off_quietly()
* make __ide_dma_{host_off,off_quietly,off}() void and drop "__" prefix
v2:
* while at it rename atiixp_ide_dma_host_off() to atiixp_dma_host_off(),
sgiioc4_ide_dma_{host_off,off_quietly}() to sgiioc4_dma_{host_off,off_quietly}()
and sl82c105_ide_dma_off_quietly() to sl82c105_dma_off_quietly()
[ Noticed by Sergei Shtylyov <sshtylyov@ru.mvista.com>. ]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* add ide_set_dma() helper and make ide_hwif_t.ide_dma_check return
-1 when DMA needs to be disabled (== need to call ->ide_dma_off_quietly)
0 when DMA needs to be enabled (== need to call ->ide_dma_on)
1 when DMA setting shouldn't be changed
* fix IDE code to use ide_set_dma() instead if using ->ide_dma_check directly
v2:
* updated for scc_pata
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
All users of ->mmio == 1 are gone so convert ->mmio into flag.
Noticed by Alan Cox.
v2:
* updated for scc_pata
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This results in smaller/faster/simpler code and allows future optimizations.
Also remove no longer needed ide[_mm]_{inl,outl}() and ide_hwif_t.{INL,OUTL}.
v2:
* updated for scc_pata
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* add ide_use_fast_pio() helper for use by host drivers
* add DMA capability and hwif->autodma checks to ide_use_dma()
- au1xxx-ide/it8213/it821x drivers didn't check for (id->capability & 1)
[ for the IT8211/2 in SMART mode this check shouldn't be made but since
in it821x_fixups() we set DMA bit explicitly:
if(strstr(id->model, "Integrated Technology Express")) {
/* In raid mode the ident block is slightly buggy
We need to set the bits so that the IDE layer knows
LBA28. LBA48 and DMA ar valid */
id->capability |= 3; /* LBA28, DMA */
we are better off using generic helper if we can ]
- ide-cris driver didn't set ->autodma
[ before the patch hwif->autodma was only checked in the chipset specific
hwif->ide_dma_check implementations, for ide-cris it is cris_dma_check()
function so there no behavior change here ]
v2:
* updated patch description (thanks to Alan Cox for the feedback)
v3:
* updated for scc_pata driver
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This field is no longer used by the core IDE code so fix ide-{disk,floppy}
drivers to keep openers count in the driver specific objects and remove
it from ide-{cd,scsi,tape} drivers (it was write-only).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
patch 2/2:
Remove clearing bmdma status from cdrom_decode_status() since ATA devices
might need it as well.
(http://lkml.org/lkml/2006/12/4/201 and http://lkml.org/lkml/2006/11/15/94)
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "Adam W. Hawks" <awhawks@us.ibm.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
patch 1/2 (revised):
- Fix drive->waiting_for_dma to work with CDB-intr devices.
- Do the dma status clearing in ide_intr() and add a new
hwif->ide_dma_clear_irq for Intel ICHx controllers.
Revised per Alan, Sergei and Bart's advice.
Patch against 2.6.20-rc6. Tested ok on my ICH4 and pdc20275 adapters.
Please review/apply, thanks.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "Adam W. Hawks" <awhawks@us.ibm.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
this is Joris' fixes reshuffelled and features renamed as David requested.
- acm_set_control is not mandatory, honour that
- throtteling is reset upon open
- throtteling is read consistently when processing input data
Signed-off-by: Joris van Rantwijk <jorispubl@xs4all.nl>
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The status in usb_iso_packet_descriptor should be signed, for the benefit
of someone who casts to a long or makes other benign misstep (the principle
of least surprise).
Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Use __u32 rather than u32 in userspace ioctl defines.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
there's a USB mass storage device which exists in two version. One
reports the correct size and the other does not. Apart from that they
are identical and cannot be told apart. Here's a heuristic based on the
empirical finding that drives have even sizes.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
usb: descriptor structures have to be packed
Many of the Wireless USB decriptors added to usb_ch9.h don't have the
__attribute__((packed)) tag, and thus, they don't reflect the wire
size. This patch fixes that.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I added two fields to struct usb_serial_port to keep track of the
throttle state. Other usb-serial drivers typically use private data for
such things, but the generic driver can not really do that because some
of its code is also used by other drivers (which may have their own
private data needs).
As it is, I am not sure that this patch is useful in all scenarios.
It is certainly helpful for low-bandwidth devices that can hold their
data in response to throttling. But for devices that pump data in
real-time as fast as possible (webcam, A/D converter, etc), throttling
may actually cause more data loss.
From: Joris van Rantwijk <jorispubl@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
CARDBUS_MEM_SIZE was increased to 64MB on 2.6.20-rc2, but larger size might
result in allocation failure for the reserving itself on some platforms
(for example typical 32bit MIPS). Make it (and CARDBUS_IO_SIZE too)
customizable by "pci=" option for such platforms.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Daniel Ritz <daniel.ritz@gmx.ch>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
debugfs: implement symbolic links
Implement a new function debugfs_create_symlink() which can be used
to create symbolic links in debugfs. This function can be useful
for people moving functionality from /proc to debugfs (e.g. the
gcov-kernel patch).
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On recent systems, calls to /sbin/modprobe are handled by udev depending
on the kind of device the kernel has discovered. This patch creates an
uevent for the kernels internal request_module(), to let udev take control
over the request, instead of forking the binary directly by the kernel.
The direct execution of /sbin/modprobe can be disabled by setting:
/sys/module/kmod/mod_request_helper (/proc/sys/kernel/modprobe)
to an empty string, the same way /proc/sys/kernel/hotplug is disabled on an
udev system.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
_GTF is an acpi method that is used to reinitialize the drive. It returns
a task file containing ata commands that are sent back to the drive to restore
it to boot up defaults.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
(cherry picked from 9c69cab24b51a89664f4c0dfaf8a436d32117624 commit)
Simplify the memory management and code a bit by representing acls with an
array instead of a linked list.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that disable_irq() defaults to delayed-disable semantics, the IRQ_DISABLED
flag is not needed anymore.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add SysRq-Q to print pending timers and other timer info.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement high resolution timers on top of the hrtimers infrastructure and the
clockevents / tick-management framework. This provides accurate timers for
all hrtimer subsystem users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With Ingo Molnar <mingo@elte.hu>
Add functions to provide dynamic ticks and high resolution timers. The code
which keeps track of jiffies and handles the long idle periods is shared
between tick based and high resolution timer based dynticks. The dyntick
functionality can be disabled on the kernel commandline. Provide also the
infrastructure to support high resolution timers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With Ingo Molnar <mingo@elte.hu>
The tick-management code is the first user of the clockevents layer. It takes
clock event devices from the clock events core and uses them to provide the
periodic tick.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Architectures register their clock event devices, in the clock events core.
Users of the clockevents core can get clock event devices for their use. The
clockevents core code provides notification mechanisms for various clock
related management events.
This allows to control the clock event devices without the architectures
having to worry about the details of function assignment. This is also a
preliminary for high resolution timers and dynamic ticks to allow the core
code to control the clock functionality without intrusive changes to the
architecture code.
[Fixes-by: Ingo Molnar <mingo@elte.hu>]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow early access to the power management timer by exposing the verified read
function and providing a helper function which checks the pmtmr_ioport
variable and returns either the pm timer readout or 0 in case the pm timer is
not available.
Create a new header file and replace also the ifdef'ed extern definition in
arch/i386/kernel/acpi/boot.c
This is a preperatory patch for the rework of the local apic timer
calibration.
No functional changes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reintroduce ktimers feature "optimized away" by the ktimers review process:
remove the curr_timer pointer from the cpu-base and use the hrtimer state.
No functional changes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reintroduce ktimers feature "optimized away" by the ktimers review process:
multiple hrtimer states to enable the running of hrtimers without holding the
cpu-base-lock.
(The "optimized" rbtree hack carried only 2 states worth of information and we
need 4 for high resolution timers and dynamic ticks.)
No functional changes.
Build-fixes-from: Andrew Morton <akpm@osdl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Improve kernel/hrtimers.c locking: use a per-CPU base with a lock to control
locking of all clocks belonging to a CPU. This simplifies code that needs to
lock all clocks at once. This makes life easier for high-res timers and
dyntick.
No functional changes.
[ optimization change from Andrew Morton <akpm@osdl.org> ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- hrtimers did not use the hrtimer_restart enum and relied on the implict
int representation. Fix the prototypes and the functions using the enums.
- Use seperate name spaces for the enumerations
- Convert hrtimer_restart macro to inline function
- Add comments
No functional changes.
[akpm@osdl.org: fix input driver]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For CONFIG_NO_HZ we need to calculate the next timer wheel event based on a
given jiffie value. Extend the existing code to allow the extra 'now'
argument. Provide a compability function for the existing implementations to
call the function with now == jiffies. (This also solves the racyness of the
original code vs. jiffies changing during the iteration.)
No functional changes to existing users of this infrastructure.
[ remove WARN_ON() that triggered on s390, by Carsten Otte <cotte@de.ibm.com> ]
[ made new helper static, Adrian Bunk <bunk@stusta.de> ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Uninline irq_enter(). [dynticks adds more stuff to it]
No functional changes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The TSC needs to be verified against another clocksource. Instead of using
hardwired assumptions of available hardware, provide a generic verification
mechanism. The verification uses the best available clocksource and handles
the usability for high resolution timers / dynticks of the clocksource which
needs to be verified.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The clocksource code allows direct updates of the rating of a given
clocksource now. Change TSC unstable tracking to use this interface and
remove the update callback.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using a flag filed allows to encode more than one information into a variable.
Preparatory patch for the generic clocksource verification.
[mingo@elte.hu: convert vmitime.c to the new clocksource flag]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Enqueue clocksources in rating order to make selection of the clocksource
easier. Also check the match with an user override at enqueue time.
Preparatory patch for the generic clocksource verification.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Prevent timeout overflow if timer ticks are behind jiffies (due to high
softirq load or due to dyntick), by limiting the valid timeout range to
MAX_LONG/2.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are loads of fat functions hidden in jiffies.h. Uninline them. No code
changes.
[jeremy@goop.org: export fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Distangle the NTP update from HZ. This is necessary for dynamic tick enabled
kernels.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide funtions to:
- check, whether an interrupt can set the affinity
- pin the interrupt to a given cpu
Necessary for the ability to setup clocksources more flexible (e.g. use the
different HPET channels per CPU)
[akpm@osdl.org: alpha build fix]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a flag so we can prevent the irq balancing of an interrupt. Move the
bits, so we have room for more :)
Necessary for the ability to setup clocksources more flexible (e.g. use the
different HPET channels per CPU)
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add kexec support to ARM.
Improvements like commandline handling could be made but this patch gives
basic functional support. It uses the next available syscall number, 347.
Once the syscall number is known, userspace support will be
finalised/submitted to kexec-tools, various patches already exist.
Originally based on a patch by Maxim Syrchin but updated and forward
ported by various people.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This is the first preparation to doing the !IORDY cases properly. Further
diffs will then add the needed logic to do it right.
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The 80c wire bit is bit 13, not 14. Bit 14 is always 1 if word93 is
implemented. This increases the chance of incorrect wire detection
especially because host side cable detection is often unreliable and
we sometimes soley depend on drive side cable detection. Fix the test
and add word93 validity check.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
I just noticed the comments about even/odd ioctl command numbers in
Linux's wireless.h file are mixed up.
Signed-off-by: Ingo van Lil <inguin@gmx.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It isn't needed anymore, all of the users are gone, and all of the ctl_table
initializers have been converted to use explicit names of the fields they are
initializing.
[akpm@osdl.org: NTFS fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a parent entry into the ctl_table so you can walk the list of parents and
find the entire path to a ctl_table entry.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With this change the sysctl inodes can be cached and nothing needs to be done
when removing a sysctl table.
For a cost of 2K code we will save about 4K of static tables (when we remove
de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or
about half that on a 32bit arch.
The speed feels about the same, even though we can now cache the sysctl
dentries :(
We get the core advantage that we don't need to have a 1 to 1 mapping between
ctl table entries and proc files. Making it possible to have /proc/sys vary
depending on the namespace you are in. The currently merged namespaces don't
have an issue here but the network namespace under /proc/sys/net needs to have
different directories depending on which network adapters are visible. By
simply being a cache different directories being visible depending on who you
are is trivial to implement.
[akpm@osdl.org: fix uninitialised var]
[akpm@osdl.org: fix ARM build]
[bunk@stusta.de: make things static]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current logic to walk through the list of sysctl table headers is slightly
painful and implement in a way it cannot be used by code outside sysctl.c
I am in the process of implementing a version of the sysctl proc support that
instead of using the proc generic non-caching monster, just uses the existing
sysctl data structure as backing store for building the dcache entries and for
doing directory reads. To use the existing data structures however I need a
way to get at them.
[akpm@osdl.org: warning fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name. Which is
pain for caching and the proc interface never implemented.
I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.
So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Corey Minyard <minyard@acm.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are currently no users in the kernel for CTL_ANY and it only has effect
on the binary interface which is practically unused.
So this complicates sysctl lookups for no good reason so just remove it.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ocfs2 was did not have the binary number it uses under CTL_FS registered in
sysctl.h. Register it to avoid future conflicts, and change the name of the
definition to be in line with the rest of the sysctl numbers.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need to have the the definition of all top level sysctl directories
registers in sysctl.h so we don't conflict by accident and cause abi problems.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for using a filesystem UUID to identify and export point in the
filehandle.
For NFSv2, this UUID is xor-ed down to 4 or 8 bytes so that it doesn't take up
too much room. For NFSv3+, we use the full 16 bytes, and possibly also a
64bit inode number for exports beneath the root of a filesystem.
When generating an fsid to return in 'stat' information, use the UUID (hashed
down to size) if it is available and a small 'fsid' was not specifically
provided.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
IDs have been defined but not used by most of the I2C adapters.
By having a unique ID, clients can check for correct connection
during probe.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
* The Voodoo3 has no SMBus, it has two bit-banged busses which
already have an ID assigned (I2C_HW_B_VOO).
* The i2c-ipmi bus driver was a non-sense, it'll never be ported
to Linux 2.6.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Yani Ioannou <yani.ioannou@gmail.com>
Driver model updates for the I2C core:
- Add new suspend(), resume(), and shutdown() methods. Use them in the
standard driver model style; document them.
- Minor doc updates to highlight zero-initialized fields in drivers, and
the driver model accessors for "clientdata".
If any i2c drivers were previously using the old suspend/resume calls
in "struct driver", they were getting warning messages ... and will
now no longer work. Other than that, this patch changes no behaviors;
and it lets I2C drivers use conventional PM and shutdown support.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
During kernel bootup, a new T60 laptop (CoreDuo, 32-bit) hangs about
10%-20% of the time in acpi_init():
Calling initcall 0xc055ce1a: topology_init+0x0/0x2f()
Calling initcall 0xc055d75e: mtrr_init_finialize+0x0/0x2c()
Calling initcall 0xc05664f3: param_sysfs_init+0x0/0x175()
Calling initcall 0xc014cb65: pm_sysrq_init+0x0/0x17()
Calling initcall 0xc0569f99: init_bio+0x0/0xf4()
Calling initcall 0xc056b865: genhd_device_init+0x0/0x50()
Calling initcall 0xc056c4bd: fbmem_init+0x0/0x87()
Calling initcall 0xc056dd74: acpi_init+0x0/0x1ee()
It's a hard hang that not even an NMI could punch through! Frustratingly,
adding printks or function tracing to the ACPI code made the hangs go away
...
After some time an additional detail emerged: disabling the NMI watchdog
made these occasional hangs go away.
So i spent the better part of today trying to debug this and trying out
various theories when i finally found the likely reason for the hang: if
acpi_ns_initialize_devices() executes an _INI AML method and an NMI
happens to hit that AML execution in the wrong moment, the machine would
hang. (my theory is that this must be some sort of chipset setup method
doing stores to chipset mmio registers?)
Unfortunately given the characteristics of the hang it was sheer
impossible to figure out which of the numerous AML methods is impacted
by this problem.
As a workaround i wrote an interface to disable chipset-based NMIs while
executing _INI sections - and indeed this fixed the hang. I did a
boot-loop of 100 separate reboots and none hung - while without the patch
it would hang every 5-10 attempts. Out of caution i did not touch the
nmi_watchdog=2 case (it's not related to the chipset anyway and didnt
hang).
I implemented this for both x86_64 and i686, tested the i686 laptop both
with nmi_watchdog=1 [which triggered the hangs] and nmi_watchdog=2, and
tested an Athlon64 box with the 64-bit kernel as well. Everything builds
and works with the patch applied.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ARCH_HAVE_XTIME_LOCK is used by x86_64 arch . This arch needs to place a
read only copy of xtime_lock into vsyscall page. This read only copy is
named __xtime_lock, and xtime_lock is defined in
arch/x86_64/kernel/vmlinux.lds.S as an alias. So the declaration of
xtime_lock in kernel/timer.c was guarded by ARCH_HAVE_XTIME_LOCK define,
defined to true on x86_64.
We can get same result with _attribute__((weak)) in the declaration. linker
should do the job.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
acpi_table_parse_madt_family() is also used to parse SRAT entries.
So re-name it to acpi_table_parse_entries(), and re-name the
madt-specific variables within it accordingly.
cosmetic only.
Signed-off-by: Len Brown <len.brown@intel.com>
acpi_madt_entry_handler() is also used for the SRAT,
so re-name it acpi_table_entry_handler().
cosmetic only.
Signed-off-by: Len Brown <len.brown@intel.com>
RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request
twice on the same connection unless it is the NULL procedure. Section
3.1.1 suggests that the client should disconnect and reconnect if it
wants to retry a request.
Implement this by adding an rpc_clnt flag that an ULP can use to
specify that the underlying transport should be disconnected on a
major timeout. The NFSv4 client asserts this new flag, and requests
no retries after a minor retransmit timeout.
Note that disconnecting on a retransmit is in general not safe to do
if the RPC client does not reuse the TCP port number when reconnecting.
See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
CONNTRACK_STAT_INC assumes rcu_read_lock in nf_hook_slow disables
preemption as well, making it legal to use __get_cpu_var without
disabling preemption manually. The assumption is not correct anymore
with preemptable RCU, additionally we need to protect against softirqs
when not holding ip_conntrack_lock.
Add CONNTRACK_STAT_INC_ATOMIC macro, which disables local softirqs,
and use where necessary.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
- rename nf_logging to nf_loggers since its an array of registered loggers
- rename nf_log_unregister_logger() to nf_log_unregister() to make it
symetrical to nf_log_register() and convert all users
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the only user of nf_log_unregister_pf (nfnetlink_log) doesn't
check the return value, change it to void and bail out silently when
a non-existant address family is supplied.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is inspired by Arjan's "Patch series to mark struct
file_operations and struct inode_operations const".
Compile tested with gcc & sparse.
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many struct inode_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fbdev modedb: make more input and output pointer parameters const
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: James Simmons <jsimmons@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a proper prototype for tosh_smm() to include/linux/toshiba.h
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: James Simmons <jsimmons@infradead.org>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a driver for S3 Trio / S3 Virge. Driver is tested with most versions
of S3 Trio and with S3 Virge/DX, on i386.
(akpm: We kind-of have support for this hardware already, but...
virgefb.c
- amiga/zorro specific,
- broken (according to Kconfig),
- uses obsolete/nonexistent interface (struct display_switch)
- recent Adrian Bunk's patch removes this driver
S3triofb.c
- ppc/openfirmware specific
- minimal functionality
- broken (according to Kconfig),
- uses obsolete/nonexistent interface (struct display_switch)
)
Signed-off-by: Ondrej Zajicek <santiago@crfreenet.org>
Cc: James Simmons <jsimmons@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The following patchset allows a host with running virtual machines to be
suspended and, on at least a subset of the machines tested, resumed. Note
that this is orthogonal to suspending and resuming an individual guest to a
file.
A side effect of implementing suspend/resume is that cpu hotplug is now
supported. This should please the owners of big iron.
This patch:
KVM wants the cpu hotplug notifications, both for cpu hotplug itself, but more
commonly for host suspend/resume.
In order to avoid extensive #ifdefs, provide stubs when CONFIG_CPU_HOTPLUG is
not defined.
In all, we have four cases:
- UP: register and unregister stubbed out
- SMP+hotplug: full register and unregister
- SMP, no hotplug, core: register as __init, unregister stubbed
(cpus are brought up during core initialization)
- SMP, no hotplug, module: register and unregister stubbed out
(cpus cannot be brought up during module lifetime)
Signed-off-by: Avi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We report the value of cr8 to userspace on an exit. Also let userspace change
cr8 when we re-enter the guest. The lets 64-bit guest code maintain the tpr
correctly.
Thanks for Yaniv Kamay for the idea.
Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch adds ability to work with 64bit metadata, this made by replacing work
with 32bit pointers by inline functions.
Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds into write inode path function to write UFS2 inode, and
modifys allocate inode path to allocate and init additional inode chunks.
Also some cleanups:
- remove not used parameters in some functions
- remove i_gen field from ufs_inode_info structure,
there is i_generation in inode structure with same purposes.
Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current implementation stores a static command-line buffer allocated to
COMMAND_LINE_SIZE size. Most architectures stores two copies of this buffer,
one for future reference and one for parameter parsing.
Current kernel command-line size for most architecture is much too small for
module parameters, video settings, initramfs paramters and much more. The
problem is that setting COMMAND_LINE_SIZE to a grater value, allocates static
buffers.
In order to allow a greater command-line size, these buffers should be
dynamically allocated or marked as init disposable buffers, so unused memory
can be released.
This patch renames the static saved_command_line variable into
boot_command_line adding __initdata attribute, so that it can be disposed
after initialization. This rename is required so applications that use
saved_command_line will not be affected by this change.
It reintroduces saved_command_line as dynamically allocated buffer to match
the data in boot_command_line.
It also mark secondary command-line buffer as __initdata, and copies it to
dynamically allocated static_command_line buffer components may hold reference
to it after initialization.
This patch is for linux-2.6.20-rc4-mm1 and is divided to target each
architecture. I could not check this in any architecture so please forgive me
if I got it wrong.
The per-architecture modification is very simple, use boot_command_line in
place of saved_command_line. The common code is the change into dynamic
command-line.
This patch:
1. Rename saved_command_line into boot_command_line, mark as init
disposable.
2. Add dynamic allocated saved_command_line.
3. Add dynamic allocated static_command_line.
4. During startup copy: boot_command_line into saved_command_line. arch
command_line into static_command_line.
5. Parse static_command_line and not arch command_line, so arch
command_line may be freed.
Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the transport code for public key functionality in eCryptfs. It
manages encryption/decryption request queues with a transport mechanism.
Currently, netlink is the only implemented transport.
Each inode has a unique File Encryption Key (FEK). Under passphrase, a File
Encryption Key Encryption Key (FEKEK) is generated from a salt/passphrase
combo on mount. This FEKEK encrypts each FEK and writes it into the header of
each file using the packet format specified in RFC 2440. This is all
symmetric key encryption, so it can all be done via the kernel crypto API.
These new patches introduce public key encryption of the FEK. There is no
asymmetric key encryption support in the kernel crypto API, so eCryptfs pushes
the FEK encryption and decryption out to a userspace daemon. After
considering our requirements and determining the complexity of using various
transport mechanisms, we settled on netlink for this communication.
eCryptfs stores authentication tokens into the kernel keyring. These tokens
correlate with individual keys. For passphrase mode of operation, the
authentication token contains the symmetric FEKEK. For public key, the
authentication token contains a PKI type and an opaque data blob managed by
individual PKI modules in userspace.
Each user who opens a file under an eCryptfs partition mounted in public key
mode must be running a daemon. That daemon has the user's credentials and has
access to all of the keys to which the user should have access. The daemon,
when started, initializes the pluggable PKI modules available on the system
and registers itself with the eCryptfs kernel module. Userspace utilities
register public key authentication tokens into the user session keyring.
These authentication tokens correlate key signatures with PKI modules and PKI
blobs. The PKI blobs contain PKI-specific information necessary for the PKI
module to carry out asymmetric key encryption and decryption.
When the eCryptfs module parses the header of an existing file and finds a Tag
1 (Public Key) packet (see RFC 2440), it reads in the public key identifier
(signature). The asymmetrically encrypted FEK is in the Tag 1 packet;
eCryptfs puts together a decrypt request packet containing the signature and
the encrypted FEK, then it passes it to the daemon registered for the
current->euid via a netlink unicast to the PID of the daemon, which was
registered at the time the daemon was started by the user.
The daemon actually just makes calls to libecryptfs, which implements request
packet parsing and manages PKI modules. libecryptfs grabs the public key
authentication token for the given signature from the user session keyring.
This auth tok tells libecryptfs which PKI module should receive the request.
libecryptfs then makes a decrypt() call to the PKI module, and it passes along
the PKI block from the auth tok. The PKI uses the blob to figure out how it
should decrypt the data passed to it; it performs the decryption and passes
the decrypted data back to libecryptfs. libecryptfs then puts together a
reply packet with the decrypted FEK and passes that back to the eCryptfs
module.
The eCryptfs module manages these request callouts to userspace code via
message context structs. The module maintains an array of message context
structs and places the elements of the array on two lists: a free and an
allocated list. When eCryptfs wants to make a request, it moves a msg ctx
from the free list to the allocated list, sets its state to pending, and fires
off the message to the user's registered daemon.
When eCryptfs receives a netlink message (via the callback), it correlates the
msg ctx struct in the alloc list with the data in the message itself. The
msg->index contains the offset of the array of msg ctx structs. It verifies
that the registered daemon PID is the same as the PID of the process that sent
the message. It also validates a sequence number between the received packet
and the msg ctx. Then, it copies the contents of the message (the reply
packet) into the msg ctx struct, sets the state in the msg ctx to done, and
wakes up the process that was sleeping while waiting for the reply.
The sleeping process was whatever was performing the sys_open(). This process
originally called ecryptfs_send_message(); it is now in
ecryptfs_wait_for_response(). When it wakes up and sees that the msg ctx
state was set to done, it returns a pointer to the message contents (the reply
packet) and returns. If all went well, this packet contains the decrypted
FEK, which is then copied into the crypt_stat struct, and life continues as
normal.
The case for creation of a new file is very similar, only instead of a decrypt
request, eCryptfs sends out an encrypt request.
> - We have a great clod of key mangement code in-kernel. Why is that
> not suitable (or growable) for public key management?
eCryptfs uses Howells' keyring to store persistent key data and PKI state
information. It defers public key cryptographic transformations to userspace
code. The userspace data manipulation request really is orthogonal to key
management in and of itself. What eCryptfs basically needs is a secure way to
communicate with a particular daemon for a particular task doing a syscall,
based on the UID. Nothing running under another UID should be able to access
that channel of communication.
> - Is it appropriate that new infrastructure for public key
> management be private to a particular fs?
The messaging.c file contains a lot of code that, perhaps, could be extracted
into a separate kernel service. In essence, this would be a sort of
request/reply mechanism that would involve a userspace daemon. I am not aware
of anything that does quite what eCryptfs does, so I was not aware of any
existing tools to do just what we wanted.
> What happens if one of these daemons exits without sending a quit
> message?
There is a stale uid<->pid association in the hash table for that user. When
the user registers a new daemon, eCryptfs cleans up the old association and
generates a new one. See ecryptfs_process_helo().
> - _why_ does it use netlink?
Netlink provides the transport mechanism that would minimize the complexity of
the implementation, given that we can have multiple daemons (one per user). I
explored the possibility of using relayfs, but that would involve having to
introduce control channels and a protocol for creating and tearing down
channels for the daemons. We do not have to worry about any of that with
netlink.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NFS_SUPER_MAGIC is already defined in include/linux/magic.h
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for IPv6 addresses in the RPC server's UDP receive path.
[akpm@linux-foundation.org: cleanups]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The rq_daddr field must support larger addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Expand the rq_addr field to allow it to contain larger addresses.
Specifically, we replace a 'sockaddr_in' with a 'sockaddr_storage', then
everywhere the 'sockaddr_in' was referenced, we use instead an accessor
function (svc_addr_in) which safely casts the _storage to _in.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sockaddr_storage will allow us to store arbitrary socket addresses in the
svc_deferred_req struct.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are loads of places where the RPC server assumes that the rq_addr fields
contains an IPv4 address. Top among these are error and debugging messages
that display the server's IP address.
Let's refactor the address printing into a separate function that's smart
enough to figure out the difference between IPv4 and IPv6 addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The remote peer's address won't change after the socket has been accepted. We
don't need to call ->getname on every incoming request.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sometimes we need to create an RPC service but not register it with the local
portmapper. NFSv4 delegation callback, for example.
Change the svc_makesock() API to allow optionally creating temporary or
permanent sockets, optionally registering with the local portmapper, and make
it return the ephemeral port of the new socket.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently in the RPC server, registering with the local portmapper and
creating "permanent" sockets are tied together. Expand the internal APIs to
allow these two socket characteristics to be separately specified.
This will be externalized in the next patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that I have changed all of the in-tree users remove the old version of
these functions. This should make it clear to any out of tree users that they
should be using kill_pgrp kill_pgrp_info or __kill_pgrp_info instead.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that I have changed all of the users remove the old version of these
functions. This should be a clear hint to any out of tree users that they
should use do_each_pid_task and while_each_pid_task for new code.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Of kernel subsystems that work with pids the tty layer is probably the largest
consumer. But it has the nice virtue that the assiation with a session only
lasts until the session leader exits. Which means that no reference counting
is required. So using struct pid winds up being a simple optimization to
avoid hash table lookups.
In the long term the use of pid_nr also ensures that when we have multiple pid
spaces mixed everything will work correctly.
Signed-off-by: Eric W. Biederman <eric@maxwell.lnxi.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Every call to is_orphaned_pgrp passed in process_group(current) which is racy
with respect to another thread changing our process group. It didn't bite us
because we were dealing with integers and the worse we would get would be a
stale answer.
In switching the checks to use struct pid to be a little more efficient and
prepare the way for pid namespaces this race became apparent.
So I simplified the calls to the more specialized is_current_pgrp_orphaned so
I didn't have to worry about making logic changes to avoid the race.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To properly implement a pid namespace I need to deal exclusively in terms of
struct pid, because pid_t values become ambiguous.
To this end session_of_pgrp is transformed to take and return a struct pid
pointer. To avoid the need to worry about reference counting I now require my
caller to hold the appropriate locks. Leaving callers repsonsible for
increasing the reference count if they need access to the result outside of
the locks.
Since session_of_pgrp currently only has one caller and that caller simply
uses only test the result for equality with another process group, the locking
change means I don't actually have to acquire the tasklist_lock at all.
tiocspgrp is also modified to take and release the lock. The logic there is a
little more complicated but nothing I won't need when I convert pgrp of a tty
to a struct pid pointer.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The aim of this patch set is to start wrapping up the struct pid conversions.
As such this patchset culminates with the removal of kill_pg, kill_pg_info,
__kill_pg_info, do_each_task_pid, and while_each_task_pid.
kill_proc, daemonize, and kernel_thread are still in my sights but there is
still work to get to them.
The first three are basic cleanups around disassociate_ctty, while working on
converting it I found several issues. tty_old_pgrp can be a tricky concept to
wrap your head around.
1 tty: Make __proc_set_tty static.
2 tty: Clarify disassociate_ctty
3 tty: Fix the locking for signal->session in disassociate_ctty
These just stop using the old helper functions.
4 signal: Use kill_pgrp not kill_pg in the sunos compatibility code.
5 signal: Rewrite kill_something_info so it uses newer helpers.
Then the grind to convert the tty layer and all of it's helper functions to
struct pid.
6 pid: Make session_of_pgrp use struct pid instead of pid_t.
7 pid: Use struct pid for talking about process groups in exit.c
8 pid: Replace is_orphaned_pgrp with is_current_pgrp_orphaned
9 tty: Update the tty layer to work with struct pid.
A final helper function update.
10 pid: Replace do/while_each_task_pid with do/while_each_pid_task
And the removal of the functions that are now unused.
11 pid: Remove now unused do_each_task_pid and while_each_task_pid
12 pid: Remove the now unused kill_pg kill_pg_info and __kill_pg_info
All of these should be fairly simple and to the point.
This patch:
Currently all users of __proc_set_tty are in tty_io.c so make the function
static.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This morning I needed to read a Minix V3 filesystem, but unfortunately my
2.6.19 did not support that, and neither did the downloaded 2.6.20rc4.
Fortunately, google told me that Daniel Aragones had already done the work,
patch found at http://www.terra.es/personal2/danarag/
Unfortunaly, looking at the patch was painful to my eyes, so I polished it
a bit before applying. The resulting kernel boots, and reads the
filesystem it needed to read.
Signed-off-by: Daniel Aragones <danarag@gmail.com>
Signed-off-by: Andries Brouwer <aeb@cwi.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is adds a simple SPI EEPROM driver, providing access to the EEPROM
through sysfs much like the I2C "eeprom" driver ... except this driver
supports write access, and multiple EEPROM sizes.
From: "Tuppa, Walter" <walter.tuppa@siemens.com>
Since I have EEPROMs on SPI with different address sizing, I made some
changes to your at25.c to support them. Works perfectly. (Also includes a
small bugfix for the "what size address" test.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Walter Tuppa <walter.tuppa@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This clarifies some aspects of the SPI programming interface, based on
feedback from Hans-Peter Nilsson. The in-memory representation of words is
right-aligned, so for example a twelve bit word is stored using sixteen bits
with four undefined bits in the MSB. And controller drivers must reject
protocol tweaking modes they do not support.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I'd like to assign NULL to kfree()d members of a structure. I can't do
that without ugly casting (see the PXA patch) when the structure pointed to
is const-qualified. I don't really see a reason why the cleanup method
isn't allowed to alter the object it should clean up. :-)
No, I didn't test the PXA patch, but I verified that the NULL-assignment
doesn't stop me from doing rmmod/insmodding my own spi_bitbang-based
driver.
Signed-off-by: Hans-Peter Nilsson <hp@axis.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make the spi_unregister_driver() code fit in with the rest of the header
file, and only do the action if the driver passed is non-NULL.
This also makes the code a line smaller.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add wrappers for getting and setting the driver data using spi_device
instead of using dev_{get|set}_drvdata with &spi->dev, to mirror the
platform_{get|set}_drvdata.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the line discipline based driver for the Gigaset M101
wireless RS232 adapter. It also improves the documentation a bit.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Hansjoerg Lipp <hjlipp@web.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Was ufs_fs.h purposefully not exported to userspace or did it just slip
through the cracks ? assuming the latter scenario, the attached patch touches
up the relationship between ufs_fs.h and its sub headers (like ufs_fs_sb.h) so
that we can export it ... the silo bootloader takes advantage of this header
for example.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3117df0453 causes this:
In file included from arch/s390/kernel/early.c:13:
include/linux/lockdep.h:300: warning:
"struct task_struct" declared inside parameter list
include/linux/lockdep.h:300:
warning: its scope is only this definition or
declaration, which is probably not what you want
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While compiling my code with -Wconversion using gcc-trunk, I always get a
bunch of warrning from headers, here is fix for them:
__getblk is alawys called with unsigned argument,
but it takes signed, the same story with __bread,__breadahead and so on.
Signed-off-by: Tomasz Kvarsin
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove_dquot_ref can move to dqout.c instead of beeing in inode.c under
#ifdef CONFIG_QUOTA. Also clean the resulting code up a tiny little bit by
testing sb->dq_op earlier - it's constant over a filesystems lifetime.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since quota.h declares a R/W semaphore, it should include rwsem.h
explicitly.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove labs() since it is not used/needed.
Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, XFS uses BH_PrivateStart for flagging unwritten extent state in a
bufferhead. Recently, I found the long standing mmap/unwritten extent
conversion bug, and it was to do with partial page invalidation not clearing
the unwritten flag from bufferheads attached to the page but beyond EOF. See
here for a full explaination:
http://oss.sgi.com/archives/xfs/2006-12/msg00196.html
The solution I have checked into the XFS dev tree involves duplicating code
from block_invalidatepage to clear the unwritten flag from the bufferhead(s),
and then calling block_invalidatepage() to do the rest.
Christoph suggested that this would be better solved by pushing the unwritten
flag into the common buffer head flags and just adding the call to
discard_buffer():
http://oss.sgi.com/archives/xfs/2006-12/msg00239.html
The following patch makes BH_Unwritten a first class citizen.
Signed-off-by: Dave Chinner <dgc@sgi.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a NOPFN_REFAULT return code for vm_ops->nopfn() equivalent to
NOPAGE_REFAULT for vmops->nopage() indicating that the handler requests a
re-execution of the faulting instruction
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a vm_insert_pfn helper, so that ->fault handlers can have nopfn
functionality by installing their own pte and returning NULL.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Split the implementation-agnostic stuff in separate files.
* Make sure that targets using non-default request_irq() pull
kernel/irq/devres.o
* Introduce new symbols (HAS_IOPORT and HAS_IOMEM) defaulting to positive;
allow architectures to turn them off (we needed these symbols anyway for
dependencies of quite a few drivers).
* protect the ioport-related parts of lib/devres.o with CONFIG_HAS_IOPORT.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch contains two fixes for RapisIO enumeration logic:
1. Fix enumeration in configurations with multiple switches. The patch adds:
a. Enumeration of an empty switch. Empty switch is a switch that
does not have any endpoint devices attached to it (except host device
or previous switch in a chain). New code assigns a phony destination
ID associated with the switch and sets up corresponding routes.
b. Adds a second pass to the enumeration to setup routes to
devices discovered after switch was scanned.
2. Fix enumeration failure when riohdid parameter has non-zero value.
Current version fails to setup response path to the host when it has
destination ID other that 0.
Signed-off-by: Alexandre Bounine <alexandreb@tundra.com>
Acked-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
They are fat: 4x8 bytes in task_struct.
They are uncoditionally updated in every fork, read, write and sendfile.
They are used only if you have some "extended acct fields feature".
And please, please, please, read(2) knows about bytes, not characters,
why it is called "rchar"?
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SMP systems without premption and spinlock debugging enabled use unlock
macros that don't tell sparse that the lock is being released. Add sparse
annotations in this case.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace a small number of expressions with a call to the "container_of()"
macro.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- #ifdef guard this header for multiple inclusion
- adjust the #include's to what is actually required by this header
- remove an unneeded #ifdef
- #endif comments
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- reduce the userspace visible part
- fix the in-kernel compilation
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Extend the set of "__attribute__" shortcut macros, and remove identical
(and now superfluous) definitions from a couple of source files.
based on a page at robert love's blog:
http://rlove.org/log/2005102601
extend the set of shortcut macros defined in compiler-gcc.h with the
following:
#define __packed __attribute__((packed))
#define __weak __attribute__((weak))
#define __naked __attribute__((naked))
#define __noreturn __attribute__((noreturn))
#define __pure __attribute__((pure))
#define __aligned(x) __attribute__((aligned(x)))
#define __printf(a,b) __attribute__((format(printf,a,b)))
Once these are in place, it's up to subsystem maintainers to decide if they
want to take advantage of them. there is already a strong precedent for
using shortcuts like this in the source tree.
The ones that might give people pause are "__aligned" and "__printf", but
shortcuts for both of those are already in use, and in some ways very
confusingly. note the two very different definitions for a macro named
"ALIGNED":
drivers/net/sgiseeq.c:#define ALIGNED(x) ((((unsigned long)(x)) + 0xf) & ~(0xf))
drivers/scsi/ultrastor.c:#define ALIGNED(x) __attribute__((aligned(x)))
also:
include/acpi/platform/acgcc.h:
#define ACPI_PRINTF_LIKE(c) __attribute__ ((__format__ (__printf__, c, c+1)))
Given the precedent, then, it seems logical to at least standardize on a
consistent set of these macros.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- no longer a userspace header
- add #include <linux/types.h> for in-kernel compilation
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for the CPCI-ASIO4 quad port CompactPCI UART board from
electronic system design gmbh.
Signed-off-by: Matthias Fuchs <matthias.fuchs@esd-electronics.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Get the Perle quad-modem PCI card (PCI-RAS4) detected by serial driver. It
may also get the PCI-RAS8 running, but can't guarantee as I didn't had one for
testing.
Signed-off-by: Thomas Hoehn <thomas.hoehn@avocent.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is an "RTC framework" driver for the "CMOS" RTCs which are standard on
PCs and some other platforms. That's MC146818 compatible silicon.
Advantages of this vs. drivers/char/rtc.c (use one _or_ the other, only
one will be able to claim the RTC irq) include:
- This leverages both the new RTC framework and the driver model; both
PNPACPI and platform device modes are supported. (A separate patch
creates a platform device on PCs where PNPACPI isn't configured.)
- It supports common extensions like longer alarms. (A separate patch
exports that information from ACPI through platform_data.)
- Likewise, system wakeup events use "real driver model support", with
policy control via sysfs "wakeup" attributes and and using normal rtc
ioctls to manage wakeup. (Patch in the works. The ACPI hooks are
known; /proc/acpi/alarm can vanish. Making it work with EFI will
be a minor challenge to someone with e.g. a MiniMac.)
It's not yet been tested on non-x86 systems, without ACPI, or with HPET.
And the RTC framework will surely have teething pains on "mainstream"
PC-based systems (though must embedded Linux systems use it heavily), not
limited to sorting out the "/dev/rtc0" issue (udev easily tweaked). Also,
the ALSA rtctimer code doesn't use the new RTC API.
Otherwise, this should be a no-known-regressions replacement for the old
drivers/char/rtc.c driver, and should help the non-embedded distros (and
the new timekeeping code) start to switch to the framework.
Note also that any systems using "rtc-m48t86" are candidates to switch over
to this more functional driver; the platform data is different, and the way
bytes are read is different, but otherwise those chips should be compatible.
[akpm@osdl.org: sparc32 fix]
[akpm@osdl.org: sparc64 fix]
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Woody Suwalski <woodys@xandros.com>
Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I noticed that almost all architectures implemented exactly the same
sys32_sysinfo... except parisc, where a bug was to be found in handling of
the uptime. So let's remove a whole whack of code for fun and profit.
Cribbed compat_sys_sysinfo from x86_64's implementation, since I figured it
would be the best tested.
This patch incorporates Arnd's suggestion of not using set_fs/get_fs, but
instead extracting out the common code from sys_sysinfo.
Cc: Christoph Hellwig <hch@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A variety of (mostly) innocuous fixes to the embedded kernel-doc content in
source files, including:
* make multi-line initial descriptions single line
* denote some function names, constants and structs as such
* change erroneous opening '/*' to '/**' in a few places
* reword some text for clarity
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The arguments are really const. Mark them const to allow these functions
being called from places where the arguments are const without getting
useless compiler warnings.
Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The patch to identify AIX disks and ignore them has caused at least one
machine to fail to find the root partition on 2.6.19. The patch is:
http://lkml.org/lkml/2006/7/31/117
The problem is some disk formatters do not blow away the first 4 bytes
of the disk. If the disk we are installing to used to have AIX on it,
then the first 4 bytes will still have IBMA in EBCDIC.
The install in question was debian etch. Im not sure what the best fix
is, perhaps the AIX detection code could check more than the first 4
bytes.
The whole partition info for primary partitions is in this block:
dd if=/dev/sdb count=$(( 4 * 16 )) bs=1 skip=$(( 0x1be ))
All other data do not matter, beside the 0x55aa marker at the end of the
first block.
Signed-off-by: Olaf Hering <olh@suse.de>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is in support of the IPMI driver. I have tested this with the
IPMI driver changes coming in the next patch.
Add a list_splice_init_rcu() function to splice an RCU-protected list into
another list. This takes the sync function as an argument, so one would do
something like:
INIT_LIST_HEAD(&list);
list_splice_init_rcu(&source, &dest, synchronize_rcu);
The idea being to keep the RCU API proliferation down to a dull roar.
[akpm@osdl.org: build fix]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The 32bit and 64bit PARISC Linux kernels suffers from the problem, that the
gettimeofday() call sometimes returns non-monotonic times.
The easiest way to fix this, is to drop the PARISC-specific implementation
and switch over to the generic TIME_INTERPOLATION framework.
But in order to make it even compile on 32bit PARISC, the patch below which
touches the generic Linux code, is mandatory.
More information and the full patch with the parisc-specific changes is included in this thread: http://lists.parisc-linux.org/pipermail/parisc-linux/2006-December/031003.html
As far as I could see, this patch does not change anything for the existing
architectures which use this framework (IA64 and SPARC64), since "cycles_t"
is defined there as unsigned 64bit-integer anyway (which then makes this
patch a no-change for them).
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert all calls to invalidate_inode_pages() into open-coded calls to
invalidate_mapping_pages().
Leave the invalidate_inode_pages() wrapper in place for now, marked as
deprecated.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It makes no sense to me to export invalidate_inode_pages() and not
invalidate_mapping_pages() and I actually need invalidate_mapping_pages()
because of its range specification ability...
akpm: also remove the export of invalidate_inode_pages() by making it an
inlined wrapper.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow taint flags to be set from userspace by writing to
/proc/sys/kernel/tainted, and add a new taint flag, TAINT_USER, to be used
when userspace has potentially done something dangerous that might
compromise the kernel. This will allow support personnel to ask further
questions about what may have caused the user taint flag to have been set.
For example, they might examine the logs of the realtime JVM to see if the
Java program has used the really silly, stupid, dangerous, and
completely-non-portable direct access to physical memory feature which MUST
be implemented according to the Real-Time Specification for Java (RTSJ).
Sigh. What were those silly people at Sun thinking?
[akpm@osdl.org: build fix]
[bunk@stusta.de: cleanup]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The PNP framework doesn't export "pnp_bus_type", which is an unfortunate
exception to the policy followed by pretty much every other bus. I noticed
this when I had to find a device in order to provide its platform_data.
Note that per advice from Arjan, the "export" scope has been been minimized to
avoid the hundred-plus bytes needed to support access from modules. In this
case, the symbol is only needed by statically linked kernel code that lives
outside the drivers/pnp directory.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mathieu originally needed to add this for tracing Xen, but it's something
that's needed for any application that can be tracing while cpus are added.
unplug isn't supported by this patch. The thought was that at minumum a new
buffer needs to be added when a cpu comes up, but it wasn't worth the effort
to remove buffers on cpu down since they'd be freed soon anyway when the
channel was closed.
[zanussi@us.ibm.com: avoid lock_cpu_hotplug deadlock]
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Tom Zanussi <zanussi@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add proper prototypes for two functions in drivers/char/vc_screen.c
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Userspace should be worrying about userspace, so having the socket.h
and stat.h pollute the namespace in the non-glibc case is wrong and
pretty much prevents any other libc from utilizing these headers
sanely unless they set up the __GLIBC__ define themselves (which
sucks)
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The line discipline numbers N_* are currently defined for each architecture
individually, but (except for a seeming mistake) identically, in
asm/termios.h. There is no obvious reason why these numbers should be
architecture specific, nor any apparent relationship with the termios
structure. The total number of these, NR_LDISCS, is defined in linux/tty.h
anyway. So I propose the following patch which moves the definitions of
the individual line disciplines to linux/tty.h too.
Three of these numbers (N_MASC, N_PROFIBUS_FDL, and N_SMSBLOCK) are unused
in the current kernel, but the patch still keeps the complete set in case
there are plans to use them yet.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Generate locking graph information into /proc/lockdep, for lock hierarchy
documentation and visualization purposes.
sample output:
c089fd5c OPS: 138 FD: 14 BD: 1 --..: &tty->termios_mutex
-> [c07a3430] tty_ldisc_lock
-> [c07a37f0] &port_lock_key
-> [c07afdc0] &rq->rq_lock_key#2
The lock classes listed are all the first-hop lock dependencies that
lockdep has seen so far.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I added IS_NOATIME(inode) macro definition in include/linux/fs.h, true if
the inode superblock is marked readonly or noatime.
This new macro is then used in touch_atime() instead of separatly testing
MS_RDONLY and MS_NOATIME
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I noticed cache misses in touch_atime() that can be avoided if we keep
mnt_count & mnt_expiry_mark in a different cache line than mnt_flags
(mostly read)
mnt_count & mnt_expiry_mark are modified each time a file is opened/closed
in a file system.
touch_atime() is called each time a file is read, and generally needs to
read mnt_flags.
Other fields of struct vfsmount are mostly read so I chose to move
mnt_count & mnt_expiry_mark at the end of struct vfsmount. And adding a
comment so that nobody tries to re-arrange fields to fill the holes :)
On 64bits platforms, the new offsetof(mnt_count) is 0xC0
On 32bits platforms, it is 0x60, so I didnot add a
____cacheline_aligned_in_smp because it would have a too big impact on the
size of this object (in particular if CONFIG_X86_L1_CACHE_SHIFT=7)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Prevent things like this:
block/ll_rw_blk.c: In function 'submit_bio':
block/ll_rw_blk.c:3222: warning: unused variable 'count'
inlines are very, very preferable to macros.
- remove unused get_cpu_vm_events() macro
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
include/linux/byteorder/pdp_endian.h is completely unused, and the comment in
the file itself states that it's both untested and only a proof-of-concept.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This does several things.
- It moves looking up of the current foreground console into process
context where we can safely take the semaphore that protects this
operation.
- It uses the new flavor of work queue processing.
- This generates a factor of do_SAK, __do_SAK that runs immediately.
- This calls __do_SAK with the console semaphore held ensuring nothing
else happens to the console while we process the SAK operation.
- With the console SAK processing moved into process context this
patch removes the xchg operations that I used to attempt to attomically
update struct pid, because of the strange locking used in the SAK processing.
With SAK using the normal console semaphore nothing special is needed.
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for auxiliary displays, the ks0108 LCD controller, the
cfag12864b LCD and adds a framebuffer device: cfag12864bfb.
- Add a "auxdisplay/" folder in "drivers/" for auxiliary display
drivers.
- Add support for the ks0108 LCD Controller as a device driver. (uses
parport interface)
- Add support for the cfag12864b LCD as a device driver. (uses ks0108
LCD Controller driver)
- Add a framebuffer device called cfag12864bfb. (uses cfag12864b LCD
driver)
- Add the usual Documentation, includes, Makefiles, Kconfigs,
MAINTAINERS, CREDITS...
- Miguel Ojeda will maintain all the stuff above.
[rdunlap@xenotime.net: workqueue fixups]
[akpm@osdl.org: kconfig fix]
Signed-off-by: Miguel Ojeda Sandonis <maxextreme@gmail.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Paulo Marques <pmarques@grupopie.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
shmem backed file does not have page writeback, nor it participates in
backing device's dirty or writeback accounting. So using generic
__set_page_dirty_nobuffers() for its .set_page_dirty aops method is a bit
overkill. It unnecessarily prolongs shm unmap latency.
For example, on a densely populated large shm segment (sevearl GBs), the
unmapping operation becomes painfully long. Because at unmap, kernel
transfers dirty bit in PTE into page struct and to the radix tree tag. The
operation of tagging the radix tree is particularly expensive because it
has to traverse the tree from the root to the leaf node on every dirty
page. What's bothering is that radix tree tag is used for page write back.
However, shmem is memory backed and there is no page write back for such
file system. And in the end, we spend all that time tagging radix tree and
none of that fancy tagging will be used. So let's simplify it by introduce
a new aops __set_page_dirty_no_writeback and this will speed up shm unmap.
Signed-off-by: Ken Chen <kenchen@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently if we have a non-zero ZONES_SHIFT we assume we are able to rely
on that as the bottom edge of the ZONEID, if not then we use the
NODES_PGOFF as the right end of either NODES _or_ SECTION. This latter is
more luck than judgement and would be incorrect if we reordered the
SECTION,NODE,ZONE options in the fields space.
Really what we want is the lower of the right hand end of the two fields we
are using (either NODE,ZONE or SECTION,ZONE). Codify that explicitly. As
always allow for there being no bits in either of the fields, such as might
be valid in a non-numa machine with only a zone NORMAL.
I have checked that the compiler is still able to constant fold all of this
away correctly.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make ZONE_DMA optional in core code.
- ifdef all code for ZONE_DMA and related definitions following the example
for ZONE_DMA32 and ZONE_HIGHMEM.
- Without ZONE_DMA, ZONE_HIGHMEM and ZONE_DMA32 we get to a ZONES_SHIFT of
0.
- Modify the VM statistics to work correctly without a DMA zone.
- Modify slab to not create DMA slabs if there is no ZONE_DMA.
[akpm@osdl.org: cleanup]
[jdike@addtoit.com: build fix]
[apw@shadowen.org: Simplify calculation of the number of bits we need for ZONES_SHIFT]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Values are readily available via ZVC per node and global sums.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Function is unnecessary now. We can use the summing features of the ZVCs to
get the values we need.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nr_free_pages is now a simple access to a global variable. Make it a macro
instead of a function.
The nr_free_pages now requires vmstat.h to be included. There is one
occurrence in power management where we need to add the include. Directly
refrer to global_page_state() there to clarify why the #include was added.
[akpm@osdl.org: arm build fix]
[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The global and per zone counter sums are in arrays of longs. Reorder the ZVCs
so that the most frequently used ZVCs are put into the same cacheline. That
way calculations of the global, node and per zone vm state touches only a
single cacheline. This is mostly important for 64 bit systems were one 128
byte cacheline takes only 8 longs.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is again simplifies some of the VM counter calculations through the use
of the ZVC consolidated counters.
[michal.k.k.piotrowski@gmail.com: build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The determination of the dirty ratio to determine writeback behavior is
currently based on the number of total pages on the system.
However, not all pages in the system may be dirtied. Thus the ratio is always
too low and can never reach 100%. The ratio may be particularly skewed if
large hugepage allocations, slab allocations or device driver buffers make
large sections of memory not available anymore. In that case we may get into
a situation in which f.e. the background writeback ratio of 40% cannot be
reached anymore which leads to undesired writeback behavior.
This patchset fixes that issue by determining the ratio based on the actual
pages that may potentially be dirty. These are the pages on the active and
the inactive list plus free pages.
The problem with those counts has so far been that it is expensive to
calculate these because counts from multiple nodes and multiple zones will
have to be summed up. This patchset makes these counters ZVC counters. This
means that a current sum per zone, per node and for the whole system is always
available via global variables and not expensive anymore to calculate.
The patchset results in some other good side effects:
- Removal of the various functions that sum up free, active and inactive
page counts
- Cleanup of the functions that display information via the proc filesystem.
This patch:
The use of a ZVC for nr_inactive and nr_active allows a simplification of some
counter operations. More ZVC functionality is used for sums etc in the
following patches.
[akpm@osdl.org: UP build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some partitioning systems create special partitions that
span the entire disk. One example are Sun partitions, and
this whole-disk partition exists to tell the firmware the
extent of the entire device so it can load the boot block
and do other things.
Such partitions should not be treated as normal partitions,
because all the other partitions overlap this whole-disk one.
So we'd see multiple instances of the same UUID etc. which
we do not want. udev and friends can thus search for this
'whole_disk' attribute and use it to decide to ignore the
partition.
Signed-off-by: Fabio Massimo Di Nitto <fabbione@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yet another attempt to resolve cpufreq and hotplug locking issues.
Patchset has 3 patches:
* Rewrite the lock infrastructure of cpufreq using a per cpu rwsem.
* Minor restructuring of work callback in ondemand driver.
* Use the new cpufreq rwsem infrastructure in ondemand work.
This patch:
Convert policy->lock to rwsem and move it to per_cpu area.
This rwsem will protect against both changing/accessing policy
related parameters and CPU hot plug/unplug.
[malattia@linux.it: fix oops in kref_put()]
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>