2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-09 14:14:00 +08:00
Commit Graph

169145 Commits

Author SHA1 Message Date
Russell King
1508c99506 ARM: PNX4008: fix watchdog device driver name
The PNX core code calls the device 'pnx4008-watchdog' not 'watchdog'

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Wim Van Sebroeck <wim@iguana.be>
2009-11-20 14:23:36 +00:00
Russell King
4ff1fa278b [ARM] kmap: fix build errors with DEBUG_HIGHMEM enabled
d451564 broke ARM by requiring KM_IRQ_PTE, KM_NMI and KM_NMI_PTE to
always be defined.  Solve this by providing invalid definitions for
these constants, but only if CONFIG_DEBUG_HIGHMEM is enabled.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-11-20 14:23:36 +00:00
Nicolas Ferre
985f37f827 AT91: add touchscreen support for at91sam9g45ekes
New at91sam9g45ekes board provides a LCD with resistive touchscreen.
This is the support of this feature by atmel_tsadcc driver. This also
sets up platform parameters to be passed to the driver.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:55:38 -08:00
Nicolas Ferre
423c9b0dc3 AT91: add platform parameters for atmel_tsadcc in at91sam9rlek
Setup platform parameters in at91sam9rl-ek board to be passed to
atmel_tsadcc touchscreen.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:55:29 -08:00
Nicolas Ferre
970435a141 Input: atmel_tsadcc - use platform parameters
Add a number of plafrom dependent parameters to atmel_tsadcc.  The
touchscreeen driver can now take into account the slight differences
that exist between IPs included in diferent products.  This will also
allow to adapt its behaivior to the caracteristics of the resistive
panel used.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:55:21 -08:00
Nicolas Ferre
ab9122cd33 Input: atmel_tsadcc - rework setting touchscreen capabilities
Tiny patch for setting capabilities using input API function.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:55:12 -08:00
Alan Jenkins
0c09b2ac35 Input: keyboard - fix theoretical race on vt switch
A VT switch can theoretically change fg_console between

        vc = vc_cons[fg_console].d
and
        kbd = kbd_table + fg_console

Fix it by replacing the second fg_console with vc->vc_num.

Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:14 -08:00
Maxim Levitsky
71bb21b677 Input: ALPS - add support for touchpads with 4-directional button
The touchpad on Acer Aspire 5720, 5520 and some other Aspire models
(signature 0x73, 0x02, 0x50) has a button that can be rocked in 4
different directions. Make the driver to generate BTN_0..BTN_3 events
in response. The Synaptics driver by default maps BTN_0 and BTN_1 to
up and down, so there should be no visible changes with the old setup
that generated BTN_FORWARD and BTN_BACK (also mapped to up and down).

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:13 -08:00
Dmitry Torokhov
315eb996d5 Input: psmouse - rework setting of BTN_MIDDLE capability
Do not start protocol detection assuming that middle mouse is present,
instead let individual protocols explicitly set this capability.
This fixes issue with Synaptics touchpads pretending that they have
middle button when hardware clearly reports otherwise.

Reported-and-tested-by: Andrey Borzenkov <arvidjaar@mail.ru>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:12 -08:00
Dmitry Torokhov
0cc1770b66 Input: lifebook - do not advertise unsupported buttons
The main input device of Lifebook touchscreens does not generate
left/right/middle button events and therefore should not be advertising
them in its capabilities.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:12 -08:00
Dmitry Torokhov
c7a1f3ccfc Input: elantech - do not advertise relative events
Elantech touchpads work in absolute mode and do not generate relative
events so they should not be advertising them.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:11 -08:00
Dmitry Torokhov
21f25573fb Input: touchkit_ps2 - do not advertise unsupported buttons
Touchkit PS/2 touchscreen does not have left/right/middle buttons and
should not be advertising as capable of generating these events.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:10 -08:00
Dmitry Torokhov
d69249f4b6 Input: input-polldev, matrix-keypad - include in kernel doc
Make sure that polled input device and matrix keypad APIs are included
with the rest of input API when generating kernel documentation. Also
description of absres was missing as well.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:09 -08:00
Samu Onkalo
dad725d089 Input: input-polldev - add sysfs interface for controlling poll interval
Sysfs entry for reading and setting of the polling interval. If the
interval is set to 0, polling is stopped. Polling is restarted when
interval is changed to non-zero.

sysfs entries:
poll = current polling interval in msec (RW)
max = max allowed polling interval (RO)
min = min allowed polling interval (RO)

Minimum and maximum limit for interval can be set while setting up the
device.

Interval can be adjusted even if the input device is not currently open.

[dtor@mail.ru: add kernel doc markup for the new fields]
Signed-off-by: Samu Onkalo <samu.p.onkalo@nokia.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:09 -08:00
Ben Dooks
bc8f1eaf68 Input: gpio_keys - seperate individual button setup to make code neater
Move the code that deals with setting up each individual button out into
a new function to reduce the indentation and allow us to common up some
of the error recovery code.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: Simtec Linux Team <linux@simtec.co.uk>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:08 -08:00
Ben Dooks
111bc59c08 Input: gpio_keys - use <linux/gpio.h> instead of <asm/gpio.h>
The gpio keys driver should be using <linux/gpio.h> instead
of <asm/gpio.h>

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: Simtec Linux Team <linux@simtec.co.uk>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:07 -08:00
Ben Dooks
db19fd8b3a Input: gpio_keys - use dev_ macros to report information
The gpio_keys driver is binding to a platform device but using pr_err()
to report errors. Change to using dev_err() so that all messages are
prefixed by the device name.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: Simtec Linux Team <linux@simtec.co.uk>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:06 -08:00
Marek Vasut
fb14159755 Input: ucb1400_ts - allow passing IRQ through platfrom_data
This patch allows UCB1400 to get IRQ GPIO from platform data. In case
platform_data are not supplied or the IRQ supplied in the platform_data
is negative, fall back to the old IRQ detection algorithm.

Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-20 00:52:05 -08:00
Dan Williams
49954c1567 ioat3: fix pq completion versus channel deallocation race
The completion of a pq operation is notified with a null descriptor
appended to the end of the chain.  This descriptor needs to be visible
to dma clients otherwise the client is precluded from ensuring all
operations are quiesced before freeing channel resources, i.e. due to
descriptor polling it may get the completion notification ahead of the
interrupt delivered by the null descriptor.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-11-19 23:21:03 -07:00
Dan Williams
7b3cc2b1fc async_tx: build-time toggling of async_{syndrome,xor}_val dma support
ioat3.2 does not support asynchronous error notifications which makes
the driver experience latencies when non-zero pq validate results are
expected.  Provide a mechanism for turning off async_xor_val and
async_syndrome_val via Kconfig.  This approach is generally useful for
any driver that specifies ASYNC_TX_DISABLE_CHANNEL_SWITCH and would like
to force the async_tx api to fall back to the synchronous path for
certain operations.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-11-19 23:21:03 -07:00
Dan Williams
4499a24dec dmaengine: include xor/pq validate in device_has_all_tx_types()
A channel must include these capabilities to satisfy
ASYNC_TX_DISABLE_CHANNEL_SWITCH.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-11-19 23:21:03 -07:00
Dan Williams
b57014def9 ioat2,3: report all uncorrectable errors
Modify is_ioat_bug() to catch all errors that are uncorrectable, or not
currently handled.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-11-19 23:21:03 -07:00
Linus Torvalds
a8a8a669ea Merge branch 'i2c-pnx-fixes' of git://git.fluff.org/bjdooks/linux
* 'i2c-pnx-fixes' of git://git.fluff.org/bjdooks/linux:
  i2c: i2c-pnx: Added missing mach/i2c.h and linux/io.h header file includes
  i2c: i2c-pnx: Made buf type unsigned to prevent sign extension
  i2c: i2c-pnx: Limit minimum jiffie timeout to 2
2009-11-19 20:29:34 -08:00
Linus Torvalds
931ed94430 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
  ocfs2: Trivial cleanup of jbd compatibility layer removal
  ocfs2: Refresh documentation
  ocfs2: return f_fsid info in ocfs2_statfs()
  ocfs2: duplicate inline data properly during reflink.
  ocfs2: Move ocfs2_complete_reflink to the right place.
  ocfs2: Return -EINVAL when a device is not ocfs2.
2009-11-19 20:29:05 -08:00
Kevin Wells
a7d73d8c68 i2c: i2c-pnx: Added missing mach/i2c.h and linux/io.h header file includes
Added missing mach/i2c.h and linux/io.h header file includes

Signed-off-by: Kevin Wells <kevin.wells@nxp.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-11-20 00:25:42 +00:00
Kevin Wells
4ced24c897 i2c: i2c-pnx: Made buf type unsigned to prevent sign extension
Made buf type unsigned to prevent sign extension

Signed-off-by: Kevin Wells <kevin.wells@nxp.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-11-20 00:25:42 +00:00
Kevin Wells
b2f125bcf5 i2c: i2c-pnx: Limit minimum jiffie timeout to 2
Limit minimum jiffie timeout to 2 to prevent early timeout on systems
with low tick rates

Signed-off-by: Kevin Wells <kevin.wells@nxp.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-11-20 00:25:41 +00:00
Dan Williams
de581b65f6 ioat3: specify valid address for disabled-Q or disabled-P
Although disabled, hardware still checks address validity, so duplicate
the known address.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-11-19 17:08:45 -07:00
Dan Williams
6f82b83b7a ioat2,3: disable asynchronous error notifications
Error interrupts and error completions may cause channel hangs, so
poll the channel status register after a timeout.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-11-19 17:07:57 -07:00
Dan Williams
228c4f5cfb ioat3: dca and raid operations are incompatible
RAID operations cause a system hang on platforms with DCA
(Direct-Cache-Access) enabled.  So turn off RAID capabilities in this
case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-11-19 17:07:10 -07:00
Jiang Yutang
a0a74d1ee2 sata_fsl: Split hard and soft reset
Split sata_fsl_softreset() into hard and soft resets to make
error-handling more efficient & device and PMP detection more
reliable.

Also includes fix for PMP support, driver tested with Sil3726,
Sil4726 & Exar PMP controllers.

[AV: Also fixes resuming from deep sleep on MPC8315 CPUs]

Signed-off-by: Jiang Yutang <b14898@freescale.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-11-19 18:18:17 -05:00
Linus Torvalds
648f4e3e50 Linux 2.6.32-rc8 2009-11-19 14:32:38 -08:00
Linus Torvalds
e6236f781c Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  SUNRPC: Address buffer overrun in rpc_uaddr2sockaddr()
  NFSv4: Fix a cache validation bug which causes getcwd() to return ENOENT
2009-11-19 13:43:19 -08:00
Alan Cox
308efab5e2 vt: Fix use of "new" in a struct field
As this struct is exposed to user space and the API was added for this
release it's a bit of a pain for the C++ world and we still have time to
fix it. Rename the fields before we end up with that pain in an actual
release.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Reported-by: Olivier Goffart
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-19 13:43:06 -08:00
David Woodhouse
5854d9c8d1 Fix handling of the HP/Acer 'DMAR at zero' BIOS error for machines with <4GiB RAM.
Commit 86cf898e1d ("intel-iommu: Check for
'DMAR at zero' BIOS error earlier.") was supposed to work by pretending
not to detect an IOMMU if it was actually being reported by the BIOS at
physical address zero.

However, the intel_iommu_init() function is called unconditionally, as
are the corresponding functions for other IOMMU hardware.

So the patch only worked if you have RAM above the 4GiB boundary. It
caused swiotlb to be initialised when no IOMMU was detected during early
boot, and thus the later IOMMU init would refuse to run.

But if you have less RAM than that, swiotlb wouldn't get set up and the
IOMMU _would_ still end up being initialised, even though we never
claimed to detect it.

This patch also sets the dmar_disabled flag when the error is detected
during the initial detection phase -- so that the later call to
intel_iommu_init() will return without doing anything, regardless of
whether swiotlb is used or not.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-19 13:42:02 -08:00
Patrick McHardy
6440fe059e netfilter: nf_log: fix sleeping function called from invalid context in seq_show()
[  171.925285] BUG: sleeping function called from invalid context at kernel/mutex.c:280
[  171.925296] in_atomic(): 1, irqs_disabled(): 0, pid: 671, name: grep
[  171.925306] 2 locks held by grep/671:
[  171.925312]  #0:  (&p->lock){+.+.+.}, at: [<c10b8acd>] seq_read+0x25/0x36c
[  171.925340]  #1:  (rcu_read_lock){.+.+..}, at: [<c1391dac>] seq_start+0x0/0x44
[  171.925372] Pid: 671, comm: grep Not tainted 2.6.31.6-4-netbook #3
[  171.925380] Call Trace:
[  171.925398]  [<c105104e>] ? __debug_show_held_locks+0x1e/0x20
[  171.925414]  [<c10264ac>] __might_sleep+0xfb/0x102
[  171.925430]  [<c1461521>] mutex_lock_nested+0x1c/0x2ad
[  171.925444]  [<c1391c9e>] seq_show+0x74/0x127
[  171.925456]  [<c10b8c5c>] seq_read+0x1b4/0x36c
[  171.925469]  [<c10b8aa8>] ? seq_read+0x0/0x36c
[  171.925483]  [<c10d5c8e>] proc_reg_read+0x60/0x74
[  171.925496]  [<c10d5c2e>] ? proc_reg_read+0x0/0x74
[  171.925510]  [<c10a4468>] vfs_read+0x87/0x110
[  171.925523]  [<c10a458a>] sys_read+0x3b/0x60
[  171.925538]  [<c1002a49>] syscall_call+0x7/0xb

Fix it by replacing RCU with nf_log_mutex.

Reported-by: "Yin, Kangkai" <kangkai.yin@intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-19 13:16:31 -08:00
Patrick McHardy
d667b9cfd0 netfilter: xt_osf: fix xt_osf_remove_callback() return value
Return a negative error value.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-19 13:16:26 -08:00
Eric Dumazet
2b1c8b0f92 veth: Fix veth_get_stats()
veth_get_stats() can be called in parallel on several cpus.

It's better to not reset dev->stats as it could give wrong result on
one cpu. Use temporary variables, then store the final results.

Also, we should loop on every possible cpus, not only online cpus,
or cpu hotplug can suddenly give wrong veth stats.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-19 13:16:22 -08:00
Eric Dumazet
56cf54831f ieee802154: dont leak skbs in ieee802154_fake_xmit()
ieee802154_fake_xmit() should free skbs since it returns NETDEV_TX_OK

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-19 13:16:21 -08:00
David Howells
14e69647c8 CacheFiles: Don't log lookup/create failing with ENOBUFS
Don't log the CacheFiles lookup/create object routined failing with ENOBUFS as
under high memory load or high cache load they can do this quite a lot.  This
error simply means that the requested object cannot be created on disk due to
lack of space, or due to failure of the backing filesystem to find sufficient
resources.

Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:12:08 +00:00
David Howells
fee096deb4 CacheFiles: Catch an overly long wait for an old active object
Catch an overly long wait for an old, dying active object when we want to
replace it with a new one.  The probability is that all the slow-work threads
are hogged, and the delete can't get a look in.

What we do instead is:

 (1) if there's nothing in the slow work queue, we sleep until either the dying
     object has finished dying or there is something in the slow work queue
     behind which we can queue our object.

 (2) if there is something in the slow work queue, we return ETIMEDOUT to
     fscache_lookup_object(), which then puts us back on the slow work queue,
     presumably behind the deletion that we're blocked by.  We are then
     deferred for a while until we work our way back through the queue -
     without blocking a slow-work thread unnecessarily.

A backtrace similar to the following may appear in the log without this patch:

	INFO: task kslowd004:5711 blocked for more than 120 seconds.
	"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
	kslowd004     D 0000000000000000     0  5711      2 0x00000080
	 ffff88000340bb80 0000000000000046 ffff88002550d000 0000000000000000
	 ffff88002550d000 0000000000000007 ffff88000340bfd8 ffff88002550d2a8
	 000000000000ddf0 00000000000118c0 00000000000118c0 ffff88002550d2a8
	Call Trace:
	 [<ffffffff81058e21>] ? trace_hardirqs_on+0xd/0xf
	 [<ffffffffa011c4d8>] ? cachefiles_wait_bit+0x0/0xd [cachefiles]
	 [<ffffffffa011c4e1>] cachefiles_wait_bit+0x9/0xd [cachefiles]
	 [<ffffffff81353153>] __wait_on_bit+0x43/0x76
	 [<ffffffff8111ae39>] ? ext3_xattr_get+0x1ec/0x270
	 [<ffffffff813531ef>] out_of_line_wait_on_bit+0x69/0x74
	 [<ffffffffa011c4d8>] ? cachefiles_wait_bit+0x0/0xd [cachefiles]
	 [<ffffffff8104c125>] ? wake_bit_function+0x0/0x2e
	 [<ffffffffa011bc79>] cachefiles_mark_object_active+0x203/0x23b [cachefiles]
	 [<ffffffffa011c209>] cachefiles_walk_to_object+0x558/0x827 [cachefiles]
	 [<ffffffffa011a429>] cachefiles_lookup_object+0xac/0x12a [cachefiles]
	 [<ffffffffa00aa1e9>] fscache_lookup_object+0x1c7/0x214 [fscache]
	 [<ffffffffa00aafc5>] fscache_object_state_machine+0xa5/0x52d [fscache]
	 [<ffffffffa00ab4ac>] fscache_object_slow_work_execute+0x5f/0xa0 [fscache]
	 [<ffffffff81082093>] slow_work_execute+0x18f/0x2d1
	 [<ffffffff8108239a>] slow_work_thread+0x1c5/0x308
	 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
	 [<ffffffff810821d5>] ? slow_work_thread+0x0/0x308
	 [<ffffffff8104be91>] kthread+0x7a/0x82
	 [<ffffffff8100beda>] child_rip+0xa/0x20
	 [<ffffffff8100b87c>] ? restore_args+0x0/0x30
	 [<ffffffff8104be17>] ? kthread+0x0/0x82
	 [<ffffffff8100bed0>] ? child_rip+0x0/0x20
	1 lock held by kslowd004/5711:
	 #0:  (&sb->s_type->i_mutex_key#7/1){+.+.+.}, at: [<ffffffffa011be64>] cachefiles_walk_to_object+0x1b3/0x827 [cachefiles]

Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:12:05 +00:00
David Howells
d0e27b7808 CacheFiles: Better showing of debugging information in active object problems
Show more debugging information if cachefiles_mark_object_active() is asked to
activate an active object.

This may happen, for instance, if the netfs tries to register an object with
the same key multiple times.

The code is changed to (a) get the appropriate object lock to protect the
cookie pointer whilst we dereference it, and (b) get and display the cookie key
if available.

Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:12:02 +00:00
David Howells
6511de33c8 CacheFiles: Mark parent directory locks as I_MUTEX_PARENT to keep lockdep happy
Mark parent directory locks as I_MUTEX_PARENT in the callers of
cachefiles_bury_object() so that lockdep doesn't complain when that invokes
vfs_unlink():

=============================================
[ INFO: possible recursive locking detected ]
2.6.32-rc6-cachefs #47
---------------------------------------------
kslowd002/3089 is trying to acquire lock:
 (&sb->s_type->i_mutex_key#7){+.+.+.}, at: [<ffffffff810bbf72>] vfs_unlink+0x8b/0x128

but task is already holding lock:
 (&sb->s_type->i_mutex_key#7){+.+.+.}, at: [<ffffffffa00e4e61>] cachefiles_walk_to_object+0x1b0/0x831 [cachefiles]

other info that might help us debug this:
1 lock held by kslowd002/3089:
 #0:  (&sb->s_type->i_mutex_key#7){+.+.+.}, at: [<ffffffffa00e4e61>] cachefiles_walk_to_object+0x1b0/0x831 [cachefiles]

stack backtrace:
Pid: 3089, comm: kslowd002 Not tainted 2.6.32-rc6-cachefs #47
Call Trace:
 [<ffffffff8105ad7b>] __lock_acquire+0x1649/0x16e3
 [<ffffffff8118170e>] ? inode_has_perm+0x5f/0x61
 [<ffffffff8105ae6c>] lock_acquire+0x57/0x6d
 [<ffffffff810bbf72>] ? vfs_unlink+0x8b/0x128
 [<ffffffff81353ac3>] mutex_lock_nested+0x54/0x292
 [<ffffffff810bbf72>] ? vfs_unlink+0x8b/0x128
 [<ffffffff8118179e>] ? selinux_inode_permission+0x8e/0x90
 [<ffffffff8117e271>] ? security_inode_permission+0x1c/0x1e
 [<ffffffff810bb4fb>] ? inode_permission+0x99/0xa5
 [<ffffffff810bbf72>] vfs_unlink+0x8b/0x128
 [<ffffffff810adb19>] ? kfree+0xed/0xf9
 [<ffffffffa00e3f00>] cachefiles_bury_object+0xb6/0x420 [cachefiles]
 [<ffffffff81058e21>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffffa00e7e24>] ? cachefiles_check_object_xattr+0x233/0x293 [cachefiles]
 [<ffffffffa00e51b0>] cachefiles_walk_to_object+0x4ff/0x831 [cachefiles]
 [<ffffffff81032238>] ? finish_task_switch+0x0/0xb2
 [<ffffffffa00e3429>] cachefiles_lookup_object+0xac/0x12a [cachefiles]
 [<ffffffffa00741e9>] fscache_lookup_object+0x1c7/0x214 [fscache]
 [<ffffffffa0074fc5>] fscache_object_state_machine+0xa5/0x52d [fscache]
 [<ffffffffa00754ac>] fscache_object_slow_work_execute+0x5f/0xa0 [fscache]
 [<ffffffff81082093>] slow_work_execute+0x18f/0x2d1
 [<ffffffff8108239a>] slow_work_thread+0x1c5/0x308
 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
 [<ffffffff810821d5>] ? slow_work_thread+0x0/0x308
 [<ffffffff8104be91>] kthread+0x7a/0x82
 [<ffffffff8100beda>] child_rip+0xa/0x20
 [<ffffffff8100b87c>] ? restore_args+0x0/0x30
 [<ffffffff8104be17>] ? kthread+0x0/0x82
 [<ffffffff8100bed0>] ? child_rip+0x0/0x20

Signed-off-by: Daivd Howells <dhowells@redhat.com>
2009-11-19 18:11:58 +00:00
David Howells
5e929b33c3 CacheFiles: Handle truncate unlocking the page we're reading
Handle truncate unlocking the page we're attempting to read from the backing
device before the read has completed.

This was causing reports like the following to occur:

	Pid: 4765, comm: kslowd Not tainted 2.6.30.1 #1
	Call Trace:
	 [<ffffffffa0331d7a>] ? cachefiles_read_waiter+0xd9/0x147 [cachefiles]
	 [<ffffffff804b74bd>] ? __wait_on_bit+0x60/0x6f
	 [<ffffffff8022bbbb>] ? __wake_up_common+0x3f/0x71
	 [<ffffffff8022cc32>] ? __wake_up+0x30/0x44
	 [<ffffffff8024a41f>] ? __wake_up_bit+0x28/0x2d
	 [<ffffffffa003a793>] ? ext3_truncate+0x4d7/0x8ed [ext3]
	 [<ffffffff80281f90>] ? pagevec_lookup+0x17/0x1f
	 [<ffffffff8028c2ff>] ? unmap_mapping_range+0x59/0x1ff
	 [<ffffffff8022cc32>] ? __wake_up+0x30/0x44
	 [<ffffffff8028e286>] ? vmtruncate+0xc2/0xe2
	 [<ffffffff802b82cf>] ? inode_setattr+0x22/0x10a
	 [<ffffffffa003baa5>] ? ext3_setattr+0x17b/0x1e6 [ext3]
	 [<ffffffff802b853d>] ? notify_change+0x186/0x2c9
	 [<ffffffffa032d9de>] ? cachefiles_attr_changed+0x133/0x1cd [cachefiles]
	 [<ffffffffa032df7f>] ? cachefiles_lookup_object+0xcf/0x12a [cachefiles]
	 [<ffffffffa0318165>] ? fscache_lookup_object+0x110/0x122 [fscache]
	 [<ffffffffa03188c3>] ? fscache_object_slow_work_execute+0x590/0x6bc
	[fscache]
	 [<ffffffff80278f82>] ? slow_work_thread+0x285/0x43a
	 [<ffffffff8024a446>] ? autoremove_wake_function+0x0/0x2e
	 [<ffffffff80278cfd>] ? slow_work_thread+0x0/0x43a
	 [<ffffffff8024a317>] ? kthread+0x54/0x81
	 [<ffffffff8020c93a>] ? child_rip+0xa/0x20
	 [<ffffffff8024a2c3>] ? kthread+0x0/0x81
	 [<ffffffff8020c930>] ? child_rip+0x0/0x20
	CacheFiles: I/O Error: Readpage failed on backing file 200000000000810
	FS-Cache: Cache cachefiles stopped due to I/O error

Reported-by: Christian Kujau <lists@nerdbynature.de>
Reported-by: Takashi Iwai <tiwai@suse.de>
Reported-by: Duc Le Minh <duclm.vn@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:11:55 +00:00
David Howells
a17754fb8c CacheFiles: Don't write a full page if there's only a partial page to cache
cachefiles_write_page() writes a full page to the backing file for the last
page of the netfs file, even if the netfs file's last page is only a partial
page.

This causes the EOF on the backing file to be extended beyond the EOF of the
netfs, and thus the backing file will be truncated by cachefiles_attr_changed()
called from cachefiles_lookup_object().

So we need to limit the write we make to the backing file on that last page
such that it doesn't push the EOF too far.

Also, if a backing file that has a partial page at the end is expanded, we
discard the partial page and refetch it on the basis that we then have a hole
in the file with invalid data, and should the power go out...  A better way to
deal with this could be to record a note that the partial page contains invalid
data until the correct data is written into it.

This isn't a problem for netfs's that discard the whole backing file if the
file size changes (such as NFS).

Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:11:52 +00:00
David Howells
868411be3f FS-Cache: Actually requeue an object when requested
FS-Cache objects have an FSCACHE_OBJECT_EV_REQUEUE event that can theoretically
be raised to ask the state machine to requeue the object for further processing
before the work function returns to the slow-work facility.

However, fscache_object_work_execute() was clearing that bit before checking
the event mask to see whether the object has any pending events that require it
to be requeued immediately.

Instead, the bit should be cleared after the check and enqueue.

Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:11:48 +00:00
David Howells
60d543ca72 FS-Cache: Start processing an object's operations on that object's death
Start processing an object's operations when that object moves into the DYING
state as the object cannot be destroyed until all its outstanding operations
have completed.

Furthermore, make sure that read and allocation operations handle being woken
up on a dead object.  Such events are recorded in the Allocs.abt and
Retrvls.abt statistics as viewable through /proc/fs/fscache/stats.

The code for waiting for object activation for the read and allocation
operations is also extracted into its own function as it is much the same in
all cases, differing only in the stats incremented.

Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:11:45 +00:00
David Howells
d461d26dde FS-Cache: Make sure FSCACHE_COOKIE_LOOKING_UP cleared on lookup failure
We must make sure that FSCACHE_COOKIE_LOOKING_UP is cleared on lookup failure
(if an object reaches the LC_DYING state), and we should clear it before
clearing FSCACHE_COOKIE_CREATING.

If this doesn't happen then fscache_wait_for_deferred_lookup() may hold
allocation and retrieval operations indefinitely until they're interrupted by
signals - which in turn pins the dying object until they go away.

Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:11:41 +00:00
David Howells
2175bb06dc FS-Cache: Add a retirement stat counter
Add a stat counter to count retirement events rather than ordinary release
events (the retire argument to fscache_relinquish_cookie()).

Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:11:38 +00:00
David Howells
201a15428b FS-Cache: Handle pages pending storage that get evicted under OOM conditions
Handle netfs pages that the vmscan algorithm wants to evict from the pagecache
under OOM conditions, but that are waiting for write to the cache.  Under these
conditions, vmscan calls the releasepage() function of the netfs, asking if a
page can be discarded.

The problem is typified by the following trace of a stuck process:

	kslowd005     D 0000000000000000     0  4253      2 0x00000080
	 ffff88001b14f370 0000000000000046 ffff880020d0d000 0000000000000007
	 0000000000000006 0000000000000001 ffff88001b14ffd8 ffff880020d0d2a8
	 000000000000ddf0 00000000000118c0 00000000000118c0 ffff880020d0d2a8
	Call Trace:
	 [<ffffffffa00782d8>] __fscache_wait_on_page_write+0x8b/0xa7 [fscache]
	 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
	 [<ffffffffa0078240>] ? __fscache_check_page_write+0x63/0x70 [fscache]
	 [<ffffffffa00b671d>] nfs_fscache_release_page+0x4e/0xc4 [nfs]
	 [<ffffffffa00927f0>] nfs_release_page+0x3c/0x41 [nfs]
	 [<ffffffff810885d3>] try_to_release_page+0x32/0x3b
	 [<ffffffff81093203>] shrink_page_list+0x316/0x4ac
	 [<ffffffff8109372b>] shrink_inactive_list+0x392/0x67c
	 [<ffffffff813532fa>] ? __mutex_unlock_slowpath+0x100/0x10b
	 [<ffffffff81058df0>] ? trace_hardirqs_on_caller+0x10c/0x130
	 [<ffffffff8135330e>] ? mutex_unlock+0x9/0xb
	 [<ffffffff81093aa2>] shrink_list+0x8d/0x8f
	 [<ffffffff81093d1c>] shrink_zone+0x278/0x33c
	 [<ffffffff81052d6c>] ? ktime_get_ts+0xad/0xba
	 [<ffffffff81094b13>] try_to_free_pages+0x22e/0x392
	 [<ffffffff81091e24>] ? isolate_pages_global+0x0/0x212
	 [<ffffffff8108e743>] __alloc_pages_nodemask+0x3dc/0x5cf
	 [<ffffffff81089529>] grab_cache_page_write_begin+0x65/0xaa
	 [<ffffffff8110f8c0>] ext3_write_begin+0x78/0x1eb
	 [<ffffffff81089ec5>] generic_file_buffered_write+0x109/0x28c
	 [<ffffffff8103cb69>] ? current_fs_time+0x22/0x29
	 [<ffffffff8108a509>] __generic_file_aio_write+0x350/0x385
	 [<ffffffff8108a588>] ? generic_file_aio_write+0x4a/0xae
	 [<ffffffff8108a59e>] generic_file_aio_write+0x60/0xae
	 [<ffffffff810b2e82>] do_sync_write+0xe3/0x120
	 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
	 [<ffffffff810b18e1>] ? __dentry_open+0x1a5/0x2b8
	 [<ffffffff810b1a76>] ? dentry_open+0x82/0x89
	 [<ffffffffa00e693c>] cachefiles_write_page+0x298/0x335 [cachefiles]
	 [<ffffffffa0077147>] fscache_write_op+0x178/0x2c2 [fscache]
	 [<ffffffffa0075656>] fscache_op_execute+0x7a/0xd1 [fscache]
	 [<ffffffff81082093>] slow_work_execute+0x18f/0x2d1
	 [<ffffffff8108239a>] slow_work_thread+0x1c5/0x308
	 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
	 [<ffffffff810821d5>] ? slow_work_thread+0x0/0x308
	 [<ffffffff8104be91>] kthread+0x7a/0x82
	 [<ffffffff8100beda>] child_rip+0xa/0x20
	 [<ffffffff8100b87c>] ? restore_args+0x0/0x30
	 [<ffffffff8102ef83>] ? tg_shares_up+0x171/0x227
	 [<ffffffff8104be17>] ? kthread+0x0/0x82
	 [<ffffffff8100bed0>] ? child_rip+0x0/0x20

In the above backtrace, the following is happening:

 (1) A page storage operation is being executed by a slow-work thread
     (fscache_write_op()).

 (2) FS-Cache farms the operation out to the cache to perform
     (cachefiles_write_page()).

 (3) CacheFiles is then calling Ext3 to perform the actual write, using Ext3's
     standard write (do_sync_write()) under KERNEL_DS directly from the netfs
     page.

 (4) However, for Ext3 to perform the write, it must allocate some memory, in
     particular, it must allocate at least one page cache page into which it
     can copy the data from the netfs page.

 (5) Under OOM conditions, the memory allocator can't immediately come up with
     a page, so it uses vmscan to find something to discard
     (try_to_free_pages()).

 (6) vmscan finds a clean netfs page it might be able to discard (possibly the
     one it's trying to write out).

 (7) The netfs is called to throw the page away (nfs_release_page()) - but it's
     called with __GFP_WAIT, so the netfs decides to wait for the store to
     complete (__fscache_wait_on_page_write()).

 (8) This blocks a slow-work processing thread - possibly against itself.

The system ends up stuck because it can't write out any netfs pages to the
cache without allocating more memory.

To avoid this, we make FS-Cache cancel some writes that aren't in the middle of
actually being performed.  This means that some data won't make it into the
cache this time.  To support this, a new FS-Cache function is added
fscache_maybe_release_page() that replaces what the netfs releasepage()
functions used to do with respect to the cache.

The decisions fscache_maybe_release_page() makes are counted and displayed
through /proc/fs/fscache/stats on a line labelled "VmScan".  There are four
counters provided: "nos=N" - pages that weren't pending storage; "gon=N" -
pages that were pending storage when we first looked, but weren't by the time
we got the object lock; "bsy=N" - pages that we ignored as they were actively
being written when we looked; and "can=N" - pages that we cancelled the storage
of.

What I'd really like to do is alter the behaviour of the cancellation
heuristics, depending on how necessary it is to expel pages.  If there are
plenty of other pages that aren't waiting to be written to the cache that
could be ejected first, then it would be nice to hold up on immediate
cancellation of cache writes - but I don't see a way of doing that.

Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19 18:11:35 +00:00