Commit Graph

190418 Commits

Author SHA1 Message Date
Bruce Allan
6f461f6c7c e1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata
Prompted by a previous patch submitted by Matthew Garret <mjg@redhat.com>,
further digging into errata documentation reveals the current enabling or
disabling of ASPM L0s and L1 states for certain parts supported by this
driver are incorrect.  82571 and 82572 should always disable L1.  For
standard frames, 82573/82574/82583 can enable L1 but L0s must be disabled,
and for jumbo frames 82573/82574 must disable L1.  This allows for some
parts to enable L1 in certain configurations leading to better power
savings.

Also according to the same errata, Early Receive (ERT) should be disabled
on 82573 when using jumbo frames.

Cc: Matthew Garret <mjg@redhat.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-27 10:18:26 -07:00
Peter Waskiewicz
61fac744dd ixgbe: Power down PHY during driver resets
The PHY laser is still on during driver init.  It's allowing
garbage to hit our FIFO, which eventually can cause the entire
device to die.  Power down the laser while setting up the device,
and re-enable the laser before getting link.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-27 10:18:25 -07:00
Christoph Egger
16a5b3c414 Remove redundant check for CONFIG_MMU
The checks for CONFIG_MMU at this location are duplicated as all the code is
located inside a #ifndef CONFIG_MMU block. So the first conditional block will
always be included while the second never will.

Signed-off-by: Christoph Egger <siccegge@stud.informatik.uni-erlangen.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-27 09:01:26 -07:00
Linus Torvalds
bc113f151a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
  squashfs: fix potential buffer over-run on 4K block file systems
  squashfs: add missing buffer free
  squashfs: fix warn_on when root inode is corrupted
  squashfs: fix locking bug in zlib wrapper
2010-04-27 08:59:38 -07:00
Linus Torvalds
93a9248af2 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: more swap extent fixes for dynamic fork offsets
2010-04-27 08:32:21 -07:00
Linus Torvalds
17282b9855 Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (39 commits)
  omap: delete unused bootloader tag variables
  omap: Devkit8000: Remove unused pins
  omap: Devkit8000: Change position of init calls
  omap: Devkit8000: Remove unnecessary include file
  omap: Devkit8000: Fix typo in pin name
  omap: Devkit8000: Add missing package selection
  omap: Devkit8000: Fix typo in supplies
  n8x0_defconfig: remove CONFIG_NILFS2_FS override
  omap: board-sdp-flash.c: Fix typos in debug output
  omap4: Fix McBSP4 base address
  omap: rx51_defconfig: Remove CONFIG_SYSFS_DEPRECATED*=y options
  omap: rx51_defconfig: Remove duplicate phonet
  omap: fix a gpmc nand problem
  AM3517: initialize i2c subsystem after mux subsystem
  omap: remove one of the define of INT_34XX_BENCH_MPU_EMUL
  omap: fix the compile error if CONFIG_MTD_NAND_OMAP2 is notenabled
  OMAP4: Clocks: Change SPI Instance Names
  omap: Devkit8000: Fix wrong usb port on Devkit8000
  OMAP4: Fix for CONTROL register Base
  OMAP4-HSMMC: FIX for MMC5 Controller IRQ Base
  ...
2010-04-27 08:27:26 -07:00
Rik van Riel
5892753383 mmap: check ->vm_ops before dereferencing
Check whether the VMA has a vm_ops before calling close, just
like we check vm_ops before calling open a few dozen lines
higher up in the function.

Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-27 08:26:51 -07:00
Linus Torvalds
a231a1f271 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: authenc - Add EINPROGRESS check
2010-04-27 08:26:09 -07:00
Linus Torvalds
0bfb82449c Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon: Fix sparc regression in r300_scratch()
  drm: make sure vblank interrupts are disabled at DPMS time
  drm/radeon/kms/evergreen: No EnableYUV table
  drm/radeon: 9800 SE has only one quadpipe
  drm/radeon/kms: don't print error for legal crtcs.
  drm/radeon/kms/evergreen: fix LUT setup
2010-04-27 08:22:50 -07:00
Tejun Heo
6b4517a791 block: implement bd_claiming and claiming block
Currently, device claiming for exclusive open is done after low level
open - disk->fops->open() - has completed successfully.  This means
that exclusive open attempts while a device is already exclusively
open will fail only after disk->fops->open() is called.

cdrom driver issues commands during open() which means that O_EXCL
open attempt can unintentionally inject commands to in-progress
command stream for burning thus disturbing burning process.  In most
cases, this doesn't cause problems because the first command to be
issued is TUR which most devices can process in the middle of burning.
However, depending on how a device replies to TUR during burning,
cdrom driver may end up issuing further commands.

This can't be resolved trivially by moving bd_claim() before doing
actual open() because that means an open attempt which will end up
failing could interfere other legit O_EXCL open attempts.
ie. unconfirmed open attempts can fail others.

This patch resolves the problem by introducing claiming block which is
started by bd_start_claiming() and terminated either by bd_claim() or
bd_abort_claiming().  bd_claim() from inside a claiming block is
guaranteed to succeed and once a claiming block is started, other
bd_start_claiming() or bd_claim() attempts block till the current
claiming block is terminated.

bd_claim() can still be used standalone although now it always
synchronizes against claiming blocks, so the existing users will keep
working without any change.

blkdev_open() and open_bdev_exclusive() are converted to use claiming
blocks so that exclusive open attempts from these functions don't
interfere with the existing exclusive open.

This problem was discovered while investigating bko#15403.

  https://bugzilla.kernel.org/show_bug.cgi?id=15403

The burning problem itself can be resolved by updating userspace
probing tools to always open w/ O_EXCL.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Matthias-Christian Ott <ott@mirix.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-27 10:57:54 +02:00
Tejun Heo
1a3cbbc5a5 block: factor out bd_may_claim()
Factor out bd_may_claim() from bd_claim(), add comments and apply a
couple of cosmetic edits.  This is to prepare for further updates to
claim path.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-27 10:57:54 +02:00
Anton Vorontsov
d8d8b63b6d watchdog: booke_wdt: fix build - unconstify watchdog_info
commit 42747d712d ("[WATCHDOG] watchdog_info
constify") introduced the following build failure:

   CC      booke_wdt.o
 booke_wdt.c: In function 'booke_wdt_init':
 booke_wdt.c:220: error: assignment of read-only variable 'ident'

Fix this by removing 'const' qualifier from watchdog_info struct.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2010-04-27 07:58:47 +00:00
Jens Axboe
0661b1ac5d mtd: ensure that bdi entries are properly initialized and registered
They will be holding dirty inodes and be responsible for flushing
them out, so they need to be setup properly.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-27 09:51:30 +02:00
Jörn Engel
a33eb6b910 Move mtd_bdi_*mappable to mtdcore.c
Removes one .h and one .c file that are never used outside of
mtdcore.c.

Signed-off-by: Joern Engel <joern@logfs.org>

Edited to remove on leftover debug define.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-27 09:50:58 +02:00
David Miller
88b045077a drm/radeon: Fix sparc regression in r300_scratch()
Commit b4fe945405 ("drm/radeon: Fix
memory allocation failures in the preKMS command stream checking.")
added a regression in that it completely tossed the get_unaligned()
done by r300_scratch() which we added in commit
958a6f8ccb ("drm: radeon: Fix unaligned
access in r300_scratch().").

Put it back.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-04-27 09:40:57 +10:00
Jesse Barnes
e32ee7fa54 drm: make sure vblank interrupts are disabled at DPMS time
When we call drm_vblank_off() at DPMS off time (to wake any clients so
they don't hang) we need to make sure interrupts are actually disabled.
If drm_vblank_off() gets called before the vblank usage timer expires,
it'll prevent the timer from disabling interrupts since it also clears
the vblank_enabled flag for the pipe.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-04-27 09:37:39 +10:00
françois romieu
908ba2bfd2 r8169: more broken register writes workaround
78f1cd0245 ("fix broken register writes")
does not work for Al Viro's r8169 (XID 18000000).

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-26 15:36:48 -07:00
françois romieu
87aeec767e r8169: failure to enable mwi should not be fatal
Few (6) network drivers enable mwi explicitly. Fewer worry about a
failure.

It is not a fix but it should avoid some annoyance like
http://bugzilla.kernel.org/show_bug.cgi?id=15454

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Conrad Kostecki <conikost@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-26 15:36:47 -07:00
Neil Brown
2bc3c1179c nfsd4: bug in read_buf
When read_buf is called to move over to the next page in the pagelist
of an NFSv4 request, it sets argp->end to essentially a random
number, certainly not an address within the page which argp->p now
points to.  So subsequent calls to READ_BUF will think there is much
more than a page of spare space (the cast to u32 ensures an unsigned
comparison) so we can expect to fall off the end of the second
page.

We never encountered thsi in testing because typically the only
operations which use more than two pages are write-like operations,
which have their own decoding logic.  Something like a getattr after a
write may cross a page boundary, but it would be very unusual for it to
cross another boundary after that.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-26 15:39:08 -04:00
Bjorn Helgaas
55051feb57 x86/PCI: never allocate PCI MMIO resources below BIOS_END
When we move a PCI device or assign resources to a device not configured
by the BIOS, we want to avoid the BIOS region below 1MB.  Note that if the
BIOS places devices below 1MB, we leave them there.

See https://bugzilla.kernel.org/show_bug.cgi?id=15744
and https://bugzilla.kernel.org/show_bug.cgi?id=15841

Tested-by: Andy Isaacson <adi@hexapodia.org>
Tested-by: Andy Bailey <bailey@akamai.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-04-26 12:30:03 -07:00
YOSHIFUJI Hideaki / 吉藤英明
4eb8b9031a bridge br_multicast: Ensure to initialize BR_INPUT_SKB_CB(skb)->mrouters_only.
Even with commit 32dec5dd02 ("bridge
br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only
without IGMP snooping."), BR_INPUT_SKB_CB(skb)->mrouters_only is
not appropriately initialized if IGMP snooping support is
compiled and disabled, so we can see garbage.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-26 11:25:31 -07:00
Denis Turischev
322af98c56 watchdog: sbc_fitpc2_wdt: fixed "scheduling while atomic" bug.
spinlock need to be replaced by mutex because of sleep functions
inside wdt_send_data.

Signed-off-by: Denis Turischev <denis@compulab.co.il>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-04-26 18:22:30 +00:00
Stefan Schmidt
93c0c8b4a5 ieee802154: Fix oops during ieee802154_sock_ioctl
Trying to run izlisten (from lowpan-tools tests) on a device that does not
exists I got the oops below. The problem is that we are using get_dev_by_name
without checking if we really get a device back. We don't in this case and
writing to dev->type generates this oops.

[Oops code removed by Dmitry Eremin-Solenikov]

If possible this patch should be applied to the current -rc fixes branch.

Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-26 11:20:32 -07:00
Denis Turischev
fcf1dd7e68 watchdog: sbc_fitpc2_wdt: fixed I/O operations order
There are fitpc2 compatible boards that hang with existent i/o
operations order. Solution is to switch between writing to data
and command ports.

Signed-off-by: Denis Turischev <denis@compulab.co.il>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-04-26 18:17:34 +00:00
Andre Detsch
dc8bf1b1a6 tg3: Fix INTx fallback when MSI fails
tg3: Fix INTx fallback when MSI fails

MSI setup changes the value of irq_vec in struct tg3 *tp.
This attribute must be taken into account and restored before
we try to do a new request_irq for INTx fallback.

In powerpc, the original code was leading to an EINVAL return within
request_irq, because the driver was trying to use the disabled MSI
virtual irq number instead of tp->pdev->irq.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-26 11:15:49 -07:00
Guenter Roeck
86913315de Watchdog: sb_wdog.c: Fix sibyte watchdog initialization
Watchdog configuration register and timer count register were interchanged,
causing wrong values to be written into both registers.
This caused watchdog triggered resets even if the watchdog was reset in time.

Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-04-26 18:14:03 +00:00
Alexander Kurz
83bf6f11e8 pcmcia: fix matching rules for pseudo-multi-function cards
Prevent PCMCIA_DEV_ID_MATCH_FUNC_ID from grabbing PFC-cards:
I changed the code, so that the first matching struct
pcmcia_device_id _PFC_ entry will mark the card has_pfc,
preventing PCMCIA_DEV_ID_MATCH_FUNC_ID to match.

[linux-pcmcia@lists.infradead.org: re-order commit message]
Signed-off-by: Alexander Kurz <linux@kbdbabel.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2010-04-26 20:09:07 +02:00
Dave Chinner
dd77ef924c xfs: more swap extent fixes for dynamic fork offsets
A new xfsqa test (226) with a prototype xfs_fsr change to try to
handle dynamic fork offsets better triggers an assertion failure
where the inode data fork is in btree format, yet there is room in
the inode for it to be in extent format. The two inodes look like:

before: ino 0x101 (target), num_extents 11, Max in-fork extents 6, broot size 40, fork offset 96
before: ino 0x115 (temp),  num_extents 5, Max in-fork extents 3, broot size 40, fork offset 56
after: ino 0x101 (target), num_extents 5, Max in-fork extents 6, broot size 40, fork offset 96
after: ino 0x115 (temp), num_extents 11, Max in-fork extents 3, broot size 40, fork offset 56

Basically the target inode ends up with 5 extents in btree format,
but it had space for 6 extents in extent format, so ends up
incorrect. Notably here the broot size is the same, and that is
where the kernel code is going wrong - the btree root will fit, so
it lets the swap go ahead.

The check should not allow the swap to take place if the number of
extents while in btree format is less than the number of extents
that can fit in the inode in extent format. Adding that check will
prevent this swap and corruption from occurring.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-04-26 12:38:51 -05:00
Vivek Goyal
afc24d49c1 blk-cgroup: config options re-arrangement
This patch fixes few usability and configurability issues.

o All the cgroup based controller options are configurable from
  "Genral Setup/Control Group Support/" menu. blkio is the only exception.
  Hence make this option visible in above menu and make it configurable from
  there to bring it inline with rest of the cgroup based controllers.

o Get rid of CONFIG_DEBUG_CFQ_IOSCHED.

  This option currently does two things.

  - Enable printing of cgroup paths in blktrace
  - Enables CONFIG_DEBUG_BLK_CGROUP, which in turn displays additional stat
    files in cgroup.

  If we are using group scheduling, blktrace data is of not really much use
  if cgroup information is not present. To get this data, currently one has to
  also enable CONFIG_DEBUG_CFQ_IOSCHED, which in turn brings the overhead of
  all the additional debug stat files which is not desired.

  Hence, this patch moves printing of cgroup paths under
  CONFIG_CFQ_GROUP_IOSCHED.

  This allows us to get rid of CONFIG_DEBUG_CFQ_IOSCHED completely. Now all
  the debug stat files are controlled only by CONFIG_DEBUG_BLK_CGROUP which
  can be enabled through config menu.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Divyesh Shah <dpshah@google.com>
Reviewed-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-26 19:27:56 +02:00
Vivek Goyal
e5ff082e8a blkio: Fix another BUG_ON() crash due to cfqq movement across groups
o Once in a while, I was hitting a BUG_ON() in blkio code. empty_time was
  assuming that upon slice expiry, group can't be marked empty already (except
  forced dispatch).

  But this assumption is broken if cfqq can move (group_isolation=0) across
  groups after receiving a request.

  I think most likely in this case we got a request in a cfqq and accounted
  the rq in one group, later while adding the cfqq to tree, we moved the queue
  to a different group which was already marked empty and after dispatch from
  slice we found group already marked empty and raised alarm.

  This patch does not error out if group is already marked empty. This can
  introduce some empty_time stat error only in case of group_isolation=0. This
  is better than crashing. In case of group_isolation=1 we should still get
  same stats as before this patch.

[  222.308546] ------------[ cut here ]------------
[  222.309311] kernel BUG at block/blk-cgroup.c:236!
[  222.309311] invalid opcode: 0000 [#1] SMP
[  222.309311] last sysfs file: /sys/devices/virtual/block/dm-3/queue/scheduler
[  222.309311] CPU 1
[  222.309311] Modules linked in: dm_round_robin dm_multipath qla2xxx scsi_transport_fc dm_zero dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
[  222.309311]
[  222.309311] Pid: 4780, comm: fio Not tainted 2.6.34-rc4-blkio-config #68 0A98h/HP xw8600 Workstation
[  222.309311] RIP: 0010:[<ffffffff8121ad88>]  [<ffffffff8121ad88>] blkiocg_set_start_empty_time+0x50/0x83
[  222.309311] RSP: 0018:ffff8800ba6e79f8  EFLAGS: 00010002
[  222.309311] RAX: 0000000000000082 RBX: ffff8800a13b7990 RCX: ffff8800a13b7808
[  222.309311] RDX: 0000000000002121 RSI: 0000000000000082 RDI: ffff8800a13b7a30
[  222.309311] RBP: ffff8800ba6e7a18 R08: 0000000000000000 R09: 0000000000000001
[  222.309311] R10: 000000000002f8c8 R11: ffff8800ba6e7ad8 R12: ffff8800a13b78ff
[  222.309311] R13: ffff8800a13b7990 R14: 0000000000000001 R15: ffff8800a13b7808
[  222.309311] FS:  00007f3beec476f0(0000) GS:ffff880001e40000(0000) knlGS:0000000000000000
[  222.309311] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  222.309311] CR2: 000000000040e7f0 CR3: 00000000a12d5000 CR4: 00000000000006e0
[  222.309311] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  222.309311] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  222.309311] Process fio (pid: 4780, threadinfo ffff8800ba6e6000, task ffff8800b3d6bf00)
[  222.309311] Stack:
[  222.309311]  0000000000000001 ffff8800bab17a48 ffff8800bab17a48 ffff8800a13b7800
[  222.309311] <0> ffff8800ba6e7a68 ffffffff8121da35 ffff880000000001 00ff8800ba5c5698
[  222.309311] <0> ffff8800ba6e7a68 ffff8800a13b7800 0000000000000000 ffff8800bab17a48
[  222.309311] Call Trace:
[  222.309311]  [<ffffffff8121da35>] __cfq_slice_expired+0x2af/0x3ec
[  222.309311]  [<ffffffff8121fd7b>] cfq_dispatch_requests+0x2c8/0x8e8
[  222.309311]  [<ffffffff8120f1cd>] ? spin_unlock_irqrestore+0xe/0x10
[  222.309311]  [<ffffffff8120fb1a>] ? blk_insert_cloned_request+0x70/0x7b
[  222.309311]  [<ffffffff81210461>] blk_peek_request+0x191/0x1a7
[  222.309311]  [<ffffffffa0002799>] dm_request_fn+0x38/0x14c [dm_mod]
[  222.309311]  [<ffffffff810ae61f>] ? sync_page_killable+0x0/0x35
[  222.309311]  [<ffffffff81210fd4>] __generic_unplug_device+0x32/0x37
[  222.309311]  [<ffffffff81211274>] generic_unplug_device+0x2e/0x3c
[  222.309311]  [<ffffffffa00011a6>] dm_unplug_all+0x42/0x5b [dm_mod]
[  222.309311]  [<ffffffff8120ca37>] blk_unplug+0x29/0x2d
[  222.309311]  [<ffffffff8120ca4d>] blk_backing_dev_unplug+0x12/0x14
[  222.309311]  [<ffffffff81109a7a>] block_sync_page+0x35/0x39
[  222.309311]  [<ffffffff810ae616>] sync_page+0x41/0x4a
[  222.309311]  [<ffffffff810ae62d>] sync_page_killable+0xe/0x35
[  222.309311]  [<ffffffff8158aa59>] __wait_on_bit_lock+0x46/0x8f
[  222.309311]  [<ffffffff810ae4f5>] __lock_page_killable+0x66/0x6d
[  222.309311]  [<ffffffff81056f9c>] ? wake_bit_function+0x0/0x33
[  222.309311]  [<ffffffff810ae528>] lock_page_killable+0x2c/0x2e
[  222.309311]  [<ffffffff810afbc5>] generic_file_aio_read+0x361/0x4f0
[  222.309311]  [<ffffffff810ea044>] do_sync_read+0xcb/0x108
[  222.309311]  [<ffffffff811e42f7>] ? security_file_permission+0x16/0x18
[  222.309311]  [<ffffffff810ea6ab>] vfs_read+0xab/0x108
[  222.309311]  [<ffffffff810ea7c8>] sys_read+0x4a/0x6e
[  222.309311]  [<ffffffff81002b5b>] system_call_fastpath+0x16/0x1b
[  222.309311] Code: 58 01 00 00 00 48 89 c6 75 0a 48 83 bb 60 01 00 00 00 74 09 48 8d bb a0 00 00 00 eb 35 41 fe cc 74 0d f6 83 c0 01 00 00 04 74 04 <0f> 0b eb fe 48 89 75 e8 e8 be e0 de ff 66 83 8b c0 01 00 00 04
[  222.309311] RIP  [<ffffffff8121ad88>] blkiocg_set_start_empty_time+0x50/0x83
[  222.309311]  RSP <ffff8800ba6e79f8>
[  222.309311] ---[ end trace 32b4f71dffc15712 ]---

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Divyesh Shah <dpshah@google.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-26 19:25:11 +02:00
Jens Axboe
e6d086d83c btrfs: convert to using bdi_setup_and_register()
It's now a provided helper, so get rid of the internal setup
and btrfs atomic_t bdi enumerator.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-26 10:27:54 +02:00
Herbert Xu
180ce7e810 crypto: authenc - Add EINPROGRESS check
When Steffen originally wrote the authenc async hash patch, he
correctly had EINPROGRESS checks in place so that we did not invoke
the original completion handler with it.

Unfortuantely I told him to remove it before the patch was applied.

As only MAY_BACKLOG request completion handlers are required to
handle EINPROGRESS completions, those checks are really needed.

This patch restores them.

Reported-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-04-26 09:14:05 +08:00
Linus Torvalds
b91ce4d14a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  ipv6: Fix inet6_csk_bind_conflict()
  e100: Fix the TX workqueue race
2010-04-25 16:28:56 -07:00
Eric Dumazet
6443bb1fc2 ipv6: Fix inet6_csk_bind_conflict()
Commit fda48a0d7a (tcp: bind() fix when many ports are bound)
introduced a bug on IPV6 part.
We should not call ipv6_addr_any(inet6_rcv_saddr(sk2)) but
ipv6_addr_any(inet6_rcv_saddr(sk)) because sk2 can be IPV4, while sk is
IPV6.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-25 15:09:42 -07:00
Linus Torvalds
202f2bb070 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: Issue the discard operation *before* releasing the blocks to be reused
  ext4: Fix buffer head leaks after calls to ext4_get_inode_loc()
  ext4: Fix possible lost inode write in no journal mode
2010-04-25 10:01:51 -07:00
Jörn Engel
5129a469a9 Catch filesystems lacking s_bdi
noop_backing_dev_info is used only as a flag to mark filesystems that
don't have any backing store, like tmpfs, procfs, spufs, etc.

Signed-off-by: Joern Engel <joern@logfs.org>

Changed the BUG_ON() to a WARN_ON(). Note that adding dirty inodes
to the noop_backing_dev_info is not legal and will not result in
them being flushed, but we already catch this condition in
__mark_inode_dirty() when checking for a registered bdi.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-25 08:54:42 +02:00
Alan Cox
401da6aea3 e100: Fix the TX workqueue race
Nothing stops the workqueue being left to run in parallel with close or a
few other operations. This causes double unmaps and the like.

See kerneloops.org #1041230 for an example

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-24 21:09:29 -07:00
Phillip Lougher
e0d1f70010 squashfs: fix potential buffer over-run on 4K block file systems
Sizing the buffer based on block size is incorrect, leading
to a potential buffer over-run on 4K block size file systems
(because the metadata block size is always 8K).  This bug
doesn't seem have triggered because 4K block size file systems
are not default, and also because metadata blocks after
compression tend to be less than 4K.

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
2010-04-25 02:09:05 +01:00
Phillip Lougher
370ec3d1ed squashfs: add missing buffer free
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
2010-04-25 02:09:05 +01:00
Phillip Lougher
1cb08e9738 squashfs: fix warn_on when root inode is corrupted
Fix warn_on triggered by mounting a fsfuzzer corrupted file system, where
the root inode has been corrupted.

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: Steve Grubb <sgrubb@redhat.com>
2010-04-25 01:49:17 +01:00
Linus Torvalds
ddc9b34c3b Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] use max load in conservative governor
  [CPUFREQ] fix a lockdep warning
2010-04-24 11:35:21 -07:00
Linus Torvalds
8e500ff8df Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
  gianfar: Fix potential oops during OF address translation
  fsl_pq_mdio: Fix kernel oops during OF address translation
  tcp: bind() fix when many ports are bound
  rdma: potential ERR_PTR dereference
  rtnetlink: potential ERR_PTR dereference
  net: ipv6 bind to device issue
  ipv6: allow to send packet after receiving ICMPv6 Too Big message with MTU field less than IPV6_MIN_MTU
  drivers/net/usb: Add new driver ipheth
  cxgb3: fix linkup issue
  X25 fix dead unaccepted sockets
  KS8851: NULL pointer dereference if list is empty
  net: 3c574_cs fix stats.tx_bytes counter
  xfrm6: ensure to use the same dev when building a bundle
  can: Fix possible NULL pointer dereference in ems_usb.c
  net: Fix an RCU warning in dev_pick_tx()
  ipv6: Fix tcp_v6_send_response transport header setting.
  bridge: add a missing ntohs()
  8139too: Fix a typo in the function name.
  mac80211: pass HT changes to driver when off channel
  mac80211: remove bogus TX agg state assignment
  ...
2010-04-24 11:34:17 -07:00
Linus Torvalds
383bee6b54 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: Ensure we re-enable devices on resume
  x86/PCI: parse additional host bridge window resource types
  PCI: revert broken device warning
  PCI aerdrv: use correct bit defines and add 2ms delay to aer_root_reset
  x86/PCI: ignore Consumer/Producer bit in ACPI window descriptions
2010-04-24 11:32:12 -07:00
Linus Torvalds
b39c8be6d5 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86:
  eeepc-laptop: add missing sparse_keymap_free
  eeepc-wmi: Build fix
  asus: don't modify bluetooth/wlan on boot
  dell-wmi: Fix memory leak
  eeepc-wmi: add backlight support
  eeepc-wmi: use a platform device as parent device of all sub-devices
  eeepc-wmi: add an eeepc_wmi context structure
2010-04-24 11:31:57 -07:00
Phillip Lougher
df37bd156d initramfs: handle unrecognised decompressor when unpacking
The unpack routine fails to handle the decompress_method() returning
unrecognised decompressor (compress_name == NULL).  This results in the
routine looping eventually oopsing on an out of bounds memory access.

Note this bug is usually hidden, only triggering on trailing junk after
one or more correct compressed blocks.  The case of the compressed archive
being complete junk is (by accident?) caught by the if (state != Reset)
check because state is initialised to Start, but not updated due to the
decompressor not having been called.  Obviously if the junk is trailing a
correctly decompressed buffer, state == Reset from the previous call to
the decompressor.

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:26 -07:00
Dan Carpenter
22eccdd7d2 ksm: check for ERR_PTR from follow_page()
The follow_page() function can potentially return -EFAULT so I added
checks for this.

Also I silenced an uninitialized variable warning on my version of gcc
(version 4.3.2).

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Izik Eidus <ieidus@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:26 -07:00
Dmitry Torokhov
453dc65931 VMware Balloon driver
This is a standalone version of VMware Balloon driver.  Ballooning is a
technique that allows hypervisor dynamically limit the amount of memory
available to the guest (with guest cooperation).  In the overcommit
scenario, when hypervisor set detects that it needs to shuffle some
memory, it instructs the driver to allocate certain number of pages, and
the underlying memory gets returned to the hypervisor.  Later hypervisor
may return memory to the guest by reattaching memory to the pageframes and
instructing the driver to "deflate" balloon.

We are submitting a standalone driver because KVM maintainer (Avi Kivity)
expressed opinion (rightly) that our transport does not fit well into
virtqueue paradigm and thus it does not make much sense to integrate with
virtio.

There were also some concerns whether current ballooning technique is the
right thing.  If there appears a better framework to achieve this we are
prepared to evaluate and switch to using it, but in the meantime we'd like
to get this driver upstream.

We want to get the driver accepted in distributions so that users do not
have to deal with an out-of-tree module and many distributions have
"upstream first" requirement.

The driver has been shipping for a number of years and users running on
VMware platform will have it installed as part of VMware Tools even if it
will not come from a distribution, thus there should not be additional
risk in pulling the driver into mainline.  The driver will only activate
if host is VMware so everyone else should not be affected at all.

Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:26 -07:00
Anton Blanchard
b8af67e268 fs/block_dev.c: fix performance regression in O_DIRECT|O_SYNC writes to block devices
We are seeing a large regression in database performance on recent
kernels.  The database opens a block device with O_DIRECT|O_SYNC and a
number of threads write to different regions of the file at the same time.

A simple test case is below.  I haven't defined DEVICE since getting it
wrong will destroy your data :) On an 3 disk LVM with a 64k chunk size we
see about 17MB/sec and only a few threads in IO wait:

procs  -----io---- -system-- -----cpu------
 r  b     bi    bo   in   cs us sy id wa st
 0  3      0 16170  656 2259  0  0 86 14  0
 0  2      0 16704  695 2408  0  0 92  8  0
 0  2      0 17308  744 2653  0  0 86 14  0
 0  2      0 17933  759 2777  0  0 89 10  0

Most threads are blocking in vfs_fsync_range, which has:

        mutex_lock(&mapping->host->i_mutex);
        err = fop->fsync(file, dentry, datasync);
        if (!ret)
                ret = err;
        mutex_unlock(&mapping->host->i_mutex);

commit 148f948ba8 (vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode) offers
some explanation of what is going on:

    Use these new helpers for syncing from generic VFS functions. This makes
    O_SYNC writes to block devices acquire i_mutex for syncing. If we really
    care about this, we can make block_fsync() drop the i_mutex and reacquire
    it before it returns.

Thanks Jan for such a good commit message!  As well as dropping i_mutex,
Christoph suggests we should remove the call to sync_blockdev():

> sync_blockdev is an overcomplicated alias for filemap_write_and_wait on
> the block device inode, which is exactly what we did just before calling
> into ->fsync

The patch below incorporates both suggestions. With it the testcase improves
from 17MB/s to 68M/sec:

procs  -----io---- -system-- -----cpu------
 r  b     bi    bo   in   cs us sy id wa st
 0  7      0 65536 1000 3878  0  0 70 30  0
 0 34      0 69632 1016 3921  0  1 46 53  0
 0 57      0 69632 1000 3921  0  0 55 45  0
 0 53      0 69640  754 4111  0  0 81 19  0

Testcase:

#define _GNU_SOURCE
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define NR_THREADS 64
#define BUFSIZE (64 * 1024)

#define DEVICE "/dev/mapper/XXXXXX"

#define ALIGN(VAL, SIZE) (((VAL)+(SIZE)-1) & ~((SIZE)-1))

static int fd;

static void *doit(void *arg)
{
	unsigned long offset = (long)arg;
	char *b, *buf;

	b = malloc(BUFSIZE + 1024);
	buf = (char *)ALIGN((unsigned long)b, 1024);
	memset(buf, 0, BUFSIZE);

	while (1)
		pwrite(fd, buf, BUFSIZE, offset);
}

int main(int argc, char *argv[])
{
	int flags = O_RDWR|O_DIRECT;
	int i;
	unsigned long offset = 0;

	if (argc > 1 && !strcmp(argv[1], "O_SYNC"))
		flags |= O_SYNC;

	fd = open(DEVICE, flags);
	if (fd == -1) {
		perror("open");
		exit(1);
	}

	for (i = 0; i < NR_THREADS-1; i++) {
		pthread_t tid;
		pthread_create(&tid, NULL, doit, (void *)offset);
		offset += BUFSIZE;
	}
	doit((void *)offset);

	return 0;
}

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:26 -07:00
Hans Verkuil
98d5ce0d00 lib/vsprintf.c: add missing EXPORT_SYMBOL(simple_strtoll)
Add a missing EXPORT_SYMBOL.

I must be the first person that wants to use this function :-)

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:26 -07:00
Amit Kucheria
81fa08f25b w1: fix omap 1-wire driver compilation
Fixes the following error:

  drivers/w1/masters/omap_hdq.c: In function 'hdq_wait_for_flag':
  drivers/w1/masters/omap_hdq.c:137: error: implicit declaration of function 'schedule_timeout_uninterruptible'
  drivers/w1/masters/omap_hdq.c: In function 'hdq_write_byte':
  drivers/w1/masters/omap_hdq.c:177: error: 'TASK_UNINTERRUPTIBLE' undeclared (first use in this function)
  drivers/w1/masters/omap_hdq.c:177: error: (Each undeclared identifier is reported only once
  drivers/w1/masters/omap_hdq.c:177: error: for each function it appears in.)
  drivers/w1/masters/omap_hdq.c:177: error: implicit declaration of function 'schedule_timeout'
  drivers/w1/masters/omap_hdq.c: In function 'hdq_isr':
  drivers/w1/masters/omap_hdq.c:221: error: 'TASK_NORMAL' undeclared (first use in this function)
  drivers/w1/masters/omap_hdq.c: In function 'omap_hdq_break':
  drivers/w1/masters/omap_hdq.c:316: error: 'TASK_UNINTERRUPTIBLE' undeclared (first use in this function)

Signed-off-by: Amit Kucheria <amit.kucheria@canonical.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:25 -07:00