Commit Graph

132045 Commits

Author SHA1 Message Date
Ingo Molnar
f701d35407 Merge branches 'tracing/ftrace' and 'linus' into tracing/core 2009-02-27 09:04:43 +01:00
Ingo Molnar
1b49061d40 Merge branch 'sched/clock' into tracing/ftrace
Conflicts:
	kernel/sched_clock.c
2009-02-27 08:35:19 +01:00
Ingo Molnar
83ce400928 x86: set X86_FEATURE_TSC_RELIABLE
If the TSC is constant and non-stop, also set it reliable.

(We will turn this off in DMI quirks for multi-chassis systems)

The performance number on a 16-way Nehalem system running
32 tasks that context-switch between each other is significant:

   sched_clock_stable=0		sched_clock_stable=1
   ....................         ....................
   22.456925 million/sec        24.306972 million/sec   [+8.2%]

lmbench's "lat_ctx -s 0 2" goes from 0.63 microseconds to
0.59 microseconds - a 6.7% increase in context-switching
performance.

Perfstat of 1 million pipe context switches between two tasks:

 Performance counter stats for './pipe-test-1m':

       [before]           [after]
   ............      ............
   37621.421089      36436.848378    task clock ticks     (msecs)

              0                 0    CPU migrations       (events)
        2000274           2000189    context switches     (events)
            194               193    pagefaults           (events)
     8433799643        8171016416    CPU cycles           (events) -3.21%
     8370133368        8180999694    instructions         (events) -2.31%
        4158565           3895941    cache references     (events) -6.74%
          44312             46264    cache misses         (events)

    2349.287976       2279.362465    wall-time            (msecs)  -3.06%

The speedup comes straight from the reduction in the instruction
count. sched_clock_cpu() got simpler and the whole workload thus
executes faster.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 21:20:25 +01:00
Ingo Molnar
b342501cd3 sched: allow architectures to specify sched_clock_stable
Allow CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures to still specify
that their sched_clock() implementation is reliable.

This will be used by x86 to switch on a faster sched_clock_cpu()
implementation on certain CPU types.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 21:20:22 +01:00
Linus Torvalds
64e71303e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: try committing transaction before returning ENOSPC
  Btrfs: add better -ENOSPC handling
2009-02-26 10:37:00 -08:00
Linus Torvalds
babb29b0a3 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  xen/blkfront: use blk_rq_map_sg to generate ring entries
  block: reduce stack footprint of blk_recount_segments()
  cciss: shorten 30s timeout on controller reset
  block: add documentation for register_blkdev()
  block: fix bogus gcc warning for uninitialized var usage
2009-02-26 10:36:35 -08:00
Linus Torvalds
6fc79d40d3 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Fix 64bit __copy_tofrom_user() regression
  powerpc: Fix 64bit memcpy() regression
  powerpc: Fix load/store float double alignment handler
2009-02-26 10:36:19 -08:00
Linus Torvalds
86883c2736 Make ieee1394_init a fs-initcall
It needs to happen before any firewire driver actually registers itself,
and that was previously handled by having the Makefile list the core
ieee1394 files before the drivers.

But now there are firewire drivers in drivers/media, and the Makefile
games aren't enough.  So just make ieee1394_init happen earlier in the
init sequence, the way all other bus layers already do.

Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Henrik Kurelid <henrik@kurelid.se>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Ben Backx <ben@bbackx.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-26 10:32:31 -08:00
Ingo Molnar
14131f2f98 tracing: implement trace_clock_*() APIs
Impact: implement new tracing timestamp APIs

Add three trace clock variants, with differing scalability/precision
tradeoffs:

 -   local: CPU-local trace clock
 -  medium: scalable global clock with some jitter
 -  global: globally monotonic, serialized clock

Make the ring-buffer use the local trace clock internally.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 18:44:06 +01:00
Ingo Molnar
6409c4da28 sched: sched_clock() improvement: use in_nmi()
make sure we dont execute more complex sched_clock() code in NMI context.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 18:44:05 +01:00
Jason Baron
af39241b90 tracing, genirq: add irq enter and exit trace events
Impact: add new tracepoints

Add them to the generic IRQ code, that way every architecture
gets these new tracepoints, not just x86.

Using Steve's new 'TRACE_FORMAT', I can get function graph
trace as follows using the original two IRQ tracepoints:

 3)               |    handle_IRQ_event() {
 3)               |    /* (irq_handler_entry) irq=28 handler=eth0 */
 3)               |    e1000_intr_msi() {
 3)   2.460 us    |      __napi_schedule();
 3)   9.416 us    |    }
 3)               |    /* (irq_handler_exit) irq=28 handler=eth0 return=handled */
 3) + 22.935 us   |  }

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 18:43:50 +01:00
Frederic Weisbecker
8656e7a2fa tracing/core: make the per cpu trace files in per cpu directories
Impact: restructure the VFS layout of per CPU trace buffers

The per cpu trace files are all in a single directory:
/debug/tracing/per_cpu. In case of a large number of cpu, the
content of this directory becomes messy so we create now one
directory per cpu inside /debug/tracing/per_cpu which contain
each their own trace_pipe and trace files.

Ie:

 /debug/tracing$ ls -R per_cpu
 per_cpu:
 cpu0  cpu1

 per_cpu/cpu0:
 trace  trace_pipe

 per_cpu/cpu1:
 trace  trace_pipe

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 14:04:08 +01:00
Jens Axboe
9e973e64ac xen/blkfront: use blk_rq_map_sg to generate ring entries
On occasion, the request will apparently have more segments than we
fit into the ring. Jens says:

> The second problem is that the block layer then appears to create one
> too many segments, but from the dump it has rq->nr_phys_segments ==
> BLKIF_MAX_SEGMENTS_PER_REQUEST. I suspect the latter is due to
> xen-blkfront not handling the merging on its own. It should check that
> the new page doesn't form part of the previous page. The
> rq_for_each_segment() iterates all single bits in the request, not dma
> segments. The "easiest" way to do this is to call blk_rq_map_sg() and
> then iterate the mapped sg list. That will give you what you are
> looking for.

> Here's a test patch, compiles but otherwise untested. I spent more
> time figuring out how to enable XEN than to code it up, so YMMV!
> Probably the sg list wants to be put inside the ring and only
> initialized on allocation, then you can get rid of the sg on stack and
> sg_init_table() loop call in the function. I'll leave that, and the
> testing, to you.

[Moved sg array into info structure, and initialize once. -J]

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-26 10:45:48 +01:00
Jens Axboe
1e42807918 block: reduce stack footprint of blk_recount_segments()
blk_recalc_rq_segments() requires a request structure passed in, which
we don't have from blk_recount_segments(). So the latter allocates one on
the stack, using > 400 bytes of stack for that. This can cause us to spill
over one page of stack from ext4 at least:

 0)     4560     400   blk_recount_segments+0x43/0x62
 1)     4160      32   bio_phys_segments+0x1c/0x24
 2)     4128      32   blk_rq_bio_prep+0x2a/0xf9
 3)     4096      32   init_request_from_bio+0xf9/0xfe
 4)     4064     112   __make_request+0x33c/0x3f6
 5)     3952     144   generic_make_request+0x2d1/0x321
 6)     3808      64   submit_bio+0xb9/0xc3
 7)     3744      48   submit_bh+0xea/0x10e
 8)     3696     368   ext4_mb_init_cache+0x257/0xa6a [ext4]
 9)     3328     288   ext4_mb_regular_allocator+0x421/0xcd9 [ext4]
10)     3040     160   ext4_mb_new_blocks+0x211/0x4b4 [ext4]
11)     2880     336   ext4_ext_get_blocks+0xb61/0xd45 [ext4]
12)     2544      96   ext4_get_blocks_wrap+0xf2/0x200 [ext4]
13)     2448      80   ext4_da_get_block_write+0x6e/0x16b [ext4]
14)     2368     352   mpage_da_map_blocks+0x7e/0x4b3 [ext4]
15)     2016     352   ext4_da_writepages+0x2ce/0x43c [ext4]
16)     1664      32   do_writepages+0x2d/0x3c
17)     1632     144   __writeback_single_inode+0x162/0x2cd
18)     1488      96   generic_sync_sb_inodes+0x1e3/0x32b
19)     1392      16   sync_sb_inodes+0xe/0x10
20)     1376      48   writeback_inodes+0x69/0xb3
21)     1328     208   balance_dirty_pages_ratelimited_nr+0x187/0x2f9
22)     1120     224   generic_file_buffered_write+0x1d4/0x2c4
23)      896     176   __generic_file_aio_write_nolock+0x35f/0x393
24)      720      80   generic_file_aio_write+0x6c/0xc8
25)      640      80   ext4_file_write+0xa9/0x137 [ext4]
26)      560     320   do_sync_write+0xf0/0x137
27)      240      48   vfs_write+0xb3/0x13c
28)      192      64   sys_write+0x4c/0x74
29)      128     128   system_call_fastpath+0x16/0x1b

Split the segment counting out into a __blk_recalc_rq_segments() helper
to avoid allocating an onstack request just for checking the physical
segment count.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26 10:45:48 +01:00
Jens Axboe
5e4c91c84b cciss: shorten 30s timeout on controller reset
If reset_devices is set for kexec, then cciss will delay 30 seconds
since the old 5i controller _may_ need that long to recover. Replace
the long sleep with incremental sleep and tests to reduce the 30 seconds
to worst case for 5i, so that other controllers will proceed quickly.

Reviewed-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26 10:45:48 +01:00
Márton Németh
9e8c0bccdc block: add documentation for register_blkdev()
Add documentation for register_blkdev() function and for the parameters.

Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26 10:45:48 +01:00
Jens Axboe
b2bf96833c block: fix bogus gcc warning for uninitialized var usage
Newer gcc throw this warning:

        fs/bio.c: In function ?bio_alloc_bioset?:
        fs/bio.c:305: warning: ?p? may be used uninitialized in this function

since it cannot figure out that 'p' is only ever used if 'bs' is non-NULL.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26 10:45:48 +01:00
Mark Nelson
f72b728bf1 powerpc: Fix 64bit __copy_tofrom_user() regression
This fixes a regression introduced by commit
a4e22f02f5 ("powerpc: Update 64bit
__copy_tofrom_user() using CPU_FTR_UNALIGNED_LD_STD").

The same bug that existed in the 64bit memcpy() also exists here so fix
it here too. The fix is the same as that applied to memcpy() with the
addition of fixes for the exception handling code required for
__copy_tofrom_user().

This stops us reading beyond the end of the source region we were told
to copy.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-26 14:02:54 +11:00
Mark Nelson
e423b9ecd6 powerpc: Fix 64bit memcpy() regression
This fixes a regression introduced by commit
25d6e2d7c5 ("powerpc: Update 64bit memcpy()
using CPU_FTR_UNALIGNED_LD_STD").

This commit allowed CPUs that have the CPU_FTR_UNALIGNED_LD_STD CPU
feature bit present to do the memcpy() with unaligned load doubles. But,
along with this came a bug where our final load double would read bytes
beyond a page boundary and into the next (unmapped) page. This was caught
by enabling CONFIG_DEBUG_PAGEALLOC,

The fix was to read only the number of bytes that we need to store rather
than reading a full 8-byte doubleword and storing only a portion of that.

In order to minimise the amount of existing code touched we use the
original do_tail for the src_unaligned case.

Below is an example of the regression, as reported by Sachin Sant:

Unable to handle kernel paging request for data at address 0xc00000003f380000
Faulting instruction address: 0xc000000000039574
cpu 0x1: Vector: 300 (Data Access) at [c00000003baf3020]
    pc: c000000000039574: .memcpy+0x74/0x244
    lr: d00000000244916c: .ext3_xattr_get+0x288/0x2f4 [ext3]
    sp: c00000003baf32a0
   msr: 8000000000009032
   dar: c00000003f380000
 dsisr: 40000000
  current = 0xc00000003e54b010
  paca    = 0xc000000000a53680
    pid   = 1840, comm = readahead
enter ? for help
[link register   ] d00000000244916c .ext3_xattr_get+0x288/0x2f4 [ext3]
[c00000003baf32a0] d000000002449104 .ext3_xattr_get+0x220/0x2f4 [ext3]
(unreliab
le)
[c00000003baf3390] d00000000244a6e8 .ext3_xattr_security_get+0x40/0x5c [ext3]
[c00000003baf3400] c000000000148154 .generic_getxattr+0x74/0x9c
[c00000003baf34a0] c000000000333400 .inode_doinit_with_dentry+0x1c4/0x678
[c00000003baf3560] c00000000032c6b0 .security_d_instantiate+0x50/0x68
[c00000003baf35e0] c00000000013c818 .d_instantiate+0x78/0x9c
[c00000003baf3680] c00000000013ced0 .d_splice_alias+0xf0/0x120
[c00000003baf3720] d00000000243e05c .ext3_lookup+0xec/0x134 [ext3]
[c00000003baf37c0] c000000000131e74 .do_lookup+0x110/0x260
[c00000003baf3880] c000000000134ed0 .__link_path_walk+0xa98/0x1010
[c00000003baf3970] c0000000001354a0 .path_walk+0x58/0xc4
[c00000003baf3a20] c000000000135720 .do_path_lookup+0x138/0x1e4
[c00000003baf3ad0] c00000000013645c .path_lookup_open+0x6c/0xc8
[c00000003baf3b70] c000000000136780 .do_filp_open+0xcc/0x874
[c00000003baf3d10] c0000000001251e0 .do_sys_open+0x80/0x140
[c00000003baf3dc0] c00000000016aaec .compat_sys_open+0x24/0x38
[c00000003baf3e30] c00000000000855c syscall_exit+0x0/0x40

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-26 14:02:53 +11:00
Michael Neuling
49f297f8df powerpc: Fix load/store float double alignment handler
When we introduced VSX, we changed the way FPRs are stored in the
thread_struct.  Unfortunately we missed the load/store float double
alignment handler code when updating how we access FPRs in the
thread_struct.

Below fixes this and merges the little/big endian case.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-26 14:02:53 +11:00
Ingo Molnar
f4abfb8d0d Merge branch 'tip/tracing/ftrace' of ssh://master.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-26 03:48:44 +01:00
Ingo Molnar
e36b1e136a Merge branches 'tracing/ftrace', 'tracing/hw-branch-tracing' and 'linus' into tracing/core 2009-02-26 03:47:27 +01:00
Steven Rostedt
3cdfdf91fc tracing: wrap arguments with PARAMS
Peter Zijlstra warned that TPPROTO and TPARGS might become something
other than a simple copy of itself. To prevent this from having
side effects in the TRACE_FORMAT macro in tracepoint.h, we add a
PARAMS() macro to be defined as just a wrapper.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-25 21:44:26 -05:00
Steven Rostedt
eef62a6826 tracing: rename DEFINE_TRACE_FMT to just TRACE_FORMAT
There's been a bit confusion to whether DEFINE/DECLARE_TRACE_FMT should
be a DEFINE or a DECLARE. Ingo Molnar suggested simply calling it
TRACE_FORMAT.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-25 21:44:22 -05:00
Linus Torvalds
169d418b12 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: emu10k1 - Fix digital/analog switch on audigy2 ZS
  ALSA: hda - Quirk for Acer Aspire 6530G
  ALSA: hda - add another MacBook Pro 3,1 SSID
  ALSA: fix excessive background noise introduced by OSS emulation rate shrink
  ALSA: aw2: do not grab every saa7146 based device
  ALSA: hda - Fix parse of init_verbs sysfs entry
  ALSA: pcxhr.h replace signed one-bit bitfields
2009-02-25 15:16:18 -08:00
Linus Torvalds
70c01f01a2 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Don't go beyond iosapic_intr_info's arraysize
  [IA64] Do not go beyond ARRAY_SIZE of unw.hash
  [IA64] enable setting DMAR on by default
2009-02-25 15:14:37 -08:00
Linus Torvalds
c4eb1bf63f Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] pata_legacy: for VLB 32bit PIO don't try tricks with slop
  [libata] pata_amd: program FIFO
  sata_mv: fix SoC interrupt breakage
  pata_it821x: resume from hibernation fails with RAID volume
2009-02-25 15:12:48 -08:00
Alan Cox
c55af1f5ab [libata] pata_legacy: for VLB 32bit PIO don't try tricks with slop
These devices are generally used with ATA anyway and it seems that some
ATAPI will need us to issue the right number of words.  Therefore as we
can't switch mid burst on VLB devices we should only use 32bit I/O for
suitable block sizes.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-02-25 15:30:23 -05:00
Alan Cox
c48052cc36 [libata] pata_amd: program FIFO
With 32bit PIO we can use the posted write buffers, but only for 32bit I/O
cycles.  This means we must disable the FIFO for ATAPI where a final 16bit
cycle may occur.

Rework the FIFO logic so that we disable the FIFO then selectively
re-enable it when we set the timings on AMD devices.  Also fix a case
where we scribbled on PCI config 0x41 of Nvidia chips when we shouldn't.

Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-02-25 15:30:16 -05:00
Mark Lord
6be96ac15e sata_mv: fix SoC interrupt breakage
For some reason, sata_mv doesn't clear interrupt status during init
when it's running on an SoC host adapter.  If the bootloader has
touched the SATA controller before starting Linux, Linux can end up
enabling the SATA interrupt with events pending, which will cause the
interrupt to be marked as spurious and then be disabled, which then
breaks all further accesses to the controller.

This patch makes the SoC path clear interrupt status on init like in
the non-SoC case.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-02-25 15:25:35 -05:00
Ondrej Zary
7ba07d16bd pata_it821x: resume from hibernation fails with RAID volume
Hibernation didn't work for me since I started to use IT8212 controller.
I did some debugging (booting with no_console_suspend init=/bin/sh).

Found that resume fails (2.6.28) with "serial number mismatch 'some
garbage' != 'some other garbage'" and "revalidation failed" messages.
That's because the controller firmware fills different serial number in
the IDENTIFY every boot.

The patch below fixes the resume simply clearing the serial number.  The
proper fix would be probably to fill in the serial number of the RAID
volume instead.  I assume that there must be something like that stored on
the drives but I don't know where.

Fix resume on pata_it821x RAID volume by clearing the serial number in
IDENTIFY data, which is otherwise different on each boot.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-02-25 15:22:44 -05:00
Linus Torvalds
a36e4f0cab Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ide: fix refcounting in device drivers
  ide-cd: document capacity hack
  it821x: remove dead URL
  atiixp: fix missing parentheses
  amd74xx: device/vendor confusion
  ide: ide.c 'clear' fix, update "ide=nodma" documentation
2009-02-25 12:22:06 -08:00
Hugh Dickins
0b0a0806b0 shmem: fix shared anonymous accounting
Each time I exit Firefox, /proc/meminfo's Committed_AS goes down almost
400 kB: OVERCOMMIT_NEVER would be allowing overcommits it should
prohibit.

Commit fc8744adc8 "Stop playing silly
games with the VM_ACCOUNT flag" changed shmem_file_setup() to set the
shmem file's VM_ACCOUNT flag according to VM_NORESERVE not being set in
the vma flags; but did so only _after_ the shmem_acct_size(flags, size)
call which is expected to pre-account a shared anonymous object.

It's all clearer if we switch shmem.c over to use VM_NORESERVE
throughout in place of !VM_ACCOUNT.

But I very nearly sent in a patch which mistakenly removed the
accounting from tmpfs files: shmem_get_inode()'s memset was good for not
setting VM_ACCOUNT, but now it needs to set VM_NORESERVE.

Rather than setting that by default, then perhaps clearing it again in
shmem_file_setup(), let's pass it as a flag to shmem_get_inode(): that
allows us to remove the #ifdef CONFIG_SHMEM from shmem_file_setup().

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-25 12:21:42 -08:00
Roel Kluin
5b5923975f [IA64] Don't go beyond iosapic_intr_info's arraysize
vi arch/ia64/kernel/iosapic.c +142
static struct iosapic_intr_info {
	...
} iosapic_intr_info[NR_IRQS];

But at line 510 we have:
	for (i = 0; i <= NR_IRQS; i++) {

s/<=/</

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-02-25 11:50:53 -08:00
Roel Kluin
aa2f63c954 [IA64] Do not go beyond ARRAY_SIZE of unw.hash
static struct {

... :114
        unsigned short hash[UNW_HASH_SIZE];

... :2152
	for (index = 0; index <= UNW_HASH_SIZE; ++index) {

This is a bug, isn't it?

s/<=/</

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-02-25 11:48:04 -08:00
Kyle McMartin
6b1ff036d4 [IA64] enable setting DMAR on by default
The previous commit which introduced the DMAR_DEFAULT_ON setting in
drivers/pci/dmar.c neglected to add the ability for ia64 to enable
the IOMMU by default. Rectify that mistake, doh!

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-02-25 11:40:27 -08:00
Bartlomiej Zolnierkiewicz
8fed436841 ide: fix refcounting in device drivers
During host driver module removal del_gendisk() results in a final
put on drive->gendev and freeing the drive by drive_release_dev().

Convert device drivers from using struct kref to use struct device
so device driver's object holds reference on ->gendev and prevents
drive from prematurely going away.

Also fix ->remove methods to not erroneously drop reference on a
host driver by using only put_device() instead of ide*_put().

Reported-by: Stanislaw Gruszka <stf_xl@wp.pl>
Tested-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-02-25 20:28:24 +01:00
Bartlomiej Zolnierkiewicz
d3dd7107f4 ide-cd: document capacity hack
Just copy the comment from drivers/scsi/sr.c::sr_done()
(from which the capacity hack has been originated).

Cc: Borislav Petkov <petkovbb@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-02-25 20:28:23 +01:00
Bartlomiej Zolnierkiewicz
f38344b0a0 it821x: remove dead URL
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-02-25 20:28:22 +01:00
Roel Kluin
f76bee16fc atiixp: fix missing parentheses
Fix missing parentheses so PIO/DMA timings for master device on the
second channel are programmed correctly (IOW "8 0 24 16" offset values
should be used instead of the current "8 0 16 16").

[ The bug went unnoticed because after PIO/DMA timings get programmed
  incorrectly for the third device they are overwritten with timings
  for the fourth device and since BIOS should also program timings for
  the third device everything should work fine until suspend/resume
  cycle or user requested transfer mode changes. ]

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
[bart: update patch description]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-02-25 20:28:22 +01:00
Roel Kluin
43a12216d3 amd74xx: device/vendor confusion
Device and vendor ids were confused

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-02-25 20:28:22 +01:00
David Fries
0af80c04e2 ide: ide.c 'clear' fix, update "ide=nodma" documentation
Documentation/kernel-parameters.txt
- ide=nodma is no longer valid.

drivers/ide/Kconfig
- The module is ide-core.ko not ide.

drivers/ide/ide.c
- It took me a while to figure out what the arguments %d.%d:%d to nodma
  module parameter ment, so I added a comment to each.
- Added a comment to each of the sscanf lines.
- There is a bug, if j is 0 it would previously clear all the other bits
  except the current device, changed in three different places.
  mask &= (1 << i) should be mask &= ~(1 << i).

Signed-off-by: David Fries <david@fries.net>
[bart: s/disk/device/ in ide.c, beautify patch description]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-02-25 20:28:21 +01:00
Linus Torvalds
c15d8a6499 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/i915: convert DRM_ERROR to DRM_DEBUG in phys object pwrite path
  drm/i915: make hw page ioremap use ioremap_wc
  drm: edid revision 0 is valid
  drm: Correct unbalanced drm_vblank_put() during mode setting.
  drm: disable encoders before re-routing them
  drm: Fix ordering of bit fields in EDID structure leading huge vsync values.
  drm: Fix shifts of EDID vsync offset/width fields.
  drm/i915: handle bogus VBT panel timing
  drm/i915: remove PLL debugging messages
2009-02-25 09:49:30 -08:00
Linus Torvalds
490213556a Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
  md: avoid races when stopping resync.
  md/raid10:  Don't call bitmap_cond_end_sync when we are doing recovery.
  md/raid10:  Don't skip more than 1 bitmap-chunk at a time during recovery.
2009-02-25 09:34:27 -08:00
Linus Torvalds
f8dacde8c0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Fix crashes in jbusmc_print_dimm()
2009-02-25 09:31:56 -08:00
Linus Torvalds
60042600c5 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  intel-iommu: fix endless "Unknown DMAR structure type" loop
  VT-d: handle Invalidation Queue Error to avoid system hang
  intel-iommu: fix build error with INTR_REMAP=y and DMAR=n
2009-02-25 09:31:21 -08:00
Fenghua Yu
6aa03ab069 Fix iwlan DMA mapping direction
When iwlan runs on IOMMU, IOMMU generates a lot of PTE write faults
because PTE write bit is not set on some of PTE's.  This is because
iwlan driver calls DMA mapping with PCI_DMA_TODEVICE which is read only
in mapping PTE.  But iwlan device actually writes to the mapped page to
update its contents.  This issue is not exposed in swiotlb.  But VT-d
hardware can capture this fault and stop the fault transaction.

The following patch fixes the issue.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Bhavesh Davda <bhavesh@vmware.com>
Tested-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-25 09:30:56 -08:00
Frederic Weisbecker
d7350c3f45 tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.

Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.

Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.

The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.

The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.

Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 13:40:58 +01:00
Frederic Weisbecker
b04cc6b1f6 tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu

Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:

- trace is an iterator through the ring-buffer. It's a reader
  but not a consumer It doesn't block when no more traces are
  available.

- trace pretty similar to the former, except that it adds more
  informations such as prempt count, irq flag, ...

- trace_pipe is a reader and a consumer, it will also block
  waiting for traces if necessary (heh, yes it's a pipe).

The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.

The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.

So this patch creates a new directory: /debug/tracing/per_cpu/.

Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.

Which means if you have two cpus, you will have:

 trace0
 trace1
 trace_pipe0
 trace_pipe1

And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.

The original all-in-one cpu trace file are still available on
their original place.

Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.

Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.

Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 13:40:58 +01:00
Ingo Molnar
2b1b858f69 Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-25 12:50:07 +01:00