Commit Graph

1783 Commits

Author SHA1 Message Date
Artem Bityutskiy
f3dc0e3842 lib/list_sort: test: improve errors handling
The 'lib_sort()' test does not free memory if it fails, and it makes the
kernel panic if it cannot allocate memory.  This patch fixes the problem.

This patch also changes several small things:
 o use 'list_add()' helper instead of adding manually
 o introduce temporary 'el1' variable to avoid ugly and unreadalbe
   "if" statement
 o make 'head' to be stack variable instead of 'kmalloc()'ed, which
   simplifies code a bit

Overall, this patch is of clean-up type.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:19 -07:00
Artem Bityutskiy
eeee9ebb54 lib/list_sort: test: use generic random32
Instead of using own pseudo-random generator, use generic linux
'random32()' function.  Presumably, this should improve test coverage.

At the same time, do the following changes:
  o Use shorter macro name for test list length
  o Do not use strange 'l_h' name for 'struct list_head' element,
    use 'list', because it is traditional name and thus, makes the
    code more obvious and readable.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:19 -07:00
Artem Bityutskiy
bb2ab10fa6 lib/list_sort: test: use more reasonable printk levels
I do not see any reason to use KERN_WARN for normal messages and
KERN_EMERG for error messages in the lib_sort testing routine.  Let's use
more reasonable KERN_NORM and KERN_ERR levels.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:18 -07:00
Artem Bityutskiy
6d411e6c01 lib/Kconfig.debug: add list_sort debugging switch
While hunting a non-existing bug in 'list_sort()', I've improved the
'list_sort_test()' function which tests the 'list_sort()' library call.
Although at the end I found a bug in my code, but not in 'list_sort()', I
think my clean-ups and improvements are worth merging because they make
the test function better.

This patch:

Make the self-tests selectable via Kconfig rather than by manual enabling
of DEBUG_LIST_SORT.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:18 -07:00
Tejun Heo
e2852ae825 percpu_counter: add debugobj support
All percpu counters are linked to a global list on initialization and
removed from it on destruction.  The list is walked during CPU up/down.
If a percpu counter is freed without being properly destroyed, the system
will oops only on the next CPU up/down making it pretty nasty to track
down.  This patch adds debugobj support for percpu counters so that such
problems can be found easily.

As percpu counters don't make sense on stack and can't be statically
initialized, debugobj support is pretty simple.  It's initialized and
activated on counter initialization, and deactivatd and destroyed on
counter destruction.  With this patch applied, the bug fixed by commit
602586a83b (shmem: put_super must
percpu_counter_destroy) triggers the following warning on tmpfs unmount
and the system won't oops on the next cpu up/down operation.

 ------------[ cut here ]------------
 WARNING: at lib/debugobjects.c:259 debug_print_object+0x5c/0x70()
 Hardware name: Bochs
 ODEBUG: free active (active state 0) object type: percpu_counter
 Modules linked in:
 Pid: 3999, comm: umount Not tainted 2.6.36-rc2-work+ #5
 Call Trace:
  [<ffffffff81083f7f>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff81084076>] warn_slowpath_fmt+0x46/0x50
  [<ffffffff813b45cc>] debug_print_object+0x5c/0x70
  [<ffffffff813b50e5>] debug_check_no_obj_freed+0x125/0x210
  [<ffffffff811577d3>] kfree+0xb3/0x2f0
  [<ffffffff81132edd>] shmem_put_super+0x1d/0x30
  [<ffffffff81162e96>] generic_shutdown_super+0x56/0xe0
  [<ffffffff81162f86>] kill_anon_super+0x16/0x60
  [<ffffffff81162ff7>] kill_litter_super+0x27/0x30
  [<ffffffff81163295>] deactivate_locked_super+0x45/0x60
  [<ffffffff81163cfa>] deactivate_super+0x4a/0x70
  [<ffffffff8117d446>] mntput_no_expire+0x86/0xe0
  [<ffffffff8117df7f>] sys_umount+0x6f/0x360
  [<ffffffff8103f01b>] system_call_fastpath+0x16/0x1b
 ---[ end trace cce2a341ba3611a7 ]---

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Thomas Gleixner <tglxlinutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:18 -07:00
Naohiro Aota
066a9be6c0 idr: fix idr_pre_get() locking description
Despite the idr_pre_get() kernel-doc, there are some cases where you can
call idr_pre_get() from within locked regions.  Add a description for such
cases.

See also: http://lkml.org/lkml/2010/9/16/462

[akpm@linux-foundation.org: cleanups, grammatical fixes]
Signed-off-by: Naohiro Aota <naota@elisp.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:18 -07:00
Andy Shevchenko
66f1991bc2 lib/bitmap.c: use hex_to_bin()
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:18 -07:00
Changli Gao
b903c0b889 lib: fix scnprintf() if @size is == 0
scnprintf() should return 0 if @size is == 0. Update the comment for it,
as @size is unsigned.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:16 -07:00
Joe Perches
5e05798128 vsprintf.c: use default pointer field size for "(null)" strings
It might be nicer to align the output.

For instance, ACPI messages sometimes have "(null)" pointers.

$ dmesg | grep "(null)"  -A 1 -B 1
[    0.198733] ACPI: Dynamic OEM Table Load:
[    0.198745] ACPI: SSDT (null) 00239 (v02  PmRef  Cpu0Ist 00003000 INTL 20051117)
[    0.199294] ACPI: SSDT 7f596e10 001C7 (v02  PmRef  Cpu0Cst 00003001 INTL 20051117)
[    0.200708] ACPI: Dynamic OEM Table Load:
[    0.200721] ACPI: SSDT (null) 001C7 (v02  PmRef  Cpu0Cst 00003001 INTL 20051117)
[    0.201950] ACPI: SSDT 7f597f10 000D0 (v02  PmRef  Cpu1Ist 00003000 INTL 20051117)
[    0.203386] ACPI: Dynamic OEM Table Load:
[    0.203398] ACPI: SSDT (null) 000D0 (v02  PmRef  Cpu1Ist 00003000 INTL 20051117)
[    0.203871] ACPI: SSDT 7f595f10 00083 (v02  PmRef  Cpu1Cst 00003000 INTL 20051117)
[    0.205301] ACPI: Dynamic OEM Table Load:
[    0.205315] ACPI: SSDT (null) 00083 (v02  PmRef  Cpu1Cst 00003000 INTL 20051117)

[akpm@linux-foundation.org: add code comment]
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:16 -07:00
Masanori ITOH
8474b591fa percpu: fix list_head init bug in __percpu_counter_init()
WARNING: at lib/list_debug.c:26 __list_add+0x3f/0x81()
Hardware name: Express5800/B120a [N8400-085]
list_add corruption. next->prev should be prev (ffffffff81a7ea00), but was dead000000200200. (next=ffff88080b872d58).
Modules linked in: aoe ipt_MASQUERADE iptable_nat nf_nat autofs4 sunrpc bridge 8021q garp stp llc ipv6 cpufreq_ondemand acpi_cpufreq freq_table dm_round_robin dm_multipath kvm_intel kvm uinput lpfc scsi_transport_fc igb ioatdma scsi_tgt i2c_i801 i2c_core dca iTCO_wdt iTCO_vendor_support pcspkr shpchp megaraid_sas [last unloaded: aoe]
Pid: 54, comm: events/3 Tainted: G        W  2.6.34-vanilla1 #1
Call Trace:
[<ffffffff8104bd77>] warn_slowpath_common+0x7c/0x94
[<ffffffff8104bde6>] warn_slowpath_fmt+0x41/0x43
[<ffffffff8120fd2e>] __list_add+0x3f/0x81
[<ffffffff81212a12>] __percpu_counter_init+0x59/0x6b
[<ffffffff810d8499>] bdi_init+0x118/0x17e
[<ffffffff811f2c50>] blk_alloc_queue_node+0x79/0x143
[<ffffffff811f2d2b>] blk_alloc_queue+0x11/0x13
[<ffffffffa02a931d>] aoeblk_gdalloc+0x8e/0x1c9 [aoe]
[<ffffffffa02aa655>] aoecmd_sleepwork+0x25/0xa8 [aoe]
[<ffffffff8106186c>] worker_thread+0x1a9/0x237
[<ffffffffa02aa630>] ? aoecmd_sleepwork+0x0/0xa8 [aoe]
[<ffffffff81065827>] ? autoremove_wake_function+0x0/0x39
[<ffffffff810616c3>] ? worker_thread+0x0/0x237
[<ffffffff810653ad>] kthread+0x7f/0x87
[<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
[<ffffffff8106532e>] ? kthread+0x0/0x87
[<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10

It's because there is no initialization code for a list_head contained in
the struct backing_dev_info under CONFIG_HOTPLUG_CPU, and the bug comes up
when block device drivers calling blk_alloc_queue() are used.  In case of
me, I got them by using aoe.

Signed-off-by: Masanori Itoh <itoumsn@nttdata.co.jp>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:04 -07:00
Linus Torvalds
229aebb873 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  Update broken web addresses in arch directory.
  Update broken web addresses in the kernel.
  Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget
  Revert "Fix typo: configuation => configuration" partially
  ida: document IDA_BITMAP_LONGS calculation
  ext2: fix a typo on comment in ext2/inode.c
  drivers/scsi: Remove unnecessary casts of private_data
  drivers/s390: Remove unnecessary casts of private_data
  net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data
  drivers/infiniband: Remove unnecessary casts of private_data
  drivers/gpu/drm: Remove unnecessary casts of private_data
  kernel/pm_qos_params.c: Remove unnecessary casts of private_data
  fs/ecryptfs: Remove unnecessary casts of private_data
  fs/seq_file.c: Remove unnecessary casts of private_data
  arm: uengine.c: remove C99 comments
  arm: scoop.c: remove C99 comments
  Fix typo configue => configure in comments
  Fix typo: configuation => configuration
  Fix typo interrest[ing|ed] => interest[ing|ed]
  Fix various typos of valid in comments
  ...

Fix up trivial conflicts in:
	drivers/char/ipmi/ipmi_si_intf.c
	drivers/usb/gadget/rndis.c
	net/irda/irnet/irnet_ppp.c
2010-10-24 13:41:39 -07:00
Pekka Enberg
6d4121f6c2 Merge branch 'master' into for-linus
Conflicts:
	include/linux/percpu.h
	mm/percpu.c
2010-10-24 19:57:05 +03:00
Linus Torvalds
b9da057105 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (31 commits)
  driver core: Display error codes when class suspend fails
  Driver core: Add section count to memory_block struct
  Driver core: Add mutex for adding/removing memory blocks
  Driver core: Move find_memory_block routine
  hpilo: Despecificate driver from iLO generation
  driver core: Convert link_mem_sections to use find_memory_block_hinted.
  driver core: Introduce find_memory_block_hinted which utilizes kset_find_obj_hinted.
  kobject: Introduce kset_find_obj_hinted.
  driver core: fix build for CONFIG_BLOCK not enabled
  driver-core: base: change to new flag variable
  sysfs: only access bin file vm_ops with the active lock
  sysfs: Fail bin file mmap if vma close is implemented.
  FW_LOADER: fix kconfig dependency warning on HOTPLUG
  uio: Statically allocate uio_class and use class .dev_attrs.
  uio: Support 2^MINOR_BITS minors
  uio: Cleanup irq handling.
  uio: Don't clear driver data
  uio: Fix lack of locking in init_uio_class
  SYSFS: Allow boot time switching between deprecated and modern sysfs layout
  driver core: remove CONFIG_SYSFS_DEPRECATED_V2 but keep it for block devices
  ...
2010-10-22 19:36:42 -07:00
Linus Torvalds
092e0e7e52 Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  vfs: make no_llseek the default
  vfs: don't use BKL in default_llseek
  llseek: automatically add .llseek fop
  libfs: use generic_file_llseek for simple_attr
  mac80211: disallow seeks in minstrel debug code
  lirc: make chardev nonseekable
  viotape: use noop_llseek
  raw: use explicit llseek file operations
  ibmasmfs: use generic_file_llseek
  spufs: use llseek in all file operations
  arm/omap: use generic_file_llseek in iommu_debug
  lkdtm: use generic_file_llseek in debugfs
  net/wireless: use generic_file_llseek in debugfs
  drm: use noop_llseek
2010-10-22 10:52:56 -07:00
Linus Torvalds
5704e44d28 Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  BKL: introduce CONFIG_BKL.
  dabusb: remove the BKL
  sunrpc: remove the big kernel lock
  init/main.c: remove BKL notations
  blktrace: remove the big kernel lock
  rtmutex-tester: make it build without BKL
  dvb-core: kill the big kernel lock
  dvb/bt8xx: kill the big kernel lock
  tlclk: remove big kernel lock
  fix rawctl compat ioctls breakage on amd64 and itanic
  uml: kill big kernel lock
  parisc: remove big kernel lock
  cris: autoconvert trivial BKL users
  alpha: kill big kernel lock
  isapnp: BKL removal
  s390/block: kill the big kernel lock
  hpet: kill BKL, add compat_ioctl
2010-10-22 10:43:11 -07:00
Robin Holt
c25d1dfbd4 kobject: Introduce kset_find_obj_hinted.
One call chain getting to kset_find_obj is:
  link_mem_sections()
    find_mem_section()
      kset_find_obj()

This is done during boot.  The memory sections were added in a linearly
increasing order and link_mem_sections tends to utilize them in that
same linear order.

Introduce a kset_find_obj_hinted which is passed the result of the
previous kset_find_obj which it uses for a quick "is the next object
our desired object" check before falling back to the old behavior.

Signed-off-by: Robin Holt <holt@sgi.com>
To: Robert P. J. Day <rpjday@crashcourse.ca>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-22 10:16:44 -07:00
Thomas Renninger
6a5c083de2 Dynamic Debug: Initialize dynamic debug earlier via arch_initcall
Having the ddebug_query= boot parameter it makes sense to set up
dynamic debug as soon as possible.

I expect sysfs files cannot be set up via an arch_initcall, because
this one is even before fs_initcall. Therefore I splitted the
dynamic_debug_init function into an early one and a later one providing
/sys/../dynamic_debug/control file.

Possibly dynamic_debug can be initialized even earlier, not sure whether
this still makes sense then. I picked up arch_initcall as it covers
quite a lot already.

Dynamic debug needs to allocate memory, therefore it's not easily possible to
set it up even before the command line gets parsed.
Therefore the boot param query string is stored in a temp string which is
applied when dynamic debug gets set up.

This has been tested with ddebug_query="file ec.c +p"
and I could retrieve pr_debug() messages early at boot during ACPI setup:
ACPI: EC: Look up EC in DSDT
ACPI: EC: ---> status = 0x08
ACPI: EC: transaction start
ACPI: EC: <--- command = 0x80
ACPI: EC: ~~~> interrupt
ACPI: EC: ---> status = 0x08
ACPI: EC: <--- data = 0xa4
...
ACPI: Interpreter enabled
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: EC: ---> status = 0x00
ACPI: EC: transaction start
ACPI: EC: <--- command = 0x80


Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: jbaron@redhat.com
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
CC: linux-acpi@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-22 10:16:42 -07:00
Thomas Renninger
a648ec05bb Dynamic Debug: Introduce ddebug_query= boot parameter
Dynamic debug lacks the ability to enable debug messages at boot time.
One could patch initramfs or service startup scripts to write to
/sys/../dynamic_debug/control, but this sucks.

This patch makes it possible to pass a query in the same format one can
write to /sys/../dynamic_debug/control via boot param.
When dynamic debug gets initialized, this query will automatically be
applied.


Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: jbaron@redhat.com
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-22 10:16:42 -07:00
Thomas Renninger
fd89cfb871 Dynamic Debug: Split out query string parsing/setup from proc_write
The parsing and applying of dynamic debug strings is not only useful for
/sys/../dynamic_debug/control write access, but can also be used for
boot parameter parsing.
The boot parameter is introduced in a follow up patch.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: jbaron@redhat.com
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-22 10:16:42 -07:00
Linus Torvalds
a9ccd80aad Merge branch 'stable/swiotlb-0.9' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6
* 'stable/swiotlb-0.9' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6:
  swiotlb: Use page alignment for early buffer allocation
  swiotlb: make io_tlb_overflow static
2010-10-21 14:04:25 -07:00
Linus Torvalds
5d70f79b5e Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (163 commits)
  tracing: Fix compile issue for trace_sched_wakeup.c
  [S390] hardirq: remove pointless header file includes
  [IA64] Move local_softirq_pending() definition
  perf, powerpc: Fix power_pmu_event_init to not use event->ctx
  ftrace: Remove recursion between recordmcount and scripts/mod/empty
  jump_label: Add COND_STMT(), reducer wrappery
  perf: Optimize sw events
  perf: Use jump_labels to optimize the scheduler hooks
  jump_label: Add atomic_t interface
  jump_label: Use more consistent naming
  perf, hw_breakpoint: Fix crash in hw_breakpoint creation
  perf: Find task before event alloc
  perf: Fix task refcount bugs
  perf: Fix group moving
  irq_work: Add generic hardirq context callbacks
  perf_events: Fix transaction recovery in group_sched_in()
  perf_events: Fix bogus AMD64 generic TLB events
  perf_events: Fix bogus context time tracking
  tracing: Remove parent recording in latency tracer graph options
  tracing: Use one prologue for the preempt irqs off tracer function tracers
  ...
2010-10-21 12:54:49 -07:00
Arnd Bergmann
6de5bd128d BKL: introduce CONFIG_BKL.
With all the patches we have queued in the BKL removal tree, only a
few dozen modules are left that actually rely on the BKL, and even
there are lots of low-hanging fruit. We need to decide what to do
about them, this patch illustrates one of the options:

Every user of the BKL is marked as 'depends on BKL' in Kconfig,
and the CONFIG_BKL becomes a user-visible option. If it gets
disabled, no BKL using module can be built any more and the BKL
code itself is compiled out.

The one exception is file locking, which is practically always
enabled and does a 'select BKL' instead. This effectively forces
CONFIG_BKL to be enabled until we have solved the fs/lockd
mess and can apply the patch that removes the BKL from fs/locks.c.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-21 15:44:13 +02:00
Peter Zijlstra
3b6e901f83 jump_label: Use more consistent naming
Now that there's still only a few users around, rename things to make
them more consistent.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101014203625.448565169@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-18 19:58:56 +02:00
Arnd Bergmann
6038f373a3 llseek: automatically add .llseek fop
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+>
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+>
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
2010-10-15 15:53:27 +02:00
Chris Metcalf
6b945df742 kmemleak: add TILE to the list of supported architectures.
All the necessary functionality was already there; we just need
to make it possible to select the config option.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-10-14 14:54:47 -04:00
Yinghai Lu
e79f86b2ef swiotlb: Use page alignment for early buffer allocation
We could call free_bootmem_late() if swiotlb is not used, and
it will shrink to page alignment.

So alloc them with page alignment at first, to avoid lose two pages

before patch:
[    0.000000]     memblock_x86_reserve_range: [00d3600000, 00d7600000]   swiotlb buffer
[    0.000000]     memblock_x86_reserve_range: [00d7e7ef40, 00d7e9ef40]     swiotlb list
[    0.000000]     memblock_x86_reserve_range: [00d7e3ef40, 00d7e7ef40]  swiotlb orig_ad
[    0.000000]     memblock_x86_reserve_range: [000008a000, 0000092000]  swiotlb overflo

after patch will get
[    0.000000]     memblock_x86_reserve_range: [00d3600000, 00d7600000]   swiotlb buffer
[    0.000000]     memblock_x86_reserve_range: [00d7e7e000, 00d7e9e000]     swiotlb list
[    0.000000]     memblock_x86_reserve_range: [00d7e3e000, 00d7e7e000]  swiotlb orig_ad
[    0.000000]     memblock_x86_reserve_range: [000008a000, 0000092000]  swiotlb overflo

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-11 17:08:36 -04:00
FUJITA Tomonori
03620b2d75 swiotlb: make io_tlb_overflow static
We don't need to export io_tlb_overflow_buffer. I'll remove
io_tlb_overflow_buffer completely in the long term though.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-11 14:54:27 -04:00
Ingo Molnar
7cd2541cf2 Merge commit 'v2.6.36-rc7' into perf/core
Conflicts:
	arch/x86/kernel/module.c

Merge reason: Resolve the conflict, pick up fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-08 10:46:27 +02:00
Dan Williams
400fb7f6a0 move async raid6 test to lib/Kconfig.debug
The prompt for "Self test for hardware accelerated raid6 recovery" does not
belong in the top level configuration menu.  All the options in
crypto/async_tx/Kconfig are selected and do not depend on CRYPTO.
Kconfig.debug seems like a reasonable fit.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-10-07 15:25:04 -07:00
Ingo Molnar
556ef63255 Merge commit 'v2.6.36-rc7' into core/rcu
Merge reason: Update from -rc3 to -rc7.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-07 09:43:45 +02:00
Ingo Molnar
d4f8f217b8 Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu 2010-10-07 09:43:11 +02:00
Christoph Lameter
ab4d5ed5ee slub: Enable sysfs support for !CONFIG_SLUB_DEBUG
Currently disabling CONFIG_SLUB_DEBUG also disabled SYSFS support meaning
that the slabs cannot be tuned without DEBUG.

Make SYSFS support independent of CONFIG_SLUB_DEBUG

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2010-10-06 16:54:36 +03:00
Linus Torvalds
5336377d62 modules: Fix module_bug_list list corruption race
With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.

However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling.  That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.

Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.

So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.

Future fixups:
 - move the module list handling code into kernel/module.c where it
   belongs.
 - get rid of 'module_bug_list' and just use the regular list of modules
   (called 'modules' - imagine that) that we already create and maintain
   for other reasons.

Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-05 11:29:27 -07:00
Don Mullis
f015ac3edd lib/list_sort: do not pass bad pointers to cmp callback
If the original list is a POT in length, the first callback from line 73
will pass a==b both pointing to the original list_head.  This is dangerous
because the 'list_sort()' user can use 'container_of()' and accesses the
"containing" object, which does not necessary exist for the list head.  So
the user can access RAM which does not belong to him.  If this is a write
access, we can end up with memory corruption.

Signed-off-by: Don Mullis <don.mullis@gmail.com>
Tested-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-01 10:50:58 -07:00
Paul E. McKenney
2dfbf4dfbe rcu: Add advice to PROVE_RCU_REPEATEDLY kernel config parameter
The PROVE_RCU_REPEATEDLY has no "Say Y"/"Say N" advice, so this commit
adds it.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-09-23 09:16:54 -07:00
Jason Baron
52159d98be jump label: Convert dynamic debug to use jump labels
Convert the 'dynamic debug' infrastructure to use jump labels.

Signed-off-by: Jason Baron <jbaron@redhat.com>
LKML-Reference: <b77627358cea3e27d7be4386f45f66219afb8452.1284733808.git.jbaron@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-09-22 16:31:19 -04:00
Ingo Molnar
3aabae7d9d Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core 2010-09-15 10:27:31 +02:00
Linus Torvalds
ff3cb3fec3 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: Range check cpu in blk_cpu_to_group
  scatterlist: prevent invalid free when alloc fails
  writeback: Fix lost wake-up shutting down writeback thread
  writeback: do not lose wakeup events when forking bdi threads
  cciss: fix reporting of max queue depth since init
  block: switch s390 tape_block and mg_disk to elevator_change()
  block: add function call to switch the IO scheduler from a driver
  fs/bio-integrity.c: return -ENOMEM on kmalloc failure
  bio-integrity.c: remove dependency on __GFP_NOFAIL
  BLOCK: fix bio.bi_rw handling
  block: put dev->kobj in blk_register_queue fail path
  cciss: handle allocation failure
  cfq-iosched: Documentation help for new tunables
  cfq-iosched: blktrace print per slice sector stats
  cfq-iosched: Implement tunable group_idle
  cfq-iosched: Do group share accounting in IOPS when slice_idle=0
  cfq-iosched: Do not idle if slice_idle=0
  cciss: disable doorbell reset on reset_devices
  blkio: Fix return code for mkdir calls
2010-09-10 07:26:27 -07:00
Steven Rostedt
46b93b74fc tracing/lockdep: Fix dependency of TRACE_IRQFLAGS
When CONFIG_IRQSOFF_TRACER is set and CONFIG_PROVE_LOCKING is not, we
get the following error:

$  make oldconfig
scripts/kconfig/conf --oldconfig arch/x86/Kconfig
warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING)
warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING)

This is because IRQSOFF_TRACER selects TRACE_IRQFLAGS but TRACE_IRQFLAGS
has PROVE_LOCKING as a dependency. This code is incorrect, and
this patch changes the TRACE_IRQFLAGS to be just a simple bool that
does not depend or select anything. Instead both IRQSOFF_TRACER and
PROVE_LOCKING select it.

Reported-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-08-31 16:35:20 -04:00
Naohiro Aota
1458ce166c idr: describe how nextidp works in idr_get_next().
It was unclear in original kernel-doc how nextidp worked in
idr_get_next(). Let's describe it.

Signed-off-by: Naohiro Aota <naota@elisp.net>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-08-31 09:43:59 +02:00
Naohiro Aota
ea24ea850b idr: fix kernel-doc warnings.
Fix the following kernel-doc warnings.

% perl scripts/kernel-doc lib/idr.c > /dev/null
Warning(lib/idr.c:300): No description found for parameter 'starting_id'
Warning(lib/idr.c:300): Excess function parameter 'start_id' description in 'idr_get_new_above'
Warning(lib/idr.c:485): No description found for parameter 'idp'
Warning(lib/idr.c:596): No description found for parameter 'nextidp'
Warning(lib/idr.c:596): Excess function parameter 'id' description in 'idr_get_next'
Warning(lib/idr.c:774): No description found for parameter 'starting_id'
Warning(lib/idr.c:774): Excess function parameter 'staring_id' description in 'ida_get_new_above'
Warning(lib/idr.c:918): No description found for parameter 'ida'

Signed-off-by: Naohiro Aota <naota@elisp.net>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-08-31 09:32:02 +02:00
Jeffrey Carlyle
edce6820a9 scatterlist: prevent invalid free when alloc fails
When alloc fails, free_table is being called. Depending on the number of
bytes requested, we determine if we are going to call _get_free_page()
or kmalloc(). When alloc fails, our math is wrong (due to sg_size - 1),
and the last buffer is wrongfully assumed to have been allocated by
kmalloc. Hence, kfree gets called and a panic occurs.

Signed-off-by: Jeffrey Carlyle <jeff.carlyle@motorola.com>
Signed-off-by: Olusanya Soyannwo <c23746@motorola.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-30 19:55:09 +02:00
NeilBrown
7c44ece988 Move .gitignore from drivers/md to lib/raid6
Another missing bit of the raid6 -> /lib move.

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-30 17:35:52 +10:00
Xiaotian Feng
f6e6e7799e kobject_uevent: fix typo in comments
s/ending/sending, s/kobject_uevent()/kobject_uevent_env() in the comments.

Signed-off-by: Xiaotian Feng <xtfeng@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-23 18:12:46 -07:00
Ingo Molnar
a6b9b4d50f Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu 2010-08-23 11:32:34 +02:00
Linus Torvalds
9ee47476d6 Merge branch 'radix-tree' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev
* 'radix-tree' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev:
  radix-tree: radix_tree_range_tag_if_tagged() can set incorrect tags
  radix-tree: clear all tags in radix_tree_node_rcu_free
2010-08-22 19:55:14 -07:00
Dave Chinner
144dcfc012 radix-tree: radix_tree_range_tag_if_tagged() can set incorrect tags
Commit ebf8aa44be ("radix-tree:
omplement function radix_tree_range_tag_if_tagged") does not safely
set tags on on intermediate tree nodes. The code walks down the tree
setting tags before it has fully resolved the path to the leaf under
the assumption there will be a leaf slot with the tag set in the
range it is searching.

Unfortunately, this is not a valid assumption - we can abort after
setting a tag on an intermediate node if we overrun the number of
tags we are allowed to set in a batch, or stop scanning because we
we have passed the last scan index before we reach a leaf slot with
the tag we are searching for set.

As a result, we can leave the function with tags set on intemediate
nodes which can be tripped over later by tag-based lookups. The
result of these stale tags is that lookup may end prematurely or
livelock because the lookup cannot make progress.

The fix for the problem involves reocrding the traversal path we
take to the leaf nodes, and only propagating the tags back up the
tree once the tag is set in the leaf node slot. We are already
recording the path for efficient traversal, so there is no
additional overhead to do the intermediately node tag setting in
this manner.

This fixes a radix tree lookup livelock triggered by the new
writeback sync livelock avoidance code introduced in commit
f446daaea9 ("mm: implement writeback
livelock avoidance using page tagging").

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Jan Kara <jack@suse.cz>
2010-08-23 10:33:53 +10:00
Dave Chinner
b6dd08652e radix-tree: clear all tags in radix_tree_node_rcu_free
Commit f446daaea9 ("mm: implement
writeback livelock avoidance using page tagging") introduced a new
radix tree tag, increasing the number of tags in each node from 2 to
3. It did not, however, fix up the code in
radix_tree_node_rcu_free() that cleans up after radix_tree_shrink()
and hence could leave stray tags set in the new tag array.

The result is that the livelock avoidance code added in the the
above commit would hit stale tags when doing tag based lookups,
resulting in livelocks when trying to traverse the tree.

Fix this problem in radix_tree_node_rcu_free() so it doesn't happen
again in the future by using a loop to walk all the tags up to
RADIX_TREE_MAX_TAGS to clear the stray tags radix_tree_shrink()
leaves behind.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Nick Piggin <npiggin@kernel.dk>
Acked-by: Jan Kara <jack@suse.cz>
2010-08-23 10:33:19 +10:00
Jan Kara
d5ed3a4af7 lib/radix-tree.c: fix overflow in radix_tree_range_tag_if_tagged()
When radix_tree_maxindex() is ~0UL, it can happen that scanning overflows
index and tree traversal code goes astray reading memory until it hits
unreadable memory.  Check for overflow and exit in that case.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-20 09:34:55 -07:00
Paul E. McKenney
910b1b7e19 rcu: Allow RCU CPU stall warnings to be off at boot, but manually enablable
Currently, if RCU CPU stall warnings are enabled, they are enabled
immediately upon boot.  They can be manually disabled via /sys (and
also re-enabled via /sys), and are automatically disabled upon panic.
However, some users need RCU CPU stalls to be disabled at boot time,
but to be enabled without rebuilding/rebooting.  For example, someone
running a real-time application in production might not want the
additional latency of RCU CPU stall detection in normal operation, but
might need to enable it at any point for fault isolation purposes.

This commit therefore provides a new CONFIG_RCU_CPU_STALL_DETECTOR_RUNNABLE
kernel configuration parameter that maintains the current behavior
(enable at boot) by default, but allows a kernel to be configured
with RCU CPU stall detection built into the kernel, but disabled at
boot time.

Requested-by: Clark Williams <williams@redhat.com>
Requested-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-08-19 17:18:04 -07:00
Arnd Bergmann
a1115570b3 radix-tree: __rcu annotations
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@suse.de>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2010-08-19 17:18:03 -07:00
Paul E. McKenney
b163760e37 rcu: make CPU stall warning timeout configurable
Also set the default to 60 seconds, up from the previous hard-coded timeout
of 10 seconds.  This allows people who care to set short timeouts, while
avoiding people with unusual configurations (make randconfig!!!) from being
bothered with spurious CPU stall warnings.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2010-08-19 17:18:02 -07:00
Paul E. McKenney
ca5ecddfa8 rcu: define __rcu address space modifier for sparse
This commit provides definitions for the __rcu annotation defined earlier.
This annotation permits sparse to check for correct use of RCU-protected
pointers.  If a pointer that is annotated with __rcu is accessed
directly (as opposed to via rcu_dereference(), rcu_assign_pointer(),
or one of their variants), sparse can be made to complain.  To enable
such complaints, use the new default-disabled CONFIG_SPARSE_RCU_POINTER
kernel configuration option.  Please note that these sparse complaints are
intended to be a debugging aid, -not- a code-style-enforcement mechanism.

There are special rcu_dereference_protected() and rcu_access_pointer()
accessors for use when RCU read-side protection is not required, for
example, when no other CPU has access to the data structure in question
or while the current CPU hold the update-side lock.

This patch also updates a number of docbook comments that were showing
their age.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Christopher Li <sparse@chrisli.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2010-08-19 17:17:59 -07:00
Randy Dunlap
625fdcaa6d latencytop: Fix kconfig dependency warnings
warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
SCHED_DEBUG which has unmet direct dependencies (DEBUG_KERNEL &&
PROC_FS) warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
SCHEDSTATS which has unmet direct dependencies (DEBUG_KERNEL && PROC_FS)

Add depends on STACKTRACE_SUPPORT for 'select STACKTRACE'.
Add depends on PROC_FS since that is where the output goes.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <20100812123121.a7c99cde.randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-17 09:09:03 +02:00
Linus Torvalds
7367f5b013 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
  Further tidyup of raid6 naming in lib/raid6
  Make lib/raid6/test build correctly.
  Rename raid6 files now they're in a 'raid6' directory.
2010-08-12 10:08:10 -07:00
David Howells
1490cf5f0c MN10300: Don't try and #include <linux/slab.h> in lib/inflate.c from bootloader
Don't try and #include <linux/slab.h> in lib/inflate.c from the bootloader code
as linux/slab.h hauls in function defs that aren't available in the bootloader
code and may also haul in conflicting functions.

To fix this, make the inclusion of linux/slab.h contingent on NO_INFLATE_MALLOC
as are the usages of kmalloc() and kfree().

In MN10300, this causes the following errors:

In file included from include/linux/string.h:21,
                 from include/linux/bitmap.h:8,
                 from include/linux/nodemask.h:93,
                 from include/linux/mmzone.h:16,
                 from include/linux/gfp.h:4,
                 from include/linux/slab.h:12,
                 from arch/mn10300/boot/compressed/../../../../lib/inflate.c:106,
                 from arch/mn10300/boot/compressed/misc.c:170:
/warthog/am33/linux-2.6-mn10300/arch/mn10300/include/asm/string.h:19: error: conflicting types for 'memset'
arch/mn10300/boot/compressed/misc.c:59: error: previous definition of 'memset' was here

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-12 09:51:35 -07:00
NeilBrown
a8e026c785 Further tidyup of raid6 naming in lib/raid6
Rename raid6/raid6x86.h to raid6/x86.h
and modify some comments.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-12 06:44:54 +10:00
NeilBrown
d5302fe41f Make lib/raid6/test build correctly.
Some bit-rot needs to be cleaned out.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-12 06:38:24 +10:00
Prarit Bhargava
dd21e9bdff lib/decompress_bunzip2.c: fix checkstack warning
Fix checkstack error:

lib/decompress_bunzip2.c: In function `get_next_block':
lib/decompress_bunzip2.c:511: warning: the frame size of 1932 bytes is larger than 1024 bytes

byteCount, symToByte, and mtfSymbol cannot be declared static or allocated
dynamically so place them in the bunzip_data struct.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11 08:59:23 -07:00
Anton Blanchard
863a604920 lib/bug.c: add oops end marker to WARN implementation
We are missing the oops end marker for the exception based WARN implementation
in lib/bug.c. This is useful for logfile analysis tools.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11 08:59:22 -07:00
Anton Blanchard
e2e7e09325 lib/bug.c: make WARN implementation match the kernel/panic.c one
There are a few issues with the exception based WARN implementation in
lib/bug.c:

- Inconsistent printk flags. The "cut here" line is printed at KERN_EMERG, so
  the console and all logged in users see the single line:

------------[ cut here ]------------

  for each WARN. Fix this so we print everything at KERN_WARNING to match the
  kernel/panic.c version.

- The lib/bug.c WARN would print "Badness at". Change it to match the
  kernel/panic.c version which prints "WARNING: at".

- Print the list of modules, similar to kernel/panic.c of modules, similar to
  kernel/panic.c

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11 08:59:22 -07:00
David Woodhouse
cc4589ebfa Rename raid6 files now they're in a 'raid6' directory.
Linus asks 'why "raid6" twice?'. No reason.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2010-08-11 00:19:05 +01:00
Linus Torvalds
3d30701b58 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md: (24 commits)
  md: clean up do_md_stop
  md: fix another deadlock with removing sysfs attributes.
  md: move revalidate_disk() back outside open_mutex
  md/raid10: fix deadlock with unaligned read during resync
  md/bitmap:  separate out loading a bitmap from initialising the structures.
  md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.
  md/bitmap: optimise scanning of empty bitmaps.
  md/bitmap: clean up plugging calls.
  md/bitmap: reduce dependence on sysfs.
  md/bitmap: white space clean up and similar.
  md/raid5: export raid5 unplugging interface.
  md/plug: optionally use plugger to unplug an array during resync/recovery.
  md/raid5: add simple plugging infrastructure.
  md/raid5: export is_congested test
  raid5: Don't set read-ahead when there is no queue
  md: add support for raising dm events.
  md: export various start/stop interfaces
  md: split out md_rdev_init
  md: be more careful setting MD_CHANGE_CLEAN
  md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk
  ...
2010-08-10 15:38:19 -07:00
Linus Torvalds
4c619407b0 Merge branch 'kmemleak' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm
* 'kmemleak' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm:
  kmemleak: Fix typo in the comment
  lib/scatterlist: Hook sg_kmalloc into kmemleak (v2)
  kmemleak: Add DocBook style comments to kmemleak.c
  kmemleak: Introduce a default off mode for kmemleak
  kmemleak: Show more information for objects found by alias
2010-08-10 13:58:11 -07:00
Michel Lespinasse
a8618a0e8a rwsem: smaller wrappers around rwsem_down_failed_common
More code can be pushed from rwsem_down_read_failed and
rwsem_down_write_failed into rwsem_down_failed_common.

Following change adding down_read_critical infrastructure support also
enjoys having flags available in a register rather than having to fish it
out in the struct rwsem_waiter...

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:11 -07:00
Michel Lespinasse
424acaaeb3 rwsem: wake queued readers when writer blocks on active read lock
This change addresses the following situation:

- Thread A acquires the rwsem for read
- Thread B tries to acquire the rwsem for write, notices there is already
  an active owner for the rwsem.
- Thread C tries to acquire the rwsem for read, notices that thread B already
  tried to acquire it.
- Thread C grabs the spinlock and queues itself on the wait queue.
- Thread B grabs the spinlock and queues itself behind C. At this point A is
  the only remaining active owner on the rwsem.

In this situation thread B could notice that it was the last active writer
on the rwsem, and decide to wake C to let it proceed in parallel with A
since they both only want the rwsem for read.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:11 -07:00
Michel Lespinasse
fd41b33435 rwsem: let RWSEM_WAITING_BIAS represent any number of waiting threads
Previously each waiting thread added a bias of RWSEM_WAITING_BIAS.  With
this change, the bias is added only once to indicate that the wait list is
non-empty.

This has a few nice properties which will be used in following changes:
- when the spinlock is held and the waiter list is known to be non-empty,
  count < RWSEM_WAITING_BIAS  <=>  there is an active writer on that sem
- count == RWSEM_WAITING_BIAS  <=>  there are waiting threads and no
                                     active readers/writers on that sem

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:11 -07:00
Michel Lespinasse
70bdc6e064 rwsem: lighter active count checks when waking up readers
In __rwsem_do_wake(), we can skip the active count check unless we come
there from up_xxxx().  Also when checking the active count, it is not
actually necessary to increment it; this allows us to get rid of the read
side undo code and simplify the calculation of the final rwsem count
adjustment once we've counted the reader threads to wake.

The basic observation is the following.  When there are waiter threads on
a rwsem and the spinlock is held, other threads can only increment the
active count by trying to grab the rwsem in down_xxxx().  However
down_xxxx() will notice there are waiter threads and take the down_failed
path, blocking to acquire the spinlock on the way there.  Therefore, a
thread observing an active count of zero with waiters queued and the
spinlock held, is protected against other threads acquiring the rwsem
until it wakes the last waiter or releases the spinlock.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:10 -07:00
Michel Lespinasse
345af7bf33 rwsem: fully separate code paths to wake writers vs readers
This is in preparation for later changes in the series.

In __rwsem_do_wake(), the first queued waiter is checked first in order to
determine whether it's a writer or a reader.  The code paths diverge at
this point.  The code that checks and increments the rwsem active count is
duplicated on both sides - the point is that later changes in the series
will be able to independently modify both sides.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:10 -07:00
Eric Paris
ea98eed9bc flex_array: add helpers to get and put to make pointers easy to use
Getting and putting arrays of pointers with flex arrays is a PITA.  You
have to remember to pass &ptr to the _put and you have to do weird and
wacky casting to get the ptr back from the _get.  Add two functions
flex_array_get_ptr() and flex_array_put_ptr() to handle all of the magic.

[akpm@linux-foundation.org: simplification suggested by Joe]
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Joe Perches <joe@perches.com>
Cc: James Morris <jmorris@namei.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:09 -07:00
Michal Nazarewicz
559b140a36 lib: vsprintf: useless strlen() removed
The strict_strtoul() and strict_strtoull() functions used strlen() to
check argument's length in a situation where it wasn't strictly necessary

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Cc: "Yi Yang" <yi.y.yang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:09 -07:00
Baruch Siach
e3f76e3386 list debugging: warn when deleting a deleted entry
Use the magic LIST_POISON* values to detect an incorrect use of list_del
on a deleted entry.  This DEBUG_LIST specific warning is easier to
understand than the generic Oops message caused by LIST_POISON
dereference.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:08 -07:00
Anton Blanchard
e269b08517 iommu: inline iommu_num_pages
A profile of a network benchmark showed iommu_num_pages rather high up:

     0.52%  iommu_num_pages

Looking at the profile, an integer divide is taking almost all of the time:

      %
         :      c000000000376ea4 <.iommu_num_pages>:
    1.93 :      c000000000376ea4:       fb e1 ff f8     std     r31,-8(r1)
    0.00 :      c000000000376ea8:       f8 21 ff c1     stdu    r1,-64(r1)
    0.00 :      c000000000376eac:       7c 3f 0b 78     mr      r31,r1
    3.86 :      c000000000376eb0:       38 84 ff ff     addi    r4,r4,-1
    0.00 :      c000000000376eb4:       38 05 ff ff     addi    r0,r5,-1
    0.00 :      c000000000376eb8:       7c 84 2a 14     add     r4,r4,r5
   46.95 :      c000000000376ebc:       7c 00 18 38     and     r0,r0,r3
   45.66 :      c000000000376ec0:       7c 84 02 14     add     r4,r4,r0
    0.00 :      c000000000376ec4:       7c 64 2b 92     divdu   r3,r4,r5
    0.00 :      c000000000376ec8:       38 3f 00 40     addi    r1,r31,64
    0.00 :      c000000000376ecc:       eb e1 ff f8     ld      r31,-8(r1)
    1.61 :      c000000000376ed0:       4e 80 00 20     blr

Since every caller of iommu_num_pages passes in a constant power of two
we can inline this such that the divide is replaced by a shift. The
entire function is only a few instructions once optimised, so it is
a good candidate for inlining overall.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:05 -07:00
Jan Kara
ebf8aa44be radix-tree: omplement function radix_tree_range_tag_if_tagged
Implement function for setting one tag if another tag is set for each item
in given range.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:44:59 -07:00
Tim Chen
27f5e0f694 tmpfs: add accurate compare function to percpu_counter library
Add percpu_counter_compare that allows for a quick but accurate comparison
of percpu_counter with a given value.

A rough count is provided by the count field in percpu_counter structure,
without accounting for the other values stored in individual cpu counters.

The actual count is a sum of count and the cpu counters.  However, count
field is never different from the actual value by a factor of
batch*num_online_cpu.  We do not need to get actual count for comparison
if count is different from the given value by this factor and allows for
quick comparison without summing up all the per cpu counters.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:44:58 -07:00
David Woodhouse
2144381da4 Merge branch 'async' of macbook:git/btrfs-unstable
Conflicts:
	drivers/md/Makefile
	lib/raid6/unroll.pl
2010-08-09 10:36:44 +01:00
Linus Torvalds
ab69bcd66f Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (28 commits)
  driver core: device_rename's new_name can be const
  sysfs: Remove owner field from sysfs struct attribute
  powerpc/pci: Remove owner field from attribute initialization in PCI bridge init
  regulator: Remove owner field from attribute initialization in regulator core driver
  leds: Remove owner field from attribute initialization in bd2802 driver
  scsi: Remove owner field from attribute initialization in ARCMSR driver
  scsi: Remove owner field from attribute initialization in LPFC driver
  cgroupfs: create /sys/fs/cgroup to mount cgroupfs on
  Driver core: Add BUS_NOTIFY_BIND_DRIVER
  driver core: fix memory leak on one error path in bus_register()
  debugfs: no longer needs to depend on SYSFS
  sysfs: Fix one more signature discrepancy between sysfs implementation and docs.
  sysfs: fix discrepancies between implementation and documentation
  dcdbas: remove a redundant smi_data_buf_free in dcdbas_exit
  dmi-id: fix a memory leak in dmi_id_init error path
  sysfs: sysfs_chmod_file's attr can be const
  firmware: Update hotplug script
  Driver core: move platform device creation helpers to .init.text (if MODULE=n)
  Driver core: reduce duplicated code for platform_device creation
  Driver core: use kmemdup in platform_device_add_resources
  ...
2010-08-06 11:36:30 -07:00
Linus Torvalds
9faa1e5942 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Ioremap: fix wrong physical address handling in PAT code
  x86, tlb: Clean up and correct used type
  x86, iomap: Fix wrong page aligned size calculation in ioremapping code
  x86, mm: Create symbolic index into address_markers array
  x86, ioremap: Fix normal ram range check
  x86, ioremap: Fix incorrect physical address handling in PAE mode
  x86-64, mm: Initialize VDSO earlier on 64 bits
  x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages
2010-08-06 10:17:52 -07:00
Linus Torvalds
4aed2fd8e3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
  tracing/kprobes: unregister_trace_probe needs to be called under mutex
  perf: expose event__process function
  perf events: Fix mmap offset determination
  perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
  perf, powerpc: Convert the FSL driver to use local64_t
  perf tools: Don't keep unreferenced maps when unmaps are detected
  perf session: Invalidate last_match when removing threads from rb_tree
  perf session: Free the ref_reloc_sym memory at the right place
  x86,mmiotrace: Add support for tracing STOS instruction
  perf, sched migration: Librarize task states and event headers helpers
  perf, sched migration: Librarize the GUI class
  perf, sched migration: Make the GUI class client agnostic
  perf, sched migration: Make it vertically scrollable
  perf, sched migration: Parameterize cpu height and spacing
  perf, sched migration: Fix key bindings
  perf, sched migration: Ignore unhandled task states
  perf, sched migration: Handle ignored migrate out events
  perf: New migration tool overview
  tracing: Drop cpparg() macro
  perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
  ...

Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
2010-08-06 09:30:52 -07:00
Linus Torvalds
3a3527b646 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  Revert "net: Make accesses to ->br_port safe for sparse RCU"
  mce: convert to rcu_dereference_index_check()
  net: Make accesses to ->br_port safe for sparse RCU
  vfs: add fs.h to define struct file
  lockdep: Add an in_workqueue_context() lockdep-based test function
  rcu: add __rcu API for later sparse checking
  rcu: add an rcu_dereference_index_check()
  tree/tiny rcu: Add debug RCU head objects
  mm: remove all rcu head initializations
  fs: remove all rcu head initializations, except on_stack initializations
  powerpc: remove all rcu head initializations
2010-08-06 09:23:07 -07:00
Linus Torvalds
da9e82b3b8 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  modpost: support objects with more than 64k sections
  trivial: fix a typo in a filename
  frv: clean up arch/frv/Makefile
  kbuild: allow assignment to {A,C}FLAGS_KERNEL on the command line
  kbuild: allow assignment to {A,C,LD}FLAGS_MODULE on the command line
  Kbuild: Add option to set -femit-struct-debug-baseonly
  Makefile: "make kernelrelease" should show the correct full kernel version
  Makefile.build: make KBUILD_SYMTYPES work again
2010-08-05 14:10:07 -07:00
Randy Dunlap
c462e8cd57 debugfs: no longer needs to depend on SYSFS
debugfs no longer uses 'kernel_subsys' (which is gone), and other
kernel/ksysfs.c code is always built, so DEBUG_FS does not need
to depend on SYSFS.

Fixes this kconfig warning:

warning: (TREE_RCU_TRACE || AMD_IOMMU_STATS && AMD_IOMMU || MTD_UBI_DEBUG && MTD && SYSFS && MTD_UBI || UBIFS_FS_DEBUG && MISC_FILESYSTEMS && UBIFS_FS || DEBUG_KMEMLEAK && DEBUG_KERNEL && EXPERIMENTAL && !MEMORY_HOTPLUG && (X86 || ARM || PPC || S390 || SPARC64 || SUPERH || MICROBLAZE) && SYSFS || TRACING || X86_PTDUMP && DEBUG_KERNEL || BLK_DEV_IO_TRACE && TRACING_SUPPORT && FTRACE && SYSFS && BLOCK) selects DEBUG_FS which has unmet direct dependencies (SYSFS)

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-05 13:53:34 -07:00
Linus Torvalds
bbc4fd12a6 Merge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze
* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze: (49 commits)
  microblaze: Add KGDB support
  microblaze: Support brki rX, 0x18 for user application debugging
  microblaze: Remove nop after MSRCLR/SET, MTS, MFS instructions
  microblaze: Simplify syscall rutine
  microblaze: Move PT_MODE saving to delay slot
  microblaze: Fix _interrupt function
  microblaze: Fix _user_exception function
  microblaze: Put together addik instructions
  microblaze: Use delay slot in syscall macros
  microblaze: Save kernel mode in delay slot
  microblaze: Do not mix register saving and mode setting
  microblaze: Move SAVE_STATE upward
  microblaze: entry.S: Macro optimization
  microblaze: Optimize hw exception rutine
  microblaze: Implement clear_ums macro and fix SAVE_STATE macro
  microblaze: Remove additional setup for kernel_mode
  microblaze: Optimize SAVE_STATE macro
  microblaze: Remove additional loading
  microblaze: Completely remove working with R11 register
  microblaze: Do not setup BIP in _debug_exception
  ...
2010-08-05 08:59:22 -07:00
Ingo Molnar
61be7fdec2 Merge branch 'perf/nmi' into perf/core
Conflicts:
	kernel/Makefile

Merge reason: Add the now complete topic, fix the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-05 08:45:05 +02:00
Linus Torvalds
3cfc2c42c1 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (48 commits)
  Documentation: update broken web addresses.
  fix comment typo "choosed" -> "chosen"
  hostap:hostap_hw.c Fix typo in comment
  Fix spelling contorller -> controller in comments
  Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault
  fs/Kconfig: Fix typo Userpace -> Userspace
  Removing dead MACH_U300_BS26
  drivers/infiniband: Remove unnecessary casts of private_data
  fs/ocfs2: Remove unnecessary casts of private_data
  libfc: use ARRAY_SIZE
  scsi: bfa: use ARRAY_SIZE
  drm: i915: use ARRAY_SIZE
  drm: drm_edid: use ARRAY_SIZE
  synclink: use ARRAY_SIZE
  block: cciss: use ARRAY_SIZE
  comment typo fixes: charater => character
  fix comment typos concerning "challenge"
  arm: plat-spear: fix typo in kerneldoc
  reiserfs: typo comment fix
  update email address
  ...
2010-08-04 15:31:02 -07:00
Linus Torvalds
6ba74014c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
  phy/marvell: add 88ec048 support
  igb: Program MDICNFG register prior to PHY init
  e1000e: correct MAC-PHY interconnect register offset for 82579
  hso: Add new product ID
  can: Add driver for esd CAN-USB/2 device
  l2tp: fix export of header file for userspace
  can-raw: Fix skb_orphan_try handling
  Revert "net: remove zap_completion_queue"
  net: cleanup inclusion
  phy/marvell: add 88e1121 interface mode support
  u32: negative offset fix
  net: Fix a typo from "dev" to "ndev"
  igb: Use irq_synchronize per vector when using MSI-X
  ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
  e1000e: Fix irq_synchronize in MSI-X case
  e1000e: register pm_qos request on hardware activation
  ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
  net: Add getsockopt support for TCP thin-streams
  cxgb4: update driver version
  cxgb4: add new PCI IDs
  ...

Manually fix up conflicts in:
 - drivers/net/e1000e/netdev.c: due to pm_qos registration
   infrastructure changes
 - drivers/net/phy/marvell.c: conflict between adding 88ec048 support
   and cleaning up the IDs
 - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
   conflict (registration change vs marking it static)
2010-08-04 11:47:58 -07:00
Linus Torvalds
4a35cee066 Merge branch 'stable/swiotlb-0.8.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6
* 'stable/swiotlb-0.8.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6:
  swiotlb: Make swiotlb bookkeeping functions visible in the header file.
  swiotlb: search and replace "int dir" with "enum dma_data_direction dir"
  swiotlb: Make internal bookkeeping functions have 'swiotlb_tbl' prefix.
  swiotlb: add the swiotlb initialization function with iotlb memory
  swiotlb: add swiotlb_tbl_map_single library function
2010-08-04 10:36:39 -07:00
Jiri Kosina
d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Michal Marek
772320e845 Merge commit 'v2.6.35' into kbuild/kbuild
Conflicts:
	arch/powerpc/Makefile
2010-08-04 13:59:13 +02:00
Michal Simek
79aac88903 microblaze: Disable FRAME_POINTER selection
Microblaze doesn't support frame pointers. Ftrace code
uses CALLER_ADDR1 which is defined in linux/ftrace.h. For Microblaze
is 0.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:32 +02:00
Chris Wilson
b94de9bb75 lib/scatterlist: Hook sg_kmalloc into kmemleak (v2)
kmemleak ignores page_alloc() and so believes the final sub-page
allocation using the plain kmalloc is decoupled and lost. This leads to
lots of false-positives with code that uses scatterlists.

The options seem to be either to tell kmemleak that the kmalloc is not
leaked or to notify kmemleak of the page allocations. The danger of the
first approach is that we may hide a real leak, so choose the latter
approach (of which I am not sure of the downsides).

v2: Added comments on the suggestion of Catalin.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2010-07-28 22:59:02 +01:00
Will Deacon
c58bbd39f8 ARM: 6213/1: atomic64_test: add ARM as supported architecture
ARM has support for the atomic64_dec_if_positive operation
so ensure that it is tested by the atomic64_test routine.

Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-07-27 10:43:46 +01:00
Takuya Yoshikawa
f4d0143951 Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault
The last 't' of 'fault' is missing in the description of FAIL_IO_TIMEOUT.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-21 10:36:50 +02:00
Jason Baron
ab0155a22a kmemleak: Introduce a default off mode for kmemleak
Introduce a new DEBUG_KMEMLEAK_DEFAULT_OFF config parameter that allows
kmemleak to be disabled by default, but enabled on the command line
via: kmemleak=on. Although a reboot is required to turn it on, its still
useful to not require a re-compile.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-19 11:54:17 +01:00
Andi Kleen
d6f4ceb796 Kbuild: Add option to set -femit-struct-debug-baseonly
Newer gcc has a -femit-struct-debug-baseonly option that dramatically
reduces the size of object files with debug info. What it does
is to only emit type information for structures when the structures
are defined in the same file or in a header file.

This means the type information for most headers are not included.
This is not good when the type information is actually
needed (e.g. with kgdb or systemtap)

But often kernel hackers only care about line numbers and don't
need all the type information anyways. In this case setting
the option can be a big win:

A build dir for a specific x86-64 configuration with gcc 4.5
shrunk from 2.3G to 1.2G. The compilation was also nearly a minute
faster.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
[mmarek: reformatted help text]
Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-07-14 17:21:28 +02:00
Yinghai Lu
95f72d1ed4 lmb: rename to memblock
via following scripts

      FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

      sed -i \
        -e 's/lmb/memblock/g' \
        -e 's/LMB/MEMBLOCK/g' \
        $FILES

      for N in $(find . -name lmb.[ch]); do
        M=$(echo $N | sed 's/lmb/memblock/g')
        mv $N $M
      done

and remove some wrong change like lmbench and dlmb etc.

also move memblock.c from lib/ to mm/

Suggested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-14 17:14:00 +10:00
Kulikov Vasiliy
4d45ada36b lib/devres.c: fix comment typo
'Unamp' should be 'Unmap'.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-11 22:16:32 +02:00
Kenji Kaneshige
ffa71f33a8 x86, ioremap: Fix incorrect physical address handling in PAE mode
Current x86 ioremap() doesn't handle physical address higher than
32-bit properly in X86_32 PAE mode. When physical address higher than
32-bit is passed to ioremap(), higher 32-bits in physical address is
cleared wrongly. Due to this bug, ioremap() can map wrong address to
linear address space.

In my case, 64-bit MMIO region was assigned to a PCI device (ioat
device) on my system. Because of the ioremap()'s bug, wrong physical
address (instead of MMIO region) was mapped to linear address space.
Because of this, loading ioatdma driver caused unexpected behavior
(kernel panic, kernel hangup, ...).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <4C1AE680.7090408@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-07-09 11:42:03 -07:00
Linus Torvalds
47a716cf0c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rbtree: Undo augmented trees performance damage and regression
  x86, Calgary: Limit the max PHB number to 256
2010-07-06 17:16:09 -07:00
Peter Zijlstra
b945d6b255 rbtree: Undo augmented trees performance damage and regression
Reimplement augmented RB-trees without sprinkling extra branches
all over the RB-tree code (which lives in the scheduler hot path).

This approach is 'borrowed' from Fabio's BFQ implementation and
relies on traversing the rebalance path after the RB-tree-op to
correct the heap property for insertion/removal and make up for
the damage done by the tree rotations.

For insertion the rebalance path is trivially that from the new
node upwards to the root, for removal it is that from the deepest
node in the path from the to be removed node that will still
be around after the removal.

[ This patch also fixes a video driver regression reported by
  Ali Gholami Rudi - the memtype->subtree_max_end was updated
  incorrectly. ]

Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Ali Gholami Rudi <ali@rudi.ir>
Cc: Fabio Checconi <fabio@gandalf.sssup.it>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1275414172.27810.27961.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-05 14:43:50 +02:00
Yehuda Sadeh
ff49d74ad3 module: initialize module dynamic debug later
We should initialize the module dynamic debug datastructures
only after determining that the module is not loaded yet. This
fixes a bug that introduced in 2.6.35-rc2, where when a trying
to load a module twice, we also load it's dynamic printing data
twice which causes all sorts of nasty issues. Also handle
the dynamic debug cleanup later on failure.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed a #ifdef)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-04 20:17:22 -07:00
Joe Perches
7db6f5fb65 vsprintf: Recursive vsnprintf: Add "%pV", struct va_format
Add the ability to print a format and va_list from a structure pointer

Allows __dev_printk to be implemented as a single printk while
minimizing string space duplication.

%pV should not be used without some mechanism to verify the
format and argument use ala __attribute__(format (printf(...))).

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04 10:40:17 -07:00
Ingo Molnar
0a54cec0c2 Merge branch 'linus' into core/rcu
Conflicts:
	fs/fs-writeback.c

Merge reason: Resolve the conflict

Note, i picked the version from Linus's tree, which effectively reverts
the fs-writeback.c bits of:

  b97181f: fs: remove all rcu head initializations, except on_stack initializations

As the upstream changes to this file changed this code heavily and the
first attempt to resolve the conflict resulted in a non-booting kernel.
It's safer to re-try this portion of the commit cleanly.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-01 09:31:25 +02:00
Imre Deak
e621ba9932 genalloc: fix allocation from end of pool
bitmap_find_next_zero_area requires the size of the bitmap, we instead
passed the last suitable position.  This made it impossible to allocate
from the end of the pool.

Fixes a regression introduced by 243797f59b
("genalloc: use bitmap_find_next_zero_area").

Signed-off-by: Imre Deak <imre.deak@nokia.com>
Cc: Zygo Blaxell <zygo.blaxell@xandros.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-29 15:29:30 -07:00
Paul E. McKenney
94bfa3b669 idr: fix RCU lockdep splat in idr_get_next()
Convert to rcu_dereference_raw() given that many callers may have many
different locking models.

Located-by: Miles Lane <miles.lane@gmail.com>
Tested-by: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-06-23 06:50:45 -07:00
Jiri Kosina
f1bbbb6912 Merge branch 'master' into for-next 2010-06-16 18:08:13 +02:00
Uwe Kleine-König
421f91d21a fix typos concerning "initiali[zs]e"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-06-16 18:05:05 +02:00
Mathieu Desnoyers
551d55a944 tree/tiny rcu: Add debug RCU head objects
Helps finding racy users of call_rcu(), which results in hangs because list
entries are overwritten and/or skipped.

Changelog since v4:
- Bissectability is now OK
- Now generate a WARN_ON_ONCE() for non-initialized rcu_head passed to
  call_rcu(). Statically initialized objects are detected with
  object_is_static().
- Rename rcu_head_init_on_stack to init_rcu_head_on_stack.
- Remove init_rcu_head() completely.

Changelog since v3:
- Include comments from Lai Jiangshan

This new patch version is based on the debugobjects with the newly introduced
"active state" tracker.

Non-initialized entries are all considered as "statically initialized". An
activation fixup (triggered by call_rcu()) takes care of performing the debug
object initialization without issuing any warning. Since we cannot increase the
size of struct rcu_head, I don't see much room to put an identifier for
statically initialized rcu_head structures. So for now, we have to live without
"activation without explicit init" detection. But the main purpose of this debug
option is to detect double-activations (double call_rcu() use of a rcu_head
before the callback is executed), which is correctly addressed here.

This also detects potential internal RCU callback corruption, which would cause
the callbacks to be executed twice.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: David S. Miller <davem@davemloft.net>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: laijs@cn.fujitsu.com
CC: dipankar@in.ibm.com
CC: josh@joshtriplett.org
CC: dvhltc@us.ibm.com
CC: niv@us.ibm.com
CC: tglx@linutronix.de
CC: peterz@infradead.org
CC: rostedt@goodmis.org
CC: Valdis.Kletnieks@vt.edu
CC: dhowells@redhat.com
CC: eric.dumazet@gmail.com
CC: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
2010-06-14 16:37:26 -07:00
Konrad Rzeszutek Wilk
d7ef1533a9 swiotlb: Make swiotlb bookkeeping functions visible in the header file.
We put the functions dealing with the operations on
the SWIOTLB buffer in the header and make those functions non-static.
And also make the functions exported via EXPORT_SYMBOL_GPL.

See "swiotlb: swiotlb: add swiotlb_tbl_map_single library function" for
full description of patchset.

[v2: swiotlb_sync_single_range_for_* no more. Remove usage.]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:27 -04:00
Konrad Rzeszutek Wilk
22d4826998 swiotlb: search and replace "int dir" with "enum dma_data_direction dir"
.. to catch anybody doing something funky.

See "swiotlb: swiotlb: add swiotlb_tbl_map_single library function" for
full description of patchset.

[v2: swiotlb_sync_single_range_* no more - removed usage]
[v3: enum dma_data_direction direction -> enum dma_data_direction dir]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:26 -04:00
Konrad Rzeszutek Wilk
bfc5501f6d swiotlb: Make internal bookkeeping functions have 'swiotlb_tbl' prefix.
The functions that operate on io_tlb_list/io_tlb_start/io_tlb_orig_addr
have the prefix 'swiotlb_tbl' now.

See "swiotlb: swiotlb: add swiotlb_tbl_map_single library function" for
full description of patchset.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:26 -04:00
FUJITA Tomonori
abbceff7d7 swiotlb: add the swiotlb initialization function with iotlb memory
This enables the caller to initialize swiotlb with its own iotlb
memory.

See "swiotlb: swiotlb: add swiotlb_tbl_map_single library function" for
full description of patchset.

[v2: changed ..with_tlb to ..with_tbl]

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:25 -04:00
FUJITA Tomonori
eb605a5754 swiotlb: add swiotlb_tbl_map_single library function
swiotlb_tbl_map_single() takes the dma address of iotlb instead of
using swiotlb_virt_to_bus().

[v2: changed swiotlb_tlb to swiotlb_tbl]
[v3: changed u64 to dma_addr_t]

This patch:

This is a set of patches that separate the address translation
(virt_to_phys, virt_to_bus, etc) and allocation of the SWIOTLB buffer
from the SWIOTLB library.

The idea behind this set of patches is to make it possible to have separate
mechanisms for translating virtual to physical or virtual to DMA addresses
on platforms which need an SWIOTLB, and where physical != PCI bus address
and also to allocate the core IOTLB memory outside SWIOTLB.

One customers of this is the pv-ops project, which can switch between
different modes of operation depending on the environment it is running in:
bare-metal or virtualized (Xen for now). Another is the Wii DMA - used to
implement the MEM2 DMA facility needed by its EHCI controller (for details:
http://lkml.org/lkml/2010/5/18/303)

On bare-metal SWIOTLB is used when there are no hardware IOMMU. In virtualized
environment it used when PCI pass-through is enabled for the guest. The problems
with PCI pass-through is that the guest's idea of PFN's is not the real thing.
To fix that, there is translation layer for PFN->machine frame number and vice-versa.
To bubble that up to the SWIOTLB layer there are two possible solutions.

One solution has been to wholesale copy the SWIOTLB, stick it in
arch/x86/xen/swiotlb.c and modify the virt_to_phys, phys_to_virt and others
to use the Xen address translation functions. Unfortunately, since the kernel can
run on bare-metal, there would be big code overlap with the real SWIOTLB.
(git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git xen/dom0/swiotlb-new)

Another approach, which this set of patches explores, is to abstract the
address translation and address determination functions away from the
SWIOTLB book-keeping functions. This way the core SWIOTLB library functions
are present in one place, while the address related functions are in
a separate library that can be loaded when running under non-bare-metal platform.

Changelog:
Since the last posting [v8.2] Konrad has done:
 - Added this changelog in the patch and referenced in the other patches
   this description.
 - 'enum dma_data_direction direction' to 'enum dma.. dir' so to be
   unified.
[v8-v8.2 changes:]
 - Rolled-up the last two patches in one.
 - Rebased against linus latest. That meant dealing with swiotlb_sync_single_range_* changes.
 - added Acked-by: Fujita Tomonori and Tested-by: Albert Herranz
[v7-v8 changes:]
 - Minimized the list of exported functions.
 - Integrated Fujita's patches and changed "swiotlb_tlb" to "swiotlb_tbl" in them.
[v6-v7 changes:]
 - Minimized the amount of exported functions/variable with a prefix of: "swiotbl_tbl".
 - Made the usage of 'int dir' to be 'enum dma_data_direction'.
[v5-v6 changes:]
 - Made the exported functions/variables have the 'swiotlb_bk' prefix.
 - dropped the checkpatches/other reworks

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:13 -04:00
Linus Torvalds
f9196e7c03 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  fix setattr error handling in sysfs, configfs
  kobject: free memory if netlink_kernel_create() fails
  lib/kobject_uevent.c: fix CONIG_NET=n warning
2010-06-04 15:27:27 -07:00
Heiko Carstens
007d08678e lib: add s390 to atomic64_dec_if_positive archs
Add s390 to list of architectures that have atomic64_dec_if_positive
implemented so we get rid of this warning:

lib/atomic64_test.c:129:2: warning: #warning Please implement
atomic64_dec_if_positive for your architecture, and add it to the IF above

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Luca Barbieri <luca@luca-barbieri.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Dan Carpenter
743db2d903 kobject: free memory if netlink_kernel_create() fails
There is a kfree(ue_sk) missing on the error path if
netlink_kernel_create() fails.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-06-04 13:27:52 -07:00
Andrew Morton
c842128607 lib/kobject_uevent.c: fix CONIG_NET=n warning
lib/kobject_uevent.c:87: warning: 'kobj_bcast_filter' defined but not used

Repairs "hotplug: netns aware uevent_helper"

Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-06-04 13:27:52 -07:00
Linus Torvalds
021fad8b70 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, cpufeature: Unbreak compile with gcc 3.x
  x86, pat: Fix memory leak in free_memtype
  x86, k8: Fix section mismatch for powernowk8_exit()
  lib/atomic64_test: fix missing include of linux/kernel.h
  x86: remove last traces of quicklist usage
  x86, setup: Phoenix BIOS fixup is needed on Dell Inspiron Mini 1012
  x86: "nosmp" command line option should force the system into UP mode
  arch/x86/pci: use kasprintf
  x86, apic: ack all pending irqs when crashed/on kexec
2010-05-30 09:06:13 -07:00
Linus Torvalds
35926ff5fb Revert "cpusets: randomize node rotor used in cpuset_mem_spread_node()"
This reverts commit 0ac0c0d0f8, which
caused cross-architecture build problems for all the wrong reasons.
IA64 already added its own version of __node_random(), but the fact is,
there is nothing architectural about the function, and the original
commit was just badly done. Revert it, since no fix is forthcoming.

Requested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-30 09:00:03 -07:00
Linus Torvalds
9a90e09854 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (27 commits)
  ACPI: Don't let acpi_pad needlessly mark TSC unstable
  drivers/acpi/sleep.h: Checkpatch cleanup
  ACPI: Minor cleanup eliminating redundant PMTIMER_TICKS to NS conversion
  ACPI: delete unused c-state promotion/demotion data strucutures
  ACPI: video: fix acpi_backlight=video
  ACPI: EC: Use kmemdup
  drivers/acpi: use kasprintf
  ACPI, APEI, EINJ injection parameters support
  Add x64 support to debugfs
  ACPI, APEI, Use ERST for persistent storage of MCE
  ACPI, APEI, Error Record Serialization Table (ERST) support
  ACPI, APEI, Generic Hardware Error Source memory error support
  ACPI, APEI, UEFI Common Platform Error Record (CPER) header
  Unified UUID/GUID definition
  ACPI Hardware Error Device (PNP0C33) support
  ACPI, APEI, PCIE AER, use general HEST table parsing in AER firmware_first setup
  ACPI, APEI, Document for APEI
  ACPI, APEI, EINJ support
  ACPI, APEI, HEST table parsing
  ACPI, APEI, APEI supporting infrastructure
  ...
2010-05-28 14:42:18 -07:00
Cesar Eduardo Barros
edcd1d843a radix-tree: fix radix_tree_prev_hole() underflow case
radix_tree_prev_hole() used LONG_MAX to detect underflow; however,
ULONG_MAX is clearly what was intended, both here and by its only user
(count_history_pages at mm/readahead.c).

Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:53 -07:00
FUJITA Tomonori
38388301b7 swiotlb: remove unnecessary swiotlb_sync_single_range_*
swiotlb_sync_single_range_for_cpu and swiotlb_sync_single_range_for_device
are unnecessary because swiotlb_sync_single_for_cpu and
swiotlb_sync_single_for_device can be used instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:52 -07:00
Joe Eykholt
5960164fde lib/random32: export pseudo-random number generator for modules
This patch moves the definition of struct rnd_state and the inline
__seed() function to linux/random.h.  It renames the static __random32()
function to prandom32() and exports it for use in modules.

prandom32() is useful as a privately-seeded pseudo random number generator
that can give the same result every time it is initialized.

For FCoE FC-BB-6 VN2VN mode self-selected unique FC address generation, we
need an pseudo-random number generator seeded with the 64-bit world-wide
port name.  A truly random generator or one seeded with randomness won't
do because the same sequence of numbers should be generated each time we
boot or the link comes up.

A prandom32_seed() inline function is added to the header file.  It is
inlined not for speed, but so the function won't be expanded in the base
kernel, but only in the module that uses it.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:52 -07:00
Imre Deak
2dcb22b346 idr: fix backtrack logic in idr_remove_all
Currently idr_remove_all will fail with a use after free error if
idr::layers is bigger than 2, which on 32 bit systems corresponds to items
more than 1024.  This is due to stepping back too many levels during
backtracking.  For simplicity let's assume that IDR_BITS=1 -> we have 2
nodes at each level below the root node and each leaf node stores two IDs.
 (In reality for 32 bit systems IDR_BITS=5, with 32 nodes at each sub-root
level and 32 IDs in each leaf node).  The sequence of freeing the nodes at
the moment is as follows:

layer
1 ->                       a(7)
2 ->            b(3)                  c(5)
3 ->        d(1)   e(2)           f(4)    g(6)

Until step 4 things go fine, but then node c is freed, whereas node g
should be freed first.  Since node c contains the pointer to node g we'll
have a use after free error at step 6.

How many levels we step back after visiting the leaf nodes is currently
determined by the msb of the id we are currently visiting:

Step
1.          node d with IDs 0,1 is freed, current ID is advanced to 2.
            msb of the current ID bit 1. This means we need to step back
            1 level to node b and take the next sibling, node e.
2-3.        node e with IDs 2,3 is freed, current ID is 4, msb is bit 2.
            This means we need to step back 2 levels to node a, freeing
            node b on the way.
4-5.        node f with IDs 4,5 is freed, current ID is 6, msb is still
            bit 2. This means we again need to step back 2 levels to node
            a and free c on the way.
6.          We should visit node g, but its pointer is not available as
            node c was freed.

The fix changes how we determine the number of levels to step back.
Instead of deducting this merely from the msb of the current ID, we should
really check if advancing the ID causes an overflow to a bit position
corresponding to a given layer.  In the above example overflow from bit 0
to bit 1 should mean stepping back 1 level.  Overflow from bit 1 to bit 2
should mean stepping back 2 levels and so on.

The fix was tested with IDs up to 1 << 20, which corresponds to 4 layers
on 32 bit systems.

Signed-off-by: Imre Deak <imre.deak@nokia.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: <stable@kernel.org>		[2.6.34.1]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:48 -07:00
Akinobu Mita
c9d221f86e fault-injection: add CPU notifier error injection module
I used this module to test the series of modification to the cpu notifiers
code.

Example1: inject CPU offline error (-1 == -EPERM)

	# modprobe cpu-notifier-error-inject cpu_down_prepare_error=-1
	# echo 0 > /sys/devices/system/cpu/cpu1/online
	bash: echo: write error: Operation not permitted

Example2: inject CPU online error (-2 == -ENOENT)

	# modprobe cpu-notifier-error-inject cpu_up_prepare_error=-2
	# echo 1 > /sys/devices/system/cpu/cpu1/online
	bash: echo: write error: No such file or directory

[akpm@linux-foundation.org: fix Kconfig help text]
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:48 -07:00
Jack Steiner
0ac0c0d0f8 cpusets: randomize node rotor used in cpuset_mem_spread_node()
Some workloads that create a large number of small files tend to assign
too many pages to node 0 (multi-node systems).  Part of the reason is that
the rotor (in cpuset_mem_spread_node()) used to assign nodes starts at
node 0 for newly created tasks.

This patch changes the rotor to be initialized to a random node number of
the cpuset.

[akpm@linux-foundation.org: fix layout]
[Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration]
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Paul Menage <menage@google.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:44 -07:00
Andrew Morton
0d2daf5cc8 revert "crc32: use __BYTE_ORDER macro for endian detection"
It doesn't work on big-endian - those architectures don't define
__LITTLE_ENDIAN.

Cc: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-26 08:19:23 -07:00
Joakim Tjernlund
4762bbc1a3 crc32: use __BYTE_ORDER macro for endian detection.
Since crc32.c contains a nifty test program that can be executed in user
space, make sure endian detection works reliably in user space too.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:06 -07:00
Joakim Tjernlund
836e2af925 crc32: major optimization
Precompute more crc32 values(0xcc00, 0xcc0000 and 0xcc000000) into tables.
 This increases the table size from 1KB to 4KB but the performance benfit
makes it worth it:

28% faster on MPC8321, 266 MHz
2x faster on Core 2 Duo, 3.1GHz

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:06 -07:00
Andy Shevchenko
903788892e lib: introduce common method to convert hex digits
hex_to_bin() is a little method which converts hex digit to its actual
value.  There are plenty of places where such functionality is needed.

[akpm@linux-foundation.org: use tolower(), saving 3 bytes, test the more common case first - it's quicker]
[akpm@linux-foundation.org: relocate tolower to make it even faster! (Joe)]
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Cc: Tilman Schmidt <tilman@imap.cc>
Cc: Duncan Sands <duncan.sands@free.fr>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: "Richard Russon (FlatCap)" <ldm@flatcap.org>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:05 -07:00
Joe Perches
db0fd97c27 lib/hexdump.c: reduce stack variable size and cleanups
Reduce char linebuf[200] to the actual size required., which is 32 * 3 + 2
+ 32 + 1, ie: linebuf[131].

Change examples to use bool true not int 1.

Align multiline argument indentation to open parenthesis.

Use temporary for ptr[j] so trigraph fits on single line.

Convert printk ptr from %*p, (int)(2 * sizeof(void *)) to %p as %p uses
the same calculation for size.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:05 -07:00
Florian Ragwitz
2b2f68b538 DYNAMIC_DEBUG: fix documentation errors
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Florian Ragwitz <rafl@debian.org>
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:05 -07:00
Dan Carpenter
ea46c8f774 dynamic_debug: small cleanup in ddebug_proc_write()
This doesn't change behavior at all.  In the original code, if nwords was
zero then ddebug_parse_query() would return -EINVAL, now we just do it
earlier.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:05 -07:00
Joe Perches
cf3b429b03 vsprintf.c: use noinline_for_stack
Mark static functions with noinline_for_stack

Before:

  akpm:/usr/src/25> objdump -d lib/vsprintf.o | perl scripts/checkstack.pl
  0x00000e82 pointer [vsprintf.o]:                        344
  0x0000198c pointer [vsprintf.o]:                        344
  0x000025d6 scnprintf [vsprintf.o]:                      216
  0x00002648 scnprintf [vsprintf.o]:                      216
  0x00002565 snprintf [vsprintf.o]:                       208
  0x0000267c sprintf [vsprintf.o]:                        208
  0x000030a3 bprintf [vsprintf.o]:                        208
  0x00003b1e sscanf [vsprintf.o]:                         208
  0x00000608 number [vsprintf.o]:                         136
  0x00000937 number [vsprintf.o]:                         136

After:

  akpm:/usr/src/25> objdump -d lib/vsprintf.o | perl scripts/checkstack.pl
  0x00000a7c symbol_string [vsprintf.o]:                  248
  0x00000ae8 symbol_string [vsprintf.o]:                  248
  0x00002310 scnprintf [vsprintf.o]:                      216
  0x00002382 scnprintf [vsprintf.o]:                      216
  0x0000229f snprintf [vsprintf.o]:                       208
  0x000023b6 sprintf [vsprintf.o]:                        208
  0x00002ddd bprintf [vsprintf.o]:                        208
  0x00003858 sscanf [vsprintf.o]:                         208
  0x00000625 number [vsprintf.o]:                         136
  0x00000954 number [vsprintf.o]:                         136

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:04 -07:00
Alexey Dobriyan
4be929be34 kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN
- C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not
  USHORT_MAX/SHORT_MAX/SHORT_MIN.

- Make SHRT_MIN of type s16, not int, for consistency.

[akpm@linux-foundation.org: fix drivers/dma/timb_dma.c]
[akpm@linux-foundation.org: fix security/keys/keyring.c]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:02 -07:00
Peter Huewe
0dbdd1bfe0 lib/atomic64_test: fix missing include of linux/kernel.h
Fix a build-failure
(http://kisskb.ellerman.id.au/kisskb/buildresult/2601239/) by adding the
missing include file (linux/kernel.h) for printk and KERN_INFO.

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
LKML-Reference: <201005241913.o4OJDKdf010884@imap1.linux-foundation.org>
Cc: Luca Barbieri <luca@luca-barbieri.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-05-24 13:33:32 -07:00
Linus Torvalds
0961d6581c Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables
  intel-iommu: Combine the BIOS DMAR table warning messages
  panic: Add taint flag TAINT_FIRMWARE_WORKAROUND ('I')
  panic: Allow warnings to set different taint flags
  intel-iommu: intel_iommu_map_range failed at very end of address space
  intel-iommu: errors with smaller iommu widths
  intel-iommu: Fix boot inside 64bit virtualbox with io-apic disabled
  intel-iommu: use physfn to search drhd for VF
  intel-iommu: Print out iommu seq_id
  intel-iommu: Don't complain that ACPI_DMAR_SCOPE_TYPE_IOAPIC is not supported
  intel-iommu: Avoid global flushes with caching mode.
  intel-iommu: Use correct domain ID when caching mode is enabled
  intel-iommu mistakenly uses offset_pfn when caching mode is enabled
  intel-iommu: use for_each_set_bit()
  intel-iommu: Fix section mismatch dmar_ir_support() uses dmar_tbl.
2010-05-21 17:25:01 -07:00
Linus Torvalds
90b9a32d8f Merge branch 'kdb-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'kdb-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: (25 commits)
  kdb,debug_core: Allow the debug core to receive a panic notification
  MAINTAINERS: update kgdb, kdb, and debug_core info
  debug_core,kdb: Allow the debug core to process a recursive debug entry
  printk,kdb: capture printk() when in kdb shell
  kgdboc,kdb: Allow kdb to work on a non open console port
  kgdb: Add the ability to schedule a breakpoint via a tasklet
  mips,kgdb: kdb low level trap catch and stack trace
  powerpc,kgdb: Introduce low level trap catching
  x86,kgdb: Add low level debug hook
  kgdb: remove post_primary_code references
  kgdb,docs: Update the kgdb docs to include kdb
  kgdboc,keyboard: Keyboard driver for kdb with kgdb
  kgdb: gdb "monitor" -> kdb passthrough
  sparc,sunzilog: Add console polling support for sunzilog serial driver
  sh,sh-sci: Use NO_POLL_CHAR in the SCIF polled console code
  kgdb,8250,pl011: Return immediately from console poll
  kgdb: core changes to support kdb
  kdb: core for kgdb back end (2 of 2)
  kdb: core for kgdb back end (1 of 2)
  kgdb,blackfin: Add in kgdb_arch_set_pc for blackfin
  ...
2010-05-21 11:08:05 -07:00
Eric W. Biederman
417daa1e8f hotplug: netns aware uevent_helper
It only makes sense for uevent_helper to get events
in the intial namespaces.  It's invocation is not
per namespace and it is not clear how we could make
it's invocation namespace aware.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:33 -07:00
Eric W. Biederman
5f71a29629 kobj: Send hotplug events in the proper namespace.
Utilize netlink_broacast_filtered to allow sending hotplug events
in the proper namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:32 -07:00
Eric W. Biederman
07e98962fa kobject: Send hotplug events in all network namespaces
Open a copy of the uevent kernel socket in each network
namespace so we can send uevents in all network namespaces.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:32 -07:00
Serge E. Hallyn
be867b194a sysfs: Comment sysfs directory tagging logic
Add some in-line comments to explain the new infrastructure, which
was introduced to support sysfs directory tagging with namespaces.
I think an overall description someplace might be good too, but it
didn't really seem to fit into Documentation/filesystems/sysfs.txt,
which appears more geared toward users, rather than maintainers, of
sysfs.

(Tejun, please let me know if I can make anything clearer or failed
altogether to comment something that should be commented.)

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:31 -07:00
Eric W. Biederman
3ff195b011 sysfs: Implement sysfs tagged directory support.
The problem.  When implementing a network namespace I need to be able
to have multiple network devices with the same name.  Currently this
is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
potentially a few other directories of the form /sys/ ... /net/*.

What this patch does is to add an additional tag field to the
sysfs dirent structure.  For directories that should show different
contents depending on the context such as /sys/class/net/, and
/sys/devices/virtual/net/ this tag field is used to specify the
context in which those directories should be visible.  Effectively
this is the same as creating multiple distinct directories with
the same name but internally to sysfs the result is nicer.

I am calling the concept of a single directory that looks like multiple
directories all at the same path in the filesystem tagged directories.

For the networking namespace the set of directories whose contents I need
to filter with tags can depend on the presence or absence of hotplug
hardware or which modules are currently loaded.  Which means I need
a simple race free way to setup those directories as tagged.

To achieve a reace free design all tagged directories are created
and managed by sysfs itself.

Users of this interface:
- define a type in the sysfs_tag_type enumeration.
- call sysfs_register_ns_types with the type and it's operations
- sysfs_exit_ns when an individual tag is no longer valid

- Implement mount_ns() which returns the ns of the calling process
  so we can attach it to a sysfs superblock.
- Implement ktype.namespace() which returns the ns of a syfs kobject.

Everything else is left up to sysfs and the driver layer.

For the network namespace mount_ns and namespace() are essentially
one line functions, and look to remain that.

Tags are currently represented a const void * pointers as that is
both generic, prevides enough information for equality comparisons,
and is trivial to create for current users, as it is just the
existing namespace pointer.

The work needed in sysfs is more extensive.  At each directory
or symlink creating I need to check if the directory it is being
created in is a tagged directory and if so generate the appropriate
tag to place on the sysfs_dirent.  Likewise at each symlink or
directory removal I need to check if the sysfs directory it is
being removed from is a tagged directory and if so figure out
which tag goes along with the name I am deleting.

Currently only directories which hold kobjects, and
symlinks are supported.  There is not enough information
in the current file attribute interfaces to give us anything
to discriminate on which makes it useless, and there are
no potential users which makes it an uninteresting problem
to solve.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:31 -07:00
Eric W. Biederman
bc451f2058 kobj: Add basic infrastructure for dealing with namespaces.
Move complete knowledge of namespaces into the kobject layer
so we can use that information when reporting kobjects to
userspace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:31 -07:00
NeilBrown
db1afffab0 kref: remove kref_set
Of the three uses of kref_set in the kernel:

 One really should be kref_put as the code is letting go of a
    reference,
 Two really should be kref_init because the kref is being
    initialised.

This suggests that making kref_set available encourages bad code.
So fix the three uses and remove kref_set completely.

Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:29 -07:00
Linus Torvalds
05ec7dd8dd Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (154 commits)
  mtd: cfi_cmdset_0002: use AMD standard command-set with Winbond flash chips
  mtd: cfi_cmdset_0002: Fix MODULE_ALIAS and linkage for new 0701 commandset ID
  mtd: mxc_nand: Remove duplicate NAND_CMD_RESET case value
  mtd: update gfp/slab.h includes
  jffs2: Stop triggering block erases from jffs2_write_super()
  jffs2: Rename jffs2_erase_pending_trigger() to jffs2_dirty_trigger()
  jffs2: Use jffs2_garbage_collect_trigger() to trigger pending erases
  jffs2: Require jffs2_garbage_collect_trigger() to be called with lock held
  jffs2: Wake GC thread when there are blocks to be erased
  jffs2: Erase pending blocks in GC pass, avoid invalid -EIO return
  jffs2: Add 'work_done' return value from jffs2_erase_pending_blocks()
  mtd: mtdchar: Do not corrupt backing device of device node inode
  mtd/maps/pcmciamtd: Fix printk format for ssize_t in debug messages
  drivers/mtd: Use kmemdup
  mtd: cfi_cmdset_0002: Fix argument order in bootloc warning
  mtd: nand: add Toshiba TC58NVG0 device ID
  pcmciamtd: add another ID
  pcmciamtd: coding style cleanups
  pcmciamtd: fixing obvious errors
  mtd: chips: add SST39WF160x NOR-flashes
  ...

Trivial conflicts due to dev_node removal in drivers/mtd/maps/pcmciamtd.c
2010-05-21 07:25:43 -07:00
Jason Wessel
5dd11d5d47 mips,kgdb: kdb low level trap catch and stack trace
The only way the debugger can handle a trap in inside rcu_lock,
notify_die, or atomic_notifier_call_chain without a recursive fault is
to have a low level "first opportunity handler" do_trap_or_bp() handler.

Generally this will be something the vast majority of folks will not
need, but for those who need it, it is added as a kernel .config
option called KGDB_LOW_LEVEL_TRAP.

Also added was a die notification for oops such that kdb can catch an
oops for analysis.

There appeared to be no obvious way to pass the struct pt_regs from
the original exception back to the stack back tracer, so a special
case was added to show_stack() for when kdb is active because you
generally desire to generally look at the back trace of the original
exception.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
2010-05-20 21:04:26 -05:00
Jason Wessel
f503b5ae53 x86,kgdb: Add low level debug hook
The only way the debugger can handle a trap in inside rcu_lock,
notify_die, or atomic_notifier_call_chain without a triple fault is
to have a low level "first opportunity handler" in the int3 exception
handler.

Generally this will be something the vast majority of folks will not
need, but for those who need it, it is added as a kernel .config
option called KGDB_LOW_LEVEL_TRAP.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: H. Peter Anvin <hpa@zytor.com>
CC: x86@kernel.org
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20 21:04:25 -05:00
Jason Wessel
ada64e4c98 kgdboc,keyboard: Keyboard driver for kdb with kgdb
This patch adds in the kdb PS/2 keyboard driver.  This was mostly a
direct port from the original kdb where I cleaned up the code against
checkpatch.pl and added the glue to stitch it into kgdb.

This patch also enables early kdb debug via kgdbwait and the keyboard.

All the access to configure kdb using either a serial console or the
keyboard is done via kgdboc.

If you want to use only the keyboard and want to break in early you
would add to your kernel command arguments:

    kgdboc=kbd kgdbwait

If you wanted serial and or the keyboard access you could use:

    kgdboc=kbd,ttyS0

You can also configure kgdboc as a kernel module or at run time with
the sysfs where you can activate and deactivate kgdb.

Turn it on:
    echo kbd,ttyS0 > /sys/module/kgdboc/parameters/kgdboc

Turn it off:
    echo "" > /sys/module/kgdboc/parameters/kgdboc

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2010-05-20 21:04:24 -05:00
Jason Wessel
dcc7871128 kgdb: core changes to support kdb
These are the minimum changes to the kgdb core in order to enable an
API to connect a new front end (kdb) to the debug core.

This patch introduces the dbg_kdb_mode variable controls where the
user level I/O is routed.  It will be routed to the gdbstub (kgdb) or
to the kdb front end which is a simple shell available over the kgdboc
connection.

You can switch back and forth between kdb or the gdb stub mode of
operation dynamically.  From gdb stub mode you can blindly type
"$3#33", or from the kdb mode you can enter "kgdb" to switch to the
gdb stub.

The logic in the debug core depends on kdb to look for the typical gdb
connection sequences and return immediately with KGDB_PASS_EVENT if a
gdb serial command sequence is detected.  That should allow a
reasonably seamless transition between kdb -> gdb without leaving the
kernel exception state.  The two gdb serial queries that kdb is
responsible for detecting are the "?" and "qSupported" packets.

CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Martin Hicks <mort@sgi.com>
2010-05-20 21:04:21 -05:00
Huang Ying
fab1c23242 Unified UUID/GUID definition
There are many different UUID/GUID definitions in kernel, such as that
in EFI, many file systems, some drivers, etc. Every kernel components
need UUID/GUID has its own definition. This patch provides a unified
definition for UUID/GUID.

UUID is defined via typedef. This makes that UUID appears more like a
preliminary type, and makes the data type explicit (comparing with
implicit "u8 uuid[16]").

The binary representation of UUID/GUID can be little-endian (used by
EFI, etc) or big-endian (defined by RFC4122), so both is defined.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:40:47 -04:00
Frederic Weisbecker
e35e7fb0e9 lockup_detector: Don't enable the lockup detector by default
The lockup detector is a new feature that now involves the
nmi watchdog. Drop the default y and let the user choose.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
2010-05-19 11:36:49 +02:00
Ben Hutchings
b2be05273a panic: Allow warnings to set different taint flags
WARN() is used in some places to report firmware or hardware bugs that
are then worked-around.  These bugs do not affect the stability of the
kernel and should not set the flag for TAINT_WARN.  To allow for this,
add WARN_TAINT() and WARN_TAINT_ONCE() macros that take a taint number
as argument.

Architectures that implement warnings using trap instructions instead
of calls to warn_slowpath_*() now implement __WARN_TAINT(taint)
instead of __WARN().

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Helge Deller <deller@gmx.de>
Tested-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2010-05-19 08:36:48 +01:00
Linus Torvalds
c4fd308ed6 Merge branch 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, pat: Update the page flags for memtype atomically instead of using memtype_lock
  x86, pat: In rbt_memtype_check_insert(), update new->type only if valid
  x86, pat: Migrate to rbtree only backend for pat memtype management
  x86, pat: Preparatory changes in pat.c for bigger rbtree change
  rbtree: Add support for augmented rbtrees
2010-05-18 09:28:04 -07:00
Linus Torvalds
cb41838bbc Merge branch 'core-hweight-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-hweight-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, hweight: Use a 32-bit popcnt for __arch_hweight32()
  arch, hweight: Fix compilation errors
  x86: Add optimized popcnt variants
  bitops: Optimize hweight() by making use of compile-time evaluation
2010-05-18 09:17:01 -07:00
Linus Torvalds
93c9d7f60c Merge branch 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix LOCK_PREFIX_HERE for uniprocessor build
  x86, atomic64: In selftest, distinguish x86-64 from 586+
  x86-32: Fix atomic64_inc_not_zero return value convention
  lib: Fix atomic64_inc_not_zero test
  lib: Fix atomic64_add_unless return value convention
  x86-32: Fix atomic64_add_unless return value convention
  lib: Fix atomic64_add_unless test
  x86: Implement atomic[64]_dec_if_positive()
  lib: Only test atomic64_dec_if_positive on archs having it
  x86-32: Rewrite 32-bit atomic64 functions in assembly
  lib: Add self-test for atomic64_t
  x86-32: Allow UP/SMP lock replacement in cmpxchg64
  x86: Add support for lock prefix in alternatives
2010-05-18 08:40:05 -07:00
Linus Torvalds
f262af3d08 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
  rcu: remove all rcu head initializations, except on_stack initializations
  rcu head introduce rcu head init on stack
  Debugobjects transition check
  rcu: fix build bug in RCU_FAST_NO_HZ builds
  rcu: RCU_FAST_NO_HZ must check RCU dyntick state
  rcu: make SRCU usable in modules
  rcu: improve the RCU CPU-stall warning documentation
  rcu: reduce the number of spurious RCU_SOFTIRQ invocations
  rcu: permit discontiguous cpu_possible_mask CPU numbering
  rcu: improve RCU CPU stall-warning messages
  rcu: print boot-time console messages if RCU configs out of ordinary
  rcu: disable CPU stall warnings upon panic
  rcu: enable CPU_STALL_VERBOSE by default
  rcu: slim down rcutiny by removing rcu_scheduler_active and friends
  rcu: refactor RCU's context-switch handling
  rcu: rename rcutiny rcu_ctrlblk to rcu_sched_ctrlblk
  rcu: shrink rcutiny by making synchronize_rcu_bh() be inline
  rcu: fix now-bogus rcu_scheduler_active comments.
  rcu: Fix bogus CONFIG_PROVE_LOCKING in comments to reflect reality.
  rcu: ignore offline CPUs in last non-dyntick-idle CPU check
  ...
2010-05-18 08:17:58 -07:00
Linus Torvalds
06ee772043 Merge branch 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debugobjects: Section mismatch cleanup
2010-05-18 07:20:19 -07:00
Frederic Weisbecker
23637d477c lockup_detector: Introduce CONFIG_HARDLOCKUP_DETECTOR
This new config is deemed to simplify even more the lockup detector
dependencies and can make it easier to bring a smooth sorting
between archs that support the new generic lockup detector and those
that still have their own, especially for those that are in the
middle of this migration.

Instead of checking whether we have CONFIG_LOCKUP_DETECTOR +
CONFIG_PERF_EVENTS_NMI each time an arch wants to know if it needs
to build its own lockup detector, take a shortcut with this new
config. It is enabled only if the hardlockup detection part of
the whole lockup detector is on.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
2010-05-16 01:57:42 +02:00
Frederic Weisbecker
e16bb1d7fe lockup_detector: Update some config
We kept CONFIG_DETECT_SOFTLOCKUP around for compatibility with
older configs. But it was enabled by default if CONFIG_DEBUG_KERNEL.

So if we want to enable CONFIG_LOCKUP_DETECTOR on configs that had
CONFIG_DETECT_SOFTLOCKUP, all we need is to have the same enabling
by default if CONFIG_DEBUG_KERNEL. We can then remove
CONFIG_DETECT_SOFTLOCKUP directly.

So tag CONFIG_LOCKUP_DETECTOR as default y. This is what we want for
most serious kernel debugging anyway.

And also forbid the lockup detector in S390 as it was for the
previous softlockup detector, event though the true reason for that
is not outlined.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
2010-05-16 01:57:27 +02:00
kirjanov@gmail.com
43aa7ac736 lib/btree: fix possible NULL pointer dereference
mempool_alloc() can return null in atomic case.

Signed-off-by: Denis Kirjanov <kirjanov@gmail.com>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-15 12:48:10 -07:00
Michel Lespinasse
91af708141 rwsem: Test for no active locks in __rwsem_do_wake undo code
If there are no active threasd using a semaphore, it is always correct
to unqueue blocked threads.  This seems to be what was intended in the
undo code.

What was done instead, was to look for a sem count of zero - this is an
impossible situation, given that at least one thread is known to be
queued on the semaphore.  The code might be correct as written, but it's
hard to reason about and it's not what was intended (otherwise the goto
out would have been unconditional).

Go for checking the active count - the alternative is not worth the
headache.

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-12 18:23:34 -07:00
Frederic Weisbecker
89d7ce2a21 lockup_detector: Make BOOTPARAM_SOFTLOCKUP_PANIC depend on LOCKUP_DETECTOR
Panic on softlockups was still depending on the softlockup detector.
But the latter has been merged into the lockup detector now.

Let's update this config dependency.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
2010-05-13 00:27:20 +02:00
Don Zickus
58687acba5 lockup_detector: Combine nmi_watchdog and softlockup detector
The new nmi_watchdog (which uses the perf event subsystem) is very
similar in structure to the softlockup detector.  Using Ingo's
suggestion, I combined the two functionalities into one file:
kernel/watchdog.c.

Now both the nmi_watchdog (or hardlockup detector) and softlockup
detector sit on top of the perf event subsystem, which is run every
60 seconds or so to see if there are any lockups.

To detect hardlockups, cpus not responding to interrupts, I
implemented an hrtimer that runs 5 times for every perf event
overflow event.  If that stops counting on a cpu, then the cpu is
most likely in trouble.

To detect softlockups, tasks not yielding to the scheduler, I used the
previous kthread idea that now gets kicked every time the hrtimer fires.
If the kthread isn't being scheduled neither is anyone else and the
warning is printed to the console.

I tested this on x86_64 and both the softlockup and hardlockup paths
work.

V2:
- cleaned up the Kconfig and softlockup combination
- surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
- seperated out the softlockup case from perf event subsystem
- re-arranged the enabling/disabling nmi watchdog from proc space
- added cpumasks for hardlockup failure cases
- removed fallback to soft events if no PMU exists for hard events

V3:
- comment cleanups
- drop support for older softlockup code
- per_cpu cleanups
- completely remove software clock base hardlockup detector
- use per_cpu masking on hard/soft lockup detection
- #ifdef cleanups
- rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
- documentation additions

V4:
- documentation fixes
- convert per_cpu to __get_cpu_var
- powerpc compile fixes

V5:
- split apart warn flags for hard and soft lockups

TODO:
- figure out how to make an arch-agnostic clock2cycles call
  (if possible) to feed into perf events as a sample period

[fweisbec: merged conflict patch]

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-12 23:55:33 +02:00
Frederic Weisbecker
a9aa1d02de Merge commit 'v2.6.34-rc7' into perf/nmi
Merge reason: catch up with latest softlockup detector changes.
2010-05-12 23:20:33 +02:00
Mathieu Desnoyers
a5d8e467f8 Debugobjects transition check
Implement a basic state machine checker in the debugobjects.

This state machine checker detects races and inconsistencies within the "active"
life of a debugobject. The checker only keeps track of the current state; all
the state machine logic is kept at the object instance level.

The checker works by adding a supplementary "unsigned int astate" field to the
debug_obj structure. It keeps track of the current "active state" of the object.

The only constraints that are imposed on the states by the debugobjects system
is that:

- activation of an object sets the current active state to 0,
- deactivation of an object expects the current active state to be 0.

For the rest of the states, the state mapping is determined by the specific
object instance. Therefore, the logic keeping track of the state machine is
within the specialized instance, without any need to know about it at the
debugobject level.

The current object active state is changed by calling:

debug_object_active_state(addr, descr, expect, next)

where "expect" is the expected state and "next" is the next state to move to if
the expected state is found. A warning is generated if the expected is not
found.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David S. Miller <davem@davemloft.net>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: laijs@cn.fujitsu.com
CC: dipankar@in.ibm.com
CC: josh@joshtriplett.org
CC: dvhltc@us.ibm.com
CC: niv@us.ibm.com
CC: peterz@infradead.org
CC: rostedt@goodmis.org
CC: Valdis.Kletnieks@vt.edu
CC: dhowells@redhat.com
CC: eric.dumazet@gmail.com
CC: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-05-10 16:08:01 -07:00
Paul E. McKenney
55ec936ff4 rcu: enable CPU_STALL_VERBOSE by default
The CPU_STALL_VERBOSE kernel configuration parameter was added to
2.6.34 to identify any preempted/blocked tasks that were preventing
the current grace period from completing when running preemptible
RCU.  As is conventional for new configurations parameters, this
defaulted disabled.  It is now time to enable it by default.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-05-10 11:08:34 -07:00
Lai Jiangshan
2b3fc35f69 rcu: optionally leave lockdep enabled after RCU lockdep splat
There is no need to disable lockdep after an RCU lockdep splat,
so remove the debug_lockdeps_off() from lockdep_rcu_dereference().
To avoid repeated lockdep splats, use a static variable in the inlined
rcu_dereference_check() and rcu_dereference_protected() macros so that
a given instance splats only once, but so that multiple instances can
be detected per boot.

This is controlled by a new config variable CONFIG_PROVE_RCU_REPEATEDLY,
which is disabled by default.  This provides the normal lockdep behavior
by default, but permits people who want to find multiple RCU-lockdep
splats per boot to easily do so.

Requested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-05-10 11:08:31 -07:00
David Woodhouse
0ae28a35bc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
	drivers/mtd/mtdcore.c

Pull in the bdi fixes and ARM platform changes that other outstanding
patches depend on.
2010-05-10 14:32:46 +01:00
H. Peter Anvin
d9c5841e22 Merge branch 'x86/asm' into x86/atomic
Merge reason:
	Conflict between LOCK_PREFIX_HERE and relative alternatives
	pointers

Resolved Conflicts:
	arch/x86/include/asm/alternative.h
	arch/x86/kernel/alternative.c

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-29 16:53:17 -07:00
Hans Verkuil
98d5ce0d00 lib/vsprintf.c: add missing EXPORT_SYMBOL(simple_strtoll)
Add a missing EXPORT_SYMBOL.

I must be the first person that wants to use this function :-)

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:26 -07:00
Albin Tonnerre
ccdb40048b lib: fix the use of LZO to decompress initramfs images
This patch fixes 2 issues with the LZO decompressor:

- It doesn't handle the case where a block isn't compressed at all.  In
  this case, calling lzo1x_decompress_safe will fail, so we need to just
  use memcpy() instead (the upstream LZO code does something similar)

- Since commit 54291362d2 ("initramfs: add
  missing decompressor error check") , the decompressor return code is
  checked in the init/initramfs.c The LZO decompressor didn't return the
  expected value, causing the initramfs code to falsely believe a
  decompression error occured

Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Tested-by: bert schulze <spambemyguest@googlemail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:25 -07:00
Changli Gao
e59464c735 flex_array: fix the panic when calling flex_array_alloc() without __GFP_ZERO
memset() is called with the wrong address and the kernel panics.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:24 -07:00
Linus Torvalds
dc57da3875 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/gart: Disable GART explicitly before initialization
  dma-debug: Cleanup for copy-loop in filter_write()
  x86/amd-iommu: Remove obsolete parameter documentation
  x86/amd-iommu: use for_each_pci_dev
  Revert "x86: disable IOMMUs on kernel crash"
  x86/amd-iommu: warn when issuing command to uninitialized cmd buffer
  x86/amd-iommu: enable iommu before attaching devices
  x86/amd-iommu: Use helper function to destroy domain
  x86/amd-iommu: Report errors in acpi parsing functions upstream
  x86/amd-iommu: Pt mode fix for domain_destroy
  x86/amd-iommu: Protect IOMMU-API map/unmap path
  x86/amd-iommu: Remove double NULL check in check_device
2010-04-15 12:20:56 -07:00
Joe Perches
4e310fda91 vsprintf: Change struct printf_spec.precision from s8 to s16
Commit ef0658f3de changed precision
from int to s8.

There is existing kernel code that uses a larger precision.

An example from the audit code:
	vsnprintf(...,..., " msg='%.1024s'", (char *)data);
which overflows precision and truncates to nothing.

Extending precision size fixes the audit system issue.

Other changes:

Change the size of the struct printf_spec.type from u16 to u8 so
sizeof(struct printf_spec) stays as small as possible.
Reorder the struct members so sizeof(struct printf_spec) remains 64 bits
without alignment holes.
Document the struct members a bit more.

Original-patch-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-14 10:32:35 -07:00
Ingo Molnar
2b2f862ee6 Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2010-04-13 13:24:54 +02:00
David S. Miller
9343af084c Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	lib/Kconfig.debug
2010-04-13 00:28:45 -07:00
David S. Miller
8b8d8e2840 sparc64: Support kmemleak.
Only missing thing was an _sdata marker in vmlinux.lds.S

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 23:46:17 -07:00
Linus Torvalds
2f4084209a Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits)
  cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
  loop: Update mtime when writing using aops
  block: expose the statistics in blkio.time and blkio.sectors for the root cgroup
  backing-dev: Handle class_create() failure
  Block: Fix block/elevator.c elevator_get() off-by-one error
  drbd: lc_element_by_index() never returns NULL
  cciss: unlock on error path
  cfq-iosched: Do not merge queues of BE and IDLE classes
  cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging
  i2o: Remove the dangerous kobj_to_i2o_device macro
  block: remove 16 bytes of padding from struct request on 64bits
  cfq-iosched: fix a kbuild regression
  block: make CONFIG_BLK_CGROUP visible
  Remove GENHD_FL_DRIVERFS
  block: Export max number of segments and max segment size in sysfs
  block: Finalize conversion of block limits functions
  block: Fix overrun in lcm() and move it to lib
  vfs: improve writeback_inodes_wb()
  paride: fix off-by-one test
  drbd: fix al-to-on-disk-bitmap for 4k logical_block_size
  ...
2010-04-09 11:50:29 -07:00
David Howells
ce82653d6c radix_tree_tag_get() is not as safe as the docs make out [ver #2]
radix_tree_tag_get() is not safe to use concurrently with radix_tree_tag_set()
or radix_tree_tag_clear().  The problem is that the double tag_get() in
radix_tree_tag_get():

		if (!tag_get(node, tag, offset))
			saw_unset_tag = 1;
		if (height == 1) {
			int ret = tag_get(node, tag, offset);

may see the value change due to the action of set/clear.  RCU is no protection
against this as no pointers are being changed, no nodes are being replaced
according to a COW protocol - set/clear alter the node directly.

The documentation in linux/radix-tree.h, however, says that
radix_tree_tag_get() is an exception to the rule that "any function modifying
the tree or tags (...) must exclude other modifications, and exclude any
functions reading the tree".

The problem is that the next statement in radix_tree_tag_get() checks that the
tag doesn't vary over time:

			BUG_ON(ret && saw_unset_tag);

This has been seen happening in FS-Cache:

	https://www.redhat.com/archives/linux-cachefs/2010-April/msg00013.html

To this end, remove the BUG_ON() from radix_tree_tag_get() and note in various
comments that the value of the tag may change whilst the RCU read lock is held,
and thus that the return value of radix_tree_tag_get() may not be relied upon
unless radix_tree_tag_set/clear() and radix_tree_delete() are excluded from
running concurrently with it.

Reported-by: Romain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-09 10:12:03 -07:00
Kevin Hilman
3eac4abaa6 rwsem generic spinlock: use IRQ save/restore spinlocks
rwsems can be used with IRQs disabled, particularily in early boot
before IRQs are enabled.  Currently the spin_unlock_irq() usage in the
slow-patch will unconditionally enable interrupts and cause problems
since interrupts are not yet initialized or enabled.

This patch uses save/restore versions of IRQ spinlocks in the slowpath
to ensure interrupts are not unintentionally disabled.

Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07 16:15:05 -07:00
Linus Torvalds
addb2d6c13 Merge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze
* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Remove unused variable from ptrace
  microblaze: io.h: Add io big-endian function
  microblaze: Enable memory leak detector
  microblaze: Fix futex code
  microblaze: Fix ftrace_update_ftrace_func panic
2010-04-07 08:48:39 -07:00
Yong Zhang
57119c34e5 ratelimit: fix the return value when __ratelimit() fails to acquire the lock
The log of commit edaac8e316 ("ratelimit:
Fix/allow use in atomic contexts"), indicates that we want to suppress the
callback when the trylock fails.

Signed-off-by: Yong Zhang <yong.zhang@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07 08:38:04 -07:00
Yong Zhang
2a7268abc4 ratelimit: annotate ___ratelimit()
To prevent from wrongly using the return value.

[akpm@linux-foundation.org: fix spello]
Signed-off-by: Yong Zhang <yong.zhang@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07 08:38:04 -07:00
Dan Carpenter
39a37ce1cc dma-debug: Cleanup for copy-loop in filter_write()
Earlier in this function we set the last byte of "buf" to NULL so we
always hit the break statement and "i" is never equal to NAME_MAX_LEN.
This patch doesn't change how the driver works but it silences a Smatch
warning and it makes it clearer that we don't write past the end of the
array.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2010-04-07 14:36:27 +02:00
Michal Simek
47c4c864af microblaze: Enable memory leak detector
Enable DEBUG_KMEMLEAK for microblaze

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-04-07 07:27:26 +02:00
Borislav Petkov
d61931d89b x86: Add optimized popcnt variants
Add support for the hardware version of the Hamming weight function,
popcnt, present in CPUs which advertize it under CPUID, Function
0x0000_0001_ECX[23]. On CPUs which don't support it, we fallback to the
default lib/hweight.c sw versions.

A synthetic benchmark comparing popcnt with __sw_hweight64 showed almost
a 3x speedup on a F10h machine.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100318112015.GC11152@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-06 15:52:11 -07:00
Peter Zijlstra
1527bc8b92 bitops: Optimize hweight() by making use of compile-time evaluation
Rename the extisting runtime hweight() implementations to
__arch_hweight(), rename the compile-time versions to __const_hweight()
and then have hweight() pick between them.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100318111929.GB11152@aftab>
Acked-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <1265028224.24455.154.camel@laptop>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-06 15:52:11 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Henrik Kretzschmar
1fb2f77c03 debugobjects: Section mismatch cleanup
This patch marks two functions, which only get called at
initialization, as __init.

Here is also interesting, that modpost doesn't catch here the right
function name.

WARNING: lib/built-in.o(.text+0x585f): Section mismatch in reference
from the function T.506() to the variable .init.data:obj
The function T.506() references the variable __initdata obj.
This is often because T.506 lacks a __initdata annotation or the 
annotation of obj is wrong.

Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
LKML-Reference: <1269632315-19403-1-git-send-email-henne@nachtwindheim.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-03-26 21:52:29 +01:00
David Woodhouse
329f9052db Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
	drivers/mtd/nand/sh_flctl.c

Maxim's patch to initialise sysfs attributes depends on the patch which
actually adds sysfs_attr_init().
2010-03-26 14:55:59 +00:00
Mike Frysinger
1d53661d26 blackfin: enable DEBUG_SECTION_MISMATCH
We see only one section mismatch now after thousands of randconfigs, and a
bug has been filed about that one.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:31:20 -07:00
Jens Axboe
b4b7a4ef09 Merge branch 'master' into for-linus
Conflicts:
	block/Kconfig

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-03-19 08:05:10 +01:00
Martin K. Petersen
2cda2728aa block: Fix overrun in lcm() and move it to lib
lcm() was defined to take integer-sized arguments.  The supplied
arguments are multiplied, however, causing us to overflow given
sufficiently large input.  That in turn led to incorrect optimal I/O
size reporting in some cases (RAID over RAID).

Switch lcm() over to unsigned long similar to gcd() and move the
function from blk-settings.c to lib.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-03-15 12:47:59 +01:00
Len Brown
ec28dcc6b4 Merge branches 'battery-2.6.34', 'bugzilla-10805', 'bugzilla-14668', 'bugzilla-531916-power-state', 'ht-warn-2.6.34', 'pnp', 'processor-rename', 'sony-2.6.34', 'suse-bugzilla-531547', 'tz-check', 'video' and 'misc-2.6.34' into release 2010-03-14 21:30:17 -04:00
Bjorn Helgaas
9d7cca0421 resource: add window support
Add support for resource windows.  This is for bridge resources, i.e.,
regions where a bridge forwards transactions from the primary to the
secondary side.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-03-14 20:08:36 -04:00
Bjorn Helgaas
0f4050c7d3 resource: add bus number support
Add support for bus number resources.  This is for bridges with a range of
bus numbers behind them.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-03-14 20:08:35 -04:00
Linus Torvalds
9fdfbc2bff Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Provide generic perf_sample_data initialization
  MAINTAINERS: Add Arnaldo as tools/perf/ co-maintainer
  perf trace: Don't use pager if scripting
  perf trace/scripting: Remove extraneous header read
  perf, ARM: Modify kuser rmb() call to compile for Thumb-2
  x86/stacktrace: Don't dereference bad frame pointers
  perf archive: Don't try to collect files without a build-id
  perf_events, x86: Fixup fixed counter constraints
  perf, x86: Restrict the ANY flag
  perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE
  perf, x86: add some IBS macros to perf_event.h
  perf, x86: make IBS macros available in perf_event.h
  hw-breakpoints: Remove stub unthrottle callback
  x86/hw-breakpoints: Remove the name field
  perf: Remove pointless breakpoint union
  perf lock: Drop the buffers multiplexing dependency
  perf lock: Fix and add misc documentally things
  percpu: Add __percpu sparse annotations to hw_breakpoint
2010-03-13 14:39:42 -08:00
Joakim Tjernlund
51ea3f6a45 inflate_fast: sout is already a short so ptr arith was off by one.
inflate_fast() can do either POST INC or PRE INC on its pointers walking
the memory to decompress.  Default is PRE INC.

The sout pointer offset was miscalculated in one case as the calculation
assumed sout was a char * This breaks inflate_fast() iff configured to do
POST INC.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:44 -08:00
Joakim Tjernlund
e69eae6552 zlib: make new optimized inflate endian independent
Commit 6846ee5ca6 ("zlib: Fix build of
powerpc boot wrapper") made the new optimized inflate only available on
arch's that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.

This patch will again enable the optimization for all arch's by defining
our own endian independent version of unaligned access.  As an added
bonus, arch's that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS do a
plain load instead.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:44 -08:00
Ingo Molnar
548b841669 Merge commit 'v2.6.34-rc1' into perf/urgent
Conflicts:
	tools/perf/util/probe-event.c

Merge reason: Pick up -rc1 and resolve the conflict as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-09 17:11:53 +01:00
Emese Revfy
52cf25d0ab Driver core: Constify struct sysfs_ops in struct kobj_type
Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

 * prevents modification of data that is shared
   (referenced) by many other structure instances
   at runtime

 * detects/prevents accidental (but not intentional)
   modification attempts on archs that enforce
   read-only kernel data at runtime

 * potentially better optimized code as the compiler
   can assume that the const data cannot be changed

 * the compiler/linker move const data into .rodata
   and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Emese Revfy
9cd43611cc kobject: Constify struct kset_uevent_ops
Constify struct kset_uevent_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

 * prevents modification of data that is shared
   (referenced) by many other structure instances
   at runtime

 * detects/prevents accidental (but not intentional)
   modification attempts on archs that enforce
   read-only kernel data at runtime

 * potentially better optimized code as the compiler
   can assume that the const data cannot be changed

 * the compiler/linker move const data into .rodata
   and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Linus Torvalds
b8fa05719b Revert "lib: build list_sort() only if needed"
This reverts commit a069c266ae.

It turns ou that not only was it missing a case (XFS) that needed it,
but perhaps more importantly, people sometimes want to enable new
modules that they hadn't had enabled before, and if such a module uses
list_sort(), it can't easily be inserted any more.

So rather than add a "select LIST_SORT" to the XFS case, just leave it
compiled in.  It's not all _that_ big, after all, and the inconvenience
isn't worth it.

Requested-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Don Mullis <don.mullis@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-07 09:54:44 -08:00
Bjorn Helgaas
4da0b66c6e vsprintf: move %pR resource printf_specs off the stack
This adds separate I/O and memory specs, so we don't have to change the
field width in a shared spec, which then lets us make all the specs const
and static, since they never change.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 17:53:07 -08:00
Bjorn Helgaas
b89dc5d6b0 vsprintf: clarify comments for printf_spec flags
Add clues about what the SMALL and SPECIAL flags do.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 17:53:07 -08:00
Joe Perches
ef0658f3de vsprintf.c: Reduce sizeof struct printf_spec from 24 to 8 bytes
Reducing the size of struct printf_spec is a good thing because multiple
instances are commonly passed on stack.

It's possible for type to be u8 and field_width to be s8, but this is
likely small enough for now.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 17:47:45 -08:00
Linus Torvalds
66b89159c2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs
* git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs:
  [LogFS] Change magic number
  [LogFS] Remove h_version field
  [LogFS] Check feature flags
  [LogFS] Only write journal if dirty
  [LogFS] Fix bdev erases
  [LogFS] Silence gcc
  [LogFS] Prevent 64bit divisions in hash_index
  [LogFS] Plug memory leak on error paths
  [LogFS] Add MAINTAINERS entry
  [LogFS] add new flash file system

Fixed up trivial conflict in lib/Kconfig, and a semantic conflict in
fs/logfs/inode.c introduced by write_inode() being changed to use
writeback_control' by commit a9185b41a4
("pass writeback_control to ->write_inode")
2010-03-06 13:18:03 -08:00
Joakim Tjernlund
4f2a9463d1 crc32: some minor cleanups
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:45 -08:00
Akinobu Mita
08564fb7ab bitmap: use for_each_set_bit()
Replace open-coded loop with for_each_set_bit().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:35 -08:00
Ben Hutchings
9a86e2bad0 lib: fix first line of kernel-doc for a few functions
The function name must be followed by a space, hypen, space, and a short
description.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:35 -08:00
Don Mullis
a069c266ae lib: build list_sort() only if needed
Build list_sort() only for configs that need it -- those that don't save
~581 bytes (i386).

Signed-off-by: Don Mullis <don.mullis@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:35 -08:00
Don Mullis
02b12b7a28 lib: revise list_sort() header comment
Clarify and correct header comment of list_sort().

Signed-off-by: Don Mullis <don.mullis@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:35 -08:00
Don Mullis
835cc0c847 lib: more scalable list_sort()
XFS and UBIFS can pass long lists to list_sort(); this alternative
implementation scales better, reaching ~3x performance gain when list
length exceeds the L2 cache size.

Stand-alone program timings were run on a Core 2 duo L1=32KB L2=4MB,
gcc-4.4, with flags extracted from an Ubuntu kernel build.  Object size is
581 bytes compared to 455 for Mark J.  Roberts' code.

Worst case for either implementation is a list length just over a power of
two, and to roughly the same degree, so here are timing results for a
range of 2^N+1 lengths.  List elements were 16 bytes each including malloc
overhead; initial order was random.

                      time (msec)
                      Tatham-Roberts
                      |       generic-Mullis-v2
loop_count  length    |       |    ratio
4000000       2     206     294    1.427
2000000       3     176     227    1.289
1000000       5     199     172    0.864
 500000       9     235     178    0.757
 250000      17     243     182    0.748
 125000      33     261     196    0.750
  62500      65     277     209    0.754
  31250     129     292     219    0.75
  15625     257     317     235    0.741
   7812     513     340     252    0.741
   3906    1025     362     267    0.737
   1953    2049     388     283    0.729  ~ L1 size
    976    4097     556     323    0.580
    488    8193     678     361    0.532
    244   16385     773     395    0.510
    122   32769     844     418    0.495
     61   65537     917     454    0.495
     30  131073    1128     543    0.481
     15  262145    2355     869    0.369  ~ L2 size
      7  524289    5597    1714    0.306
      3 1048577    6218    2022    0.325

Mark's code does not actually implement the usual or generic mergesort,
but rather a variant from Simon Tatham described here:

    http://www.chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html

Simon's algorithm performs O(log N) passes over the entire input list,
doing merges of sublists that double in size on each pass.  The generic
algorithm instead merges pairs of equal length lists as early as possible,
in recursive order.  For either algorithm, the elements that extend the
list beyond power-of-two length are a special case, handled as nearly as
possible as a "rounding-up" to a full POT.

Some intuition for the locality of reference implications of merge order
may be gotten by watching this animation:

    http://www.sorting-algorithms.com/merge-sort

Simon's algorithm requires only O(1) extra space rather than the generic
algorithm's O(log N), but in my non-recursive implementation the actual
O(log N) data is merely a vector of ~20 pointers, which I've put on the
stack.

Long-running list_sort() calls: If the list passed in may be long, or the
client's cmp() callback function is slow, the client's cmp() may
periodically invoke cond_resched() to voluntarily yield the CPU.  All
inner loops of list_sort() call back to cmp().

Stability of the sort: distinct elements that compare equal emerge from
the sort in the same order as with Mark's code, for simple test cases.  A
boot-time test is provided to verify this and other correctness
requirements.

A kernel that uses drm.ko appears to run normally with this change; I have
no suitable hardware to similarly test the use by UBIFS.

[akpm@linux-foundation.org: style tweaks, fix comment, make list_sort_test __init]
Signed-off-by: Don Mullis <don.mullis@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:35 -08:00
André Goddard Rosa
d6a2eedfdd lib/string.c: simplify strnstr()
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Joe Perches <joe@perches.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:35 -08:00
André Goddard Rosa
a11d2b64e1 lib/string.c: simplify stricmp()
Removes 32 bytes on core2 with gcc 4.4.1:
   text    data     bss     dec     hex filename
   3196       0       0    3196     c7c lib/string-BEFORE.o
   3164       0       0    3164     c5c lib/string-AFTER.o

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:35 -08:00
Simon Kagstrom
0347af4ee3 lkdtm: add debugfs access and loosen KPROBE ties
Add adds a debugfs interface and additional failure modes to LKDTM to
provide similar functionality to the provoke-crash driver submitted here:

  http://lwn.net/Articles/371208/

Crashes can now be induced either through module parameters (as before)
or through the debugfs interface as in provoke-crash.

The patch also provides a new "direct" interface, where KPROBES are not
used, i.e., the crash is invoked directly upon write to the debugfs
file. When built without KPROBES configured, only this mode is available.

Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Cc: M. Mohan Kumar <mohan@in.ibm.com>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:32 -08:00
Amerigo Wang
f047f4f379 mm: use the same log level for show_mem()
Use the same log level for printk's in show_mem(), so that those messages
can be shown completely when using log level 6.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:27 -08:00
H. Peter Anvin
a5c9161f27 x86, atomic64: In selftest, distinguish x86-64 from 586+
The x86-64 implementation of the atomics is totally different from the
i586+ implementation, which makes it quite confusing to call it
"586+".  Also fix indentation, and add "i" for "i386" and "i586" as
used elsewhere in the kernel.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-4-git-send-email-luca@luca-barbieri.com>
2010-03-01 11:51:56 -08:00
Luca Barbieri
25a304f277 lib: Fix atomic64_inc_not_zero test
atomic64_inc_not_zero must return 1 if it perfomed the add and 0 otherwise.
The test assumed the opposite convention.

Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267469749-11878-5-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-03-01 11:39:02 -08:00
Luca Barbieri
97577896f6 lib: Fix atomic64_add_unless return value convention
atomic64_add_unless must return 1 if it perfomed the add and 0 otherwise.
The generic implementation did the opposite thing.

Reported-by: H. Peter Anvin <hpa@zytor.com>
Confirmed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267469749-11878-4-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-03-01 11:38:46 -08:00
Luca Barbieri
9efbcd5902 lib: Fix atomic64_add_unless test
atomic64_add_unless must return 1 if it perfomed the add and 0 otherwise.
The test assumed the opposite convention.

Reported-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267469749-11878-2-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-03-01 11:38:46 -08:00
Luca Barbieri
d7f6de1e9c x86: Implement atomic[64]_dec_if_positive()
Add support for atomic_dec_if_positive(), and
atomic64_dec_if_positive() for x86-64.

atomic64_dec_if_positive() for x86-32 was already implemented in a previous patch.

Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267183361-20775-2-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-03-01 11:38:42 -08:00
Luca Barbieri
8f4f202b33 lib: Only test atomic64_dec_if_positive on archs having it
Currently atomic64_dec_if_positive() is only supported by PowerPC,
MIPS and x86-32.

Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267183361-20775-1-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-03-01 11:37:55 -08:00
David S. Miller
47871889c6 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/firmware/iscsi_ibft.c
2010-02-28 19:23:06 -08:00
Ingo Molnar
c99c30fead nmi_watchdog: Turn it off by default
It was nice to enable it by default for testing - but before we
push it upstream we want it to be off - so that people can
opt-in gradually.

Cc: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
LKML-Reference: <1266880143-24943-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-28 20:49:00 +01:00
Linus Torvalds
a7f16d10b5 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Mark atomic irq ops raw for 32bit legacy
  x86: Merge show_regs()
  x86: Macroise x86 cache descriptors
  x86-32: clean up rwsem inline asm statements
  x86: Merge asm/atomic_{32,64}.h
  x86: Sync asm/atomic_32.h and asm/atomic_64.h
  x86: Split atomic64_t functions into seperate headers
  x86-64: Modify memcpy()/memset() alternatives mechanism
  x86-64: Modify copy_user_generic() alternatives mechanism
  x86: Lift restriction on the location of FIX_BTMAP_*
  x86, core: Optimize hweight32()
2010-02-28 10:35:09 -08:00
Linus Torvalds
642c4c75a7 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (44 commits)
  rcu: Fix accelerated GPs for last non-dynticked CPU
  rcu: Make non-RCU_PROVE_LOCKING rcu_read_lock_sched_held() understand boot
  rcu: Fix accelerated grace periods for last non-dynticked CPU
  rcu: Export rcu_scheduler_active
  rcu: Make rcu_read_lock_sched_held() take boot time into account
  rcu: Make lockdep_rcu_dereference() message less alarmist
  sched, cgroups: Fix module export
  rcu: Add RCU_CPU_STALL_VERBOSE to dump detailed per-task information
  rcu: Fix rcutorture mod_timer argument to delay one jiffy
  rcu: Fix deadlock in TREE_PREEMPT_RCU CPU stall detection
  rcu: Convert to raw_spinlocks
  rcu: Stop overflowing signed integers
  rcu: Use canonical URL for Mathieu's dissertation
  rcu: Accelerate grace period if last non-dynticked CPU
  rcu: Fix citation of Mathieu's dissertation
  rcu: Documentation update for CONFIG_PROVE_RCU
  security: Apply lockdep-based checking to rcu_dereference() uses
  idr: Apply lockdep-based diagnostics to rcu_dereference() uses
  radix-tree: Disable RCU lockdep checking in radix tree
  vfs: Abstract rcu_dereference_check for files-fdtable use
  ...
2010-02-28 10:13:16 -08:00
Linus Torvalds
ef1a8de8ea Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (88 commits)
  powerpc: Fix lwsync feature fixup vs. modules on 64-bit
  powerpc: Convert pmc_owner_lock to raw_spinlock
  powerpc: Convert die.lock to raw_spinlock
  powerpc: Convert tlbivax_lock to raw_spinlock
  powerpc: Convert mpic locks to raw_spinlock
  powerpc: Convert pmac_pic_lock to raw_spinlock
  powerpc: Convert big_irq_lock to raw_spinlock
  powerpc: Convert feature_lock to raw_spinlock
  powerpc: Convert i8259_lock to raw_spinlock
  powerpc: Convert beat_htab_lock to raw_spinlock
  powerpc: Convert confirm_error_lock to raw_spinlock
  powerpc: Convert ipic_lock to raw_spinlock
  powerpc: Convert native_tlbie_lock to raw_spinlock
  powerpc: Convert beatic_irq_mask_lock to raw_spinlock
  powerpc: Convert nv_lock to raw_spinlock
  powerpc: Convert context_lock to raw_spinlock
  powerpc/85xx: Add NOR, LEDs and PIB support for MPC8568E-MDS boards
  powerpc/86xx: Enable VME driver on the GE SBC610
  powerpc/86xx: Enable VME driver on the GE PPC9A
  powerpc/86xx: Add MSI section to GE PPC9A DTS
  ...
2010-02-27 13:26:18 -08:00
Frederic Weisbecker
dd8b1cf681 perf: Remove pointless breakpoint union
Remove pointless union in the breakpoint field of hw_perf_event.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
2010-02-27 17:10:39 +01:00
Hitoshi Mitake
84c6f88fc8 perf lock: Fix and add misc documentally things
I've forgot to add 'perf lock' line to command-list.txt,
so users of perf could not find perf lock when they type 'perf'.

Fixing command-list.txt requires document
(tools/perf/Documentation/perf-lock.txt).
But perf lock is too much "under construction" to write a
stable document, so this is something like pseudo document for now.

And I wrote description of perf lock at help section of
CONFIG_LOCK_STAT, this will navigate users of lock trace events.

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
LKML-Reference: <1265267295-8388-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-02-27 17:05:22 +01:00
Linus Torvalds
64d497f553 Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (187 commits)
  sh: remove dead LED code for migo-r and ms7724se
  sh: ecovec build fix for CONFIG_I2C=n
  sh: ecovec r-standby support
  sh: ms7724se r-standby support
  sh: SH-Mobile R-standby register save/restore
  clocksource: Fix up a registration/IRQ race in the sh drivers.
  sh: ms7724: modify scan_timing for KEYSC
  sh: ms7724: Add sh_sir support
  sh: mach-ecovec24: Add sh_sir support
  sh: wire up SET/GET_UNALIGN_CTL.
  sh: allow alignment fault mode to be configured at kernel boot.
  sh: sh7724: Update FSI/SPU2 clock
  sh: always enable sh7724 vpu_clk and set to 166MHz on Ecovec
  sh: add sh7724 kick callback to clk_div4_table
  sh: introduce struct clk_div4_table
  sh: clock-cpg div4 set_rate() shift fix
  sh: Turn on speculative return for SH7785 and SH7786
  sh: Merge legacy and dynamic PMB modes.
  sh: Use uncached I/O helpers in PMB setup.
  sh: Provide uncached I/O helpers.
  ...
2010-02-26 16:54:27 -08:00
David Woodhouse
a7790532f5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
The SmartMedia FTL code depends on new kfifo bits from 2.6.33
2010-02-26 19:06:24 +00:00
Luca Barbieri
86a8938078 lib: Add self-test for atomic64_t
This patch adds self-test on boot code for atomic64_t.

This has been used to test the later changes in this patchset.

Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-4-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-25 20:47:12 -08:00
Benjamin Herrenschmidt
874f2f997d Merge commit 'origin/master' into next
Manual merge of:
	drivers/char/hvc_console.c
	drivers/char/hvc_console.h
2010-02-26 14:41:00 +11:00
Ben Hutchings
4d1ee80f3a idr: export idr_get_next()
idr_get_next() was accidentally not exported when added.  It is about
to be used by mtdcore, which may be built as a module.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2010-02-25 11:54:51 +00:00
Paul E. McKenney
1ed509a225 rcu: Add RCU_CPU_STALL_VERBOSE to dump detailed per-task information
When RCU detects a grace-period stall, it currently just prints
out the PID of any tasks doing the stalling.  This patch adds
RCU_CPU_STALL_VERBOSE, which enables the more-verbose reporting
from sched_show_task().

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <1266887105-1528-21-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25 10:35:02 +01:00
Paul E. McKenney
96be753af9 idr: Apply lockdep-based diagnostics to rcu_dereference() uses
Because idr can be used with any of a number of locks or with
any flavor of RCU, just disable the lockdep-based diagnostics.
If idr needs diagnostics, the check expression will need to be
passed into the relevant idr primitives as an additional
argument.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <1266887105-1528-11-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25 10:34:51 +01:00
Paul E. McKenney
2676a58c98 radix-tree: Disable RCU lockdep checking in radix tree
Because the radix tree is used with many different locking
designs, we cannot do any effective checking without changing
the radix-tree APIs. It might make sense to do this later, but
only if the RCU lockdep checking proves itself sufficiently
valuable.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <1266887105-1528-10-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25 10:34:50 +01:00
Paul E. McKenney
632ee20013 rcu: Introduce lockdep-based checking to RCU read-side primitives
Inspection is proving insufficient to catch all RCU misuses,
which is understandable given that rcu_dereference() might be
protected by any of four different flavors of RCU (RCU, RCU-bh,
RCU-sched, and SRCU), and might also/instead be protected by any
of a number of locking primitives. It is therefore time to
enlist the aid of lockdep.

This set of patches is inspired by earlier work by Peter
Zijlstra and Thomas Gleixner, and takes the following approach:

o	Set up separate lockdep classes for RCU, RCU-bh, and RCU-sched.

o	Set up separate lockdep classes for each instance of SRCU.

o	Create primitives that check for being in an RCU read-side
	critical section.  These return exact answers if lockdep is
	fully enabled, but if unsure, report being in an RCU read-side
	critical section.  (We want to avoid false positives!)
	The primitives are:

	For RCU: rcu_read_lock_held(void)

	For RCU-bh: rcu_read_lock_bh_held(void)

	For RCU-sched: rcu_read_lock_sched_held(void)

	For SRCU: srcu_read_lock_held(struct srcu_struct *sp)

o	Add rcu_dereference_check(), which takes a second argument
	in which one places a boolean expression based on the above
	primitives and/or lockdep_is_held().

o	A new kernel configuration parameter, CONFIG_PROVE_RCU, enables
	rcu_dereference_check().  This depends on CONFIG_PROVE_LOCKING,
	and should be quite helpful during the transition period while
	CONFIG_PROVE_RCU-unaware patches are in flight.

The existing rcu_dereference() primitive does no checking, but
upcoming patches will change that.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <1266887105-1528-1-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25 09:40:59 +01:00
Ingo Molnar
996de8c6fe Merge commit 'v2.6.33' into core/rcu
Merge reason: Update from -rc4 to -final.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25 09:40:26 +01:00
Tejun Heo
d2e7276b6b idr: fix a critical misallocation bug, take#2
This is retry of reverted 859ddf0974
("idr: fix a critical misallocation bug") which contained two bugs.

* pa[idp->layers] should be cleared even if it's not used by
  sub_alloc() because it's used by mark idr_mark_full().

* The original condition check also assigned pa[l] to p which the new
  code didn't do thus leaving p pointing at the wrong layer.

Both problems have been fixed and the idr code has received good amount
testing using userland testing setup where simple bitmap allocator is
run parallel to verify the result of idr allocation.

The bug this patch fixes is caused by sub_alloc() optimization path
bypassing out-of-room condition check and restarting allocation loop
with starting value higher than maximum allowed value.  For detailed
description, please read commit message of 859ddf09.

Signed-off-by: Tejun Heo <tj@kernel.org>
Based-on-patch-from: Eric Paris <eparis@redhat.com>
Reported-by: Eric Paris <eparis@redhat.com>
Tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-22 19:50:34 -08:00
Pallipadi, Venkatesh
17d9ddc72f rbtree: Add support for augmented rbtrees
Add support for augmented rbtrees in core rbtree code.

This will be used in subsequent patches, in x86 PAT code, which needs
interval trees to efficiently keep track of PAT ranges.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
LKML-Reference: <20100210232343.GA11465@linux-os.sc.intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-18 15:40:56 -08:00
David S. Miller
2bb4646fce Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-02-16 22:09:29 -08:00
Don Zickus
c3128fb6ad nmi_watchdog: Use a boolean config flag for compiling
Determines if an arch has setup arch specific perf_events and
nmi_watchdog code.  This should restrict compiles to only those
arches ready.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
LKML-Reference: <1266013161-31197-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-14 09:19:43 +01:00
Ingo Molnar
8e7672cdb4 nmi_watchdog: Only enable on x86 for now
It wont even build on other platforms just yet - so restrict it
to x86 for now.

Cc: Don Zickus <dzickus@redhat.com>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-09 06:11:00 +01:00
Don Zickus
84e478c6f1 nmi_watchdog: Config option to enable new nmi_watchdog
These are the bits that enable the new nmi_watchdog and safely
isolate the old nmi_watchdog.  Only one or the other can run,
not both at the same time.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-08 08:29:03 +01:00
Tejun Heo
6f14a668f1 idr: revert misallocation bug fix
Commit 859ddf0974 tried to fix
misallocation bug but broke full bit marking by not clearing
pa[idp->layers] and also is causing X failures due to lookup failure
in drm code.  The cause of the latter hasn't been found yet.  Revert
the fix for now.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-04 16:03:41 -08:00
Michael Ellerman
24551f64d4 lmb: Add lmb_free()
We can free memory allocated with lmb_alloc() by removing it from the
list of reserved LMBs. Rework lmb_remove() to allow that possibility
and add lmb_free() which exploits it.

BenH: Removed some useless parenthesis

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-03 17:39:50 +11:00
Tejun Heo
859ddf0974 idr: fix a critical misallocation bug
Eric Paris located a bug in idr.  With IDR_BITS of 6, it grows to three
layers when id 4096 is first allocated.  When that happens, idr wraps
incorrectly and searches the idr array ignoring the high bits.  The
following test code from Eric demonstrates the bug nicely.

#include <linux/idr.h>
#include <linux/kernel.h>
#include <linux/module.h>

static DEFINE_IDR(test_idr);

int init_module(void)
{
	int ret, forty95, forty96;
	void *addr;

	/* add 2 entries both with 4095 as the start address */
again1:
	if (!idr_pre_get(&test_idr, GFP_KERNEL))
		return -ENOMEM;
	ret = idr_get_new_above(&test_idr, (void *)4095, 4095, &forty95);
	if (ret) {
		if (ret == -EAGAIN)
			goto again1;
		return ret;
	}
	if (forty95 != 4095)
		printk(KERN_ERR "hmmm, forty95=%d\n", forty95);

again2:
	if (!idr_pre_get(&test_idr, GFP_KERNEL))
		return -ENOMEM;
	ret = idr_get_new_above(&test_idr, (void *)4096, 4095, &forty96);
	if (ret) {
		if (ret == -EAGAIN)
			goto again2;
		return ret;
	}
	if (forty96 != 4096)
		printk(KERN_ERR "hmmm, forty96=%d\n", forty96);

	/* try to find the 2 entries, noticing that 4096 broke */
	addr = idr_find(&test_idr, forty95);
	if ((int)addr != forty95)
		printk(KERN_ERR "hmmm, after find forty95=%d addr=%d\n", forty95, (int)addr);
	addr = idr_find(&test_idr, forty96);
	if ((int)addr != forty96)
		printk(KERN_ERR "hmmm, after find forty96=%d addr=%d\n", forty96, (int)addr);
	/* really weird, the entry which should be at 4096 is actually at 0!! */
	addr = idr_find(&test_idr, 0);
	if ((int)addr)
		printk(KERN_ERR "found an entry at id=0 for addr=%d\n", (int)addr);

	idr_remove(&test_idr, forty95);
	idr_remove(&test_idr, forty96);

	return 0;
}

void cleanup_module(void)
{
}

MODULE_AUTHOR("Eric Paris <eparis@redhat.com>");
MODULE_DESCRIPTION("Simple idr test");
MODULE_LICENSE("GPL");

This happens because when sub_alloc() back tracks it doesn't always do it
step-by-step while the over-the-limit detection assumes step-by-step
backtracking.  The logic in sub_alloc() looks like the following.

  restart:
    clear pa[top level + 1] for end cond detection
    l = top level
    while (true) {
	search for empty slot at this level
	if (not found) {
	    push id to the next possible value
	    l++
A:	    if (pa[l] is clear)
	        failed, return asking caller to grow the tree
	    if (going up 1 level gives more slots to search)
	        continue the while loop above with the incremented l
	    else
C:	        goto restart
	}
	adjust id accordingly to the found slot
	if (l == 0)
	    return found id;
	create lower level if not there yet
	record pa[l] and l--
    }

Test A is the fail exit condition but this assumes that failure is
propagated upwared one level at a time but the B optimization path breaks
the assumption and restarts the whole thing with a start value which is
above the possible limit with the current layers.  sub_alloc() assumes the
start id value is inside the limit when called and test A is the only exit
condition check, so it ends up searching for empty slot while ignoring
high set bit.

So, for 4095->4096 test, level0 search fails but pa[1] contains a valid
pointer.  However, going up 1 level wouldn't give any more empty slot so
it takes C and when the whole thing restarts nobody notices the high bit
set beyond the top level.

This patch fixes the bug by changing the fail exit condition check to full
id limit check.

Based-on-patch-from: Eric Paris <eparis@redhat.com>
Reported-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-02 18:11:21 -08:00