Commit Graph

31637 Commits

Author SHA1 Message Date
Ingo Molnar
c9e1585b1b Merge branch 'tip/x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into x86/mm 2009-02-20 18:51:43 +01:00
Ingo Molnar
7a5714e018 x86, pat: add large-PAT check to split_large_page()
Impact: future-proof the split_large_page() function

Linus noticed that split_large_page() is not safe wrt. the
PAT bit: it is bit 12 on the 1GB and 2MB page table level
(_PAGE_BIT_PAT_LARGE), and it is bit 7 on the 4K page
table level (_PAGE_BIT_PAT).

Currently it is not a problem because we never set
_PAGE_BIT_PAT_LARGE on any of the large-page mappings - but
should this happen in the future the split_large_page() would
silently lift bit 12 into the lowlevel 4K pte and would start
corrupting the physical page frame offset. Not fun.

So add a debug warning, to make sure if something ever sets
the PAT bit then this function gets updated too.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 17:48:49 +01:00
Steven Rostedt
3c3e5694ad x86: check PMD in spurious_fault handler
Impact: fix to prevent hard lockup on bad PMD permissions

If the PMD does not have the correct permissions for a page access,
but the PTE does, the spurious fault handler will mistake the fault
as a lazy TLB transaction. This will result in an infinite loop of:

 fault -> spurious_fault check (pass) -> return to code -> fault

This patch adds a check and a warn on if the PTE passes the permissions
but the PMD does not.

[ Updated: Ingo Molnar suggested using WARN_ONCE with some text ]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-20 11:44:47 -05:00
Ingo Molnar
609162850d Merge branches 'x86/asm', 'x86/cleanups' and 'x86/headers' into x86/core 2009-02-20 17:40:50 +01:00
Ingo Molnar
3b6f7b9beb Merge branch 'x86/urgent' into x86/core 2009-02-20 17:40:43 +01:00
Vegard Nossum
ecab22aa6d x86: use symbolic constants for MSR_IA32_MISC_ENABLE bits
Impact: Cleanup. No functional changes.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 12:07:43 +01:00
Ingo Molnar
07a66d7c53 x86: use the right protections for split-up pagetables
Steven Rostedt found a bug in where in his modified kernel
ftrace was unable to modify the kernel text, due to the PMD
itself having been marked read-only as well in
split_large_page().

The fix, suggested by Linus, is to not try to 'clone' the
reference protection of a huge-page, but to use the standard
(and permissive) page protection bits of KERNPG_TABLE.

The 'cloning' makes sense for the ptes but it's a confused and
incorrect concept at the page table level - because the
pagetable entry is a set of all ptes and hence cannot
'clone' any single protection attribute - the ptes can be any
mixture of protections.

With the permissive KERNPG_TABLE, even if the pte protections
get changed after this point (due to ftrace doing code-patching
or other similar activities like kprobes), the resulting combined
protections will still be correct and the pte's restrictive
(or permissive) protections will control it.

Also update the comment.

This bug was there for a long time but has not caused visible
problems before as it needs a rather large read-only area to
trigger. Steve possibly hacked his kernel with some really
large arrays or so. Anyway, the bug is definitely worth fixing.

[ Huang Ying also experienced problems in this area when writing
  the EFI code, but the real bug in split_large_page() was not
  realized back then. ]

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Huang Ying <ying.huang@intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 08:35:03 +01:00
Alok N Kataria
48ffc70b67 x86, vmi: TSC going backwards check in vmi clocksource
Impact: fix time warps under vmware

Similar to the check for TSC going backwards in the TSC clocksource,
we also need this check for VMI clocksource.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org
2009-02-20 07:53:08 +01:00
Linus Torvalds
402a917aca Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5405/1: ep93xx: remove unused gesbc9312.h header
  [ARM] 5404/1: Fix condition in arm_elf_read_implies_exec() to set READ_IMPLIES_EXEC
  [ARM] omap: fix clock reparenting in omap2_clk_set_parent()
  [ARM] 5403/1: pxa25x_ep_fifo_flush() *ep->reg_udccs always set to 0
  [ARM] 5402/1: fix a case of wrap-around in sanity_check_meminfo()
  [ARM] 5401/1: Orion: fix edge triggered GPIO interrupt support
  [ARM] 5400/1: Add support for inverted rdy_busy pin for Atmel nand device controller
  [ARM] 5391/1: AT91: Enable GPIO clocks earlier
  [ARM] 5390/1: AT91: Watchdog fixes
  [ARM] 5398/1: Add Wan ZongShun to MAINTAINERS for W90P910
  [ARM] omap: fix _omap2_clksel_get_src_field()
  [ARM] omap: fix omap2_divisor_to_clksel() error return value
2009-02-19 09:52:12 -08:00
Ingo Molnar
e9ce0c37c2 Merge branch 'x86/untangle2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/headers 2009-02-19 18:15:01 +01:00
Linus Torvalds
bcf8951fc2 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown
  x86, mce: use force_sig_info to kill process in machine check
  x86, mce: reinitialize per cpu features on resume
  x86, rcu: fix strange load average and ksoftirqd behavior
2009-02-19 09:14:35 -08:00
Hartley Sweeten
9dd446f657 [ARM] 5405/1: ep93xx: remove unused gesbc9312.h header
Remove the gesbc9312.h header since it is unused.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-19 16:13:02 +00:00
Cyrill Gorcunov
cb425afd21 x86: compressed head_32 - use ENTRY,ENDPROC macros
Impact: clenaup

Linker script will put startup_32 at predefined
address so using startup_32 will not bloat the
code size.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:13:01 +01:00
Cyrill Gorcunov
2d4eeecb98 x86: compressed head_64 - use ENTRY,ENDPROC macros
Impact: clenaup

Linker script will put startup_32 at predefined
address so using ENTRY will not bloat the code
size.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:13:01 +01:00
Cyrill Gorcunov
324bda9e47 x86: pmjump - use GLOBAL,ENDPROC macros
Impact: cleanup

We are in setup stage so we use GLOBAL
instead of ENTRY and do not increase code
size.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:13:00 +01:00
Cyrill Gorcunov
2f79555097 x86: copy.S - use GLOBAL,ENDPROC macros
Impact: cleanup

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:13:00 +01:00
Cyrill Gorcunov
1b25f3b4e1 x86: linkage - get rid of _X86 macros
Impact: cleanup

There was an attempt to bring build-time checking for
missed ENTRY_X86/END_X86 and KPROBE... pairs. Using
them will add messy in code. Get just rid of them.
This commit could be easily restored if the need appear
in future.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:12:59 +01:00
Cyrill Gorcunov
95695547a7 x86: asm linkage - introduce GLOBAL macro
If the code is time critical and this entry is called
from other places we use ENTRY to have it globally defined
and especially aligned.

Contrary we have some snippets which are size
critical. So we use plane ".globl name; name:"
directive. Introduce GLOBAL macro for this.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:12:59 +01:00
Makito SHIOKAWA
9da616fb99 [ARM] 5404/1: Fix condition in arm_elf_read_implies_exec() to set READ_IMPLIES_EXEC
READ_IMPLIES_EXEC must be set when:
o binary _is_ an executable stack (i.e. not EXSTACK_DISABLE_X)
o processor architecture is _under_ ARMv6 (XN bit is supported from ARMv6)

Signed-off-by: Makito SHIOKAWA <lkhmkt@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-19 14:45:27 +00:00
Heiko Carstens
23d75d9cad [S390] fix "mem=" handling in case of standby memory
Standby memory detected with the sclp interface gets always registered
with add_memory calls without considering the limitationt that the
"mem=" kernel paramater implies.
So fix this and only register standby memory that is below the specified
limit.
This fixes zfcpdump since it uses "mem=32M". In case there is appr.
2GB standby memory present all of usable memory would be used for the
struct pages needed for standby memory.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-02-19 15:19:19 +01:00
Christian Borntraeger
d5cd0343d2 [S390] Fix timeval regression on s390
commit aa5e97ce4b
[PATCH] improve precision of process accounting.

Introduced a timing regression:
-bash-3.2# time ls
real    0m0.006s
user    0m1.754s
sys     0m1.094s

The problem was introduced by an error in cputime_to_timeval.
Cputime is now 1/4096 microsecond, therefore, we have to divide
the remainder with 4096 to get the microseconds.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-02-19 15:19:19 +01:00
Russell King
41f3103fcf [ARM] omap: fix clock reparenting in omap2_clk_set_parent()
When changing the parent of a clock, it is necessary to keep the
clock use counts balanced otherwise things the parent state will
get corrupted.  Since we already disable and re-enable the clock,
we might as well use the recursive versions instead.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-19 13:25:16 +00:00
Hiroshi Shimamoto
71d8f9784a x86: syscalls.h: remove asmlinkage from declaration of sys_rt_sigreturn()
Impact: cleanup

asmlinkage for sys_rt_sigreturn() no longer exists in arch/x86/kernel/signal.c.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 12:18:54 +01:00
Nicolas Pitre
3fd9825c42 [ARM] 5402/1: fix a case of wrap-around in sanity_check_meminfo()
In the non highmem case, if two memory banks of 1GB each are provided,
the second bank would evade suppression since its virtual base would
be 0.  Fix this by disallowing any memory bank which virtual base
address is found to be lower than PAGE_OFFSET.

Reported-by: Lennert Buytenhek <buytenh@marvell.com>

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-19 09:49:45 +00:00
Jaswinder Singh Rajput
de5483029b x86: include/asm/processor.h remove double declaration of print_cpu_info
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 10:12:18 +01:00
KAMEZAWA Hiroyuki
cc2559bccc mm: fix memmap init for handling memory hole
Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole.
and memmap initialization was not done. This was a trouble for
sparc boot.

To fix this, the PFN should be initialized and marked as PG_reserved.
This patch changes early_pfn_in_nid() return true if PFN is a hole.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
KAMEZAWA Hiroyuki
f2dbcfa738 mm: clean up for early_pfn_to_nid()
What's happening is that the assertion in mm/page_alloc.c:move_freepages()
is triggering:

	BUG_ON(page_zone(start_page) != page_zone(end_page));

Once I knew this is what was happening, I added some annotations:

	if (unlikely(page_zone(start_page) != page_zone(end_page))) {
		printk(KERN_ERR "move_freepages: Bogus zones: "
		       "start_page[%p] end_page[%p] zone[%p]\n",
		       start_page, end_page, zone);
		printk(KERN_ERR "move_freepages: "
		       "start_zone[%p] end_zone[%p]\n",
		       page_zone(start_page), page_zone(end_page));
		printk(KERN_ERR "move_freepages: "
		       "start_pfn[0x%lx] end_pfn[0x%lx]\n",
		       page_to_pfn(start_page), page_to_pfn(end_page));
		printk(KERN_ERR "move_freepages: "
		       "start_nid[%d] end_nid[%d]\n",
		       page_to_nid(start_page), page_to_nid(end_page));
 ...

And here's what I got:

	move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
	move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
	move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
	move_freepages: start_nid[1] end_nid[0]

My memory layout on this box is:

[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0081ff5d
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[8] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x00020000
[    0.000000]     1: 0x00800000 -> 0x0081f7ff
[    0.000000]     1: 0x0081f800 -> 0x0081fe50
[    0.000000]     1: 0x0081fed1 -> 0x0081fed8
[    0.000000]     1: 0x0081feda -> 0x0081fedb
[    0.000000]     1: 0x0081fedd -> 0x0081fee5
[    0.000000]     1: 0x0081fee7 -> 0x0081ff51
[    0.000000]     1: 0x0081ff59 -> 0x0081ff5d

So it's a block move in that 0x81f600-->0x81f7ff region which triggers
the problem.

This patch:

Declaration of early_pfn_to_nid() is scattered over per-arch include
files, and it seems it's complicated to know when the declaration is used.
 I think it makes fix-for-memmap-init not easy.

This patch moves all declaration to include/linux/mm.h

After this,
  if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use static definition in include/linux/mm.h
  else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use generic definition in mm/page_alloc.c
  else
     -> per-arch back end function will be called.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
Andi Kleen
07db1c140e x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown
Impact: Bugfix

The ifdef for the apic clear on shutdown for the 64bit intel thermal
vector was incorrect and never triggered. Fix that.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:34 -08:00
Andi Kleen
380851bc6b x86, mce: use force_sig_info to kill process in machine check
Impact: bug fix (with tolerant == 3)

do_exit cannot be called directly from the exception handler because
it can sleep and the exception handler runs on the exception stack.
Use force_sig() instead.

Based on a earlier patch by Ying Huang who debugged the problem.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:31 -08:00
Andi Kleen
6ec68bff3c x86, mce: reinitialize per cpu features on resume
Impact: Bug fix

This fixes a long standing bug in the machine check code. On resume the
boot CPU wouldn't get its vendor specific state like thermal handling
reinitialized. This means the boot cpu wouldn't ever get any thermal
events reported again.

Call the respective initialization functions on resume

v2: Remove ancient init because they don't have a resume device anyways.
    Pointed out by Thomas Gleixner.
v3: Now fix the Subject too to reflect v2 change

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:28 -08:00
Nicolas Pitre
fd4b9b3650 [ARM] 5401/1: Orion: fix edge triggered GPIO interrupt support
The GPIO interrupts can be configured as either level triggered or edge
triggered, with a default of level triggered.  When an edge triggered
interrupt is requested, the gpio_irq_set_type method is called which
currently switches the given IRQ descriptor between two struct irq_chip
instances: orion_gpio_irq_level_chip and orion_gpio_irq_edge_chip. This
happens via __setup_irq() which also calls irq_chip_set_defaults() to
assign default methods to uninitialized ones.  The problem is that
irq_chip_set_defaults() is called before the irq_chip reference is
switched, leaving the new irq_chip (orion_gpio_irq_edge_chip in this
case) with uninitialized methods such as chip->startup() causing a kernel
oops.

Many solutions are possible, such as making irq_chip_set_defaults() global
and calling it from gpio_irq_set_type(), or calling __irq_set_trigger()
before irq_chip_set_defaults() in __setup_irq().  But those require
modifications to the generic IRQ code which might have adverse effect on
other architectures, and that would still be a fragile arrangement.
Manually copying the missing methods from within gpio_irq_set_type()
would be really ugly and it would break again the day new methods with
automatic defaults are added.

A better solution is to have a single irq_chip instance which can deal
with both edge and level triggered interrupts.  It is also a good idea
to switch the IRQ handler instead, as the edge IRQ handler allows for
one edge IRQ event to be queued as the IRQ is actually masked only when
that second IRQ is received, at which point the hardware can queue an
additional IRQ event, making edge triggered interrupts a bit more
reliable.

Tested-by: Martin Michlmayr <tbm@cyrius.com>

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-17 22:37:09 +00:00
Linus Torvalds
f8effd1a4a Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  doc: mmiotrace.txt, buffer size control change
  trace: mmiotrace to the tracer menu in Kconfig
  mmiotrace: count events lost due to not recording
2009-02-17 14:29:15 -08:00
Linus Torvalds
35010334aa Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, vm86: fix preemption bug
  x86, olpc: fix model detection without OFW
  x86, hpet: fix for LS21 + HPET = boot hang
  x86: CPA avoid repeated lazy mmu flush
  x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
  x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption
  x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem
  x86/cpa: make sure cpa is safe to call in lazy mmu mode
  x86, ptrace, mm: fix double-free on race
2009-02-17 14:27:39 -08:00
Linus Torvalds
b30b774930 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/vsx: Fix VSX alignment handler for regs 32-63
  powerpc/ps3: Move ps3_mm_add_memory to device_initcall
  powerpc/mm: Fix numa reserve bootmem page selection
  powerpc/mm: Fix _PAGE_CHG_MASK to protect _PAGE_SPECIAL
2009-02-17 14:23:49 -08:00
Ingo Molnar
9be1b56a3e x86, apic: separate 32-bit setup functionality out of apic_32.c
Impact: build fix, cleanup

A couple of arch setup callbacks were mistakenly in apic_32.c, breaking
the build.

Also simplify the code a bit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 23:12:48 +01:00
Linus Torvalds
39a65762d4 Merge branch 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: VMX: Flush volatile msrs before emulating rdmsr
  KVM: Fix assigned devices circular locking dependency
  KVM: x86: fix LAPIC pending count calculation
  KVM: Fix INTx for device assignment
  KVM: MMU: Map device MMIO as UC in EPT
  KVM: x86: disable kvmclock on non constant TSC hosts
  KVM: PIT: fix i8254 pending count read
  KVM: Fix racy in kvm_free_assigned_irq
  KVM: Add kvm_arch_sync_events to sync with asynchronize events
  KVM: mmu_notifiers release method
  KVM: Avoid using CONFIG_ in userspace visible headers
  KVM: ia64: fix fp fault/trap handler
2009-02-17 14:04:32 -08:00
Paul E. McKenney
bf51935f3e x86, rcu: fix strange load average and ksoftirqd behavior
Damien Wyart reported high ksoftirqd CPU usage (20%) on an
otherwise idle system.

The function-graph trace Damien provided:

>   799.521187 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521371 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521555 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521738 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521934 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522068 |   1)  ksoftir-2324  |               |                rcu_check_callbacks() {
>   799.522208 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522392 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522575 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522759 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522956 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523074 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.523214 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523397 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523579 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523762 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523960 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524079 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.524220 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524403 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524587 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524770 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
> [ . . . ]

Shows rcu_check_callbacks() being invoked way too often. It should be called
once per jiffy, and here it is called no less than 22 times in about
3.5 milliseconds, meaning one call every 160 microseconds or so.

Why do we need to call rcu_pending() and rcu_check_callbacks() from the
idle loop of 32-bit x86, especially given that no other architecture does
this?

The following patch removes the call to rcu_pending() and
rcu_check_callbacks() from the x86 32-bit idle loop in order to
reduce the softirq load on idle systems.

Reported-by: Damien Wyart <damien.wyart@free.fr>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 22:47:45 +01:00
H. Peter Anvin
a7eb518998 x86: truncate ISA addresses to unsigned int
Impact: Cleanup; fix inappropriate macro use

ISA addresses on x86 are mapped 1:1 with the physical address space.
Since the ISA address space is only 24 bits (32 for VLB or LPC) it
will always fit in an unsigned int, and at least in the aha1542 driver
using a wider type would cause an undesirable promotion.  Hence
explicitly cast the ISA bus addresses to unsigned int.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
2009-02-17 13:01:51 -08:00
Ingo Molnar
2a05180fe2 x86, apic: move remaining APIC drivers to arch/x86/kernel/apic/*
Move the 32-bit extended-arch APIC drivers to arch/x86/kernel/apic/
too, and rename apic_64.c to probe_64.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 20:35:47 +01:00
Ingo Molnar
f62bae5009 x86, apic: move APIC drivers to arch/x86/kernel/apic/*
arch/x86/kernel/ is getting a bit crowded, and the APIC
drivers are scattered into various different files.

Move them to arch/x86/kernel/apic/*, and also remove
the 'gen' prefix from those which had it.

Also move APIC related functionality: the IO-APIC driver,
the NMI and the IPI code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 18:17:36 +01:00
Ingo Molnar
be163a159b x86, apic: rename 'genapic' to 'apic'
Impact: cleanup

Now that all APIC code is consolidated there's nothing 'gen' about
apics anymore - so rename 'struct genapic' to 'struct apic'.

This shortens the code and is nicer to read as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:53:57 +01:00
Ingo Molnar
ab6fb7c0b0 x86, apic: remove ->store_NMI_vector()
Impact: cleanup

It's not used by anything anymore.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:53:56 +01:00
Ingo Molnar
cb81eaedf1 x86, numaq_32: clean up, misc
Impact: cleanup

 - misc other cleanups that change the md5 signature
 - consolidate global variables
 - remove unnecessary __numaq_mps_oem_check() wrapper
 - make numaq_mps_oem_check static
 - update copyrights
 - misc other cleanups pointed out by checkpatch

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:53:54 +01:00
Ingo Molnar
36afc3af04 x86, numaq_32: clean up
Impact: cleanup

- refactor smp_dump_qct()
- tidy up include files, remove duplicates
- misc other cleanups, pointed out by checkpatch

No code changed:

md5:
   9c0bc01a53558c77df0f2ebcda7e11a9  numaq_32.o.before.asm
   9c0bc01a53558c77df0f2ebcda7e11a9  numaq_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:51 +01:00
Ingo Molnar
7da18ed924 x86, es7000: misc cleanups
These are cleanups that change the md5 signature:

 - asm/ => linux/ include conversion
 - simplify the code flow of find_unisys_acpi_oem_table()
 - move ACPI methods into one #ifdef block
 - remove 0/NULL initialization of statics
 - simplify/standardize printouts
 - update copyrights
 - more cleanups, pointed out by checkpatch

arch/x86/kernel/es7000_32.o:

   text	   data	    bss	    dec	    hex	filename
   2693	    192	     44	   2929	    b71	es7000_32.o.before
   2688	    192	     44	   2924	    b6c	es7000_32.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:50 +01:00
Ingo Molnar
352887d1c9 x86, es7000: remove dead code, clean up
Impact: cleanup

 - a number of structure definitions were stale
 - remove needless wrappers around apic definitions
 - fix details noticed by checkpatch

No code changed:

md5:
   029d8fde0aaf6e934ea63bd8b36430fd  es7000_32.o.before.asm
   029d8fde0aaf6e934ea63bd8b36430fd  es7000_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:49 +01:00
Ingo Molnar
d3185b37df x86, es7000: remove externs
Impact: cleanup

In the subarch times there were a number of externs between
various bits of the ES7000 code. Now that there's a single
es7000-platform support file, the externs can be removed and
the functions can be changed the statics.

Beyond the cleanup factor, this also shrinks the size of the
kernel image a bit:

arch/x86/kernel/es7000_32.o:

   text	   data	    bss	    dec	    hex	filename
   2813	    192	     44	   3049	    be9	es7000_32.o.before
   2693	    192	     44	   2929	    b71	es7000_32.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:48 +01:00
Ingo Molnar
b9e0d1aa97 x86, apic: remove apicid_cluster()
There were multiple definitions of apicid_cluster() scattered around
in APIC drivers - but the definitions are equivalent to the already
existing generic APIC_CLUSTER() method.

So remove apicid_cluster() and change all users to APIC_CLUSTER().

No code changed:

md5:
   1b8244ba8d3d6a454593ce10f09dfa58  summit_32.o.before.asm
   1b8244ba8d3d6a454593ce10f09dfa58  summit_32.o.after.asm

md5:
   a593d98a882bf534622c70d9568497ac  es7000_32.o.before.asm
   a593d98a882bf534622c70d9568497ac  es7000_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:47 +01:00
Ingo Molnar
2c4ce18c95 x86, es7000: clean up
No code changed:

arch/x86/kernel/es7000_32.o:

   text	   data	    bss	    dec	    hex	filename
   2813	    192	     44	   3049	    be9	es7000_32.o.before
   2813	    192	     44	   3049	    be9	es7000_32.o.after

md5:
   a593d98a882bf534622c70d9568497ac  es7000_32.o.before.asm
   a593d98a882bf534622c70d9568497ac  es7000_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:46 +01:00
Ingo Molnar
2f205bc47f x86, apic: clean up the cpu_2_logical_apiciddeclaration
extern declarations were scattered in 4 files - consolidate them
into apic.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:46 +01:00