In particular, allow over-large read- or write-requests to be downgraded
to a more reasonable range, rather than considering them outright errors.
We want to protect lower layers from (the sadly all too common) overflow
conditions, but prefer to do so by chopping the requests up, rather than
just refusing them outright.
Cc: Peter Anvin <hpa@zytor.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The first of these changes s/hotplug/uevent/ was needed to
compile sn2_defconfig (ia64/sn). The other three files
changed are blind changes of all remaining bus_type.hotplug
references I could find to bus_type.uevent.
This patch attempts to finish similar changes made in the
gregkh-driver-kill-hotplug-word-from-driver-core Nov 22 patch.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Leave the overloaded "hotplug" word to susbsystems which are handling
real devices. The driver core does not "plug" anything, it just exports
the state to userspace and generates events.
Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
To allow multiple platforms to use the PXA27x OHCI driver, the platform
code needs to be moved into the board specific files in
arch/arm/mach-pxa. This patch does this for mainstone and adds
preliminary hooks to allow other boards to use the driver.
This has been compile tested for mainstone and successfully run on Spitz
(Sharp Zaurus SL-C3000) with the addition of an appropriate board
support file.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Nicolas Pitre <nico@cam.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The function ia64_pci_legacy_write() returns 0 for everything
except errors. This return value gets sent back to the user from
pci_write_legacy_io(), making it look like every write fails. The trivial
patch below copies the behavior of the SGI sn machvec and does what
would be expected from something implementing a write() function.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Sonny has noticed hotplug CPU on ppc64 is broken in 2.6.15-*. One of the
problems is that htab_initialize_secondary is called when a cpu is being
brought up, but it is marked __init.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently, we do not pass the correct start_pfn to e820_hole_size, to
calculate holes. Following patch fixes that.
The bug results in incorrect number of node_present_pages for each pgdat
and causes ugly output in /sys and probably VM inbalances.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Sighed-off-by: Shair Fultheim <shai@scalex86.org>
Sighed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix UML compilation when SKAS mode is disabled. Indeed, we were compiling
SKAS-only object files, which failed due to some SKAS-only headers being
excluded from the search path.
Thanks to the bug report from Pekka J Enberg.
Acked-by: Pekka J Enberg <penberg (at) cs ! helsinki ! fi>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Today, when compiling UML, I got warnings for two used unexported symbols:
readdir64 and truncate64. Indeed, my glibc headers are aliasing readdir to
readdir64 and truncate to truncate64 (and so on).
I'm then adding additional exports. Since I've no idea if the symbols where
always provided in the supported glibc's, I've added weak definitions too.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Don't use printk() where "current_thread_info()" is crap.
Until when we switch to running on init_stack, current_thread_info() evaluates
to crap. Printk uses "current" at times (in detail, ¤t is evaluated with
CONFIG_DEBUG_SPINLOCK to check the spinlock owner task).
And this leads to random segmentation faults.
Exactly, what happens is that ¤t = *(current_thread_info()), i.e. round
down $esp and dereference the value. I.e. access the stack below $esp, which
causes SIGSEGV on a VM_GROWSDOWN vma (see arch/i386/mm/fault.c).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It's definition is wrong (-1 means "no limit" not 999),
only the Sparc SunOS/Solaris compat code uses it, so
let's just kill it off completely from limits.h and
all referencing code.
Noticed by Ulrich Drepper.
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a Kconfig symbol SPARC that is defined on both the sparc and
sparc64 architectures.
This symbol makes some dependencies more readable.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
It turns out that commit f9bd170a87
broke the cascade from XICS to i8259 on pSeries machines; specifically
we ended up not ever doing the EOI on the XICS for the cascade. The
result was that interrupts from the serial ports (and presumably any
other devices using ISA interrupts) didn't get through. This fixes
it and also simplifies the code, by doing the EOI on the XICS in the
xics_get_irq routine after reading and acking the interrupt on the
i8259.
Signed-off-by: Paul Mackerras <paulus@samba.org>
It was a stupid workaround for the "static inline" vs.
"extern inline" issues of long ago, and it is what causes
schedule() to be inlined like crazy into kernel/sched.c
when -Os is specified.
MIPS and S390 should probably do the same.
Now CC_OPTIMIZE_FOR_SIZE can be safely used on sparc64
once more.
Signed-off-by: David S. Miller <davem@davemloft.net>
Now needs to include the type 1 functions ("direct") too.
Reported by Pavel Roskin <proski@gnu.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The CPM2 interrupt handler does not return success to the IRQ subsystem, which
causes it to kill the IRQ line after 100,000 interrupts.
Signed-off-by: Edson Seabra <Edson.Seabra@cyclades.com>
Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since we don't restore the volatile registers in the syscall exit
path, we need to make sure we don't leak any potentially interesting
values from the kernel to userspace. This was already the case for
all except r11. This makes it use r11 for an MSR value, so r11 will
have an (uninteresting) MSR value in it on return to userspace.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Patch from Nicolas Pitre
Strictly speaking, the NPTL kernel helpers are required for pre ARMv6
only. They are available on ARMv6+ as well for obvious compatibility
reasons. However there are cases where extra memory barriers are needed
when using an SMP ARMv6 machine but not on pre-ARMv6.
This patch adds a memory barrier kernel helper that glibc can use as
needed for pre-ARMv6 binaries to be forward compatible with an SMP
kernel on ARMv6, as well as the necessary dmb instructions to the
cmpxchg helper.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Acked-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
With Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
UML skas0 stub has been miscompiling for many people (incidentally not
the authors), depending on the used GCC versions.
I think (and testing on some GCC versions shows) this patch avoids the
fundamental issue which is behind this, namely gcc using the stack when
we have just replaced it, behind gcc's back. The remapping and storage
of the return value is hidden in a blob of asm, hopefully giving gcc no
room for creativity.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
So you may have seen the miniconfig stuff wander by, which means that my
build script exits if there's a .config error, and we have this:
fs/Kconfig:1749:warning: 'select' used by config symbol 'CIFS_UPCALL'
refer to undefined symbol 'CONNECTOR'
This makes it shut up.
Signed-off-by: Rob Landley <rob@landley.net>
[ Verified it makes sense. ]
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
The current UML build assumes that on x86-64 systems, /lib is a symlink
to /lib64, but in some distributions (like PLD and CentOS) they are
separate directories, so the 64 bit library loader isn't found. This
patch inserts /lib64 at the start of the rpath on x86-64 UML builds.
Signed-off-by: Rob Landley <rob@landley.net>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Duplicated code - the patch adding it was probably applied twice without
enough care.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Rather than providing more wrappers for 6-arg syscalls, arrange for
them to be supported as standard. This just means that we always
store the 6th argument on the stack, rather than in the wrappers.
This means we eliminate the wrappers for:
* sys_futex
* sys_arm_fadvise64_64
* sys_mbind
* sys_ipc
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
DMA_MODE_{READ,WRITE} are declared in asm-powerpc/dma.h and their
declarations there match the definitions. Old declarations in
ppc4xx_dma.h are not right anymore (wrong type, to start with).
Killed them, added include of asm/dma.h where needed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use correct address when referencing mmconfig aperture while checking
for broken MCFG. This was a typo when porting the code from 64bit to
32bit. It caused oopses at boot on some ThinkPads.
Should definitely go into 2.6.15.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sparc64, i386 and x86_64 have support for a special data section dedicated
to rarely updated data that is frequently read. The section was created to
avoid false sharing of those rarely read data with frequently written kernel
data.
This patch creates such a data section for ia64 and will group rarely written
data into this section.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Change the NR_CPUS default for ia64/sn up to 1024.
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: John Hesterberg <jh@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
I see why the problem exists only on SN. SN uses a different hardware
mechanism to purge TLB entries across nodes.
It looks like there is a bug in the SN TLB flushing code. During context switch,
kernel threads inherit the mm of the task that was previously running on the
cpu. This confuses the code in sn2_global_tlb_purge().
The result is a missed TLB purge for the task that owns the "borrowed" mm.
(I hit the problem running heavy stress where kswapd was purging code pages of
a user task that woke kswapd. The user task took a SIGILL fault trying to
execute code in the page that had been ripped out from underneath it).
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Use raw_smp_processor_id() instead of get_cpu() as we don't need the
extra features of get_cpu().
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The udelay() inline for ia64 uses the ITC. If CONFIG_PREEMPT is enabled
and the platform has unsynchronized ITCs and the calling task migrates
to another CPU while doing the udelay loop, then the effective delay may
be too short or very, very long.
This patch disables preemption around 100 usec chunks of the overall
desired udelay time. This minimizes preemption-holdoffs.
udelay() is now too big to be inline, move it out of line and export it.
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Patch from Daniel Jacobowitz
Handle new EABI relocations when loading kernel modules. This is
necessary for CONFIG_AEABI kernels, and also for some broken
(since fixed) old ABI toolchains.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As reported by Keith Mannthey, there are problems in populate_memnodemap()
The bug was that the compute_hash_shift() was returning 31, with incorrect
initialization of memnodemap[]
To correct the bug, we must use (1UL << shift) instead of (1 << shift) to
avoid an integer overflow, and we must check that shift < 64 to avoid an
infinite loop.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On systems that do not support the HPET legacy functions (basically the IBM
x460, but there could be others), in time_init() we accidentally fall into a
PM timer conditional and set the vxtime_hz value to the PM timer's frequency.
We then use this value with the HPET for timekeeping.
This patch (which mimics the behavior in time_init_gtod) corrects the
collision.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When a register set is passed in don't try to fix up the pointer.
Noticed by Al Viro
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
They report all busses as MMCONFIG capable, but it never works for the
internal devices in the CPU's builtin northbridge.
It just probes all func 0 devices on bus 0 (the internal northbridge is
currently always on bus 0) and if they are not accessible using MCFG they are
put into a special fallback bitmap.
On systems where it isn't we assume the BIOS vendor supplied correct MCFG.
Requires the earlier patch for mmconfig type1 fallback
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When there is no entry for a bus in MCFG fall back to type1. This is
especially important on K8 systems where always some devices can't be accessed
using mmconfig (in particular the builtin northbridge doesn't support it for
its own devices)
Cc: <gregkh@suse.de>
Cc: <jgarzik@pobox.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It's illegal because it can sleep.
Use a two step lookup scheme instead. First look up the vm_struct, then
change the direct mapping, then finally unmap it. That's ok because nobody
can change the particular virtual address range as long as the vm_struct is
still in the global list.
Also added some LinuxDoc documentation to iounmap.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Disabling LAPIC timer isn't sufficient. In some situations, such as we
enabled NMI watchdog, there is still unexpected interrupt (such as NMI)
invoked in offline CPU. This also avoids offline CPU receives spurious
interrupt and anything similar.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Otherwise TSC->HPET fallback could see incorrect state and crash later.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With this fix, sparc links vmlinuz again using crosstool. Without this
fix, the final link fails missing several dozen dozen symbols, beginning
with:
kernel/built-in.o(.text+0x6fd0): In function `do_exit':
: undefined reference to `exit_io_context'
(exit_io_context is defined in block/ll_rw_blk.c).
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes one build error introduced in sparc with the patch of Oct 30,
resent Nov 4 "[patch 3/5] atomic: atomic_inc_not_zero" I still can't get
sparc to build, but at least it gets further after I remove this line.
Apparently, this change was agreed to by Andrew and Nick on Nov 14, but
everyone thought someone else was doing it.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
I realized ZONE_DMA32 has a trivial bug at Kconfig for ia64. In
include/linux/gfp.h on 2.6.15-rc5-mm1, CONFIG is define like followings.
#ifdef CONFIG_DMA_IS_DMA32
#define __GFP_DMA32 ((__force gfp_t)0x01) /* ZONE_DMA is ZONE_DMA32
*/
:
:
So, CONFIG_"ZONE"_DMA_IS_DMA32 is clearly wrong.
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When multiple probes are registered at the same address and if due to some
recursion (probe getting triggered within a probe handler), we skip calling
pre_handlers and just increment nmissed field.
The below patch make sure it walks the list for multiple probes case.
Without the below patch we get incorrect results of nmissed count for
multiple probe case.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Page count should be initialized to 1 on each of the MIPS empty zero pages,
to avoid a bad_page warning whenever one of them is freed from all mappings.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
arch/um/kernel/tt/uaccess.c: In function `copy_from_user_tt':
arch/um/kernel/tt/uaccess.c:11: error: `FIXADDR_USER_START' undeclared (first use in this function)
arch/um/kernel/tt/uaccess.c:11: error: (Each undeclared identifier is reported only once
arch/um/kernel/tt/uaccess.c:11: error: for each function it appears in.)
I get the compile error when I disable CONFIG_MODE_SKAS.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Paolo Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With CPU hotplug enabled, NMI watchdog stoped working. It appears the
violation is the cpu_online check in nmi handler. local ACPI based NMI
watchdog is initialized before we set CPU online for APs. It's quite
possible a NMI is fired before we set CPU online, and that's what happens
here.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Lothar Wassmann
The patch makes sure, that the ouptut functions of pins are restored
before restoring the Alternat Function settings, preventing pins from
being intermediately configured for undefined or unwanted alternate
functions.
Here is the original comment:
I've got a PXA270 system that uses GPIO80 as nCS4. This system did
hang on resume. Digging into the problem I found that the processor
stalled immediately when restoring the GAFR2_U register which restored
the alternate function for GPIO80. Since the GPDR registers were
restored after the GAFR registers, the offending GPIO was configured
as input at this point.
Thus the alternate function that was in effect after restoring the
GAFR was in fact the input function "MBREQ" instead of the output
function "nCS4". The "PXA27x Processor Family Developer's Manual"
(Footnote in Table 6-1 on page 6-3) states that:
"The MBREQ alternate function must not be enabled until the PSSR[RDH]
bit field is cleared. For more details, see Table 3-15, "PSSR Bit
Definitions" on page 3-71."
There is another note in the Developer's Manual (chapter 24.4.2
"GPIO operation as Alternate Function" on page 24-4)
stating that:
"Configuring a GPIO for an alternate function that is not defined for
it causes unpredictable results."
Since some GPIOs have no input function defined, and to prevent
inadvertedly programming the MBREQ function on some pin, the GAFR
registers should be restored after the GPDR registers have been
restored.
Additional provisions have to be made when the MBREQ function is
actually required. The corresponding GAFR bits should not be restored
with the regular GAFR restore, but must be set only after the PSSR
bits have been cleared.
Signed-off-by: Lothar Wassmann <LW@KARO-electronics.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ppc32 kernel, when built with CONFIG_SMP and booted on a single CPU
machine, will not properly set smp_tb_synchronized, thus causing
gettimeofday() to not use the HW timebase and to be limited to jiffy
resolution. This, among others, causes unacceptable pauses when launching
X.org.
Signed-Off-By: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The code that sets the clock spreading feature of the Intrepid ASIC
must not be run on some machine models or those won't boot. This
fixes it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Patch from Nikola Valerjev
Single stepping an application using ptrace() fails over ARM instructions BX and BLX.
Steps to reproduce:
Compile and link the following files
main.c
-----
void foo();
int main() {
foo();
return 0;
}
foo.s
-----
.text
.globl foo
foo:
BX LR
Using ptrace() functionality, run to main(), and start singlestepping.
Singlestep over \"BX LR\" instruction won\'t transfer the control back
to main, but run the code to completion.
This problems seems to be in the function get_branch_address() in
arch/arm/kernel/ptrace.c. The function doesn\'t seem to recognize BX
and BLX instructions as branches. BX and BLX instructions can be used
to convert from ARM to Thumb mode if the target address has the low
bit set. However, they are also perfectly legal in the ARM only mode.
Although other things in the kernel seem to indicate that only ARM
mode is accepted (and not Thumb), many compilers will generate BX
and BLX instructions even when generating ARM only code.
Signed-off-by: Nikola Valerjev <nikola@ghs.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On ppc64, when opening a new hugepage region, we need to make sure any
old normal-page SLBs for the area are flushed on all CPUs. There was
a bug in this logic - after putting the new hugepage area masks into
the thread structure, we copied it into the paca (read by the SLB miss
handler) only on one CPU, not on all. This could cause incorrect SLB
entries to be loaded when a multithreaded program was running
simultaneously on several CPUs. This patch corrects the error,
copying the context information into the PACA on all CPUs using the mm
in question before flushing any existing SLB entries.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On most powerpc CPUs, the dcache and icache are not coherent so
between writing and executing a page, the caches must be flushed.
Userspace programs assume pages given to them by the kernel are icache
clean, so we must do this flush between the kernel clearing a page and
it being mapped into userspace for execute. We were not doing this
for hugepages, this patch corrects the situation.
We use the same lazy mechanism as we use for normal pages, delaying
the flush until userspace actually attempts to execute from the page
in question.
Tested on G5.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cache info is setup by walking the device tree in initialize_cache_info().
However, icache_flush_range might be called before that, in
slb_initialize()->patch_slb_encoding, which modifies the load immediate
instructions used with SLB fault code.
Not only that, but depending on memory layout, we might take SLB faults
during unflatten_device_tree. So that fault will load an SLB entry that
might not contain the right LLP flags for the segment.
Either we can walk the flattened device tree to setup cache info, or
we can pick the known defaults that are known to work. Doing it in the
flattened device tree is hairier since we need to know the machine type
to know what property to look for, etc, etc.
For now, it's just easier to go with the defaults. Worst thing that
happens from it is that we might waste a few cycles doing too small
dcbst/icbi increments.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Some debug code wasn't properly removed from the initial 64k pages
patch, and while it's harmless, it's also slowing down significantly a
very hot code path, thus it should really be removed.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 64k pages patch changed the meaning of one argument passed to the
low level hash functions (from "large" it became "psize" or page size
index), but one of the call sites wasn't properly updated, causing
potential random weird problems with huge pages. This fixes it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This bug exists in the current code and prevents machines from booting
with numa enabled if there is a node that does not contain memory.
Workaround is to boot with 'numa=off'. Looks like a simple typo.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
What is the value shown in "cpu MHz" of /proc/cpuinfo when CPUs are capable of
changing frequency?
Today the answer is: It depends.
On i386:
SMP kernel - It is always the boot frequency
UP kernel - Scales with the frequency change and shows that was last set.
On x86_64:
There is one single variable cpu_khz that gets written by all the CPUs. So,
the frequency set by last CPU will be seen on /proc/cpuinfo of all the
CPUs in the system. What you see also depends on whether you have constant_tsc
capable CPU or not.
On ia64:
It is always boot time frequency of a particular CPU that gets displayed.
The patch below changes this to:
Show the last known frequency of the particular CPU, when cpufreq is present. If
cpu doesnot support changing of frequency through cpufreq, then boot frequency
will be shown. The patch affects i386, x86_64 and ia64 architectures.
Signed-off-by: Venkatesh Pallipadi<venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
This patch moves away PMBASE reading and only performs it at
cpufreq_register_driver time by exiting with -ENODEV if unable to read
the value.
Signed-off-by: Mattia Dongili <malattia@linux.it>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
The attached patch introduces runtime latency measurement for ICH[234]
based chipsets instead of using CPUFREQ_ETERNAL. It includes
some sanity checks in case the measured value is out of range and
assigns a safe value of 500uSec that should still be enough on
problematics chipsets (current testing report values ~200uSec). The
measurement is currently done in speedstep_get_freqs in order to avoid
further unnecessary transitions and in the hope it'll come handy for SMI
also.
Signed-off-by: Mattia Dongili <malattia@linux.it>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
speedstep-ich.c | 4 ++--
speedstep-lib.c | 32 +++++++++++++++++++++++++++++++-
speedstep-lib.h | 1 +
speedstep-smi.c | 1 +
4 files changed, 35 insertions(+), 3 deletions(-)
If a user has booted with 'quiet', some important messages don't
get displayed which really should. We've seen at least one case
where powernow-k8 stopped working, and the user needed a BIOS update
that they didn't know about.
Signed-off-by: Dave Jones <davej@redhat.com>
The patch that added support for a new platform chipset (shub2) broke
PTC deadlock recovery on older versions of the chipset. (PTCs are the
SN platform-specific method for doing a global TLB purge). This
patch fixes deadlock recovery so that it works on both the old & new
chipsets.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
We have a customer application which trips a bug. The problem arises
when a driver attempts to call do_munmap on an area which is mapped, but
because current->thread.task_size has been set to 0xC0000000, the call
to do_munmap fails thinking it is an unmap beyond the user's address
space.
The comment in fs/binfmt_elf.c in load_elf_library() before the call
to SET_PERSONALITY() indicates that task_size must not be changed for
the running application until flush_thread, but is for ia64 executing
ia32 binaries.
This patch moves the setting of task_size from SET_PERSONALITY() to
flush_thread() as indicated. The customer application no longer is able
to trip the bug.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The per-node data structures are allocated with strided offsets that are a
function of the node number. This prevents excessive cache-aliasing from
occurring.
On systems with a large number of nodes, the strided offset becomes
too large. This patch restricts the maximum offset to 32MB. This is far larger
than the size of any current L3 cache.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix only patch to add fixup code that sets up
pci_controller->window. This code is a temporary
fix until ACPI support on Altix is added.
Also, corrects the usage of pci_dev->sysdata,
which had previously been used to reference
platform specific device info, to now point to
a pci_controller struct.
Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Return -EINTR instead of -ERESTARTSYS when signals are delivered during
a blocked read of /proc/sal/*/event. This allows salinfo_decode to
detect signals when it is blocked on a read of those files.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
There's never been a hardware platform that has both pSeries/RPA LPAR
hypervisor and stab (pre-POWER4 segment management). This removes
the redundant code in stab_initalize().
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This reverts commit da0825fd20, making
it so that if you select CONFIG_PPC_MULTIPLATFORM you get support
for PMAC, PREP and CHRP built in.
The reason for not allowing PMAC, PREP and CHRP to be selected
individually for ARCH=ppc is that there is too much interdependency
between them in the platform support code. For example, CHRP uses
the PMAC nvram code.
Configuring with ARCH=powerpc does allow you to select support for
PMAC and CHRP separately. Support for PREP is not there yet but
should be there soon.
Signed-off-by: Paul Mackerras <paulus@samba.org>
The previous commit will use the page-at-a-time hypervisor call for
setting up IOMMU entries when we are using 64k pages and setting up
one 64k page, even though that means 16 calls to the hypervisor, since
the hypervisor still works on 4k pages. This optimizes this case by
using the multi-page IOMMU setup hypervisor call instead.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This updates the sn2_defconfig file for the Altix 330 hardware, enables
the AGP graphics for the SGI Prism, and removes prompts for the remainder
of the new features. Greg Edwards reviewed the changes.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Must adjust tcenum and npages by TCE_PAGE_FACTOR to convert between
64KB pages and TCE (4K) pages. (This is done in other places, except
for this one location.)
Signed-off-by: Michal Ostrowski <mostrows at watson ibm com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Correctly specify treeboot based image entrypoint. Currently makefile uses
$(ENTRYPOINT) which isn't defined anywhere. Each board port sets
entrypoint-$(CONFIG_BOARD_NAME) instead.
Without this patch I cannot boot Ocotea (PPC440GX eval board) anymore. I
was getting random "OS panic" errors from OpenBIOS for a while, but with
current kernel I get them all the time (probably because image became
bigger).
Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Acked-by: Tom Rini <trini@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The time to wait after deasserting PCI_RST has been counted with incorrect
value - this patch fixes the issue.
Signed-off-by: Vitaly Bordug <vbordug@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>