linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-17 17:24:17 +08:00

Author	SHA1	Message	Date
Zachary Amsden	f2f30ebca6	[PATCH] x86: introduce a write acessor for updating the current LDT Introduce a write acessor for updating the current LDT. This is required for hypervisors like Xen that do not allow LDT pages to be directly written. Testing - here's a fun little LDT test that can be trivially modified to test limits as well. /* * Copyright (c) 2005, Zachary Amsden (zach@vmware.com) * This is licensed under the GPL. / #include <stdio.h> #include <signal.h> #include <asm/ldt.h> #include <asm/segment.h> #include <sys/types.h> #include <unistd.h> #include <sys/mman.h> #define __KERNEL__ #include <asm/page.h> void main(void) { struct user_desc desc; char code; unsigned long long tsc; code = (char )mmap(0, 8192, PROT_EXEC\|PROT_READ\|PROT_WRITE, MAP_PRIVATE \| MAP_ANONYMOUS, -1, 0); desc.entry_number = 0; desc.base_addr = code; desc.limit = 1; desc.seg_32bit = 1; desc.contents = MODIFY_LDT_CONTENTS_CODE; desc.read_exec_only = 0; desc.limit_in_pages = 1; desc.seg_not_present = 0; desc.useable = 1; if (modify_ldt(1, &desc, sizeof(desc)) != 0) { perror("modify_ldt"); } printf("code base is 0x%08x\n", (unsigned)code); code[0x0ffe] = 0x0f; / rdtsc / code[0x0fff] = 0x31; code[0x1000] = 0xcb; / lret */ __asm__ __volatile("lcall $7,$0xffe" : "=A" (tsc)); printf("TSC is 0x%016llx\n", tsc); } Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:13 -07:00
Zachary Amsden	e9f86e351f	[PATCH] x86: remove redundant TSS clearing When reviewing GDT updates, I found the code: set_tss_desc(cpu,t); /* This just modifies memory; ... / per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS].b &= 0xfffffdff; This second line is unnecessary, since set_tss_desc() has already cleared the busy bit. Commented disassembly, line 1: c028b8bd: 8b 0c 86 mov (%esi,%eax,4),%ecx c028b8c0: 01 cb add %ecx,%ebx c028b8c2: 8d 0c 39 lea (%ecx,%edi,1),%ecx => %ecx = per_cpu(cpu_gdt_table, cpu) c028b8c5: 8d 91 80 00 00 00 lea 0x80(%ecx),%edx => %edx = &per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS] c028b8cb: 66 c7 42 00 73 20 movw $0x2073,0x0(%edx) c028b8d1: 66 89 5a 02 mov %bx,0x2(%edx) c028b8d5: c1 cb 10 ror $0x10,%ebx c028b8d8: 88 5a 04 mov %bl,0x4(%edx) c028b8db: c6 42 05 89 movb $0x89,0x5(%edx) => ((char )%edx)[5] = 0x89 (equivalent) ((char )per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS])[5] = 0x89 c028b8df: c6 42 06 00 movb $0x0,0x6(%edx) c028b8e3: 88 7a 07 mov %bh,0x7(%edx) c028b8e6: c1 cb 10 ror $0x10,%ebx => other bits Commented disassembly, line 2: c028b8e9: 8b 14 86 mov (%esi,%eax,4),%edx c028b8ec: 8d 04 3a lea (%edx,%edi,1),%eax => %eax = per_cpu(cpu_gdt_table, cpu) c028b8ef: 81 a0 84 00 00 00 ff andl $0xfffffdff,0x84(%eax) => per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS].b &= 0xfffffdff; (equivalent) ((char )per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS])[5] &= 0xfd Note that (0x89 & ~0xfd) == 0; i.e, set_tss_desc(cpu,t) has already stored the type field in the GDT with the busy bit clear. Eliminating redundant and obscure code is always a good thing; in fact, I pointed out this same optimization many moons ago in arch/i386/setup.c, back when it used to be called that. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:13 -07:00
Zachary Amsden	a520112930	[PATCH] x86: make IOPL explicit The pushf/popf in switch_to are ONLY used to switch IOPL. Making this explicit in C code is more clear. This pushf/popf pair was added as a bugfix for leaking IOPL to unprivileged processes when using sysenter/sysexit based system calls (sysexit does not restore flags). When requesting an IOPL change in sys_iopl(), it is just as easy to change the current flags and the flags in the stack image (in case an IRET is required), but there is no reason to force an IRET if we came in from the SYSENTER path. This change is the minimal solution for supporting a paravirtualized Linux kernel that allows user processes to run with I/O privilege. Other solutions require radical rewrites of part of the low level fault / system call handling code, or do not fully support sysenter based system calls. Unfortunately, this added one field to the thread_struct. But as a bonus, on P4, the fastest time measured for switch_to() went from 312 to 260 cycles, a win of about 17% in the fast case through this performance critical path. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:12 -07:00
Zachary Amsden	0998e4228a	[PATCH] x86: privilege cleanup Privilege checking cleanup. Originally, these diffs were much greater, but recent cleanups in Linux have already done much of the cleanup. I added some explanatory comments in places where the reasoning behind certain tests is rather subtle. Also, in traps.c, we can skip the user_mode check in handle_BUG(). The reason is, there are only two call chains - one via die_if_kernel() and one via do_page_fault(), both entering from die(). Both of these paths already ensure that a kernel mode failure has happened. Also, the original check here, if (user_mode(regs)) was insufficient anyways, since it would not rule out BUG faults from V8086 mode execution. Saving the %ss segment in show_regs() rather than assuming a fixed value also gives better information about the current kernel state in the register dump. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:12 -07:00
Zachary Amsden	f2ab446124	[PATCH] x86: more asm cleanups Some more assembler cleanups I noticed along the way. Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:12 -07:00
Zachary Amsden	c9b02a2413	[PATCH] i386: use set_pte macros in a couple places where they were missing Also, setting PDPEs in PAE mode does not require atomic operations, since the PDPEs are cached by the processor, and only reloaded on an explicit or implicit reload of CR3. Since the four PDPEs must always be present in an active root, and the kernel PDPE is never updated, we are safe even from SMIs and interrupts / NMIs using task gates (which reload CR3). Actually, much of this is moot, since the user PDPEs are never updated either, and the only usage of task gates is by the doublefault handler. It appears the only place PGDs get updated in PAE mode is in init_low_mappings() / zap_low_mapping() for initial page table creation and recovery from ACPI sleep state, and these sites are safe by inspection. Getting rid of the cmpxchg8b saves code space and 720 cycles in pgd_alloc on P4. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:12 -07:00
Zachary Amsden	e7a2ff593c	[PATCH] i386: load_tls() fix Subtle fix: load_TLS has been moved after saving %fs and %gs segments to avoid creating non-reversible segments. This could conceivably cause a bug if the kernel ever needed to save and restore fs/gs from the NMI handler. It currently does not, but this is the safest approach to avoiding fs/gs corruption. SMIs are safe, since SMI saves the descriptor hidden state. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:11 -07:00
Zachary Amsden	4d37e7e3fd	[PATCH] i386: inline assembler: cleanup and encapsulate descriptor and task register management i386 inline assembler cleanup. This change encapsulates descriptor and task register management. Also, it is possible to improve assembler generation in two cases; savesegment may store the value in a register instead of a memory location, which allows GCC to optimize stack variables into registers, and MOV MEM, SEG is always a 16-bit write to memory, making the casting in math-emu unnecessary. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:11 -07:00
Zachary Amsden	245067d167	[PATCH] i386: cleanup serialize msr i386 arch cleanup. Introduce the serialize macro to serialize processor state. Why the microcode update needs it I am not quite sure, since wrmsr() is already a serializing instruction, but it is a microcode update, so I will keep the semantic the same, since this could be a timing workaround. As far as I can tell, this has always been there since the original microcode update source. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:11 -07:00
Zachary Amsden	4bb0d3ec3e	[PATCH] i386: inline asm cleanup i386 Inline asm cleanup. Use cr/dr accessor functions. Also, a potential bugfix. Also, some CR accessors really should be volatile. Reads from CR0 (numeric state may change in an exception handler), writes to CR4 (flipping CR4.TSD) and reads from CR2 (page fault) prevent instruction re-ordering. I did not add memory clobber to CR3 / CR4 / CR0 updates, as it was not there to begin with, and in no case should kernel memory be clobbered, except when doing a TLB flush, which already has memory clobber. I noticed that page invalidation does not have a memory clobber. I can't find a bug as a result, but there is definitely a potential for a bug here: #define __flush_tlb_single(addr) \ __asm__ __volatile__("invlpg %0": :"m" ((char ) addr)) Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:11 -07:00
Roland McGrath	2a0694d15d	[PATCH] i386: clean up vDSO alignment padding This makes the vDSO use nops for all its padding around instructions, rather than sometimes zeros, and nop-pads the end of the area containing instructions to a 32-byte cache line, to keep text and data in separate lines. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:10 -07:00
Natalie.Protasevich@unisys.com	56f1d5d52a	[PATCH] ES7000 platform update (i386) This is subarch update for ES7000. I've modified platform check code and removed unnecessary OEM table parsing for newer systems that don't use OEM information during boot. Parsing the table in fact is causing problems, and the platform doesn't get recognized. The patch only affects the ES7000 subach. Signed-off-by: <Natalie.Protasevich@unisys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:10 -07:00
Venkatesh Pallipadi	252943efcf	[PATCH] x86: Add the check for all the cores in a package in cache information Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:10 -07:00
Venkatesh Pallipadi	911a62d423	[PATCH] x86: sutomatically enable bigsmp when we have more than 8 CPUs i386 generic subarchitecture requires explicit dmi strings or command line to enable bigsmp mode. The patch below removes that restriction, and uses bigsmp as soon as it finds more than 8 logical CPUs, Intel processors and xAPIC support. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:10 -07:00
Vivek Goyal	484b90c4b9	[PATCH] kdump: Save parameter segment in protected mode (x86) o With introduction of kexec as boot-loader, the assumption that parameter segment will always be loaded at lower address than kernel and will be addressable by early bootup page tables is no longer valid. In kexec on panic case parameter segment might well be loaded beyond kernel image and might not be addressable by early boot page tables. o This case might hit in the scenario where user has reserved a chunk of memory for second kernel, for example 16MB to 64MB, and has also built second kernel for physical memory location 16MB. In this case kexec has no choice but to load the parameter segment at a higher address than new kernel image at safe location where new kernel does not stomp it. o Though problem should automatically go away once relocatable kernel for i386 is in place and kexec can determine the location of new kernel at run time and load parameter segment at lower address than kernel image. But till then this patch can go in (assuming it does not break something else). o This patch moves up the boot parameter saving code. Now boot parameters are copied out in protected mode before page tables are initialized. This will ensure that parameter segment is always addressable irrespective of its physical location. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:09 -07:00
Petr Tesarik	5fd75ebb1a	[PATCH] vm86: Honor TF bit when emulating an instruction If the virtual 86 machine reaches an instruction which raises a General Protection Fault (such as CLI or STI), the instruction is emulated (in handle_vm86_fault). However, the emulation ignored the TF bit, so the hardware debug interrupt was not invoked after such an emulated instruction (and the DOS debugger missed it). This patch fixes the problem by emulating the hardware debug interrupt as the last action before control is returned to the VM86 program. Signed-off-by: Petr Tesarik <kernel@tesarici.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:09 -07:00
Matt Tolentino	7ae65fd334	[PATCH] x86: fix EFI memory map parsing The memory descriptors that comprise the EFI memory map are not fixed in stone such that the size could change in the future. This uses the memory descriptor size obtained from EFI to iterate over the memory map entries during boot. This enables the removal of an x86 specific pad (and ifdef) in the EFI header. I also couldn't stomach the broken up nature of the function to put EFI runtime calls into virtual mode any longer so I fixed that up a bit as well. For reference, this patch only impacts x86. Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:09 -07:00
Venkatesh Pallipadi	4116c527ea	[PATCH] hpet: use read_timer_tsc only when CPU has TSC Only use read_timer_tsc only when CPU has TSC. Thanks to Andrea for pointing this out. Should not be issue on any platforms as all recent systems that has HPET also has CPUs that supports TSC. The patch is still required for correctness. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:09 -07:00
Ingo Molnar	869f96a00e	[PATCH] x86: compress the stack layout of do_page_fault() This patch pushes the creation of a rare signal frame (SIGBUS or SIGSEGV) into a separate function, thus saving stackspace in the main do_page_fault() stackframe. The effect is 132 bytes less of stack used by the typical do_page_fault() invocation - resulting in a denser cache-layout. (Another minor effect is that in case of kernel crashes that come from a pagefault, we add less space to the already existing frame, giving the crash functions a slightly higher chance to do their stuff without overflowing the stack.) (The changes also result in slightly cleaner code.) argument bugfix from "Guillaume C." <guichaz@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:06:09 -07:00
Chen, Kenneth W	0e5c9f39f6	[PATCH] remove hugetlb_clean_stale_pgtable() and fix huge_pte_alloc() I don't think we need to call hugetlb_clean_stale_pgtable() anymore in 2.6.13 because of the rework with free_pgtables(). It now collect all the pte page at the time of munmap. It used to only collect page table pages when entire one pgd can be freed and left with staled pte pages. Not anymore with 2.6.13. This function will never be called and We should turn it into a BUG_ON. I also spotted two problems here, not Adam's fault :-) (1) in huge_pte_alloc(), it looks like a bug to me that pud is not checked before calling pmd_alloc() (2) in hugetlb_clean_stale_pgtable(), it also missed a call to pmd_free_tlb. I think a tlb flush is required to flush the mapping for the page table itself when we clear out the pmd pointing to a pte page. However, since hugetlb_clean_stale_pgtable() is never called, so it won't trigger the bug. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Cc: Adam Litke <agl@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:46 -07:00
Adam Litke	02b0ccef90	[PATCH] hugetlb: check p?d_present in huge_pte_offset() For demand faulting, we cannot assume that the page tables will be populated. Do what the rest of the architectures do and test p?d_present() while walking down the page table. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: <linux-mm@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:46 -07:00
Adam Litke	7bf07f3d4b	[PATCH] hugetlb: move stale pte check into huge_pte_alloc() Initial Post (Wed, 17 Aug 2005) This patch moves the if (! pte_none(*pte)) hugetlb_clean_stale_pgtable(pte); logic into huge_pte_alloc() so all of its callers can be immune to the bug described by Kenneth Chen at http://lkml.org/lkml/2004/6/16/246 > It turns out there is a bug in hugetlb_prefault(): with 3 level page table, > huge_pte_alloc() might return a pmd that points to a PTE page. It happens > if the virtual address for hugetlb mmap is recycled from previously used > normal page mmap. free_pgtables() might not scrub the pmd entry on > munmap and hugetlb_prefault skips on any pmd presence regardless what type > it is. Unless I am missing something, it seems more correct to place the check inside huge_pte_alloc() to prevent a the same bug wherever a huge pte is allocated. It also allows checking for this condition when lazily faulting huge pages later in the series. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: <linux-mm@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:46 -07:00
Bob Picco	3e347261a8	[PATCH] sparsemem extreme implementation With cleanups from Dave Hansen <haveblue@us.ibm.com> SPARSEMEM_EXTREME makes mem_section a one dimensional array of pointers to mem_sections. This two level layout scheme is able to achieve smaller memory requirements for SPARSEMEM with the tradeoff of an additional shift and load when fetching the memory section. The current SPARSEMEM implementation is a one dimensional array of mem_sections which is the default SPARSEMEM configuration. The patch attempts isolates the implementation details of the physical layout of the sparsemem section array. SPARSEMEM_EXTREME requires bootmem to be functioning at the time of memory_present() calls. This is not always feasible, so architectures which do not need it may allocate everything statically by using SPARSEMEM_STATIC. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Bob Picco <bob.picco@hp.com> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-05 00:05:38 -07:00
Ivan Kokshaysky	81d4af1340	[PATCH] x86: pci_assign_unassigned_resources() update I had some time to think about PCI assign issues in 2.6.13-rc series. The major problem here is that we call pci_assign_unassigned_resources() way too early - at subsys_initcall level. Therefore we give no chances to ACPI and PnP routines (called at fs_initcall level) to reserve their respective resources properly, as the comments in drivers/pnp/system.c and drivers/acpi/motherboard.c suggest: /** * Reserve motherboard resources after PCI claim BARs, * but before PCI assign resources for uninitialized PCI devices */ So I moved the pci_assign_unassigned_resources() call to pcibios_assign_resources() (fs_initcall), which should hopefully fix a lot of problems and make PCIBIOS_MIN_IO tweaks unnecessary. Other changes: - remove resource assignment code from pcibios_assign_resources(), since it duplicates pci_assign_unassigned_resources() functionality and actually does nothing in 2.6.13; - modify ROM assignment code as per Ben's suggestion: try to use firmware settings by default (if PCI_ASSIGN_ROMS is not set); - set CARDBUS_IO_SIZE back to 4K as it's a wonderful stress test for various setups. Confirmed by Tero Roponen <teanropo@cc.jyu.fi> (who had problems with the 4kB CardBus IO size previously). Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-30 11:14:48 -07:00
Steven Rostedt	69be8f1896	[PATCH] convert signal handling of NODEFER to act like other Unix boxes. It has been reported that the way Linux handles NODEFER for signals is not consistent with the way other Unix boxes handle it. I've written a program to test the behavior of how this flag affects signals and had several reports from people who ran this on various Unix boxes, confirming that Linux seems to be unique on the way this is handled. The way NODEFER affects signals on other Unix boxes is as follows: 1) If NODEFER is set, other signals in sa_mask are still blocked. 2) If NODEFER is set and the signal is in sa_mask, then the signal is still blocked. (Note: this is the behavior of all tested but Linux _and_ NetBSD 2.0 ). The way NODEFER affects signals on Linux: 1) If NODEFER is set, other signals are _not_ blocked regardless of sa_mask (Even NetBSD doesn't do this). 2) If NODEFER is set and the signal is in sa_mask, then the signal being handled is not blocked. The patch converts signal handling in all current Linux architectures to the way most Unix boxes work. Unix boxes that were tested: DU4, AIX 5.2, Irix 6.5, NetBSD 2.0, SFU 3.5 on WinXP, AIX 5.3, Mac OSX, and of course Linux 2.6.13-rcX. NetBSD was the only other Unix to behave like Linux on point #2. The main concern was brought up by point #1 which even NetBSD isn't like Linux. So with this patch, we leave NetBSD as the lonely one that behaves differently here with #2. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-29 10:03:11 -07:00
Chuck Ebbert	b1daec3089	[PATCH] i386: fix incorrect FP signal code i386 floating-point exception handling has a bug that can cause error code 0 to be sent instead of the proper code during signal delivery. This is caused by unconditionally checking the IS and c1 bits from the FPU status word when they are not always relevant. The IS bit tells whether an exception is a stack fault and is only relevant when the exception is IE (invalid operation.) The C1 bit determines whether a stack fault is overflow or underflow and is only relevant when IS and IE are set. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-23 19:52:37 -07:00
Steven Rostedt	cd3716ab40	[PATCH] Mobil Pentium 4 HT and the NMI I'm trying to get the nmi working with my laptop (IBM ThinkPad G41) and after debugging it a while, I found that the nmi code doesn't want to set it up for this particular CPU. Here I have: $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Mobile Intel(R) Pentium(R) 4 CPU 3.33GHz stepping : 1 cpu MHz : 3320.084 cache size : 1024 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 3 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni monitor ds_cpl est tm2 cid xtpr bogomips : 6642.39 processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Mobile Intel(R) Pentium(R) 4 CPU 3.33GHz stepping : 1 cpu MHz : 3320.084 cache size : 1024 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 3 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni monitor ds_cpl est tm2 cid xtpr bogomips : 6637.46 And the following code shows: $ cat linux-2.6.13-rc6/arch/i386/kernel/nmi.c [...] void setup_apic_nmi_watchdog (void) { switch (boot_cpu_data.x86_vendor) { case X86_VENDOR_AMD: if (boot_cpu_data.x86 != 6 && boot_cpu_data.x86 != 15) return; setup_k7_watchdog(); break; case X86_VENDOR_INTEL: switch (boot_cpu_data.x86) { case 6: if (boot_cpu_data.x86_model > 0xd) return; setup_p6_watchdog(); break; case 15: if (boot_cpu_data.x86_model > 0x3) return; Here I get boot_cpu_data.x86_model == 0x4. So I decided to change it and reboot. I now seem to have a working NMI. So, unless there's something know to be bad about this processor and the NMI. I'm submitting the following patch. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Acked-by: Mikael Pettersson <mikpe@csd.uu.se> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-19 18:44:56 -07:00
Andi Kleen	6be382ea0c	[PATCH] x86: Remove obsolete get_cpu_vendor call Since early CPU identify is in this information is already available Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-18 12:53:59 -07:00
Ravikiran G Thirumalai	4b0271eb9d	[PATCH] Move the fix to align node_end_pfns to a proper location Move the fix to align node_end_pfns to a proper location. The earlier fix made the node_remap_start_vaddr to get misaligned causing remap_numa_kva to barf again :-/ Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-07 10:00:39 -07:00
Tom Duffy	46bdac9938	[PATCH] visws: linkage fix This patch add stubs to allow the visws subarch to link again. Signed-off-by: Tom Duffy <thomas.duffy.99@alumni.brown.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-07 10:00:38 -07:00
Eric W. Biederman	36cf446c2c	[PATCH] i386 visws: Add machine_shutdown and emergency_restart Another x86 subarchitecture bit I missed. This adds both machine_emergency_restart missed in my reboot fixes and machine_shutdown needed for kexec support. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-06 12:54:57 -07:00
Eric W. Biederman	094528a7fb	[PATCH] i386 voyager: Add machine_shutdown Here is one more bit of breakage my x86 sub-architecture confusion caused. Add machine_shutdown to voyager so it will compile with CONFIG_KEXEC. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-06 12:54:57 -07:00
James Bottomley	e6cb99413d	[PATCH] fix voyager compile after machine_emergency_restart breakage [PATCH] i386: Implement machine_emergency_reboot introduced this new function into arch/i386/reboot.c. However, subarchitectures are entitled to implement their own copies of reboot.c from which this new function is now missing. It looks like visws will also need a similar fixup Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-05 12:22:37 -07:00
Venkatesh Pallipadi	c91096d85c	[PATCH] remove special HPET_EMULATE_RTC config option We had a user whose apps weren't working correctly because his "rtc" wasn't working fully. For the sake of simplicity, it seems sensible to always enable HPET RTC emulation. Remove a special config option for HPET_EMULATE_RTC and make it directly depend on HPET_TIMER and RTC. This will avoid the hangs when EMULATE_RTC is not configured and when some userlevel script depends on RTC interrupt, as in: http://bugzilla.kernel.org/show_bug.cgi?id=4904 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-04 16:27:58 -07:00
Andrew Morton	39bbb07d7c	[PATCH] transmeta: CONFIG_PROC_FS=n build fix Fix bug found by Grant Coady <lkml@dodo.com.au>'s autobuild setup. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-01 21:38:00 -07:00
Eric Lammerts	cdf32eaa4e	[PATCH] disable addres space randomization default on transmeta CPUs We know that the randomisation slows down some workloads on Transmeta CPUs by quite large amounts. We think it's because the CPU needs to recode the same x86 instructions when they pop up at a different virtual address after a fork+exec. So disable randomization by default on those CPUs. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-01 19:13:59 -07:00
Ingo Molnar	6cb54819d7	[PATCH] remove sys_set_zone_reclaim() This removes sys_set_zone_reclaim() for now. While i'm sure Martin is trying to solve a real problem, we must not hard-code an incomplete and insufficient approach into a syscall, because syscalls are pretty much for eternity. I am quite strongly convinced that this syscall must not hit v2.6.13 in its current form. Firstly, the syscall lacks basic syscall design: e.g. it allows the global setting of VM policy for unprivileged users. (!) [ Imagine an Oracle installation and a SAP installation on the same NUMA box fighting over the 'optimal' setting for this flag. What will they do? Will they try to set the flag to their own preferred value every second or so? ] Secondly, it was added based on a single datapoint from Martin: http://marc.theaimsgroup.com/?l=linux-mm&m=111763597218177&w=2 where Martin characterizes the numbers the following way: ' Run-to-run variability for "make -j" is huge, so these numbers aren't terribly useful except to see that with reclaim the benchmark still finishes in a reasonable amount of time. ' in other words: the fundamental problem has likely not been solved, only a tendential move into the right direction has been observed, and a handful of numbers were picked out of a set of hugely variable results, without showing the variability data. How much variance is there run-to-run? I'd really suggest to first walk the walk and see what's needed to get stable & predictable kernel compilation numbers on that NUMA box, before adding random syscalls to tune a particular aspect of the VM ... which approach might not even matter once the whole picture has been analyzed and understood! The third, most important point is that the syscall exposes VM tuning internals in a completely unstructured way. What sense does it make to have a _GLOBAL_ per-node setting for 'should we go to another node for reclaim'? If then it might make sense to do this per-app, via numalib or so. The change is minimalistic in that it doesnt remove the syscall and the underlying infrastructure changes, only the user-visible changes. We could perhaps add a CAP_SYS_ADMIN-only sysctl for this hack, a'ka /proc/sys/vm/swappiness, but even that looks quite counterproductive when the generic approach is that we are trying to reduce the number of external factors in the VM balance picture. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-01 10:03:56 -07:00
Len Brown	adbedd3424	merge 2.6.13-rc4 with ACPI's to-linus tree	2005-07-30 01:55:32 -04:00
Len Brown	d6ac1a7910	/home/lenb/src/to-linus branch 'acpi-2.6.12'	2005-07-29 23:31:17 -04:00
David Shaohua Li	87bec66b96	[ACPI] suspend/resume ACPI PCI Interrupt Links Add reference count and disable ACPI PCI Interrupt Link when no device still uses it. Warn when drivers have not released Link at suspend time. http://bugzilla.kernel.org/show_bug.cgi?id=3469 Signed-off-by: David Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-29 22:49:38 -04:00
Dominik Brodowski	4b31e77455	[ACPI] Always set P-state on initialization Otherwise a platform that supports ACPI based cpufreq and boots up at lowest possible speed could stay there forever. This because the governor may request max speed, but the code doesn't update if there is no change in speed, and it assumed the initial state of max speed. http://bugzilla.kernel.org/show_bug.cgi?id=4634 Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2005-07-29 18:29:47 -04:00
Natalie.Protasevich@unisys.com	e1afc3f522	[PATCH] x86: avoid wasting IRQs patch update The patch addresses a problem with ACPI SCI interrupt entry, which gets re-used, and the IRQ is assigned to another unrelated device. The patch corrects the code such that SCI IRQ is skipped and duplicate entry is avoided. Second issue came up with VIA chipset, the problem was caused by original patch assigning IRQs starting 16 and up. The VIA chipset uses 4-bit IRQ register for internal interrupt routing, and therefore cannot handle IRQ numbers assigned to its devices. The patch corrects this problem by allowing PCI IRQs below 16. Signed-off by: Natalie Protasevich <Natalie.Protasevich@unisys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-29 15:01:13 -07:00
Ravikiran G Thirumalai	94d2ac66c1	[PATCH] mm: Ensure proper alignment for node_remap_start_pfn While reserving KVA for lmem_maps of node, we have to make sure that node_remap_start_pfn[] is aligned to a proper pmd boundary. (node_remap_start_pfn[] gets its value from node_end_pfn[]) Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-29 15:01:13 -07:00
Linus Torvalds	590f47a1d9	Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq	2005-07-29 14:40:08 -07:00
Dave Jones	094ce7fde4	arch/i386/kernel/cpu/cpufreq/powernow-k8.c: In function `powernow_k8_cpu_init_acpi': arch/i386/kernel/cpu/cpufreq/powernow-k8.c:740: warning: unused variable `vid' arch/i386/kernel/cpu/cpufreq/powernow-k8.c:739: warning: unused variable `fid' arch/i386/kernel/cpu/cpufreq/powernow-k8.c:743: warning: unused variable `vid' arch/i386/kernel/cpu/cpufreq/powernow-k8.c:742: warning: unused variable `fid' arch/i386/kernel/cpu/cpufreq/powernow-k8.c:746: `fid' undeclared (first use in this function) arch/i386/kernel/cpu/cpufreq/powernow-k8.c:746: `vid' undeclared (first use in this function) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dave Jones <davej@redhat.com>	2005-07-29 12:55:40 -07:00
Eric W. Biederman	e7b47ccaf6	[PATCH] i386 machine_kexec: Cleanup inline assembly For some reason I was telling my inline assembly that the input argument was an output argument. Playing in the trampoline code I have seen a couple of instances where lgdt get the wrong size (because the trampolines run in 16bit mode) so use lgdtl and lidtl to be explicit. Additionally gcc-3.3 and gcc-3.4 want's an lvalue for a memory argument and it doesn't think an array of characters is an lvalue so use a packed structure instead. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-29 12:17:26 -07:00
Dave Jones	2bcad935a3	Fix up powernow-k8 compile. (Missing definitions). From: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-07-29 09:56:41 -07:00
Linus Torvalds	33ac02aa4c	Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq	2005-07-29 10:16:25 -07:00
Dave Hansen	e1474e2d9d	[PATCH] re-disable TSC on NUMAQ Somewhere recently, the TSC got re-enabled for timekeeping on NUMAQ machines. However, the hardware makes these get unsynchronized quite badly. So badly, in fact, that the code to fix up the skew can just hang on boot. This patch re-disables them. It's nicely confined to the numaq.c file. It would be great if this could make it into 2.6.13, I think it counts as a bugfix. Tested on a 16-proc 4-node NUMAQ. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:46:05 -07:00
Andi Kleen	e2cac78935	[PATCH] x86_64: When running cpuid4 need to run on the correct CPU Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-28 21:46:01 -07:00

1 2 3 4 5 ...

251 Commits