Commit Graph

48 Commits

Author SHA1 Message Date
Dmitry Vyukov
5c9a8750a6 kernel: add kcov code coverage
kcov provides code coverage collection for coverage-guided fuzzing
(randomized testing).  Coverage-guided fuzzing is a testing technique
that uses coverage feedback to determine new interesting inputs to a
system.  A notable user-space example is AFL
(http://lcamtuf.coredump.cx/afl/).  However, this technique is not
widely used for kernel testing due to missing compiler and kernel
support.

kcov does not aim to collect as much coverage as possible.  It aims to
collect more or less stable coverage that is function of syscall inputs.
To achieve this goal it does not collect coverage in soft/hard
interrupts and instrumentation of some inherently non-deterministic or
non-interesting parts of kernel is disbled (e.g.  scheduler, locking).

Currently there is a single coverage collection mode (tracing), but the
API anticipates additional collection modes.  Initially I also
implemented a second mode which exposes coverage in a fixed-size hash
table of counters (what Quentin used in his original patch).  I've
dropped the second mode for simplicity.

This patch adds the necessary support on kernel side.  The complimentary
compiler support was added in gcc revision 231296.

We've used this support to build syzkaller system call fuzzer, which has
found 90 kernel bugs in just 2 months:

  https://github.com/google/syzkaller/wiki/Found-Bugs

We've also found 30+ bugs in our internal systems with syzkaller.
Another (yet unexplored) direction where kcov coverage would greatly
help is more traditional "blob mutation".  For example, mounting a
random blob as a filesystem, or receiving a random blob over wire.

Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
typical coverage can be just a dozen of basic blocks (e.g.  an invalid
input).  In such context gcov becomes prohibitively expensive as
reset/collect coverage steps depend on total number of basic
blocks/edges in program (in case of kernel it is about 2M).  Cost of
kcov depends only on number of executed basic blocks/edges.  On top of
that, kernel requires per-thread coverage because there are always
background threads and unrelated processes that also produce coverage.
With inlined gcov instrumentation per-thread coverage is not possible.

kcov exposes kernel PCs and control flow to user-space which is
insecure.  But debugfs should not be mapped as user accessible.

Based on a patch by Quentin Casasnovas.

[akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
[akpm@linux-foundation.org: unbreak allmodconfig]
[akpm@linux-foundation.org: follow x86 Makefile layout standards]
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: David Drysdale <drysdale@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Josh Poimboeuf
c0dd671686 objtool: Mark non-standard object files and directories
Code which runs outside the kernel's normal mode of operation often does
unusual things which can cause a static analysis tool like objtool to
emit false positive warnings:

 - boot image
 - vdso image
 - relocation
 - realmode
 - efi
 - head
 - purgatory
 - modpost

Set OBJECT_FILES_NON_STANDARD for their related files and directories,
which will tell objtool to skip checking them.  It's ok to skip them
because they don't affect runtime stack traces.

Also skip the following code which does the right thing with respect to
frame pointers, but is too "special" to be validated by a tool:

 - entry
 - mcount

Also skip the test_nx module because it modifies its exception handling
table at runtime, which objtool can't understand.  Fortunately it's
just a test module so it doesn't matter much.

Currently objtool is the only user of OBJECT_FILES_NON_STANDARD, but it
might eventually be useful for other tools.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/366c080e3844e8a5b6a0327dc7e8c2b90ca3baeb.1456719558.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-29 08:35:02 +01:00
Andrey Ryabinin
c6d308534a UBSAN: run-time undefined behavior sanity checker
UBSAN uses compile-time instrumentation to catch undefined behavior
(UB).  Compiler inserts code that perform certain kinds of checks before
operations that could cause UB.  If check fails (i.e.  UB detected)
__ubsan_handle_* function called to print error message.

So the most of the work is done by compiler.  This patch just implements
ubsan handlers printing errors.

GCC has this capability since 4.9.x [1] (see -fsanitize=undefined
option and its suboptions).
However GCC 5.x has more checkers implemented [2].
Article [3] has a bit more details about UBSAN in the GCC.

[1] - https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html
[2] - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html
[3] - http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

Issues which UBSAN has found thus far are:

Found bugs:

 * out-of-bounds access - 97840cb67f ("netfilter: nfnetlink: fix
   insufficient validation in nfnetlink_bind")

undefined shifts:

 * d48458d4a7 ("jbd2: use a better hash function for the revoke
   table")

 * 10632008b9 ("clockevents: Prevent shift out of bounds")

 * 'x << -1' shift in ext4 -
   http://lkml.kernel.org/r/<5444EF21.8020501@samsung.com>

 * undefined rol32(0) -
   http://lkml.kernel.org/r/<1449198241-20654-1-git-send-email-sasha.levin@oracle.com>

 * undefined dirty_ratelimit calculation -
   http://lkml.kernel.org/r/<566594E2.3050306@odin.com>

 * undefined roundown_pow_of_two(0) -
   http://lkml.kernel.org/r/<1449156616-11474-1-git-send-email-sasha.levin@oracle.com>

 * [WONTFIX] undefined shift in __bpf_prog_run -
   http://lkml.kernel.org/r/<CACT4Y+ZxoR3UjLgcNdUm4fECLMx2VdtfrENMtRRCdgHB2n0bJA@mail.gmail.com>

   WONTFIX here because it should be fixed in bpf program, not in kernel.

signed overflows:

 * 32a8df4e0b ("sched: Fix odd values in effective_load()
   calculations")

 * mul overflow in ntp -
   http://lkml.kernel.org/r/<1449175608-1146-1-git-send-email-sasha.levin@oracle.com>

 * incorrect conversion into rtc_time in rtc_time64_to_tm() -
   http://lkml.kernel.org/r/<1449187944-11730-1-git-send-email-sasha.levin@oracle.com>

 * unvalidated timespec in io_getevents() -
   http://lkml.kernel.org/r/<CACT4Y+bBxVYLQ6LtOKrKtnLthqLHcw-BMp3aqP3mjdAvr9FULQ@mail.gmail.com>

 * [NOTABUG] signed overflow in ktime_add_safe() -
   http://lkml.kernel.org/r/<CACT4Y+aJ4muRnWxsUe1CMnA6P8nooO33kwG-c8YZg=0Xc8rJqw@mail.gmail.com>

[akpm@linux-foundation.org: fix unused local warning]
[akpm@linux-foundation.org: fix __int128 build woes]
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yury Gribov <y.gribov@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Linus Torvalds
37507717de Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf updates from Ingo Molnar:
 "This series tightens up RDPMC permissions: currently even highly
  sandboxed x86 execution environments (such as seccomp) have permission
  to execute RDPMC, which may leak various perf events / PMU state such
  as timing information and other CPU execution details.

  This 'all is allowed' RDPMC mode is still preserved as the
  (non-default) /sys/devices/cpu/rdpmc=2 setting.  The new default is
  that RDPMC access is only allowed if a perf event is mmap-ed (which is
  needed to correctly interpret RDPMC counter values in any case).

  As a side effect of these changes CR4 handling is cleaned up in the
  x86 code and a shadow copy of the CR4 value is added.

  The extra CR4 manipulation adds ~ <50ns to the context switch cost
  between rdpmc-capable and rdpmc-non-capable mms"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks
  perf/x86: Only allow rdpmc if a perf_event is mapped
  perf: Pass the event to arch_perf_update_userpage()
  perf: Add pmu callbacks to track event mapping and unmapping
  x86: Add a comment clarifying LDT context switching
  x86: Store a per-cpu shadow copy of CR4
  x86: Clean up cr4 manipulation
2015-02-16 14:58:12 -08:00
Andrey Ryabinin
ef7f0d6a6c x86_64: add KASan support
This patch adds arch specific code for kernel address sanitizer.

16TB of virtual addressed used for shadow memory.  It's located in range
[ffffec0000000000 - fffffc0000000000] between vmemmap and %esp fixup
stacks.

At early stage we map whole shadow region with zero page.  Latter, after
pages mapped to direct mapping address range we unmap zero pages from
corresponding shadow (see kasan_map_shadow()) and allocate and map a real
shadow memory reusing vmemmap_populate() function.

Also replace __pa with __pa_nodebug before shadow initialized.  __pa with
CONFIG_DEBUG_VIRTUAL=y make external function call (__phys_addr)
__phys_addr is instrumented, so __asan_load could be called before shadow
area initialized.

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:41 -08:00
Andy Lutomirski
1e02ce4ccc x86: Store a per-cpu shadow copy of CR4
Context switches and TLB flushes can change individual bits of CR4.
CR4 reads take several cycles, so store a shadow copy of CR4 in a
per-cpu variable.

To avoid wasting a cache line, I added the CR4 shadow to
cpu_tlbstate, which is already touched in switch_mm.  The heaviest
users of the cr4 shadow will be switch_mm and __switch_to_xtra, and
__switch_to_xtra is called shortly after switch_mm during context
switch, so the cacheline is likely to be hot.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Vince Weaver <vince@deater.net>
Cc: "hillf.zj" <hillf.zj@alibaba-inc.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/3a54dd3353fffbf84804398e00dfdc5b7c1afd7d.1414190806.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 12:10:42 +01:00
Peter Foley
a9358bc353 x86/build: Supress realmode.bin is up to date message
Supress this unnecessary message during kernel re-build:

   make[3]: 'arch/x86/realmode/rm/realmode.bin' is up to date.

Signed-off-by: Peter Foley <pefoley2@pefoley.com>
Link: http://lkml.kernel.org/r/1397584693-15902-1-git-send-email-pefoley2@pefoley.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-16 15:17:24 +02:00
H. Peter Anvin
4064e0ea3c Merge commit 'f4bcd8ccddb02833340652e9f46f5127828eb79d' into x86/build
Bring in upstream merge of x86/kaslr for future patches.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-29 09:07:00 -08:00
David Woodhouse
1c678da3bd x86: Remove duplication of 16-bit CFLAGS
Define them once in arch/x86/Makefile instead of twice.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Link: http://lkml.kernel.org/r/1389180083-23249-1-git-send-email-David.Woodhouse@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-22 04:21:45 -08:00
Linus Torvalds
2a0fede97f Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Misc cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, cpu, amd: Fix a shadowed variable situation
  um, x86: Fix vDSO build
  x86: Delete non-required instances of include <linux/init.h>
  x86, realmode: Pointer walk cleanups, pull out invariant use of __pa()
  x86/traps: Clean up error exception handler definitions
2014-01-20 12:03:57 -08:00
Paul Gortmaker
663b55b9b3 x86: Delete non-required instances of include <linux/init.h>
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>.  Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

[ hpa: undid incorrect removal from arch/x86/kernel/head_32.S ]

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1389054026-12947-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-01-06 21:25:18 -08:00
H. Peter Anvin
7306006f10 x86, realmode: Pointer walk cleanups, pull out invariant use of __pa()
The pointer arithmetic in this function was really bizarre, where in
fact all we really wanted was a simple pointer array walk.  Use the
much more idiomatic construction for that (*ptr++).

Factor an invariant use of __pa() out of the relocation loop.  At
least on 64 bits it seems gcc isn't capable of doing that
automatically.

Change the scope of a couple of variables to make it extra obvious
that they are extremely local temp variables.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-rd908t9c8kvcojdabtmm94mb@git.kernel.org
2013-12-18 16:07:53 -08:00
H. Peter Anvin
8b3b005d67 x86, build: Pass in additional -mno-mmx, -mno-sse options
In checkin

    5551a34e5a x86-64, build: Always pass in -mno-sse

we unconditionally added -mno-sse to the main build, to keep newer
compilers from generating SSE instructions from autovectorization.
However, this did not extend to the special environments
(arch/x86/boot, arch/x86/boot/compressed, and arch/x86/realmode/rm).
Add -mno-sse to the compiler command line for these environments, and
add -mno-mmx to all the environments as well, as we don't want a
compiler to generate MMX code either.

This patch also removes a $(cc-option) call for -m32, since we have
long since stopped supporting compilers too old for the -m32 option,
and in fact hardcode it in other places in the Makefiles.

Reported-by: Kevin B. Smith <kevin.b.smith@intel.com>
Cc: Sunil K. Pandey <sunil.k.pandey@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: H. J. Lu <hjl.tools@gmail.com>
Link: http://lkml.kernel.org/n/tip-j21wzqv790q834n7yc6g80j1@git.kernel.org
Cc: <stable@vger.kernel.org> # build fix only
2013-12-09 15:52:39 -08:00
H. Peter Anvin
68d00bbebb Merge remote-tracking branch 'origin/x86/mm' into x86/mm2
Explicitly merging these two branches due to nontrivial conflicts and
to allow further work.

Resolved Conflicts:
	arch/x86/kernel/head32.c
	arch/x86/kernel/head64.c
	arch/x86/mm/init_64.c
	arch/x86/realmode/init.c

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-02-01 02:28:36 -08:00
Yinghai Lu
4f7b92263a x86, realmode: Separate real_mode reserve and setup
After we switch to use #PF handler help to set page table, init_level4_pgt
will only have entries set after init_mem_mapping().
We need to move copying init_level4_pgt to trampoline_pgd after that.

So split reserve and setup, and move the setup after init_mem_mapping()

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-11-git-send-email-yinghai@kernel.org
Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Acked-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29 15:13:24 -08:00
Yinghai Lu
9735e91e9c x86, 64bit, realmode: Use init_level4_pgt to set trampoline_pgd directly
with #PF handler way to set early page table, level3_ident will go away with
64bit native path.

So just use entries in init_level4_pgt to set them in trampoline_pgd.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-10-git-send-email-yinghai@kernel.org
Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Acked-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29 15:12:30 -08:00
Yinghai Lu
231b3642a3 x86, realmode: Set real_mode permissions early
Trampoline code is executed by APs with kernel low mapping on 64bit.
We need to set trampoline code to EXEC early before we boot APs.

Found the problem after switching to #PF handler set page table,
and we do not set initial kernel low mapping with EXEC anymore in
arch/x86/kernel/head_64.S.

Change to use early_initcall instead that will make sure trampoline
will have EXEC set.

-v2: Merge two comments according to Borislav Petkov <bp@alien8.de>

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-7-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29 15:12:25 -08:00
Alexander Duyck
fc8d782677 x86: Use __pa_symbol instead of __pa on C visible symbols
When I made an attempt at separating __pa_symbol and __pa I found that there
were a number of cases where __pa was used on an obvious symbol.

I also caught one non-obvious case as _brk_start and _brk_end are based on the
address of __brk_base which is a C visible symbol.

In mark_rodata_ro I was able to reduce the overhead of kernel symbol to
virtual memory translation by using a combination of __va(__pa_symbol())
instead of page_address(virt_to_page()).

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Link: http://lkml.kernel.org/r/20121116215640.8521.80483.stgit@ahduyck-cp1.jf.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-11-16 16:42:09 -08:00
H. Peter Anvin
1396adc3c2 x86, suspend: Correct the restore of CR4, EFER; skip computing EFLAGS.ID
The patch:

    73201dbe x86, suspend: On wakeup always initialize cr4 and EFER

... was incorrectly committed in an intermediate (unfinished) form.

- We need to test CF, not ZF, for a bit test with btl.
- We don't actually need to compute the existence of EFLAGS.ID,
  since we set a flag at suspend time if CR4 should be restored.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Link: http://lkml.kernel.org/r/1348529239-17943-1-git-send-email-hpa@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-02 08:42:31 +02:00
H. Peter Anvin
73201dbec6 x86, suspend: On wakeup always initialize cr4 and EFER
We already have a flag word to indicate the existence of MISC_ENABLES,
so use the same flag word to indicate existence of cr4 and EFER, and
always restore them if they exist.  That way if something passes a
nonzero value when the value *should* be zero, we will still
initialize it.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Link: http://lkml.kernel.org/r/1348529239-17943-1-git-send-email-hpa@linux.intel.com
2012-09-26 15:06:22 -07:00
Andrew Boie
484d90eec8 x86, build: Globally set -fno-pic
GCC built with nonstandard options can enable -fpic by default.
We never want this for 32-bit kernels and it will break the build.

[ hpa: Notably the Android toolchain apparently does this. ]

Change-Id: Iaab7d66e598b1c65ac4a4f0229eca2cd3d0d2898
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Link: http://lkml.kernel.org/r/1344624546-29691-1-git-send-email-andrew.p.boie@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-08-10 16:12:30 -07:00
H. Peter Anvin
9751d76275 x86-64, reboot: Be more paranoid in 64-bit reboot=bios
Be a bit more paranoid in the transition back to 16-bit mode.  In
particular, in case the kernel is residing above the 4 GiB mark,
switch to the trampoline GDT, and make the jump after turning off
paging a far jump.  In theory, none of this should matter, but it is
exactly the kind of things that broken SMM or virtualization software
could trip up on.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/tip-jopx7y6g6dbcx4tpal8q0jlr@git.kernel.org
2012-06-21 10:25:03 -07:00
H. Peter Anvin
650513979a x86-64, reboot: Allow reboot=bios and reboot-cpu override on x86-64
With the revamped realmode trampoline code, it is trivial to extend
support for reboot=bios to x86-64.  Furthermore, while we are at it,
remove the restriction that only we can only override the reboot CPU
on 32 bits.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/n/tip-jopx7y6g6dbcx4tpal8q0jlr@git.kernel.org
2012-06-17 10:51:01 -07:00
H. Peter Anvin
61f5446169 x86, realmode: Move end signature into header.S
The end signature was defined in wakeup_asm.S as it originally came
from the ACPI wakeup code.  However, we rely on the existence of the
.signature section to expand .bss, otherwise we would have to include
code to explicitly zero the .bss depending on the configuration.
Since the expanded .bss is just in .init.data anyway, it's easier to
always have it expanded.

This fixes failures when compiled without CONFIG_ACPI_SLEEP.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
2012-05-21 00:02:45 -07:00
H. Peter Anvin
638d957b51 x86, realmode: Change EFER to a single u64 field
Change EFER to be a single u64 field instead of two u32 fields; change
the order to maintain alignment.  Note that on x86-64 cr4 is really
also a 64-bit quantity, although we can only set the low 32 bits from
the trampoline code since it is still executing in 32-bit mode at that
point.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
2012-05-16 14:02:05 -07:00
H. Peter Anvin
1371270188 x86, realmode: Move kernel/realmode.c to realmode/init.c
Keep all the realmode code together, including initialization (only
the rm/ subdirectory is actually built as real-mode code, anyway.)

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
2012-05-16 13:49:10 -07:00
H. Peter Anvin
51edbe6a2f x86, realmode: Move not-common bits out of trampoline_common.S
Move the bits that aren't actually common out of trampoline_common.S
and into the arch-specific files.  Furthermore, make sure the page
directory is first in the .bss section for trampoline_64.S in order to
not waste an entire page of memory.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
2012-05-16 13:44:10 -07:00
Jarkko Sakkinen
34d0b02e08 x86, realmode: Fix no cache bits test in reboot_32.S
Before the new real-mode code infrastructure %edx was
used for testing CD and NW bits with andl in order to
decide whether to flush the processor caches or not.
The value of cr0 was also stored in %eax, which was
later used to set cr0 after masking out lower byte
(except TS bit) in order to enter real-mode.

In the new real-mode code infrastructure we wanted to
keep input parameter in %eax so we are using %edx for
both cr0 cases. This has caused regression since andl
overwrites the value of %edx.

This patch fixes the issue by replacing andl with testl,
which is essentially andl without writing result to the
register.

Special thanks to Paolo Bonzini for noting this and
proposing a fix.

Reported-and-tested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336633898-23743-1-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-10 19:31:22 -07:00
H. Peter Anvin
0f6f11eb00 x86, realmode: Make sure all generated files are listed in targets
Kbuild expects all generated files to be listed in the targets
variable.  If it isn't, weird things happen.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336595106-21135-1-git-send-email-jarkko.sakkinen@intel.com
2012-05-09 14:53:01 -07:00
Jarkko Sakkinen
c5403aed04 x86, realmode: build fix: remove duplicate build
Real-mode binary was built twice. This patch fixes
the issue by making realmode.relocs as target for
realmode.bin.

[ hpa: removed the direct dependency on realmode.relocs in
  arch/x86/realmode/Makefile ]

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336595106-21135-1-git-send-email-jarkko.sakkinen@intel.com
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-09 13:32:17 -07:00
Jarkko Sakkinen
cda846f101 x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline
This patch changes 64-bit trampoline so that CR4 and
EFER are provided by the kernel instead of using fixed
values.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-24-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 15:04:27 -07:00
Jarkko Sakkinen
f2604c141a x86, realmode: move relocs from scripts/ to arch/x86/tools
Moved relocs tool from scripts/ to arch/x86/tools because
it is architecture specific script. Added new target archscripts
that can be used to build scripts needed building an architecture.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-22-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Michal Marek <mmarek@suse.cz>
2012-05-08 15:03:35 -07:00
Jarkko Sakkinen
f37240f16b x86, realmode: header for trampoline code
Added header for trampoline code that can be used to supply
input data to it. This makes interface between real mode code
and kernel cleaner and simpler. Replaced two confusing pointers
to level4 pgt in trampoline_64.S with a single pointer to the
beginning of the page table.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-21-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:48:45 -07:00
Jarkko Sakkinen
c4845474a0 x86, realmode: flattened rm hierachy
Simplified hierarchy under rm directory to a flat
directory because it is not anymore really justified
to have own directory for wakeup code. It only adds
more complexity.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-20-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:48:45 -07:00
Jarkko Sakkinen
b429dbf6e8 x86, realmode: don't copy real_mode_header
Replaced copying of real_mode_header with a pointer
to beginning of RM memory.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-19-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:48:45 -07:00
Jarkko Sakkinen
8e029fcdd8 x86, realmode: fix 64-bit wakeup sequence
There were number of issues in wakeup sequence:

- Wakeup stack was placed in hardcoded address.
- NX bit in EFER was not enabled.
- Initialization incorrectly set physical address
of secondary_startup_64.
- Some alignment issues.

This patch fixes these issues and in addition:

- Unifies coding conventions in .S files.
- Sets alignments of code and data right.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-18-git-send-email-jarkko.sakkinen@intel.com
Originally-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:48:11 -07:00
H. Peter Anvin
6feb592dce x86, realmode: Fix always-zero test in reboot_32.S
A test instruction is an "and", and an and with zero is always zero.
This would cause us to always take the BIOS path, not the APM path, in
case anyone actually cares...

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-17-git-send-email-jarkko.sakkinen@intel.com
2012-05-08 11:48:04 -07:00
H. Peter Anvin
be60828920 x86, realmode: Move trampoline_*.S early in the link order
Move trampoline_*.S earlier in the link order so it ends up being
first in the text segment; since the SIPI vector requires 4K alignment
it otherwise ends up padding the .text segment with that much
completely unnecessarily.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-16-git-send-email-jarkko.sakkinen@intel.com
2012-05-08 11:48:03 -07:00
H. Peter Anvin
e5684ec438 x86, realmode: Replace open-coded ljmpw with a macro
We cannot code an ljmpw to the real-mode segment directly, because gas
refuses to assemble an ljmp with a symbolic segment.  Instead of
open-coding it everywhere, define a macro and use it for this case.

This is specifically an ljmpw from a 16-bit segment.  This is okay, as
one should never enter real mode from a 32-bit segment: if one do, the
CPU ends up in a bizarre (and useless) mode sometimes called "unreal
mode" where segments behave like real mode but the default address and
operand sizes is 32 bits.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-15-git-send-email-jarkko.sakkinen@intel.com
2012-05-08 11:48:03 -07:00
H. Peter Anvin
968ff9ee56 x86, realmode: Remove indirect jumps in trampoline_32 and wakeup_asm
Remove indirect jumps in trampoline_32.S and the 32-bit part of
wakeup_asm.S.  There exist systems which are known to do weird
things if an SMI comes in right after a mode switch, and the
safest way to deal with it is to always follow with a simple
absolute far jump.  In the 64-bit code we then to a register
indirect near jump; follow that pattern for the 32-bit code.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-14-git-send-email-jarkko.sakkinen@intel.com
2012-05-08 11:48:03 -07:00
H. Peter Anvin
056a43a6d3 x86, realmode: Remove indirect jumps in trampoline_64.S
Remove indirect jumps in trampoline_64.S which are no longer
necessary: the realmode code can relocate the absolute jumps
correctly from the start.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-13-git-send-email-jarkko.sakkinen@intel.com
2012-05-08 11:48:03 -07:00
H. Peter Anvin
f7436a9da9 x86, realmode: Align .data section in trampoline_32.S
Specify the alignment of the .data section in trampoline_32.S.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-12-git-send-email-jarkko.sakkinen@intel.com
2012-05-08 11:48:03 -07:00
H. Peter Anvin
0247428611 x86, realmode: Move bits to the proper sections in trampoline_64.S
Move various bits to the sections they really belong in in
trampoline_64.S.  Use GLOBAL() rather than ENTRY() for data objects:
ENTRY() should only be used with code and forces alignment to 16
bytes.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-11-git-send-email-jarkko.sakkinen@intel.com
2012-05-08 11:48:03 -07:00
H. Peter Anvin
487f50ffeb x86, realmode: Add .text64 section, make barrier symbols absolute
Add a .text64 section.  The purpose of this is to keep 16-, 32- and
64-bit code segregated into separate sections, mainly to keep
disassembly sane.

Move barrier symbols out of sections to avoid the "symbol in empty
section" problem in some versions of GNU ld.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-10-git-send-email-jarkko.sakkinen@intel.com
2012-05-08 11:47:18 -07:00
Jarkko Sakkinen
c9b77ccb52 x86, realmode: Move ACPI wakeup to unified realmode code
Migrated ACPI wakeup code to the real-mode blob.
Code existing in .x86_trampoline  can be completely
removed. Static descriptor table in wakeup_asm.S is
courtesy of H. Peter Anvin.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-7-git-send-email-jarkko.sakkinen@intel.com
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:46:05 -07:00
Jarkko Sakkinen
48927bbb97 x86, realmode: Move SMP trampoline to unified realmode code
Migrated SMP trampoline code to the real mode blob.
SMP trampoline code is not yet removed from
.x86_trampoline because it is needed by the wakeup
code.

[ hpa: always enable compiling startup_32_smp in head_32.S... it is
  only a few instructions which go into .init on UP builds, and it makes
  the rest of the code less #ifdef ugly. ]

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-6-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:41:51 -07:00
Jarkko Sakkinen
5a8c9aebe0 x86, realmode: Move reboot_32.S to unified realmode code
Migrated reboot_32.S from x86_trampoline to the real-mode
blob.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-5-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:41:50 -07:00
Jarkko Sakkinen
b3266bd6ff x86, realmode: realmode.bin infrastructure
Create realmode.bin and realmode.relocs files. Piggy
pack them into relocatable object that will be included
into .init.data section of the main kernel image.

The first file includes binary image of the real-mode code.
The latter file includes all relocations. The layout of the
binary image is specified in realmode.lds.S. The makefile
generates pa_ prefixed symbols for each exported global.
These are used in 32-bit code and in realmode header to
define symbols that need to be relocated.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-3-git-send-email-jarkko.sakkinen@intel.com
Originally-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:41:48 -07:00