Now that VMLINUX_SYMBOL() is no-op, clean up the linker script.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Commit 99492c39f3 ("earlycon: Fix __earlycon_table stride") tried to fix
__earlycon_table stride by forcing the earlycon_id struct alignment to 32
and asking the linker to 32-byte align the __earlycon_table symbol. This
fix was based on commit 07fca0e57f ("tracing: Properly align linker
defined symbols") which tried a similar fix for the tracing subsystem.
However, this fix doesn't quite work because there is no guarantee that
gcc will place structures packed into an array format. In fact, gcc 4.9
chooses to 64-byte align these structs by inserting additional padding
between the entries because it has no clue that they are supposed to be in
an array. If we are unlucky, the linker will assign symbol
"__earlycon_table" to a 32-byte aligned address which does not correspond
to the 64-byte aligned contents of section "__earlycon_table".
To address this same problem, the fix to the tracing system was
subsequently re-implemented using a more robust table of pointers approach
by commits:
3d56e331b6 ("tracing: Replace syscall_meta_data struct array with pointer array")
6549864629 ("tracepoints: Fix section alignment using pointer array")
e4a9ea5ee7 ("tracing: Replace trace_event struct array with pointer array")
Let's use this same "array of pointers to structs" approach for
EARLYCON_TABLE.
Fixes: 99492c39f3 ("earlycon: Fix __earlycon_table stride")
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Suggested-by: Aaron Durbin <adurbin@chromium.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nothing particularly stands out here, probably because people were tied
up with spectre/meltdown stuff last time around. Still, the main pieces
are:
- Rework of our CPU features framework so that we can whitelist CPUs that
don't require kpti even in a heterogeneous system
- Support for the IDC/DIC architecture extensions, which allow us to elide
instruction and data cache maintenance when writing out instructions
- Removal of the large memory model which resulted in suboptimal codegen
by the compiler and increased the use of literal pools, which could
potentially be used as ROP gadgets since they are mapped as executable
- Rework of forced signal delivery so that the siginfo_t is well-formed
and handling of show_unhandled_signals is consolidated and made
consistent between different fault types
- More siginfo cleanup based on the initial patches from Eric Biederman
- Workaround for Cortex-A55 erratum #1024718
- Some small ACPI IORT updates and cleanups from Lorenzo Pieralisi
- Misc cleanups and non-critical fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJaw1TCAAoJELescNyEwWM0gyQIAJVMK4QveBW+LwF96NYdZo16
p90Aa+nqKelh/s93govQArDMv1gxyuXdFlQZVOGPQHfqpz6RhJWmBA2tFsUbQrUc
OBcioPrRihqTmKBe+1r1XORwZxkVX6GGmCn0LYpPR7I3TjxXZpvxqaxGxiUvHkci
yVxWlDTyN/7eL3akhCpCDagN3Fxwk3QnJLqE3fxOFMlY7NvQcmUxcITiUl/s469q
xK6SWH9SRH1JK8jTHPitwUBiU//3FfCqSI9HLEdDIDoTuPcVM8UetWvi4QzrzJL1
UYg8lmU0CXNmflDzZJDaMf+qFApOrGxR0YVPpBzlQvxe0JIY69g48f+JzDPz8nc=
=+gNa
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"Nothing particularly stands out here, probably because people were
tied up with spectre/meltdown stuff last time around. Still, the main
pieces are:
- Rework of our CPU features framework so that we can whitelist CPUs
that don't require kpti even in a heterogeneous system
- Support for the IDC/DIC architecture extensions, which allow us to
elide instruction and data cache maintenance when writing out
instructions
- Removal of the large memory model which resulted in suboptimal
codegen by the compiler and increased the use of literal pools,
which could potentially be used as ROP gadgets since they are
mapped as executable
- Rework of forced signal delivery so that the siginfo_t is
well-formed and handling of show_unhandled_signals is consolidated
and made consistent between different fault types
- More siginfo cleanup based on the initial patches from Eric
Biederman
- Workaround for Cortex-A55 erratum #1024718
- Some small ACPI IORT updates and cleanups from Lorenzo Pieralisi
- Misc cleanups and non-critical fixes"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (70 commits)
arm64: uaccess: Fix omissions from usercopy whitelist
arm64: fpsimd: Split cpu field out from struct fpsimd_state
arm64: tlbflush: avoid writing RES0 bits
arm64: cmpxchg: Include linux/compiler.h in asm/cmpxchg.h
arm64: move percpu cmpxchg implementation from cmpxchg.h to percpu.h
arm64: cmpxchg: Include build_bug.h instead of bug.h for BUILD_BUG
arm64: lse: Include compiler_types.h and export.h for out-of-line LL/SC
arm64: fpsimd: include <linux/init.h> in fpsimd.h
drivers/perf: arm_pmu_platform: do not warn about affinity on uniprocessor
perf: arm_spe: include linux/vmalloc.h for vmap()
Revert "arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)"
arm64: cpufeature: Avoid warnings due to unused symbols
arm64: Add work around for Arm Cortex-A55 Erratum 1024718
arm64: Delay enabling hardware DBM feature
arm64: Add MIDR encoding for Arm Cortex-A55 and Cortex-A35
arm64: capabilities: Handle shared entries
arm64: capabilities: Add support for checks based on a list of MIDRs
arm64: Add helpers for checking CPU MIDR against a range
arm64: capabilities: Clean up midr range helpers
arm64: capabilities: Change scope of VHE to Boot CPU feature
...
Introduce BPF_PROG_TYPE_RAW_TRACEPOINT bpf program type to access
kernel internal arguments of the tracepoints in their raw form.
>From bpf program point of view the access to the arguments look like:
struct bpf_raw_tracepoint_args {
__u64 args[0];
};
int bpf_prog(struct bpf_raw_tracepoint_args *ctx)
{
// program can read args[N] where N depends on tracepoint
// and statically verified at program load+attach time
}
kprobe+bpf infrastructure allows programs access function arguments.
This feature allows programs access raw tracepoint arguments.
Similar to proposed 'dynamic ftrace events' there are no abi guarantees
to what the tracepoints arguments are and what their meaning is.
The program needs to type cast args properly and use bpf_probe_read()
helper to access struct fields when argument is a pointer.
For every tracepoint __bpf_trace_##call function is prepared.
In assembler it looks like:
(gdb) disassemble __bpf_trace_xdp_exception
Dump of assembler code for function __bpf_trace_xdp_exception:
0xffffffff81132080 <+0>: mov %ecx,%ecx
0xffffffff81132082 <+2>: jmpq 0xffffffff811231f0 <bpf_trace_run3>
where
TRACE_EVENT(xdp_exception,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp, u32 act),
The above assembler snippet is casting 32-bit 'act' field into 'u64'
to pass into bpf_trace_run3(), while 'dev' and 'xdp' args are passed as-is.
All of ~500 of __bpf_trace_*() functions are only 5-10 byte long
and in total this approach adds 7k bytes to .text.
This approach gives the lowest possible overhead
while calling trace_xdp_exception() from kernel C code and
transitioning into bpf land.
Since tracepoint+bpf are used at speeds of 1M+ events per second
this is valuable optimization.
The new BPF_RAW_TRACEPOINT_OPEN sys_bpf command is introduced
that returns anon_inode FD of 'bpf-raw-tracepoint' object.
The user space looks like:
// load bpf prog with BPF_PROG_TYPE_RAW_TRACEPOINT type
prog_fd = bpf_prog_load(...);
// receive anon_inode fd for given bpf_raw_tracepoint with prog attached
raw_tp_fd = bpf_raw_tracepoint_open("xdp_exception", prog_fd);
Ctrl-C of tracing daemon or cmdline tool that uses this feature
will automatically detach bpf program, unload it and
unregister tracepoint probe.
On the kernel side the __bpf_raw_tp_map section of pointers to
tracepoint definition and to __bpf_trace_*() probe function is used
to find a tracepoint with "xdp_exception" name and
corresponding __bpf_trace_xdp_exception() probe function
which are passed to tracepoint_probe_register() to connect probe
with tracepoint.
Addition of bpf_raw_tracepoint doesn't interfere with ftrace and perf
tracepoint mechanisms. perf_event_open() can be used in parallel
on the same tracepoint.
Multiple bpf_raw_tracepoint_open("xdp_exception", prog_fd) are permitted.
Each with its own bpf program. The kernel will execute
all tracepoint probes and all attached bpf programs.
In the future bpf_raw_tracepoints can be extended with
query/introspection logic.
__bpf_raw_tp_map section logic was contributed by Steven Rostedt
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
In commit 316ca8804e ("ACPI/IORT: Remove linker section for IORT entries
probing"), iort entries was removed in vmlinux.lds.h. But in
commit 2fcc112af3 ("clocksource/drivers: Rename clksrc table to timer"),
this line was back incorrectly.
It does no harm except for adding some useless symbols, so fix it.
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Pull networking updates from David Miller:
1) Significantly shrink the core networking routing structures. Result
of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf
2) Add netdevsim driver for testing various offloads, from Jakub
Kicinski.
3) Support cross-chip FDB operations in DSA, from Vivien Didelot.
4) Add a 2nd listener hash table for TCP, similar to what was done for
UDP. From Martin KaFai Lau.
5) Add eBPF based queue selection to tun, from Jason Wang.
6) Lockless qdisc support, from John Fastabend.
7) SCTP stream interleave support, from Xin Long.
8) Smoother TCP receive autotuning, from Eric Dumazet.
9) Lots of erspan tunneling enhancements, from William Tu.
10) Add true function call support to BPF, from Alexei Starovoitov.
11) Add explicit support for GRO HW offloading, from Michael Chan.
12) Support extack generation in more netlink subsystems. From Alexander
Aring, Quentin Monnet, and Jakub Kicinski.
13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
Russell King.
14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.
15) Many improvements and simplifications to the NFP driver bpf JIT,
from Jakub Kicinski.
16) Support for ipv6 non-equal cost multipath routing, from Ido
Schimmel.
17) Add resource abstration to devlink, from Arkadi Sharshevsky.
18) Packet scheduler classifier shared filter block support, from Jiri
Pirko.
19) Avoid locking in act_csum, from Davide Caratti.
20) devinet_ioctl() simplifications from Al viro.
21) More TCP bpf improvements from Lawrence Brakmo.
22) Add support for onlink ipv6 route flag, similar to ipv4, from David
Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
tls: Add support for encryption using async offload accelerator
ip6mr: fix stale iterator
net/sched: kconfig: Remove blank help texts
openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
tcp_nv: fix potential integer overflow in tcpnv_acked
r8169: fix RTL8168EP take too long to complete driver initialization.
qmi_wwan: Add support for Quectel EP06
rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
ipmr: Fix ptrdiff_t print formatting
ibmvnic: Wait for device response when changing MAC
qlcnic: fix deadlock bug
tcp: release sk_frag.page in tcp_disconnect
ipv4: Get the address of interface correctly.
net_sched: gen_estimator: fix lockdep splat
net: macb: Handle HRESP error
net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
ipv6: addrconf: break critical section in addrconf_verify_rtnl()
ipv6: change route cache aging logic
i40e/i40evf: Update DESC_NEEDED value to reflect larger value
bnxt_en: cleanup DIM work on device shutdown
...
Add injectable error types for each error-injectable function.
One motivation of error injection test is to find software flaws,
mistakes or mis-handlings of expectable errors. If we find such
flaws by the test, that is a program bug, so we need to fix it.
But if the tester miss input the error (e.g. just return success
code without processing anything), it causes unexpected behavior
even if the caller is correctly programmed to handle any errors.
That is not what we want to test by error injection.
To clarify what type of errors the caller must expect for each
injectable function, this introduces injectable error types:
- EI_ETYPE_NULL : means the function will return NULL if it
fails. No ERR_PTR, just a NULL.
- EI_ETYPE_ERRNO : means the function will return -ERRNO
if it fails.
- EI_ETYPE_ERRNO_NULL : means the function will return -ERRNO
(ERR_PTR) or NULL.
ALLOW_ERROR_INJECTION() macro is expanded to get one of
NULL, ERRNO, ERRNO_NULL to record the error type for
each function. e.g.
ALLOW_ERROR_INJECTION(open_ctree, ERRNO)
This error types are shown in debugfs as below.
====
/ # cat /sys/kernel/debug/error_injection/list
open_ctree [btrfs] ERRNO
io_ctl_init [btrfs] ERRNO
====
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Since error-injection framework is not limited to be used
by kprobes, nor bpf. Other kernel subsystems can use it
freely for checking safeness of error-injection, e.g.
livepatch, ftrace etc.
So this separate error-injection framework from kprobes.
Some differences has been made:
- "kprobe" word is removed from any APIs/structures.
- BPF_ALLOW_ERROR_INJECTION() is renamed to
ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
- CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
feature. It is automatically enabled if the arch supports
error injection feature for kprobe or ftrace etc.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Construct the init thread stack in the linker script rather than doing it
by means of a union so that ia64's init_task.c can be got rid of.
The following symbols are then made available from INIT_TASK_DATA() linker
script macro:
init_thread_union
init_stack
INIT_TASK_DATA() also expands the region to THREAD_SIZE to accommodate the
size of the init stack. init_thread_union is given its own section so that
it can be placed into the stack space in the right order. I'm assuming
that the ia64 ordering is correct and that the task_struct is first and the
thread_info second.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Will Deacon <will.deacon@arm.com> (arm64)
Tested-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Using BPF we can override kprob'ed functions and return arbitrary
values. Obviously this can be a bit unsafe, so make this feature opt-in
for functions. Simply tag a function with KPROBE_ERROR_INJECT_SYMBOL in
order to give BPF access to that function for error injection purposes.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
I like _ONCE warnings because it's guaranteed that they don't flood the
log.
During testing I find it useful to reset the state of the once warnings,
so that I can rerun tests and see if they trigger again, or can
guarantee that a test run always hits the same warnings.
This patch adds a debugfs interface to reset all the _ONCE warnings so
that they appear again:
echo 1 > /sys/kernel/debug/clear_warn_once
This is implemented by putting all the warning booleans into a special
section, and clearing it.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20171017221455.6740-1-andi@firstfloor.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86 core updates from Ingo Molnar:
"Note that in this cycle most of the x86 topics interacted at a level
that caused them to be merged into tip:x86/asm - but this should be a
temporary phenomenon, hopefully we'll back to the usual patterns in
the next merge window.
The main changes in this cycle were:
Hardware enablement:
- Add support for the Intel UMIP (User Mode Instruction Prevention)
CPU feature. This is a security feature that disables certain
instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri)
[ Note that this is disabled by default for now, there are some
smaller enhancements in the pipeline that I'll follow up with in
the next 1-2 days, which allows this to be enabled by default.]
- Add support for the AMD SEV (Secure Encrypted Virtualization) CPU
feature, on top of SME (Secure Memory Encryption) support that was
added in v4.14. (Tom Lendacky, Brijesh Singh)
- Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES,
VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela)
Other changes:
- A big series of entry code simplifications and enhancements (Andy
Lutomirski)
- Make the ORC unwinder default on x86 and various objtool
enhancements. (Josh Poimboeuf)
- 5-level paging enhancements (Kirill A. Shutemov)
- Micro-optimize the entry code a bit (Borislav Petkov)
- Improve the handling of interdependent CPU features in the early
FPU init code (Andi Kleen)
- Build system enhancements (Changbin Du, Masahiro Yamada)
- ... plus misc enhancements, fixes and cleanups"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits)
x86/build: Make the boot image generation less verbose
selftests/x86: Add tests for the STR and SLDT instructions
selftests/x86: Add tests for User-Mode Instruction Prevention
x86/traps: Fix up general protection faults caused by UMIP
x86/umip: Enable User-Mode Instruction Prevention at runtime
x86/umip: Force a page fault when unable to copy emulated result to user
x86/umip: Add emulation code for UMIP instructions
x86/cpufeature: Add User-Mode Instruction Prevention definitions
x86/insn-eval: Add support to resolve 16-bit address encodings
x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
x86/insn-eval: Add support to resolve 32-bit address encodings
x86/insn-eval: Compute linear address in several utility functions
resource: Fix resource_size.cocci warnings
X86/KVM: Clear encryption attribute when SEV is active
X86/KVM: Decrypt shared per-cpu variables when SEV is active
percpu: Introduce DEFINE_PER_CPU_DECRYPTED
x86: Add support for changing memory encryption attribute in early boot
x86/io: Unroll string I/O when SEV is active
x86/boot: Add early boot support when running with SEV active
...
KVM guest defines three per-CPU variables (steal-time, apf_reason, and
kvm_pic_eoi) which are shared between a guest and a hypervisor.
When SEV is active, memory is encrypted with a guest-specific key, and if
the guest OS wants to share the memory region with the hypervisor then it
must clear the C-bit (i.e set decrypted) before sharing it.
DEFINE_PER_CPU_DECRYPTED can be used to define the per-CPU variables
which will be shared between a guest and a hypervisor.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: linux-arch@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: kvm@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Lameter <cl@linux.com>
Link: https://lkml.kernel.org/r/20171020143059.3291-16-brijesh.singh@amd.com
Rename the unwinder config options from:
CONFIG_ORC_UNWINDER
CONFIG_FRAME_POINTER_UNWINDER
CONFIG_GUESS_UNWINDER
to:
CONFIG_UNWINDER_ORC
CONFIG_UNWINDER_FRAME_POINTER
CONFIG_UNWINDER_GUESS
... in order to give them a more logical config namespace.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Using .text.unlikely for refcount exceptions isn't safe because gcc may
move entire functions into .text.unlikely (e.g. in6_dev_dev()), which
would cause any uses of a protected refcount_t function to stay inline
with the function, triggering the protection unconditionally:
.section .text.unlikely,"ax",@progbits
.type in6_dev_get, @function
in6_dev_getx:
.LFB4673:
.loc 2 4128 0
.cfi_startproc
...
lock; incl 480(%rbx)
js 111f
.pushsection .text.unlikely
111: lea 480(%rbx), %rcx
112: .byte 0x0f, 0xff
.popsection
113:
This creates a unique .text..refcount section and adds an additional
test to the exception handler to WARN in the case of having none of OF,
SF, nor ZF set so we can see things like this more easily in the future.
The double dot for the section name keeps it out of the TEXT_MAIN macro
namespace, to avoid collisions and so it can be put at the end with
text.unlikely to keep the cold code together.
See commit:
cb87481ee8 ("kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured")
... which matches C names: [a-zA-Z0-9_] but not ".".
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch <linux-arch@vger.kernel.org>
Fixes: 7a46ec0e2f ("locking/refcounts, x86/asm: Implement fast refcount overflow protection")
Link: http://lkml.kernel.org/r/1504382986-49301-2-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
General updates:
* Constify pci_device_id in various drivers
* Constify device_type
* Remove pad control code from the Gemini driver
* Use %pOF to print OF node full_name
* Various fixes in the physmap_of driver
* Remove unused vars in mtdswap
* Check devm_kzalloc() return value in the spear_smi driver
* Check clk_prepare_enable() return code in the st_spi_fsm driver
* Create per MTD device debugfs enties
NAND updates, from Boris Brezillon:
* Fix memory leaks in the core
* Remove unused NAND locking support
* Rename nand.h into rawnand.h (preparing support for spi NANDs)
* Use NAND_MAX_ID_LEN where appropriate
* Fix support for 20nm Hynix chips
* Fix support for Samsung and Hynix SLC NANDs
* Various cleanup, improvements and fixes in the qcom driver
* Fixes for bugs detected by various static code analysis tools
* Fix mxc ooblayout definition
* Add a new part_parsers to tmio and sharpsl platform data in order to
define a custom list of partition parsers
* Request the reset line in exclusive mode in the sunxi driver
* Fix a build error in the orion-nand driver when compiled for ARMv4
* Allow 64-bit mvebu platforms to select the PXA3XX driver
SPI NOR updates, from Cyrille Pitchen and Marek Vasut:
* add support to the JEDEC JESD216B specification (SFDP tables).
* add support to the Intel Denverton SPI flash controller.
* fix error recovery for Spansion/Cypress SPI NOR memories.
* fix 4-byte address management for the Aspeed SPI controller.
* add support to some Microchip SST26 memory parts
* remove unneeded pinctrl header Write a message for tag:
-----BEGIN PGP SIGNATURE-----
iQJABAABCAAqBQJZrav6Ixxib3Jpcy5icmV6aWxsb25AZnJlZS1lbGVjdHJvbnMu
Y29tAAoJEGXtNgF+CLcABwkP/joDrq09RIC9n5gP+ubJe6O1jKvNWDd6bIVXD3Ke
73R0a0ANwwWlNYWTChTdrb8UeewVS1bzutyy5O2Sbdb6Jc6s7xkfQDTsbET2HWOK
S7Lt/zjlC6/6cow59B6h43PGS6wmIFaZD3K+70sGhvFnV8epVUzS2Aa783xS8LXm
so2djZOdUYnW+yE0eho24VQR6nS4YP4Vc+7Mm9skjU0ifjB9mJiWRkzoQnqIgORO
M+Iab+qjDs9KR/edWh6mZtnvjps0VSW4I40YsClpcgIn550w1DSXe4u6/8Nk+2Bp
gfrALls91gob0ocxmEdIyLID+M0410HcN/Lvh36nw+tkkGTaXf0D6mkqzdKNrZ3w
yz+UV9uf19kr1c6zFGcCvUlD0btn9KT+F2legnhgURtwUyDFQcaYQlkpDIeEzUMV
ZrtzKbSE2v9810YKXjtCnseewdP+Eph/ewN6ODX5yg/fs8K0fyQYTRtYYM50U69X
md8zznBBDPhJVu5T2Of7my9V1SxvCP8a7LrKjAXuFHpZ/CHiPe+QOWBgG2L+zXXT
e10/rTg7T2pcyKpBvL/3/mCYeJ+Iup3lKT1EHGCXcKnLGecVgOsbvdG+JnvQMI2J
FLmu1exvrzi0Gcrs/05hqwyUvkHZ5FB1a+heNOtmQ+h1U0ElXqILyu7brzghupRe
3phO
=UgCd
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20170904' of git://git.infradead.org/linux-mtd
Pull MTD updates from Boris Brezillon:
"General updates:
- Constify pci_device_id in various drivers
- Constify device_type
- Remove pad control code from the Gemini driver
- Use %pOF to print OF node full_name
- Various fixes in the physmap_of driver
- Remove unused vars in mtdswap
- Check devm_kzalloc() return value in the spear_smi driver
- Check clk_prepare_enable() return code in the st_spi_fsm driver
- Create per MTD device debugfs enties
NAND updates, from Boris Brezillon:
- Fix memory leaks in the core
- Remove unused NAND locking support
- Rename nand.h into rawnand.h (preparing support for spi NANDs)
- Use NAND_MAX_ID_LEN where appropriate
- Fix support for 20nm Hynix chips
- Fix support for Samsung and Hynix SLC NANDs
- Various cleanup, improvements and fixes in the qcom driver
- Fixes for bugs detected by various static code analysis tools
- Fix mxc ooblayout definition
- Add a new part_parsers to tmio and sharpsl platform data in order
to define a custom list of partition parsers
- Request the reset line in exclusive mode in the sunxi driver
- Fix a build error in the orion-nand driver when compiled for ARMv4
- Allow 64-bit mvebu platforms to select the PXA3XX driver
SPI NOR updates, from Cyrille Pitchen and Marek Vasut:
- add support to the JEDEC JESD216B specification (SFDP tables).
- add support to the Intel Denverton SPI flash controller.
- fix error recovery for Spansion/Cypress SPI NOR memories.
- fix 4-byte address management for the Aspeed SPI controller.
- add support to some Microchip SST26 memory parts
- remove unneeded pinctrl header Write a message for tag:"
* tag 'for-linus-20170904' of git://git.infradead.org/linux-mtd: (74 commits)
mtd: nand: complain loudly when chip->bits_per_cell is not correctly initialized
mtd: nand: make Samsung SLC NAND usable again
mtd: nand: tmio: Register partitions using the parsers
mfd: tmio: Add partition parsers platform data
mtd: nand: sharpsl: Register partitions using the parsers
mtd: nand: sharpsl: Add partition parsers platform data
mtd: nand: qcom: Support for IPQ8074 QPIC NAND controller
mtd: nand: qcom: support for IPQ4019 QPIC NAND controller
dt-bindings: qcom_nandc: IPQ8074 QPIC NAND documentation
dt-bindings: qcom_nandc: IPQ4019 QPIC NAND documentation
dt-bindings: qcom_nandc: fix the ipq806x device tree example
mtd: nand: qcom: support for different DEV_CMD register offsets
mtd: nand: qcom: QPIC data descriptors handling
mtd: nand: qcom: enable BAM or ADM mode
mtd: nand: qcom: erased codeword detection configuration
mtd: nand: qcom: support for read location registers
mtd: nand: qcom: support for passing flags in DMA helper functions
mtd: nand: qcom: add BAM DMA descriptor handling
mtd: nand: qcom: allocate BAM transaction
mtd: nand: qcom: DMA mapping support for register read buffer
...
Pull x86 asm updates from Ingo Molnar:
- Introduce the ORC unwinder, which can be enabled via
CONFIG_ORC_UNWINDER=y.
The ORC unwinder is a lightweight, Linux kernel specific debuginfo
implementation, which aims to be DWARF done right for unwinding.
Objtool is used to generate the ORC unwinder tables during build, so
the data format is flexible and kernel internal: there's no
dependency on debuginfo created by an external toolchain.
The ORC unwinder is almost two orders of magnitude faster than the
(out of tree) DWARF unwinder - which is important for perf call graph
profiling. It is also significantly simpler and is coded defensively:
there has not been a single ORC related kernel crash so far, even
with early versions. (knock on wood!)
But the main advantage is that enabling the ORC unwinder allows
CONFIG_FRAME_POINTERS to be turned off - which speeds up the kernel
measurably:
With frame pointers disabled, GCC does not have to add frame pointer
instrumentation code to every function in the kernel. The kernel's
.text size decreases by about 3.2%, resulting in better cache
utilization and fewer instructions executed, resulting in a broad
kernel-wide speedup. Average speedup of system calls should be
roughly in the 1-3% range - measurements by Mel Gorman [1] have shown
a speedup of 5-10% for some function execution intense workloads.
The main cost of the unwinder is that the unwinder data has to be
stored in RAM: the memory cost is 2-4MB of RAM, depending on kernel
config - which is a modest cost on modern x86 systems.
Given how young the ORC unwinder code is it's not enabled by default
- but given the performance advantages the plan is to eventually make
it the default unwinder on x86.
See Documentation/x86/orc-unwinder.txt for more details.
- Remove lguest support: its intended role was that of a temporary
proof of concept for virtualization, plus its removal will enable the
reduction (removal) of the paravirt API as well, so Rusty agreed to
its removal. (Juergen Gross)
- Clean up and fix FSGS related functionality (Andy Lutomirski)
- Clean up IO access APIs (Andy Shevchenko)
- Enhance the symbol namespace (Jiri Slaby)
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
objtool: Handle GCC stack pointer adjustment bug
x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone()
x86/fpu/math-emu: Add ENDPROC to functions
x86/boot/64: Extract efi_pe_entry() from startup_64()
x86/boot/32: Extract efi_pe_entry() from startup_32()
x86/lguest: Remove lguest support
x86/paravirt/xen: Remove xen_patch()
objtool: Fix objtool fallthrough detection with function padding
x86/xen/64: Fix the reported SS and CS in SYSCALL
objtool: Track DRAP separately from callee-saved registers
objtool: Fix validate_branch() return codes
x86: Clarify/fix no-op barriers for text_poke_bp()
x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs
selftests/x86/fsgsbase: Test selectors 1, 2, and 3
x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps
x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common
x86/asm: Fix UNWIND_HINT_REGS macro for older binutils
x86/asm/32: Fix regs_get_register() on segment registers
x86/xen/64: Rearrange the SYSCALL entries
x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads
...
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZo2HiAAoJEHm+PkMAQRiG3OcIAJqSeVK2uQ/QhmqFN1ExYay4
bdTjSTtSk7GH6PxI2C0cqfZvsxOUU7ICDHG8bYM1LA0S0SxfOtoFhHGKc/BcFLX8
MiKJWlF51ZbX0mkIEpKF+C8pRrXPgSqtk3N450/k2BzG9qCZSM93A2NCOB7v9T9w
XOBUIYHqfTS2tdmCinjwu8Ls+w8oPOGH1gLjxZyGnBlg4lTqHMcUufmHeVEAh11d
giGByqqqXH69kGD1HNC7H6quzXN9rz4n0gEwEG0mIhfkJ98b+ESSWwSEXXypOAQD
QT5/6+2YizXf5DPCqR46xasQCPjRsS6Sv0cF2cntW2PEAb4jBjhx5gTFlJcoOC8=
=efWJ
-----END PGP SIGNATURE-----
Merge tag 'v4.13-rc7' into mtd/next
Merge v4.13-rc7 back to resolve merge conflicts in
drivers/mtd/nand/nandsim.c and include/asm-generic/vmlinux.lds.h.
When XIP_KERNEL is enabled, some functions are defined in the .data
ELF section because we require them to be in RAM whenever we communicate
with the flash chip. However this causes problems when FTRACE is
enabled and gcc emits calls to __gnu_mcount_nc in the function
prolog:
drivers/built-in.o: In function `cfi_chip_setup':
:(.data+0x272fc): relocation truncated to fit: R_ARM_CALL against symbol `__gnu_mcount_nc' defined in .text section in arch/arm/kernel/built-in.o
drivers/built-in.o: In function `cfi_probe_chip':
:(.data+0x27de8): relocation truncated to fit: R_ARM_CALL against symbol `__gnu_mcount_nc' defined in .text section in arch/arm/kernel/built-in.o
/tmp/ccY172rP.s: Assembler messages:
/tmp/ccY172rP.s:70: Warning: ignoring changed section attributes for .data
/tmp/ccY172rP.s: Error: 1 warning, treating warnings as errors
make[5]: *** [drivers/mtd/chips/cfi_probe.o] Error 1
/tmp/ccK4rjeO.s: Assembler messages:
/tmp/ccK4rjeO.s:421: Warning: ignoring changed section attributes for .data
/tmp/ccK4rjeO.s: Error: 1 warning, treating warnings as errors
make[5]: *** [drivers/mtd/chips/cfi_util.o] Error 1
/tmp/ccUvhCYR.s: Assembler messages:
/tmp/ccUvhCYR.s:1895: Warning: ignoring changed section attributes for .data
/tmp/ccUvhCYR.s: Error: 1 warning, treating warnings as errors
Specifically, this does not work because the .data section is not
marked executable, which leads LD to not generate trampolines for
long calls.
This moves the __xipram functions into their own .xiptext section instead.
The section is still placed next to .data and located in RAM but is marked
executable, which avoids the build errors.
Also, we only need to place the XIP functions into a separate section
if both CONFIG_XIP_KERNEL and CONFIG_MTD_XIP are set: When only MTD_XIP
is used, the whole kernel is still in RAM and we do not need to worry
about pulling out the rug under it. When only XIP_KERNEL but not MTD_XIP
is set, the kernel is in some form of ROM, but we never write to it.
Note that MTD_XIP has been broken on ARM since around 2011 or 2012. I
have sent another patch[2] to fix compilation, which I plan to merge
through arm-soc unless there are objections. The obvious alternative
to that would be to completely rip out the MTD_XIP support from the
kernel, since obviously nobody has been using it in a long while.
Link: [1] https://patchwork.kernel.org/patch/8109771/
Link: [2] https://patchwork.kernel.org/patch/9855225/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Generate irqentry and softirqentry text sections without
any Kconfig dependencies. This will add extra sections, but
there should be no performace impact.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David S . Miller <davem@davemloft.net>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-arch@vger.kernel.org
Cc: linux-cris-kernel@axis.com
Cc: mathieu.desnoyers@efficios.com
Link: http://lkml.kernel.org/r/150172789110.27216.3955739126693102122.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The .data and .bss sections were modified in the generic linker script to
pull in sections named .data.<C identifier>, which are generated by gcc with
-ffunction-sections and -fdata-sections options.
The problem with this pattern is it can also match section names that Linux
defines explicitly, e.g., .data.unlikely. This can cause Linux sections to
get moved into the wrong place.
The way to avoid this is to use ".." separators for explicit section names
(the dot character is valid in a section name but not a C identifier).
However currently there are sections which don't follow this rule, so for
now just disable the wild card by default.
Example: http://marc.info/?l=linux-arm-kernel&m=150106824024221&w=2
Cc: <stable@vger.kernel.org> # 4.9
Fixes: b67067f117 ("kbuild: allow archs to select link dead code/data elimination")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Add the new ORC unwinder which is enabled by CONFIG_ORC_UNWINDER=y.
It plugs into the existing x86 unwinder framework.
It relies on objtool to generate the needed .orc_unwind and
.orc_unwind_ip sections.
For more details on why ORC is used instead of DWARF, see
Documentation/x86/orc-unwinder.txt - but the short version is
that it's a simplified, fundamentally more robust debugninfo
data structure, which also allows up to two orders of magnitude
faster lookups than the DWARF unwinder - which matters to
profiling workloads like perf.
Thanks to Andy Lutomirski for the performance improvement ideas:
splitting the ORC unwind table into two parallel arrays and creating a
fast lookup table to search a subset of the unwind table.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/0a6cbfb40f8da99b7a45a1a8302dc6aef16ec812.1500938583.git.jpoimboe@redhat.com
[ Extended the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Highlights include:
- Support for STRICT_KERNEL_RWX on 64-bit server CPUs.
- Platform support for FSP2 (476fpe) board
- Enable ZONE_DEVICE on 64-bit server CPUs.
- Generic & powerpc spin loop primitives to optimise busy waiting
- Convert VDSO update function to use new update_vsyscall() interface
- Optimisations to hypercall/syscall/context-switch paths
- Improvements to the CPU idle code on Power8 and Power9.
As well as many other fixes and improvements.
Thanks to:
Akshay Adiga, Andrew Donnellan, Andrew Jeffery, Anshuman Khandual, Anton
Blanchard, Balbir Singh, Benjamin Herrenschmidt, Christophe Leroy, Christophe
Lombard, Colin Ian King, Dan Carpenter, Gautham R. Shenoy, Hari Bathini, Ian
Munsie, Ivan Mikhaylov, Javier Martinez Canillas, Madhavan Srinivasan,
Masahiro Yamada, Matt Brown, Michael Neuling, Michal Suchanek, Murilo
Opsfelder Araujo, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul
Mackerras, Pavel Machek, Russell Currey, Santosh Sivaraj, Stephen Rothwell,
Thiago Jung Bauermann, Yang Li.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZXyPCAAoJEFHr6jzI4aWAI9QQAISf2x5y//cqCi4ISyQB5pTq
KLS/yQajNkQOw7c0fzBZOaH5Xd/SJ6AcKWDg8yDlpDR3+sRRsr98iIRECgKS5I7/
DxD9ywcbSoMXFQQo1ZMCp5CeuMUIJRtugBnUQM+JXCSUCPbznCY5DchDTLyTBTpO
MeMVhI//JxthhoOMA9MudiEGaYCU9ho442Z4OJUSiLUv8WRbvQX9pTqoc4vx1fxA
BWf2mflztBVcIfKIyxIIIlDLukkMzix6gSYPMCbC7lzkbnU7JSqKiheJXjo1gJS2
ePHKDxeNR2/QP0g/j3aT/MR1uTt9MaNBSX3gANE1xQ9OoJ8m1sOtCO4gNbSdLWae
eXhDnoiEp930DRZOeEioOItuWWoxFaMyYk3BMmRKV4QNdYL3y3TRVeL2/XmRGqYL
Lxz4IY/x5TteFEJNGcRX90uizNSi8AaEXPF16pUk8Ctt6eH3ZSwPMv2fHeYVCMr0
KFlKHyaPEKEoztyzLcUR6u9QB56yxDN58bvLpd32AeHvKhqyxFoySy59x9bZbatn
B2y8mmDItg860e0tIG6jrtplpOVvL8i5jla5RWFVoQDuxxrLAds3vG9JZQs+eRzx
Fiic93bqeUAS6RzhXbJ6QQJYIyhE7yqpcgv7ME1W87SPef3HPBk9xlp3yIDwdA2z
bBDvrRnvupusz8qCWrxe
=w8rj
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights include:
- Support for STRICT_KERNEL_RWX on 64-bit server CPUs.
- Platform support for FSP2 (476fpe) board
- Enable ZONE_DEVICE on 64-bit server CPUs.
- Generic & powerpc spin loop primitives to optimise busy waiting
- Convert VDSO update function to use new update_vsyscall() interface
- Optimisations to hypercall/syscall/context-switch paths
- Improvements to the CPU idle code on Power8 and Power9.
As well as many other fixes and improvements.
Thanks to: Akshay Adiga, Andrew Donnellan, Andrew Jeffery, Anshuman
Khandual, Anton Blanchard, Balbir Singh, Benjamin Herrenschmidt,
Christophe Leroy, Christophe Lombard, Colin Ian King, Dan Carpenter,
Gautham R. Shenoy, Hari Bathini, Ian Munsie, Ivan Mikhaylov, Javier
Martinez Canillas, Madhavan Srinivasan, Masahiro Yamada, Matt Brown,
Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Naveen N.
Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pavel Machek,
Russell Currey, Santosh Sivaraj, Stephen Rothwell, Thiago Jung
Bauermann, Yang Li"
* tag 'powerpc-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (158 commits)
powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs
powerpc/mm/radix: Implement STRICT_RWX/mark_rodata_ro() for Radix
powerpc/mm/hash: Implement mark_rodata_ro() for hash
powerpc/vmlinux.lds: Align __init_begin to 16M
powerpc/lib/code-patching: Use alternate map for patch_instruction()
powerpc/xmon: Add patch_instruction() support for xmon
powerpc/kprobes/optprobes: Use patch_instruction()
powerpc/kprobes: Move kprobes over to patch_instruction()
powerpc/mm/radix: Fix execute permissions for interrupt_vectors
powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp()
powerpc/64s: Blacklist rtas entry/exit from kprobes
powerpc/64s: Blacklist functions invoked on a trap
powerpc/64s: Un-blacklist system_call() from kprobes
powerpc/64s: Move system_call() symbol to just after setting MSR_EE
powerpc/64s: Blacklist system_call() and system_call_common() from kprobes
powerpc/64s: Convert .L__replay_interrupt_return to a local label
powerpc64/elfv1: Only dereference function descriptor for non-text symbols
cxl: Export library to support IBM XSL
powerpc/dts: Use #include "..." to include local DT
powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8
...
- Added TRACE_DEFINE_SIZEOF() which allows trace events that use
sizeof() it the TP_printk() to be converted to the actual size such
that trace-cmd and perf can parse them correctly.
- Some rework of the TRACE_DEFINE_ENUM() such that the above
TRACE_DEFINE_SIZEOF() could reuse the same code.
- Recording of tgid (Thread Group ID). This is similar to how
task COMMs are recorded (cached at sched_switch), where it is
in a table and used on output of the trace and trace_pipe files.
- Have ":mod:<module>" be cached when written into set_ftrace_filter.
Then the functions of the module will be traced at module load.
- Some random clean ups and small fixes.
-----BEGIN PGP SIGNATURE-----
iQExBAABCAAbBQJZXjYuFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
fsgIAKUvhpn2igoYCR9tWqu+DovEmwxCIumbCzmCFQcRKlLttRte94yY5+W9hnV0
JPzd9T9zBDVqq1fI7iIop1SuTwEfKW6lJom0usZ8AFpK+YKm6FHnQ28POlvHzre2
lzO41tpRWiehLQsITZ47eByhsvEfhx86mYT/oM1JSR6Pii1OpjyNYmDMw6BaMNBT
kSCQFgIhzAhVuHjwAnB/S++E/ou7M5bCwCb5CNh7MubKubV5upHpoJcgYGO+WWa6
56H/iEhff4EECTGJVefd8e78MtJPL8EsuM0nAcMPlnl8AaiOpP7XCdlgTwdefLvP
b3o+nP15voSHkARGXC6eM6gH0po=
=rvGB
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"The new features of this release:
- Added TRACE_DEFINE_SIZEOF() which allows trace events that use
sizeof() it the TP_printk() to be converted to the actual size such
that trace-cmd and perf can parse them correctly.
- Some rework of the TRACE_DEFINE_ENUM() such that the above
TRACE_DEFINE_SIZEOF() could reuse the same code.
- Recording of tgid (Thread Group ID). This is similar to how task
COMMs are recorded (cached at sched_switch), where it is in a table
and used on output of the trace and trace_pipe files.
- Have ":mod:<module>" be cached when written into set_ftrace_filter.
Then the functions of the module will be traced at module load.
- Some random clean ups and small fixes"
* tag 'trace-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (26 commits)
ftrace: Test for NULL iter->tr in regex for stack_trace_filter changes
ftrace: Decrement count for dyn_ftrace_total_info for init functions
ftrace: Unlock hash mutex on failed allocation in process_mod_list()
tracing: Add support for display of tgid in trace output
tracing: Add support for recording tgid of tasks
ftrace: Decrement count for dyn_ftrace_total_info file
ftrace: Remove unused function ftrace_arch_read_dyn_info()
sh/ftrace: Remove only user of ftrace_arch_read_dyn_info()
ftrace: Have cached module filters be an active filter
ftrace: Implement cached modules tracing on module load
ftrace: Have the cached module list show in set_ftrace_filter
ftrace: Add :mod: caching infrastructure to trace_array
tracing: Show address when function names are not found
ftrace: Add missing comment for FTRACE_OPS_FL_RCU
tracing: Rename update the enum_map file
tracing: Add TRACE_DEFINE_SIZEOF() macros
tracing: define TRACE_DEFINE_SIZEOF() macro to map sizeof's to their values
tracing: Rename enum_replace to eval_replace
trace: rename enum_map functions
trace: rename trace.c enum functions
...
The config option name is now renamed to 'TIMER_OF' for consistency with
the CLOCKSOURCE_OF_DECLARE => TIMER_OF_DECLARE change.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
The table name is now renamed to 'timer' for consistency with
the CLOCKSOURCE_OF_DECLARE => TIMER_OF_DECLARE change.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
The kernel and its modules have sections containing the enum
string to value conversions. Rename this section because we
intend to store more than enums in it.
Link: http://lkml.kernel.org/r/20170531215653.3240-2-jeremy.linton@arm.com
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
After discussing it, this feature is dropped as it is not considered
adequate:
https://patchwork.kernel.org/patch/9639317/
There is no user of this macro yet, so there is no impact on the drivers.
This reverts commit 376bc27150.
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Add --orphan-handling=warn to final link flags. This ensures we can
handle all sections explicitly. This would have caught subtle breakage
such as 7de3b27bac at build-time.
Also bring existing orphan sections into the fold:
- .text.hot and .text.unlikely are compiler generated sections.
- .sdata2, .dynsbss, .plt are used by PPC32
- We previously did not specify DWARF_DEBUG or STABS_DEBUG
- DWARF_DEBUG did not include all DWARF sections that can be emitted
- A number of sections are unused and can be discarded.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This includes:
* Some code optimizations for the Intel VT-d driver
* Code to switch off a previously enabled Intel IOMMU
* Support for 'struct iommu_device' for OMAP, Rockchip and
Mediatek IOMMUs
* Some header optimizations for IOMMU core code headers and a
few fixes that became necessary in other parts of the kernel
because of that
* ACPI/IORT updates and fixes
* Some Exynos IOMMU optimizations
* Code updates for the IOMMU dma-api code to bring it closer to
use per-cpu iova caches
* New command-line option to set default domain type allocated
by the iommu core code
* Another command line option to allow the Intel IOMMU switched
off in a tboot environment
* ARM/SMMU: TLB sync optimisations for SMMUv2, Support for using
an IDENTITY domain in conjunction with DMA ops, Support for
SMR masking, Support for 16-bit ASIDs (was previously broken)
* Various other small fixes and improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZEY4XAAoJECvwRC2XARrjth0QAKV56zjnFclv39aDo6eCq9CT
51+XT4bPY5VKQ2+Jx76TBNObHmGK+8KEMHfT9khpWJtFCDyy25SGckLry1nYqmZs
tSTsbj4sOeCyKzOLITlRN9/OzKXkjKAxYuq+sQZZFDFYf3kCM/eag0dGAU6aVLNp
tkIal3CSpGjCQ9M5JohrtQ1mwiGqCIkMIgvnBjRw+bfpLnQNG+VL6VU2G3RAkV2b
5Vbdoy+P7ZQnJSZr/bibYL2BaQs2diR4gOppT5YbsfniMq4QYSjheu1xBboGX8b7
sx8yuPi4370irSan0BDvlvdQdjBKIRiDjfGEKDhRwPhtvN6JREGakhEOC8MySQ37
mP96B72Lmd+a7DEl5udOL7tQILA0DcUCX0aOyF714khnZuFU5tVlCotb/36xeJ+T
FPc3RbEVQ90m8dYU6MNJ+ahtb/ZapxGTRfisIigB6wlnZa0Evabp9EJSce6oJMkm
whbBhDubeEU18n9XAaofMbu+P2LAzq8cxiRMlsDvT4mIy7jO86jjCmhpu1Tfn2GY
4wrEQZdWOMvhUsIhObXA0aC3BzC506uvnKPW3qy041RaxBuelWiBi29qzYbhxzkr
DLDpWbUZNYPyFJjttpavyQb2/XRduBTJdVP1pQpkJNDsW5jLiBkpSqm9xNADapRY
vLSYRX0JCIquaD+PAuxn
=3aE8
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- code optimizations for the Intel VT-d driver
- ability to switch off a previously enabled Intel IOMMU
- support for 'struct iommu_device' for OMAP, Rockchip and Mediatek
IOMMUs
- header optimizations for IOMMU core code headers and a few fixes that
became necessary in other parts of the kernel because of that
- ACPI/IORT updates and fixes
- Exynos IOMMU optimizations
- updates for the IOMMU dma-api code to bring it closer to use per-cpu
iova caches
- new command-line option to set default domain type allocated by the
iommu core code
- another command line option to allow the Intel IOMMU switched off in
a tboot environment
- ARM/SMMU: TLB sync optimisations for SMMUv2, Support for using an
IDENTITY domain in conjunction with DMA ops, Support for SMR masking,
Support for 16-bit ASIDs (was previously broken)
- various other small fixes and improvements
* tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (63 commits)
soc/qbman: Move dma-mapping.h include to qman_priv.h
soc/qbman: Fix implicit header dependency now causing build fails
iommu: Remove trace-events include from iommu.h
iommu: Remove pci.h include from trace/events/iommu.h
arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()
ACPI/IORT: Fix CONFIG_IOMMU_API dependency
iommu/vt-d: Don't print the failure message when booting non-kdump kernel
iommu: Move report_iommu_fault() to iommu.c
iommu: Include device.h in iommu.h
x86, iommu/vt-d: Add an option to disable Intel IOMMU force on
iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed
iommu/arm-smmu: Correct sid to mask
iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
iommu: Make iommu_bus_notifier return NOTIFY_DONE rather than error code
omap3isp: Remove iommu_group related code
iommu/omap: Add iommu-group support
iommu/omap: Make use of 'struct iommu_device'
iommu/omap: Store iommu_dev pointer in arch_data
iommu/omap: Move data structures to omap-iommu.h
iommu/omap: Drop legacy-style device support
...
Pull x86 asm updates from Ingo Molnar:
"The main changes in this cycle were:
- unwinder fixes and enhancements
- improve ftrace interaction with the unwinder
- optimize the code footprint of WARN() and related debugging
constructs
- ... plus misc updates, cleanups and fixes"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/unwind: Dump all stacks in unwind_dump()
x86/unwind: Silence more entry-code related warnings
x86/ftrace: Fix ebp in ftrace_regs_caller that screws up unwinder
x86/unwind: Remove unused 'sp' parameter in unwind_dump()
x86/unwind: Prepend hex mask value with '0x' in unwind_dump()
x86/unwind: Properly zero-pad 32-bit values in unwind_dump()
x86/unwind: Ensure stack pointer is aligned
debug: Avoid setting BUGFLAG_WARNING twice
x86/unwind: Silence entry-related warnings
x86/unwind: Read stack return address in update_stack_state()
x86/unwind: Move common code into update_stack_state()
debug: Fix __bug_table[] in arch linker scripts
debug: Add _ONCE() logic to report_bug()
x86/debug: Define BUG() again for !CONFIG_BUG
x86/debug: Implement __WARN() using UD0
x86/ftrace: Use Makefile logic instead of #ifdef for compiling ftrace_*.o
x86/ftrace: Add -mfentry support to x86_32 with DYNAMIC_FTRACE set
x86/ftrace: Clean up ftrace_regs_caller
x86/ftrace: Add stack frame pointer to ftrace_caller
x86/ftrace: Move the ftrace specific code out of entry_32.S
...
The IORT linker section introduced by commit 34ceea275f
("ACPI/IORT: Introduce linker section for IORT entries probing")
was needed to make sure SMMU drivers are registered (and therefore
probed) in the kernel before devices using the SMMU have a chance
to probe in turn.
Through the introduction of deferred IOMMU configuration the linker
section based IORT probing infrastructure is not needed any longer, in
that device/SMMU probe dependencies are managed through the probe
deferral mechanism, making the IORT linker section infrastructure
unused, so that it can be removed.
Remove the unused IORT linker section probing infrastructure
from the kernel to complete the ACPI IORT IOMMU configure probe
deferral mechanism implementation.
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When __{start,end}_ro_after_init is referenced from C code, we run into
the following build errors on blackfin:
kernel/extable.c:169: undefined reference to `__start_ro_after_init'
kernel/extable.c:169: undefined reference to `__end_ro_after_init'
The build error is due to the fact that blackfin is one of the few
arches that prepends an underscore '_' to all symbols defined in C.
Fix this by wrapping __{start,end}_ro_after_init in vmlinux.lds.h with
VMLINUX_SYMBOL(), which adds the necessary prefix for arches that have
HAVE_UNDERSCORE_SYMBOL_PREFIX.
Link: http://lkml.kernel.org/r/1491259387-15869-1-git-send-email-jeyu@redhat.com
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Eddie Kovsky <ewk@edkovsky.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull timer fixes from Thomas Gleixner:
"Two small fixes for the new CLKEVT_OF infrastructure"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
vmlinux.lds: Add __clkevt_of_table to kernel
clockevents: Fix syntax error in clkevt-of macro
A section name for .data..ro_after_init was added by both:
commit d07a980c1b ("s390: add proper __ro_after_init support")
and
commit d7c19b066d ("mm: kmemleak: scan .data.ro_after_init")
The latter adds incorrect wrapping around the existing s390 section, and
came later. I'd prefer the s390 naming, so this moves the s390-specific
name up to the asm-generic/sections.h and renames the section as used by
kmemleak (and in the future, kernel/extable.c).
Link: http://lkml.kernel.org/r/20170327192213.GA129375@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390 parts]
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Eddie Kovsky <ewk@edkovsky.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Josh suggested moving the _ONCE logic inside the trap handler, using a
bit in the bug_entry::flags field, avoiding the need for the extra
variable.
Sadly this only works for WARN_ON_ONCE(), since the others have
printk() statements prior to triggering the trap.
Still, this saves a fair amount of text and some data:
text data filename
10682460 4530992 defconfig-build/vmlinux.orig
10665111 4530096 defconfig-build/vmlinux.patched
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The code introduced by commit 0c8893c9095d ("clockevents: Add a
clkevt-of mechanism like clksrc-of") refer to __clkevt_of_table
what doesn't exist in the vmlinux. As a result kernel build
failed with error: "clkevt-probe.c:63: undefined reference to
`__clkevt_of_table’"
Fixes: 0c8893c9095d ("clockevents: Add a clkevt-of mechanism like clksrc-of")
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Pull kbuild updates from Michal Marek:
- prototypes for x86 asm-exported symbols (Adam Borowski) and a warning
about missing CRCs (Nick Piggin)
- asm-exports fix for LTO (Nicolas Pitre)
- thin archives improvements (Nick Piggin)
- linker script fix for CONFIG_LD_DEAD_CODE_DATA_ELIMINATION (Nick
Piggin)
- genksyms support for __builtin_va_list keyword
- misc minor fixes
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
x86/kbuild: enable modversions for symbols exported from asm
kbuild: fix scripts/adjust_autoksyms.sh* for the no modules case
scripts/kallsyms: remove last remnants of --page-offset option
make use of make variable CURDIR instead of calling pwd
kbuild: cmd_export_list: tighten the sed script
kbuild: minor improvement for thin archives build
kbuild: modpost warn if export version crc is missing
kbuild: keep data tables through dead code elimination
kbuild: improve linker compatibility with lib-ksyms.o build
genksyms: Regenerate parser
kbuild/genksyms: handle va_list type
kbuild: thin archives for multi-y targets
kbuild: kallsyms allow 3-pass generation if symbols size has changed
Since commit e647b53227 ("ACPI: Add early device probing
infrastructure") the kernel has gained the infrastructure that allows
adding linker script section entries to execute ACPI driver callbacks
(ie probe routines) for all subsystems that register a table entry
in the respective kernel section (eg clocksource, irqchip).
Since ARM IOMMU devices data is described through IORT tables when
booting with ACPI, the ARM IOMMU drivers must be made able to hook ACPI
callback routines that are called to probe IORT entries and initialize
the respective IOMMU devices.
To avoid adding driver specific hooks into IORT table initialization
code (breaking therefore code modularity - ie ACPI IORT code must be made
aware of ARM SMMU drivers ACPI init callbacks), this patch adds code
that allows ARM SMMU drivers to take advantage of the ACPI early probing
infrastructure, so that they can add linker script section entries
containing drivers callback to be executed on IORT tables detection.
Since IORT nodes are differentiated by a type, the callback routines
can easily parse the IORT table entries, check the IORT nodes and
carry out some actions whenever the IORT node type associated with
the driver specific callback is matched.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Tomasz Nowicki <tn@semihalf.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Tomasz Nowicki <tn@semihalf.com>
Cc: Tomasz Nowicki <tn@semihalf.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is enabled we must ensure
that we still keep various programatically-accessed tables.
[npiggin: Fold Paul's patches into one, and add a few more tables.
diff symbol tables of allyesconfig with/without -gc-sections shows up
lost tables quite easily.]
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Limit the number of kmemleak false positives by including
.data.ro_after_init in memory scanning. To achieve this we need to add
symbols for start and end of the section to the linker scripts.
The problem was been uncovered by commit 56989f6d85 ("genetlink: mark
families as __ro_after_init").
Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull kbuild updates from Michal Marek:
- EXPORT_SYMBOL for asm source by Al Viro.
This does bring a regression, because genksyms no longer generates
checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
working on a patch to fix this.
Plus, we are talking about functions like strcpy(), which rarely
change prototypes.
- Fixes for PPC fallout of the above by Stephen Rothwell and Nick
Piggin
- fixdep speedup by Alexey Dobriyan.
- preparatory work by Nick Piggin to allow architectures to build with
-ffunction-sections, -fdata-sections and --gc-sections
- CONFIG_THIN_ARCHIVES support by Stephen Rothwell
- fix for filenames with colons in the initramfs source by me.
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
initramfs: Escape colons in depfile
ppc: there is no clear_pages to export
powerpc/64: whitelist unresolved modversions CRCs
kbuild: -ffunction-sections fix for archs with conflicting sections
kbuild: add arch specific post-link Makefile
kbuild: allow archs to select link dead code/data elimination
kbuild: allow architectures to use thin archives instead of ld -r
kbuild: Regenerate genksyms lexer
kbuild: genksyms fix for typeof handling
fixdep: faster CONFIG_ search
ia64: move exports to definitions
sparc32: debride memcpy.S a bit
[sparc] unify 32bit and 64bit string.h
sparc: move exports to definitions
ppc: move exports to definitions
arm: move exports to definitions
s390: move exports to definitions
m68k: move exports to definitions
alpha: move exports to actual definitions
x86: move exports to actual definitions
...
When doing an nmi backtrace of many cores, most of which are idle, the
output is a little overwhelming and very uninformative. Suppress
messages for cpus that are idling when they are interrupted and just
emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".
We do this by grouping all the cpuidle code together into a new
.cpuidle.text section, and then checking the address of the interrupted
PC to see if it lies within that section.
This commit suitably tags x86 and tile idle routines, and only adds in
the minimal framework for other architectures.
Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
Tested-by: Petr Mladek <pmladek@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Enabling -ffunction-sections modified the generic linker script to
pull .text.* sections into regular TEXT_TEXT section, conflicting
with some architectures. Revert that change and require archs that
enable the option to ensure they have no conflicting section names,
and do the appropriate merging.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Fixes: b67067f117 ("kbuild: allow archs to select link dead code/data elimination")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Introduce LD_DEAD_CODE_DATA_ELIMINATION option for architectures to
select to build with -ffunction-sections, -fdata-sections, and link
with --gc-sections. It requires some work (documented) to ensure all
unreferenced entrypoints are live, and requires toolchain and build
verification, so it is made a per-arch option for now.
On a random powerpc64le build, this yelds a significant size saving,
it boots and runs fine, but there is a lot I haven't tested as yet, so
these savings may be reduced if there are bugs in the link.
text data bss dec filename
11169741 1180744 1923176 14273661 vmlinux
10445269 1004127 1919707 13369103 vmlinux.dce
~700K text, ~170K data, 6% removed from kernel image size.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Pull kbuild updates from Michal Marek:
- GCC plugin support by Emese Revfy from grsecurity, with a fixup from
Kees Cook. The plugins are meant to be used for static analysis of
the kernel code. Two plugins are provided already.
- reduction of the gcc commandline by Arnd Bergmann.
- IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada
- bin2c fix by Michael Tautschnig
- setlocalversion fix by Wolfram Sang
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
gcc-plugins: disable under COMPILE_TEST
kbuild: Abort build on bad stack protector flag
scripts: Fix size mismatch of kexec_purgatory_size
kbuild: make samples depend on headers_install
Kbuild: don't add obj tree in additional includes
Kbuild: arch: look for generated headers in obtree
Kbuild: always prefix objtree in LINUXINCLUDE
Kbuild: avoid duplicate include path
Kbuild: don't add ../../ to include path
vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
kconfig.h: use already defined macros for IS_REACHABLE() define
export.h: use __is_defined() to check if __KSYM_* is defined
kconfig.h: use __is_defined() to check if MODULE is defined
kbuild: setlocalversion: print error to STDERR
Add sancov plugin
Add Cyclomatic complexity GCC plugin
GCC plugin infrastructure
Shared library support
Pull s390 updates from Martin Schwidefsky:
"There are a couple of new things for s390 with this merge request:
- a new scheduling domain "drawer" is added to reflect the unusual
topology found on z13 machines. Performance tests showed up to 8
percent gain with the additional domain.
- the new crc-32 checksum crypto module uses the vector-galois-field
multiply and sum SIMD instruction to speed up crc-32 and crc-32c.
- proper __ro_after_init support, this requires RO_AFTER_INIT_DATA in
the generic vmlinux.lds linker script definitions.
- kcov instrumentation support. A prerequisite for that is the
inline assembly basic block cleanup, which is the reason for the
net/iucv/iucv.c change.
- support for 2GB pages is added to the hugetlbfs backend.
Then there are two removals:
- the oprofile hardware sampling support is dead code and is removed.
The oprofile user space uses the perf interface nowadays.
- the ETR clock synchronization is removed, this has been superseeded
be the STP clock synchronization. And it always has been
"interesting" code..
And the usual bug fixes and cleanups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (82 commits)
s390/pci: Delete an unnecessary check before the function call "pci_dev_put"
s390/smp: clean up a condition
s390/cio/chp : Remove deprecated create_singlethread_workqueue
s390/chsc: improve channel path descriptor determination
s390/chsc: sanitize fmt check for chp_desc determination
s390/cio: make fmt1 channel path descriptor optional
s390/chsc: fix ioctl CHSC_INFO_CU command
s390/cio/device_ops: fix kernel doc
s390/cio: allow to reset channel measurement block
s390/console: Make preferred console handling more consistent
s390/mm: fix gmap tlb flush issues
s390/mm: add support for 2GB hugepages
s390: have unique symbol for __switch_to address
s390/cpuinfo: show maximum thread id
s390/ptrace: clarify bits in the per_struct
s390: stack address vs thread_info
s390: remove pointless load within __switch_to
s390: enable kcov support
s390/cpumf: use basic block for ecctr inline assembly
s390/hypfs: use basic block for diag inline assembly
...
If CONFIG_KASAN is enabled and gcc is configured with
--disable-initfini-array and/or gold linker is used, gcc emits
.ctors/.dtors and .text.startup/.text.exit sections instead of
.init_array/.fini_array. .dtors section is not explicitly accounted in
the linker script and messes vvar/percpu layout.
We want:
ffffffff822bfd80 D _edata
ffffffff822c0000 D __vvar_beginning_hack
ffffffff822c0000 A __vvar_page
ffffffff822c0080 0000000000000098 D vsyscall_gtod_data
ffffffff822c1000 A __init_begin
ffffffff822c1000 D init_per_cpu__irq_stack_union
ffffffff822c1000 A __per_cpu_load
ffffffff822d3000 D init_per_cpu__gdt_page
We got:
ffffffff8279a600 D _edata
ffffffff8279b000 A __vvar_page
ffffffff8279c000 A __init_begin
ffffffff8279c000 D init_per_cpu__irq_stack_union
ffffffff8279c000 A __per_cpu_load
ffffffff8279e000 D __vvar_beginning_hack
ffffffff8279e080 0000000000000098 D vsyscall_gtod_data
ffffffff827ae000 D init_per_cpu__gdt_page
This happens because __vvar_page and .vvar get different addresses in
arch/x86/kernel/vmlinux.lds.S:
. = ALIGN(PAGE_SIZE);
__vvar_page = .;
.vvar : AT(ADDR(.vvar) - LOAD_OFFSET) {
/* work around gold bug 13023 */
__vvar_beginning_hack = .;
Discard .dtors/.fini_array/.text.exit, since we don't call dtors.
Merge .text.startup into init text.
Link: http://lkml.kernel.org/r/1467386363-120030-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org> [4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The use of config_enabled() against config options is ambiguous.
Now, IS_ENABLED() is implemented purely with macro expansion, so
let's replace config_enabled() with IS_ENABLED().
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
commit c74ba8b348 ("arch: Introduce post-init read-only memory")
introduced the __ro_after_init attribute which allows to add variables
to the ro_after_init data section.
This new section was added to rodata, even though it contains writable
data. This in turn causes problems on architectures which mark the
page table entries read-only that point to rodata very early.
This patch allows architectures to implement an own handling of the
.data..ro_after_init section.
Usually that would be:
- mark the rodata section read-only very early
- mark the ro_after_init section read-only within mark_rodata_ro
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
04633df0c4 ("x86/cpu: Call verify_cpu() after having entered long mode too")
added the call to verify_cpu() for sanitizing CPU configuration.
The latter uses the stack minimally and it can happen that we land in
startup_64() directly from a 64-bit bootloader. Then we want to use our
own, known good stack.
Do that.
APs don't need this as the trampoline sets up a stack for them.
Reported-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459434062-31055-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.
Move the definition of __irq_entry to <linux/interrupt.h> so that the
users don't need to pull in <linux/ftrace.h>. Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here's the big tty/serial driver pull request for 4.6-rc1.
Lots of changes in here, Peter has been on a tear again, with lots of
refactoring and bugs fixes, many thanks to the great work he has been
doing. Lots of driver updates and fixes as well, full details in the
shortlog.
All have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlbp8z8ACgkQMUfUDdst+ym1vwCgnOOCORaZyeQ4QrcxPAK5pHFn
VrMAoNHvDgNYtG+Hmzv25Lgp3HnysPin
=MLRG
-----END PGP SIGNATURE-----
Merge tag 'tty-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here's the big tty/serial driver pull request for 4.6-rc1.
Lots of changes in here, Peter has been on a tear again, with lots of
refactoring and bugs fixes, many thanks to the great work he has been
doing. Lots of driver updates and fixes as well, full details in the
shortlog.
All have been in linux-next for a while with no reported issues"
* tag 'tty-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (220 commits)
serial: 8250: describe CONFIG_SERIAL_8250_RSA
serial: samsung: optimize UART rx fifo access routine
serial: pl011: add mark/space parity support
serial: sa1100: make sa1100_register_uart_fns a function
tty: serial: 8250: add MOXA Smartio MUE boards support
serial: 8250: convert drivers to use up_to_u8250p()
serial: 8250/mediatek: fix building with SERIAL_8250=m
serial: 8250/ingenic: fix building with SERIAL_8250=m
serial: 8250/uniphier: fix modular build
Revert "drivers/tty/serial: make 8250/8250_ingenic.c explicitly non-modular"
Revert "drivers/tty/serial: make 8250/8250_mtk.c explicitly non-modular"
serial: mvebu-uart: initial support for Armada-3700 serial port
serial: mctrl_gpio: Add missing module license
serial: ifx6x60: avoid uninitialized variable use
tty/serial: at91: fix bad offset for UART timeout register
tty/serial: at91: restore dynamic driver binding
serial: 8250: Add hardware dependency to RT288X option
TTY, devpts: document pty count limiting
tty: goldfish: support platform_device with id -1
drivers: tty: goldfish: Add device tree bindings
...
One of the easiest ways to protect the kernel from attack is to reduce
the internal attack surface exposed when a "write" flaw is available. By
making as much of the kernel read-only as possible, we reduce the
attack surface.
Many things are written to only during __init, and never changed
again. These cannot be made "const" since the compiler will do the wrong
thing (we do actually need to write to them). Instead, move these items
into a memory region that will be made read-only during mark_rodata_ro()
which happens after all kernel __init code has finished.
This introduces __ro_after_init as a way to mark such memory, and adds
some documentation about the existing __read_mostly marking.
This improves the security of the Linux kernel by marking formerly
read-write memory regions as read-only on a fully booted up system.
Based on work by PaX Team and Brad Spengler.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Brown <david.brown@linaro.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-arch <linux-arch@vger.kernel.org>
Link: http://lkml.kernel.org/r/1455748879-21872-5-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use a single common table of struct earlycon_id for both command line
and devicetree. Re-define OF_EARLYCON_DECLARE() macro to instance a
unique earlycon declaration (the declaration is only guaranteed to be
unique within a compilation unit; separate compilation units must still
use unique earlycon names).
The semantics of OF_EARLYCON_DECLARE() is different; it declares an
earlycon which can matched either on the command line or by devicetree.
EARLYCON_DECLARE() is semantically unchanged; it declares an earlycon
which is matched by command line only. Remove redundant instances of
EARLYCON_DECLARE().
This enables all earlycons to properly initialize struct console
with the appropriate name and index, which improves diagnostics and
enables direct earlycon-to-console handoff.
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
DT enjoys a rather nice probing infrastructure for clocksources,
while ACPI is so far stuck into a very distant past.
This patch introduces a declarative API, allowing clocksources
to be self-contained and be called when parsing the GTDT table.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
DT enjoys a rather nice probing infrastructure for irqchips, while
ACPI is so far stuck into a very distant past.
This patch introduces a declarative API, allowing irqchips to be
self-contained and be called when a particular entry is matched
in the MADT table.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
IRQ controllers and timers are the two types of device the kernel
requires before being able to use the device driver model.
ACPI so far lacks a proper probing infrastructure similar to the one
we have with DT, where we're able to declare IRQ chips and
clocksources inside the driver code, and let the core code pick it up
and call us back on a match. This leads to all kind of really ugly
hacks all over the arm64 code and even in the ACPI layer.
In order to allow some basic probing based on the ACPI tables,
introduce "struct acpi_probe_entry" which contains just enough
data and callbacks to match a table, an optional subtable, and
call a probe function. A driver can, at build time, register itself
and expect being called if the right entry exists in the ACPI
table.
A acpi_probe_device_table() is provided, taking an identifier for
a set of acpi_prove_entries, and iterating over the registered
entries.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When building a kernel with .text.unlikely text the unlikely text for
each translation unit was put next to the main .text code in the
final vmlinux.
The problem is that the linker doesn't allow more specific submatches
of a section name in a different linker script statement after the
main match.
So we need to move them all into one line. With that change
.text.unlikely is at the end of everything again.
I also moved .text.hot into the same statement though, even though
that's not strictly needed.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Here's the big tty/serial driver update for 4.1-rc1.
It was delayed for a bit due to some questions surrounding some of the
console command line parsing changes that are in here. There's still
one tiny regression for people who were previously putting multiple
console command lines and expecting them all to be ignored for some odd
reason, but Peter is working on fixing that. If not, I'll send a revert
for the offending patch, but I have faith that Peter can address it.
Other than the console work here, there's the usual serial driver
updates and changes, and a buch of 8250 reworks to try to make that
driver easier to maintain over time, and have it support more devices in
the future.
All of these have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlU2IcUACgkQMUfUDdst+ylFqACcC8LPhFEZg9aHn0hNUoqGK3rE
5dUAnR4b8r/NYqjVoE9FJZgZfB/TqVi1
=lyN/
-----END PGP SIGNATURE-----
Merge tag 'tty-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here's the big tty/serial driver update for 4.1-rc1.
It was delayed for a bit due to some questions surrounding some of the
console command line parsing changes that are in here. There's still
one tiny regression for people who were previously putting multiple
console command lines and expecting them all to be ignored for some
odd reason, but Peter is working on fixing that. If not, I'll send a
revert for the offending patch, but I have faith that Peter can
address it.
Other than the console work here, there's the usual serial driver
updates and changes, and a buch of 8250 reworks to try to make that
driver easier to maintain over time, and have it support more devices
in the future.
All of these have been in linux-next for a while"
* tag 'tty-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (119 commits)
n_gsm: Drop unneeded cast on netdev_priv
sc16is7xx: expose RTS inversion in RS-485 mode
serial: 8250_pci: port failed after wakeup from S3
earlycon: 8250: Document kernel command line options
earlycon: 8250: Fix command line regression
earlycon: Fix __earlycon_table stride
tty: clean up the tty time logic a bit
serial: 8250_dw: only get the clock rate in one place
serial: 8250_dw: remove useless ACPI ID check
dmaengine: hsu: move memory allocation to GFP_NOWAIT
dmaengine: hsu: remove redundant pieces of code
serial: 8250_pci: add Intel Tangier support
dmaengine: hsu: add Intel Tangier PCI ID
serial: 8250_pci: replace switch-case by formula for Intel MID
serial: 8250_pci: replace switch-case by formula
tty: cpm_uart: replace CONFIG_8xx by CONFIG_CPM1
serial: jsm: some off by one bugs
serial: xuartps: Fix check in console_setup().
serial: xuartps: Get rid of register access macros.
serial: xuartps: Fix iobase use.
...
Pull ARM updates from Russell King:
"Included in this update are both some long term fixes and some new
features.
Fixes:
- An integer overflow in the calculation of ELF_ET_DYN_BASE.
- Avoiding OOMs for high-order IOMMU allocations
- SMP requires the data cache to be enabled for synchronisation
primitives to work, so prevent the CPU_DCACHE_DISABLE option being
visible on SMP builds.
- A bug going back 10+ years in the noMMU ARM94* CPU support code,
where it corrupts registers. Found by folk getting Linux running
on their cameras.
- Versatile Express needs an errata workaround enabled for CPU
hot-unplug to work.
Features:
- Clean up module linker by handling out of range relocations
separately from relocation cases we don't handle.
- Fix a long term bug in the pci_mmap_page_range() code, which we
hope won't impact userspace (we hope there's no users of the
existing broken interface.)
- Don't map DMA coherent allocations when we don't have a MMU.
- Drop experimental status for SMP_ON_UP.
- Warn when DT doesn't specify ePAPR mandatory cache properties.
- Add documentation concerning how we find the start of physical
memory for AUTO_ZRELADDR kernels, detailing why we have chosen the
mask and the implications of changing it.
- Updates from Ard Biesheuvel to address some issues with large
kernels (such as allyesconfig) failing to link.
- Allow hibernation to work on modern (ARMv7) CPUs - this appears to
have never worked in the past on these CPUs.
- Enable IRQ_SHOW_LEVEL, which changes the /proc/interrupts output
format (hopefully without userspace breaking... let's hope that if
it causes someone a problem, they tell us.)
- Fix tegra-ahb DT offsets.
- Rework ARM errata 643719 code (and ARMv7 flush_cache_louis()/
flush_dcache_all()) code to be more efficient, and enable this
errata workaround by default for ARMv7+SMP CPUs. This complements
the Versatile Express fix above.
- Rework ARMv7 context code for errata 430973, so that only Cortex A8
CPUs are impacted by the branch target buffer flush when this
errata is enabled. Also update the help text to indicate that all
r1p* A8 CPUs are impacted.
- Switch ARM to the generic show_mem() implementation, it conveys all
the information which we were already reporting.
- Prevent slow timer sources being used for udelay() - timers running
at less than 1MHz are not useful for this, and can cause udelay()
to return immediately, without any wait. Using such a slow timer
is silly.
- VDSO support for 32-bit ARM, mainly for gettimeofday() using the
ARM architected timer.
- Perf support for Scorpion performance monitoring units"
vdso semantic conflict fixed up as per linux-next.
* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (52 commits)
ARM: update errata 430973 documentation to cover Cortex A8 r1p*
ARM: ensure delay timer has sufficient accuracy for delays
ARM: switch to use the generic show_mem() implementation
ARM: proc-v7: avoid errata 430973 workaround for non-Cortex A8 CPUs
ARM: enable ARM errata 643719 workaround by default
ARM: cache-v7: optimise test for Cortex A9 r0pX devices
ARM: cache-v7: optimise branches in v7_flush_cache_louis
ARM: cache-v7: consolidate initialisation of cache level index
ARM: cache-v7: shift CLIDR to extract appropriate field before masking
ARM: cache-v7: use movw/movt instructions
ARM: allow 16-bit instructions in ALT_UP()
ARM: proc-arm94*.S: fix setup function
ARM: vexpress: fix CPU hotplug with CT9x4 tile.
ARM: 8276/1: Make CPU_DCACHE_DISABLE depend on !SMP
ARM: 8335/1: Documentation: DT bindings: Tegra AHB: document the legacy base address
ARM: 8334/1: amba: tegra-ahb: detect and correct bogus base address
ARM: 8333/1: amba: tegra-ahb: fix register offsets in the macros
ARM: 8339/1: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL
ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility
ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations
...
- Generic PM domains support update including new PM domain
callbacks to handle device initialization better (Russell King,
Rafael J Wysocki, Kevin Hilman).
- Unified device properties API update including a new mechanism
for accessing data provided by platform initialization code
(Rafael J Wysocki, Adrian Hunter).
- ARM cpuidle update including ARM32/ARM64 handling consolidation
(Daniel Lezcano).
- intel_idle update including support for the Silvermont Core in
the Baytrail SOC and for the Airmont Core in the Cherrytrail and
Braswell SOCs (Len Brown, Mathias Krause).
- New cpufreq driver for Hisilicon ACPU (Leo Yan).
- intel_pstate update including support for the Knights Landing
chip (Dasaratharaman Chandramouli, Kristen Carlson Accardi).
- QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann).
- powernv cpufreq driver update (Shilpasri G Bhat).
- devfreq update including Tegra support changes (Tomeu Vizoso,
MyungJoo Ham, Chanwoo Choi).
- powercap RAPL (Running-Average Power Limit) driver update
including support for Intel Broadwell server chips (Jacob Pan,
Mathias Krause).
- ACPI device enumeration update related to the handling of the
special PRP0001 device ID allowing DT-style 'compatible' property
to be used for ACPI device identification (Rafael J Wysocki).
- ACPI EC driver update including limited _DEP support (Lan Tianyu,
Lv Zheng).
- ACPI backlight driver update including a new mechanism to allow
native backlight handling to be forced on non-Windows 8 systems
and a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede).
- New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu).
- Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger,
Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki).
- Fixes related to suspend-to-idle for the iTCO watchdog driver and
the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu).
- PM tracing support for the suspend phase of system suspend/resume
transitions (Zhonghui Fu).
- Configurable delay for the system suspend/resume testing facility
(Brian Norris).
- PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJVLbO+AAoJEILEb/54YlRx5N4QAJXsmEW1FL2l6mMAyTQkEsVj
nbqjF9I6aJgYM9+i8GKaZJxpN17SAZ7Ii7aCAXjPwX8AvjT70+gcZr+KDWtPir61
B75VNVEcUYOR4vOF5Z6rQcQMlhGPkfMOJYXFMahpOG6DdPbVh1x2/tuawfc6IC0V
a6S/fln6WqHrXQ+8swDSv1KuZsav6+8AQaTlNUQkkuXdY9b3k/3xiy5C2K26APP8
x1B39iAF810qX6ipnK0gEOC3Vs29dl7hvNmgOVmmkBGVS7+pqTuy5n1/9M12cDRz
78IQ7DXB0NcSwr5tdrmGVUyH0Q6H9lnD3vO7MJkYwKDh5a/2MiBr2GZc4KHDKDWn
E1sS27f1Pdn9qnpWLzTcY+yYNV3EEyre56L2fc+sh+Xq9sNOjUah+Y/eAej/IxYD
XYRf+GAj768yCJgNP+Y3PJES/PRh+0IZ/dn5k0Qq2iYvc8mcObyG6zdQIvCucv/i
70uV1Z2GWEb31cI9TUV8o5GrMW3D0KI9EsCEEpiFFUnhjNog3AWcerGgFQMHxu7X
ZnNSzudvek+XJ3NtpbPgTiJAmnMz8bDvBQm3G1LUO2TQdjYTU6YMUHsfzXs8DL6c
aIMWO4stkVuDtWrlT/hfzIXepliccyXmSP6sbH+zNNCepulXe5C4M2SftaDi4l/B
uIctXWznvHoGys+EFL+v
=erd3
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"These are mostly fixes and cleanups all over, although there are a few
items that sort of fall into the new feature category.
First off, we have new callbacks for PM domains that should help us to
handle some issues related to device initialization in a better way.
There also is some consolidation in the unified device properties API
area allowing us to use that inferface for accessing data coming from
platform initialization code in addition to firmware-provided data.
We have some new device/CPU IDs in a few drivers, support for new
chips and a new cpufreq driver too.
Specifics:
- Generic PM domains support update including new PM domain callbacks
to handle device initialization better (Russell King, Rafael J
Wysocki, Kevin Hilman)
- Unified device properties API update including a new mechanism for
accessing data provided by platform initialization code (Rafael J
Wysocki, Adrian Hunter)
- ARM cpuidle update including ARM32/ARM64 handling consolidation
(Daniel Lezcano)
- intel_idle update including support for the Silvermont Core in the
Baytrail SOC and for the Airmont Core in the Cherrytrail and
Braswell SOCs (Len Brown, Mathias Krause)
- New cpufreq driver for Hisilicon ACPU (Leo Yan)
- intel_pstate update including support for the Knights Landing chip
(Dasaratharaman Chandramouli, Kristen Carlson Accardi)
- QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann)
- powernv cpufreq driver update (Shilpasri G Bhat)
- devfreq update including Tegra support changes (Tomeu Vizoso,
MyungJoo Ham, Chanwoo Choi)
- powercap RAPL (Running-Average Power Limit) driver update including
support for Intel Broadwell server chips (Jacob Pan, Mathias Krause)
- ACPI device enumeration update related to the handling of the
special PRP0001 device ID allowing DT-style 'compatible' property
to be used for ACPI device identification (Rafael J Wysocki)
- ACPI EC driver update including limited _DEP support (Lan Tianyu,
Lv Zheng)
- ACPI backlight driver update including a new mechanism to allow
native backlight handling to be forced on non-Windows 8 systems and
a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede)
- New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu)
- Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger,
Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki)
- Fixes related to suspend-to-idle for the iTCO watchdog driver and
the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu)
- PM tracing support for the suspend phase of system suspend/resume
transitions (Zhonghui Fu)
- Configurable delay for the system suspend/resume testing facility
(Brian Norris)
- PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki)"
* tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
ACPI / scan: Fix NULL pointer dereference in acpi_companion_match()
ACPI / scan: Rework modalias creation when "compatible" is present
intel_idle: mark cpu id array as __initconst
powercap / RAPL: mark rapl_ids array as __initconst
powercap / RAPL: add ID for Broadwell server
intel_pstate: Knights Landing support
intel_pstate: remove MSR test
cpufreq: fix qoriq uniprocessor build
ACPI / scan: Take the PRP0001 position in the list of IDs into account
ACPI / scan: Simplify acpi_match_device()
ACPI / scan: Generalize of_compatible matching
device property: Introduce firmware node type for platform data
device property: Make it possible to use secondary firmware nodes
PM / watchdog: iTCO: stop watchdog during system suspend
cpufreq: hisilicon: add acpu driver
ACPI / EC: Call acpi_walk_dep_device_list() after installing EC opregion handler
cpufreq: powernv: Report cpu frequency throttling
intel_idle: Add support for the Airmont Core in the Cherrytrail and Braswell SOCs
intel_idle: Update support for Silvermont Core in Baytrail SOC
PM / devfreq: tegra: Register governor on module init
...
The compiler and the linker must agree on the alignment of
struct earlycon_id; empirical testing and commit 07fca0e57f
("tracing: Properly align linker defined symbols") suggests
32-byte alignment is the LCD.
Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.
For example, the tracepoint trace_tlb_flush() has:
__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.
With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:
By adding:
TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
$ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })
The above is what userspace expects to see, and tools do not need to
be modified to parse them.
Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org
Cc: Guilherme Cox <cox@computer.org>
Cc: Tony Luck <tony.luck@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This introduces a new .text.fixup input section that gets emitted
together with the .text section for each input object file.
Note that
*(.text)
*(.text.fixup)
is not the same as
*(.text .text.fixup)
and we are looking for the latter, to ensure that fixup snippets that
are assembled into a separate section in the object file do not end
up out of range for the relative branch instructions it contains if
the .text section itself grows very large.
This helps prevent linker failures on large ARM kernels.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Earlycon matching can only be triggered if 'earlycon=...' has been
specified on the kernel command line. To workaround this limitation
requires tight coupling between arches and specific serial drivers
in order to start an earlycon. Devicetree avoids this limitation
with a link table that contains the required data to match earlycons.
Mirror this approach for earlycon match by name. Re-purpose
EARLYCON_DECLARE to generate a table entry which associates name with
setup() function. Re-purpose setup_earlycon() to scan this table for
an earlycon match, which is registered if found.
Declare one "earlycon" early_param, which calls setup_earlycon().
This design allows setup_earlycon() to be called directly with a
param string (as if 'earlycon=...' had been set on the command line).
Re-registration (either directly or by early_param) is prevented.
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current state of the different cpuidle drivers is the different PM
operations are passed via the platform_data using the platform driver
paradigm.
This approach allowed to split the low level PM code from the arch specific
and the generic cpuidle code.
Unfortunately there are complaints about this approach as, in the context of the
single kernel image, we have multiple drivers loaded in memory for nothing and
the platform driver is not adequate for cpuidle.
This patch provides a common interface via cpuidle ops for all new cpuidle
driver and a definition for the device tree.
It will allow with the next patches to a have a common definition with ARM64
and share the same cpuidle driver.
The code is optimized to use the __init section intensively in order to reduce
the memory footprint after the driver is initialized and unify the function
names with ARM64.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Rob Herring <robherring2@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
KASan uses constructors for initializing redzones for global variables.
Globals instrumentation in GCC 4.9.2 produces constructors with priority
(.init_array.00099)
Currently kernel ignores such constructors. Only constructors with
default priority supported (.init_array)
This patch adds support for constructors with priorities. For kernel
image we put pointers to constructors between __ctors_start/__ctors_end
and do_ctors() will call them on start up. For modules we merge
.init_array.* sections into resulting .init_array. Module code properly
handles constructors in .init_array section.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
IOMMU drivers must be initialised before any of their upstream devices,
otherwise the relevant iommu_ops won't be configured for the bus in
question. To solve this, a number of IOMMU drivers use initcalls to
initialise the driver before anything has a chance to be probed.
Whilst this solves the immediate problem, it leaves the job of probing
the IOMMU completely separate from the iommu_ops to configure the IOMMU,
which are called on a per-bus basis and require the driver to figure out
exactly which instance of the IOMMU is being requested. In particular,
the add_device callback simply passes a struct device to the driver,
which then has to parse firmware tables or probe buses to identify the
relevant IOMMU instance.
This patch takes the first step in addressing this problem by adding an
early initialisation pass for IOMMU drivers, giving them the ability to
store some per-instance data in their iommu_ops structure and store that
in their of_node. This can later be used when parsing OF masters to
identify the IOMMU instance in question.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch changes the __init_end address to a
page align address, so that free_initmem() can
free the whole .init section, because if the end
address is not page aligned, it will round down to
a page align address, then the tail unligned page
will not be freed.
Signed-off-by: wang <yalin.wang2010@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This resolves a number of merge issues with changes in this tree and
Linus's tree at the same time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes a typo that named the read_mostly section of percpu as
readmostly. It works fine with SMP because the linker script specifies
.data..percpu..readmostly. However, UP kernel builds don't have percpu
sections defined and the non-percpu version of the section is called
data..read_mostly, so .data..readmostly will float around and may break
things unexpectedly.
Looking at the original change that introduced data..percpu..readmostly
(commit c957ef2c59), it looks like this
was the original intention.
Tested: Built UP kernel and confirmed the sections got merged.
- Before the patch:
$ objdump -h vmlinux.o | grep '\.data\.\.read.*mostly'
38 .data..read_mostly 00004418 0000000000000000 0000000000000000 00431ac0 2**6
50 .data..readmostly 00000014 0000000000000000 0000000000000000 00444000 2**3
- After the patch:
$ objdump -h vmlinux.o | grep '\.data\.\.read.*mostly'
38 .data..read_mostly 00004438 0000000000000000 0000000000000000 00431ac0 2**6
Signed-off-by: Zhengyu He <hzy@google.com>
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add pci_fixup_suspend_late as a new pci_fixup_pass. The pass is called
from suspend_noirq and poweroff_noirq. Using the same pass for suspend
and hibernate is consistent with resume_early which is called by
resume_noirq and restore_noirq.
The new quirk pass is required for Thunderbolt support on Apple
hardware.
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull more perf updates from Ingo Molnar:
"A second round of perf updates:
- wide reaching kprobes sanitization and robustization, with the hope
of fixing all 'probe this function crashes the kernel' bugs, by
Masami Hiramatsu.
- uprobes updates from Oleg Nesterov: tmpfs support, corner case
fixes and robustization work.
- perf tooling updates and fixes from Jiri Olsa, Namhyung Ki, Arnaldo
et al:
* Add support to accumulate hist periods (Namhyung Kim)
* various fixes, refactorings and enhancements"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits)
perf: Differentiate exec() and non-exec() comm events
perf: Fix perf_event_comm() vs. exec() assumption
uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates
perf/documentation: Add description for conditional branch filter
perf/x86: Add conditional branch filtering support
perf/tool: Add conditional branch filter 'cond' to perf record
perf: Add new conditional branch filter 'PERF_SAMPLE_BRANCH_COND'
uprobes: Teach copy_insn() to support tmpfs
uprobes: Shift ->readpage check from __copy_insn() to uprobe_register()
perf/x86: Use common PMU interrupt disabled code
perf/ARM: Use common PMU interrupt disabled code
perf: Disable sampled events if no PMU interrupt
perf: Fix use after free in perf_remove_from_context()
perf tools: Fix 'make help' message error
perf record: Fix poll return value propagation
perf tools: Move elide bool into perf_hpp_fmt struct
perf tools: Remove elide setup for SORT_MODE__MEMORY mode
perf tools: Fix "==" into "=" in ui_browser__warning assignment
perf tools: Allow overriding sysfs and proc finding with env var
perf tools: Consider header files outside perf directory in tags target
...
This adds the infrastructure to generic earlycon for earlycon setup
using DT. The actual setup is not enabled until a following commit to
add the FDT parsing.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Grant Likely <grant.likely@linaro.org>
OF table sections all have the same pattern, so create a macro to define
them and insure consistency.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Herring <robh@kernel.org>
The cpu_method_of_table is the oddball of the various OF linker sections.
In preparation to have common linker section definitions, align the
cpu_method_of_table with the other definitions for the naming and ending
with a blank struct.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Make the irqchip OF match table section naming aligned with other
OF match table sections in preparation to have a common definition.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Herring <robh@kernel.org>
ARC Linux (not supporting native unaligned access) was failing
to boot because __start_kprobe_blacklist was not aligned.
This was because per generated vmlinux.lds it was emitted right
next to .rodata with strings etc hence could be randomly
unaligned.
Fix that by ensuring a word alignment. While 4 would suffice for
32bit arches and problem at hand, it is probably better to put 8.
| Path: (null) CPU: 0 PID: 1 Comm: swapper Not tainted
| 3.15.0-rc3-next-20140430 #2
| task: 8f044000 ti: 8f01e000 task.ti: 8f01e000
|
| [ECR ]: 0x00230400 => Misaligned r/w from 0x800fb0d3
| [EFA ]: 0x800fb0d3
| [BLINK ]: do_one_initcall+0x86/0x1bc
| [ERET ]: init_kprobes+0x52/0x120
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: <torvalds@linux-foundation.org>
Cc: <rusty@rustcorp.com.au>
Cc: <rdunlap@infradead.org>
Cc: <jeremy@goop.org>
Cc: <arnd@arndb.de>
Cc: <dl9pf@gmx.de>
Cc: <sparse@chrisli.org>
Cc: <anil.s.keshavamurthy@intel.com>
Cc: <davem@davemloft.net>
Cc: <ananth@in.ibm.com>
Cc: <masami.hiramatsu.pt@hitachi.com>
Cc: <chrisw@sous-sol.org>
Cc: <akataria@vmware.com>
Cc: anton Kolesov <Anton.Kolesov@synopsys.com>
Link: http://lkml.kernel.org/r/5361DB14.7010406@synopsys.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Introduce NOKPROBE_SYMBOL() macro which builds a kprobes
blacklist at kernel build time.
The usage of this macro is similar to EXPORT_SYMBOL(),
placed after the function definition:
NOKPROBE_SYMBOL(function);
Since this macro will inhibit inlining of static/inline
functions, this patch also introduces a nokprobe_inline macro
for static/inline functions. In this case, we must use
NOKPROBE_SYMBOL() for the inline function caller.
When CONFIG_KPROBES=y, the macro stores the given function
address in the "_kprobe_blacklist" section.
Since the data structures are not fully initialized by the
macro (because there is no "size" information), those
are re-initialized at boot time by using kallsyms.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/20140417081705.26341.96719.stgit@ltc230.yrl.intra.hitachi.co.jp
Cc: Alok Kataria <akataria@vmware.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christopher Li <sparse@chrisli.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jan-Simon Möller <dl9pf@gmx.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-arch@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-sparse@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Lots of changes specific to one of the SoC families. Some that
stick out are:
* mach-qcom gains new features, most importantly SMP support for
the newer chips (Stephen Boyd, Rohit Vaswani)
* mvebu gains support for three new SoCs: Armada 375, 380 and 385
(Thomas Petazzoni and Free-electrons team)
* SMP support for Rockchips (Heiko Stübner)
* Lots of i.MX changes (Shawn Guo)
* Added support for BCM5301x SoC (Hauke Mehrtens)
* Multiplatform support for Marvell Kirkwood and Dove
(Andrew Lunn and Sebastian Hesselbarth doing the final part
of a long journey)
* Unify davinci platforms and remove obsolete ones (Sekhar Nori,
Arnd Bergmann)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUz/yT2CrR//JCVInAQJN8A/9Ft1rfp4LEe8Lpr9yAZydG4UaJKy8Hh7Z
fmohMAuy88J+8jzdwQKKCeEiId+nIf+WmFIQDn9YRDev1/T2v32Ax49XuGtY47JX
4loIC2wR0+j1aSwhEVOmlM03lX7Hbu6iNDkxaLkDKTRrt3DhDNA6cPZYwNOT273W
Yx7hIDpvsoOVN3zbPwqhwLrXgywsaNB9E7ly1GixRd1thdg46kMRcM0LJSXPH3we
pyx7sZbILTVMeUx79XUTvBDJYsbjJWFZknVDYXGkrS5YxAASVsVW2KW9fP9E+UXE
wTmOxg6spsHGgCezwy8NL5UmfaAOXL3mm6ginFwWpyz7Iu+P5IvfR1W+8UA/O8tp
K9y8wLA64chPQJkAGaPQBqUPq9QkNHodZWgaPKxKuuv3qF481DCnQKkFRz+sl7mu
oQVGnoMCnTY6L6yYcIq/GpgiJ731vwefirAwPR8FEBN/gw/gC01b+DDchx/5inPJ
6V6dCEtPZxXMOsIaYBWFauk3pMFU3E8coklmteyYDQg7eb+55Zq3vsNEpu/vb6ll
M660AQzzbkZ7lgsSBdNODEvkNH15kC35G2UCfwy99uCE4k/0Vi7reJ1BzXkc+dtJ
+maBtA6NMALXQ/EI+B+fZLccI4Hv7avwFy1rQJaf+TLiFvTd9yp0qUX8JjXWDPgu
pPWQOC4a9mU=
=AGpV
-----END PGP SIGNATURE-----
Merge tag 'soc-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC specific changes from Arnd Bergmann:
"Lots of changes specific to one of the SoC families. Some that stick
out are:
- mach-qcom gains new features, most importantly SMP support for the
newer chips (Stephen Boyd, Rohit Vaswani)
- mvebu gains support for three new SoCs: Armada 375, 380 and 385
(Thomas Petazzoni and Free-electrons team)
- SMP support for Rockchips (Heiko Stübner)
- Lots of i.MX changes (Shawn Guo)
- Added support for BCM5301x SoC (Hauke Mehrtens)
- Multiplatform support for Marvell Kirkwood and Dove (Andrew Lunn
and Sebastian Hesselbarth doing the final part of a long journey)
- Unify davinci platforms and remove obsolete ones (Sekhar Nori, Arnd
Bergmann)"
* tag 'soc-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (126 commits)
ARM: sunxi: Select HAVE_ARM_ARCH_TIMER
ARM: cache-tauros2: remove ARMv6 code
ARM: mvebu: don't select CONFIG_NEON
ARM: davinci: fix DT booting with default defconfig
ARM: configs: bcm_defconfig: enable bcm590xx regulator support
ARM: davinci: remove tnetv107x support
MAINTAINERS: Update ARM STi maintainers
ARM: restrict BCM_KONA_UART to ARCH_BCM_MOBILE
ARM: bcm21664: Add board support.
ARM: sunxi: Add the new watchog compatibles to the reboot code
ARM: enable ARM_HAS_SG_CHAIN for multiplatform
ARM: davinci: remove da8xx_omapl_defconfig
ARM: davinci: da8xx: fix multiple watchdog device registration
ARM: davinci: add da8xx specific configs to davinci_all_defconfig
ARM: davinci: enable da8xx build concurrently with older devices
ARM: BCM5301X: workaround suppress fault
ARM: BCM5301X: add early debugging support
ARM: BCM5301X: initial support for the BCM5301X/BCM470X SoCs with ARM CPU
ARM: mach-bcm: Remove GENERIC_TIME
ARM: shmobile: APMU: Fix warnings due to improper printk formats
...
Add support for custom reserved memory drivers. Call their init() function
for each reserved region and prepare for using operations provided by them
with by the reserved_mem->ops array.
Based on previous code provided by Josh Cartwright <joshc@codeaurora.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
The goal of multi-platform kernels is to remove the need for mach
directories and machine descriptors. To further that goal,
introduce CPU_METHOD_OF_DECLARE() to allow cpu hotplug/smp
support to be separated from the machine descriptors.
Implementers should specify an enable-method property in their
cpus node and then implement a matching set of smp_ops in their
hotplug/smp code, wiring it up with the CPU_METHOD_OF_DECLARE()
macro. When the kernel is compiled we'll collect all the
enable-method smp_ops into one section for use at boot.
At boot time we'll look for an enable-method in each cpu node and
try to match that against all known CPU enable methods in the
kernel. If there are no enable-methods in the cpu nodes we
fallback to the cpus node and try to use any enable-method found
there. If that doesn't work we fall back to the old way of using
the machine descriptor.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: <devicetree@vger.kernel.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Kumar Gala <galak@codeaurora.org>
This adds the .init_array section as yet another section with constructors. This
is needed because gcc could add __gcov_init calls to .init_array or .ctors
section, depending on gcc (and binutils) version .
v2: - reuse mod->ctors for .init_array section for modules, because gcc uses
.ctors or .init_array, but not both at the same time
v3: - fail to load if that does happen somehow.
Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are several tracepoints (mostly in RCU), that reference a string
pointer and uses the print format of "%s" to display the string that
exists in the kernel, instead of copying the actual string to the
ring buffer (saves time and ring buffer space).
But this has an issue with userspace tools that read the binary buffers
that has the address of the string but has no access to what the string
itself is. The end result is just output that looks like:
rcu_dyntick: ffffffff818adeaa 1 0
rcu_dyntick: ffffffff818adeb5 0 140000000000000
rcu_dyntick: ffffffff818adeb5 0 140000000000000
rcu_utilization: ffffffff8184333b
rcu_utilization: ffffffff8184333b
The above is pretty useless when read by the userspace tools. Ideally
we would want something that looks like this:
rcu_dyntick: Start 1 0
rcu_dyntick: End 0 140000000000000
rcu_dyntick: Start 140000000000000 0
rcu_callback: rcu_preempt rhp=0xffff880037aff710 func=put_cred_rcu 0/4
rcu_callback: rcu_preempt rhp=0xffff880078961980 func=file_free_rcu 0/5
rcu_dyntick: End 0 1
The trace_printk() which also only stores the address of the string
format instead of recording the string into the buffer itself, exports
the mapping of kernel addresses to format strings via the printk_format
file in the debugfs tracing directory.
The tracepoint strings can use this same method and output the format
to the same file and the userspace tools will be able to decipher
the address without any modification.
The tracepoint strings need its own section to save the strings because
the trace_printk section will cause the trace_printk() buffers to be
allocated if anything exists within the section. trace_printk() is only
used for debugging and should never exist in the kernel, we can not use
the trace_printk sections.
Add a new tracepoint_str section that will also be examined by the output
of the printk_format file.
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pull first stage of __cpuinit removal from Paul Gortmaker:
"The two commits here 1) dummy out all the __cpuinit macros so that we
no longer generate such sections, and then 2) remove all the section
processing that we used to do for those sections.
This makes all the __cpuinit and friends no-ops, so that we can remove
the use cases of it at our leisure. Expect stage 2, which does the
tree wide removal sweep at the end of the merge window."
* 'cpuinit-delete' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
modpost: remove all traces of cpuinit/cpuexit sections
init.h: remove __cpuinit sections from the kernel
Rework RapidIO switch drivers to add an option to build them as loadable
kernel modules.
This patch removes RapidIO-specific vmlinux section and converts switch
drivers to be compatible with LDM driver registration method. To simplify
registration of device-specific callback routines this patch introduces
rio_switch_ops data structure. The sw_sysfs() callback is removed from
the list of device-specific operations because under the new structure its
functions can be handled by switch driver's probe() and remove() routines.
If a specific switch device driver is not loaded the RapidIO subsystem
core will use default standard-based operations to configure a switch.
Because the current implementation of RapidIO enumeration/discovery method
relies on availability of device-specific operations for error management,
switch device drivers must be loaded before the RapidIO
enumeration/discovery starts.
This patch also moves several common routines from enumeration/discovery
module into the RapidIO core code to make switch-specific operations
accessible to all components of RapidIO subsystem.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Andre van Herk <andre.van.herk@Prodrive.nl>
Cc: Micha Nelissen <micha.nelissen@Prodrive.nl>
Cc: Stef van Os <stef.van.os@Prodrive.nl>
Cc: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Delete all audit rules that were checking how the .cpuXYZ
related sections were inter-operating with other __init
like sections, now that __cpuinit is gone. Update the linker
script to not have any knowledge of .cpuinit sections.
[lds.h update courtesy of Ralf Baechle <ralf@linux-mips.org>]
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Ever since commit 45f035ab9b ("CONFIG_HOTPLUG should be always on"),
it has been basically impossible to build a kernel with CONFIG_HOTPLUG
turned off. Remove all the remaining references to it.
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have CONFIG_SYMBOL_PREFIX, which three archs define to the string
"_". But Al Viro broke this in "consolidate cond_syscall and
SYSCALL_ALIAS declarations" (in linux-next), and he's not the first to
do so.
Using CONFIG_SYMBOL_PREFIX is awkward, since we usually just want to
prefix it so something. So various places define helpers which are
defined to nothing if CONFIG_SYMBOL_PREFIX isn't set:
1) include/asm-generic/unistd.h defines __SYMBOL_PREFIX.
2) include/asm-generic/vmlinux.lds.h defines VMLINUX_SYMBOL(sym)
3) include/linux/export.h defines MODULE_SYMBOL_PREFIX.
4) include/linux/kernel.h defines SYMBOL_PREFIX (which differs from #7)
5) kernel/modsign_certificate.S defines ASM_SYMBOL(sym)
6) scripts/modpost.c defines MODULE_SYMBOL_PREFIX
7) scripts/Makefile.lib defines SYMBOL_PREFIX on the commandline if
CONFIG_SYMBOL_PREFIX is set, so that we have a non-string version
for pasting.
(arch/h8300/include/asm/linkage.h defines SYMBOL_NAME(), too).
Let's solve this properly:
1) No more generic prefix, just CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX.
2) Make linux/export.h usable from asm.
3) Define VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR().
4) Make everyone use them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com> (metag)
A large number of cleanups, all over the platforms. This is dominated
largely by the Samsung platforms (s3c, s5p, exynos) and a few of the
others moving code out of arch/arm into more appropriate subsystems.
The clocksource and irqchip drivers are now abstracted to the point
where platforms that are already cleaned up do not need to even specify
the driver they use, it can all get configured from the device tree
as we do for normal device drivers. The clocksource changes basically
touch every single platform in the process.
We further clean up the use of platform specific header files here,
with the goal of turning more of the platforms over to being
"multiplatform" enabled, which implies that they cannot expose
their headers to architecture independent code any more.
It is expected that no functional changes are part of the cleanup.
The overall reduction in total code lines is mostly the result of
removing broken and obsolete code.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUSUyKmCrR//JCVInAQIN8RAAnb/uPytmlMjn5yCksF4Mvb/FVbn/TVwz
KRIGpCHOzyKK1q7pM8NRUVWfjW2SZqbXJFqx6zBGKSlDPvFTOhsLyyupU+Tnyu5W
IX4eIUBwb+a6H7XDHw0X2YI8uHzi5RNLhne0A1QyDKcnuHs1LDAttXnJHaK4Ap6Y
NN2YFt3l3ld7DXWXJtMsw5v8lC10aeIFGTvXefaPDAdeMLivmI57qEUMDXknNr7W
Odz/Rc0/cw3BNBVl/zNHA0jw7FOjKAymCYYNUa4xDCJEr+JnIRTqizd0N/YIIC7x
aA2xjJ3oKUFyF51yiJE6nFuTyJznhwtehc+uiMOSIkjrPLym52LEHmd7G5Yqlmjz
oiei09qBb870q3lGxwfht9iaeIwYgQFYGfD0yW5QWArCO5pxhtCPLPH7YZNZtcQd
ZJRSGGqT/ljBz3bm0K9OLESeeTTN7+Nxvtpiz/CD+Piegz0gWJzDYJRTzkJ3UWpA
WTVhVQdWUeX2JrNkgM7Z3Tu8iXOe+LIEs7kVXGJZSREmIIZiRvR36UrODZtAkp9I
7YQ+srX/uaR832pgK0RrHK0zY0psU6MmIvhYxJZFbx7keiPA9eH6drb0x7tGqcUD
FzEUzvcZvyqppndfBi+R60H/YKAhJDEXdwxzo6dyCpPQaW1T9GnzIqXuE1zin+Aw
X7Y8YywMbHI=
=DvgJ
-----END PGP SIGNATURE-----
Merge tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC cleanups from Arnd Bergmann:
"A large number of cleanups, all over the platforms. This is dominated
largely by the Samsung platforms (s3c, s5p, exynos) and a few of the
others moving code out of arch/arm into more appropriate subsystems.
The clocksource and irqchip drivers are now abstracted to the point
where platforms that are already cleaned up do not need to even
specify the driver they use, it can all get configured from the device
tree as we do for normal device drivers. The clocksource changes
basically touch every single platform in the process.
We further clean up the use of platform specific header files here,
with the goal of turning more of the platforms over to being
"multiplatform" enabled, which implies that they cannot expose their
headers to architecture independent code any more.
It is expected that no functional changes are part of the cleanup.
The overall reduction in total code lines is mostly the result of
removing broken and obsolete code."
* tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (133 commits)
ARM: mvebu: correct gated clock documentation
ARM: kirkwood: add missing include for nsa310
ARM: exynos: move exynos4210-combiner to drivers/irqchip
mfd: db8500-prcmu: update resource passing
drivers/db8500-cpufreq: delete dangling include
ARM: at91: remove NEOCORE 926 board
sunxi: Cleanup the reset code and add meaningful registers defines
ARM: S3C24XX: header mach/regs-mem.h local
ARM: S3C24XX: header mach/regs-power.h local
ARM: S3C24XX: header mach/regs-s3c2412-mem.h local
ARM: S3C24XX: Remove plat-s3c24xx directory in arch/arm/
ARM: S3C24XX: transform s3c2443 subirqs into new structure
ARM: S3C24XX: modify s3c2443 irq init to initialize all irqs
ARM: S3C24XX: move s3c2443 irq code to irq.c
ARM: S3C24XX: transform s3c2416 irqs into new structure
ARM: S3C24XX: modify s3c2416 irq init to initialize all irqs
ARM: S3C24XX: move s3c2416 irq init to common irq code
ARM: S3C24XX: Modify s3c_irq_wake to use the hwirq property
ARM: S3C24XX: Move irq syscore-ops to irq-pm
clocksource: always define CLOCKSOURCE_OF_DECLARE
...
Modify of_clk_init function so that it will determine which
driver to initialize based on device tree instead of each driver
registering to it.
Based on a similar patch for drivers/irqchip by Thomas Petazzoni and
drivers/clocksource by Stephen Warren.
Signed-off-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Tested-by: Tony Prisk <linux@prisktech.co.nz>
Tested-by: Pawel Moll <pawel.moll@arm.com>
Tested-by: Rob Herring <rob.herring@calxeda.com>
Tested-by: Josh Cartwright <josh.cartwright@ni.com>
Reviewed-by: Josh Cartwright <josh.cartwright@ni.com>
Acked-by: Maxime Ripard <maxime.ripard@anandra.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: merge conflict from missing CLKSRC_OF_TABLES()]
Signed-off-by: Mike Turquette <mturquette@linaro.org>
This creates irqchip initialization infrastructure from Thomas
Petazzoni. The VIC and GIC irqchip code is moved to drivers/irqchips
and adapted to use the new infrastructure. All DT enabled platforms
using GIC and VIC are converted over to use the new irqchip_init.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJQ8ZobAAoJEMhvYp4jgsXiihIH/2VvxmSHZb0e3jN6AR0B42b7
9EwX0IE0B23t91hNTwdzzmTJQYA7pMmWkgHNfd3vIeqSepJAmrVv/gp4iM9CtPwE
KNh+kDWOK2ZsOH4Vb0lYRJHN8WQOIQHuCUr9+MdYLNOgf/pPL6G/Y9kv9A1e7fTC
W+tFRjC5N1ilZMGyowX12L1wnwDk6kHzed6YV6bskC17cZ9/pg8PhSVbM4A/3kAv
NXYKqbXJb+eCsWGXg/knZXOL6V9gBwvVYoe4O9X3nQ0226AWB9caad8l8tchAjRB
fmrYF1tbkpOWPnLxhvQy5b5MJichJgTMJHh7RgiEcc/3f63kOljjlx4QKiqHvT0=
=q7gm
-----END PGP SIGNATURE-----
Merge tag 'gic-vic-to-irqchip' of git://sources.calxeda.com/kernel/linux into next/cleanup
From Rob Herring:
Initial irqchip init infrastructure and GIC and VIC clean-ups
This creates irqchip initialization infrastructure from Thomas
Petazzoni. The VIC and GIC irqchip code is moved to drivers/irqchips
and adapted to use the new infrastructure. All DT enabled platforms
using GIC and VIC are converted over to use the new irqchip_init.
* tag 'gic-vic-to-irqchip' of git://sources.calxeda.com/kernel/linux:
irqchip: Move ARM vic.h to include/linux/irqchip/arm-vic.h
ARM: picoxcell: use common irqchip_init function
ARM: spear: use common irqchip_init function
irqchip: Move ARM VIC to drivers/irqchip
ARM: samsung: remove unused tick.h
ARM: remove unneeded vic.h includes
ARM: remove mach .handle_irq for VIC users
ARM: VIC: set handle_arch_irq in VIC initialization
ARM: VIC: shrink down vic.h
irqchip: Move ARM gic.h to include/linux/irqchip/arm-gic.h
ARM: use common irqchip_init for GIC init
irqchip: Move ARM GIC to drivers/irqchip
ARM: remove mach .handle_irq for GIC users
ARM: GIC: set handle_arch_irq in GIC initialization
ARM: GIC: remove direct use of gic_raise_softirq
ARM: GIC: remove assembly ifdefs from gic.h
ARM: mach-ux500: use SGI0 to wake up the other core
arm: add set_handle_irq() to register the parent IRQ controller handler function
irqchip: add basic infrastructure
irqchip: add to the directories part of the IRQ subsystem in MAINTAINERS
Fixed up massive merge conflicts with the timer cleanup due to adjacent changes:
Signed-off-by: Olof Johansson <olof@lixom.net>
Conflicts:
arch/arm/mach-bcm/board_bcm.c
arch/arm/mach-cns3xxx/cns3420vb.c
arch/arm/mach-ep93xx/adssphere.c
arch/arm/mach-ep93xx/edb93xx.c
arch/arm/mach-ep93xx/gesbc9312.c
arch/arm/mach-ep93xx/micro9.c
arch/arm/mach-ep93xx/simone.c
arch/arm/mach-ep93xx/snappercl15.c
arch/arm/mach-ep93xx/ts72xx.c
arch/arm/mach-ep93xx/vision_ep9307.c
arch/arm/mach-highbank/highbank.c
arch/arm/mach-imx/mach-imx6q.c
arch/arm/mach-msm/board-dt-8960.c
arch/arm/mach-netx/nxdb500.c
arch/arm/mach-netx/nxdkn.c
arch/arm/mach-netx/nxeb500hmi.c
arch/arm/mach-nomadik/board-nhk8815.c
arch/arm/mach-picoxcell/common.c
arch/arm/mach-realview/realview_eb.c
arch/arm/mach-realview/realview_pb1176.c
arch/arm/mach-realview/realview_pb11mp.c
arch/arm/mach-realview/realview_pba8.c
arch/arm/mach-realview/realview_pbx.c
arch/arm/mach-socfpga/socfpga.c
arch/arm/mach-spear13xx/spear1310.c
arch/arm/mach-spear13xx/spear1340.c
arch/arm/mach-spear13xx/spear13xx.c
arch/arm/mach-spear3xx/spear300.c
arch/arm/mach-spear3xx/spear310.c
arch/arm/mach-spear3xx/spear320.c
arch/arm/mach-spear3xx/spear3xx.c
arch/arm/mach-spear6xx/spear6xx.c
arch/arm/mach-tegra/board-dt-tegra20.c
arch/arm/mach-tegra/board-dt-tegra30.c
arch/arm/mach-u300/core.c
arch/arm/mach-ux500/board-mop500.c
arch/arm/mach-ux500/cpu-db8500.c
arch/arm/mach-versatile/versatile_ab.c
arch/arm/mach-versatile/versatile_dt.c
arch/arm/mach-versatile/versatile_pb.c
arch/arm/mach-vexpress/v2m.c
include/asm-generic/vmlinux.lds.h
With the recent creation of the drivers/irqchip/ directory, it is
desirable to move irq controller drivers here. At the moment, the only
driver here is irq-bcm2835, the driver for the irq controller found in
the ARM BCM2835 SoC, present in Rasberry Pi systems. This irq
controller driver was exporting its initialization function and its
irq handling function through a header file in
<linux/irqchip/bcm2835.h>.
When proposing to also move another irq controller driver in
drivers/irqchip, Rob Herring raised the very valid point that moving
things to drivers/irqchip was good in order to remove more stuff from
arch/arm, but if it means adding gazillions of headers files in
include/linux/irqchip/, it would not be very nice.
So, upon the suggestion of Rob Herring and Arnd Bergmann, this commit
introduces a small infrastructure that defines a central
irqchip_init() function in drivers/irqchip/irqchip.c, which is meant
to be called as the ->init_irq() callback of ARM platforms. This
function calls of_irq_init() with an array of match strings and init
functions generated from a special linker section.
Note that the irq controller driver initialization function is
responsible for setting the global handle_arch_irq() variable, so that
ARM platforms no longer have to define the ->handle_irq field in their
DT_MACHINE structure.
A global header, <linux/irqchip.h> is also added to expose the single
irqchip_init() function to the reset of the kernel.
A further commit moves the BCM2835 irq controller driver to this new
small infrastructure, therefore removing the include/linux/irqchip/
directory.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Stephen Warren <swarren@wwwdotorg.org>
Reviewed-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
[rob.herring: reword commit message to reflect use of linker sections.]
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
It is desirable to move all clocksource drivers to drivers/clocksource,
yet each requires its own initialization function. We'd rather not
pollute <linux/> with a header for each function. Instead, create a
single of_clksrc_init() function which will determine which clocksource
driver to initialize based on device tree.
Based on a similar patch for drivers/irqchip by Thomas Petazzoni.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Instead of just sorting the ip's of the functions per ftrace page,
sort the entire list before adding them to the ftrace pages.
This will allow the bsearch algorithm to be sped up as it can
also sort by pages, not just records within a page.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch adds a set of macros that can be used to declare
kernel parameters to be parsed _before_ initcalls at a chosen
level are executed. We rename the now-unused "flags" field of
struct kernel_param as the level. It's signed, for when we
use this for early params as well, in future.
Linker macro collating init calls had to be modified in order
to add additional symbols between levels that are later used
by the init code to split the calls into blocks.
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Due to the alignment of following variables, these typically consume
more than just the single byte that 'bool' requires, and as there are a
few hundred instances, the cache pollution (not so much the waste of
memory) sums up. Put these variables into their own section, outside of
any half way frequently used memory range.
Do the same also to the __warned variable of rcu_lockdep_assert().
(Don't, however, include the ones used by printk_once() and alike, as
they can potentially be hot.)
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Markers have removed already twice:
1: fc5377668c
2: eb878b3bc0
But a little bit is still here.
Signed-off-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
* 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: Unify input section names
percpu: Avoid extra NOP in percpu_cmpxchg16b_double
percpu: Cast away printk format warning
percpu: Always align percpu output section to PAGE_SIZE
Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (107 commits)
perf stat: Add more cache-miss percentage printouts
perf stat: Add -d -d and -d -d -d options to show more CPU events
ftrace/kbuild: Add recordmcount files to force full build
ftrace: Add self-tests for multiple function trace users
ftrace: Modify ftrace_set_filter/notrace to take ops
ftrace: Allow dynamically allocated function tracers
ftrace: Implement separate user function filtering
ftrace: Free hash with call_rcu_sched()
ftrace: Have global_ops store the functions that are to be traced
ftrace: Add ops parameter to ftrace_startup/shutdown functions
ftrace: Add enabled_functions file
ftrace: Use counters to enable functions to trace
ftrace: Separate hash allocation and assignment
ftrace: Create a global_ops to hold the filter and notrace hashes
ftrace: Use hash instead for FTRACE_FL_FILTER
ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions
perf bench, x86: Add alternatives-asm.h wrapper
x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit
x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB
x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB
...
This patch places every exported symbol in its own section
(i.e. "___ksymtab+printk"). Thus the linker will use its SORT() directive
to sort and finally merge all symbol in the right and final section
(i.e. "__ksymtab").
The symbol prefixed archs use an underscore as prefix for symbols.
To avoid collision we use a different character to create the temporary
section names.
This work was supported by a hardware donation from the CE Linux Forum.
Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (folded in '+' fixup)
Tested-by: Dirk Behme <dirk.behme@googlemail.com>
Conflicts:
include/linux/perf_event.h
Merge reason: pick up the latest jump-label enhancements, they are cooked ready.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce:
static __always_inline bool static_branch(struct jump_label_key *key);
instead of the old JUMP_LABEL(key, label) macro.
In this way, jump labels become really easy to use:
Define:
struct jump_label_key jump_key;
Can be used as:
if (static_branch(&jump_key))
do unlikely code
enable/disale via:
jump_label_inc(&jump_key);
jump_label_dec(&jump_key);
that's it!
For the jump labels disabled case, the static_branch() becomes an
atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
atomic_dec() operations. We show testing results for this change below.
Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.
Since we now require a 'struct jump_label_key *key', we can store a pointer into
the jump table addresses. In this way, we can enable/disable jump labels, in
basically constant time. This change allows us to completely remove the previous
hashtable scheme. Thanks to Peter Zijlstra for this re-write.
Testing:
I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
configurations, where tracepoints were disabled.
jump label configured in
avg: 815.6
jump label *not* configured in (using atomic reads)
avg: 800.1
jump label *not* configured in (regular reads)
avg: 803.4
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110316212947.GA8792@redhat.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Suggested-by: H. Peter Anvin <hpa@linux.intel.com>
Tested-by: David Daney <ddaney@caviumnetworks.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The two percpu helper macros have the section names duplicated. So create
a new define to merge the two. This also allows arches who need to link
things more directly themselves to avoid duplicating the input sections in
their linker script.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Percpu allocator honors alignment request upto PAGE_SIZE and both the
percpu addresses in the percpu address space and the translated kernel
addresses should be aligned accordingly. The calculation of the
former depends on the alignment of percpu output section in the kernel
image.
The linker script macros PERCPU_VADDR() and PERCPU() are used to
define this output section and the latter takes @align parameter.
Several architectures are using @align smaller than PAGE_SIZE breaking
percpu memory alignment.
This patch removes @align parameter from PERCPU(), renames it to
PERCPU_SECTION() and makes it always align to PAGE_SIZE. While at it,
add PCPU_SETUP_BUG_ON() checks such that alignment problems are
reliably detected and remove percpu alignment comment recently added
in workqueue.c as the condition would trigger BUG way before reaching
there.
For um, this patch raises the alignment of percpu area. As the area
is in .init, there shouldn't be any noticeable difference.
This problem was discovered by David Howells while debugging boot
failure on mn10300.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: uclinux-dist-devel@blackfin.uclinux.org
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
percpu: Generic support for this_cpu_cmpxchg_double()
alpha: use L1_CACHE_BYTES for cacheline size in the linker script
percpu: align percpu readmostly subsection to cacheline
Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the
percpu alignment having changed ("x86: Reduce back the alignment of the
per-CPU data section")
Put x86 entry code into a separate link section: .entry.text.
Separating the entry text section seems to have performance
benefits - caused by more efficient instruction cache usage.
Running hackbench with perf stat --repeat showed that the change
compresses the icache footprint. The icache load miss rate went
down by about 15%:
before patch:
19417627 L1-icache-load-misses ( +- 0.147% )
after patch:
16490788 L1-icache-load-misses ( +- 0.180% )
The motivation of the patch was to fix a particular kprobes
bug that relates to the entry text section, the performance
advantage was discovered accidentally.
Whole perf output follows:
- results for current tip tree:
Performance counter stats for './hackbench/hackbench 10' (500 runs):
19417627 L1-icache-load-misses ( +- 0.147% )
2676914223 instructions # 0.497 IPC ( +- 0.079% )
5389516026 cycles ( +- 0.144% )
0.206267711 seconds time elapsed ( +- 0.138% )
- results for current tip tree with the patch applied:
Performance counter stats for './hackbench/hackbench 10' (500 runs):
16490788 L1-icache-load-misses ( +- 0.180% )
2717734941 instructions # 0.502 IPC ( +- 0.079% )
5414756975 cycles ( +- 0.148% )
0.206747566 seconds time elapsed ( +- 0.137% )
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: masami.hiramatsu.pt@hitachi.com
Cc: ananth@in.ibm.com
Cc: davem@davemloft.net
Cc: 2nddept-manager@sdl.hitachi.co.jp
LKML-Reference: <20110307181039.GB15197@jolsa.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently the syscall_meta structures for the syscall tracepoints are
placed in the __syscall_metadata section, and at link time, the linker
makes one large array of all these syscall metadata structures. On boot
up, this array is read (much like the initcall sections) and the syscall
data is processed.
The problem is that there is no guarantee that gcc will place complex
structures nicely together in an array format. Two structures in the
same file may be placed awkwardly, because gcc has no clue that they
are suppose to be in an array.
A hack was used previous to force the alignment to 4, to pack the
structures together. But this caused alignment issues with other
architectures (sparc).
Instead of packing the structures into an array, the structures' addresses
are now put into the __syscall_metadata section. As pointers are always the
natural alignment, gcc should always pack them tightly together
(otherwise initcall, extable, etc would also fail).
By having the pointers to the structures in the section, we can still
iterate the trace_events without causing unnecessary alignment problems
with other architectures, or depending on the current behaviour of
gcc that will likely change in the future just to tick us kernel developers
off a little more.
The __syscall_metadata section is also moved into the .init.data section
as it is now only needed at boot up.
Suggested-by: David Miller <davem@davemloft.net>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Make the tracepoints more robust, making them solid enough to handle compiler
changes by not relying on anything based on compiler-specific behavior with
respect to structure alignment. Implement an approach proposed by David Miller:
use an array of const pointers to refer to the individual structures, and export
this pointer array through the linker script rather than the structures per se.
It will consume 32 extra bytes per tracepoint (24 for structure padding and 8
for the pointers), but are less likely to break due to compiler changes.
History:
commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE()
added the aligned(32) type and variable attribute to the tracepoint structures
to deal with gcc happily aligning statically defined structures on 32-byte
multiples.
One attempt was to use a 8-byte alignment for tracepoint structures by applying
both the variable and type attribute to tracepoint structures definitions and
declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5.
The reason is that the "aligned" attribute only specify the _minimum_ alignment
for a structure, leaving both the compiler and the linker free to align on
larger multiples. Because tracepoint.c expects the structures to be placed as an
array within each section, up-alignment cause NULL-pointer exceptions due to the
extra unexpected padding.
(this patch applies on top of -tip)
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: David S. Miller <davem@davemloft.net>
LKML-Reference: <20110126222622.GA10794@Krystal>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Currently the trace_event structures are placed in the _ftrace_events
section, and at link time, the linker makes one large array of all
the trace_event structures. On boot up, this array is read (much like
the initcall sections) and the events are processed.
The problem is that there is no guarantee that gcc will place complex
structures nicely together in an array format. Two structures in the
same file may be placed awkwardly, because gcc has no clue that they
are suppose to be in an array.
A hack was used previous to force the alignment to 4, to pack the
structures together. But this caused alignment issues with other
architectures (sparc).
Instead of packing the structures into an array, the structures' addresses
are now put into the _ftrace_event section. As pointers are always the
natural alignment, gcc should always pack them tightly together
(otherwise initcall, extable, etc would also fail).
By having the pointers to the structures in the section, we can still
iterate the trace_events without causing unnecessary alignment problems
with other architectures, or depending on the current behaviour of
gcc that will likely change in the future just to tick us kernel developers
off a little more.
The _ftrace_event section is also moved into the .init.data section
as it is now only needed at boot up.
Suggested-by: David Miller <davem@davemloft.net>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Currently percpu readmostly subsection may share cachelines with other
percpu subsections which may result in unnecessary cacheline bounce
and performance degradation.
This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
linker macros, makes each arch linker scripts specify its cacheline
size and use it to align percpu subsections.
This is based on Shaohua's x86 only patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shaohua.li@intel.com>
Currently only drivers that are built as modules have their versions
shown in /sys/module/<module_name>/version, but this information might
also be useful for built-in drivers as well. This especially important
for drivers that do not define any parameters - such drivers, if
built-in, are completely invisible from userspace.
This patch changes MODULE_VERSION() macro so that in case when we are
compiling built-in module, version information is stored in a separate
section. Kernel then uses this data to create 'version' sysfs attribute
in the same fashion it creates attributes for module parameters.
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The readmostly section should end at a cacheline aligned address,
otherwise the last several data might share cachline with other data and
make the readmostly data still have cache bounce.
For example, in ia64, secpath_cachep is the last readmostly data, and it
shares cacheline with init_uts_ns.
a000000100e80480 d secpath_cachep
a000000100e80488 D init_uts_ns
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds support for linking device tree blob(s) into
vmlinux. Modifies asm-generic/vmlinux.lds.h to add linking
.dtb sections into vmlinux. To maintain compatiblity with the of/fdt
driver code platforms MUST copy the blob to a non-init memory location
before the kernel frees the .init.* sections in the image.
Modifies scripts/Makefile.lib to add a kbuild command to
compile DTS files to device tree blobs and a rule to create objects to
wrap the blobs for linking.
STRUCT_ALIGNMENT is defined in vmlinux.lds.h for use in the rule to
create wrapper objects for the dtb in Makefile.lib. The
STRUCT_ALIGN() macro in vmlinux.lds.h is modified to use the
STRUCT_ALIGNMENT definition.
The DTB's are placed on 32 byte boundries to allow parsing the blob
with driver/of/fdt.c during early boot without having to copy the blob
to get the structure alignment GCC expects.
A DTB is linked in by adding the DTB object to the list of objects to
be linked into vmlinux in the archtecture specific Makefile using
obj-y += foo.dtb.o
Signed-off-by: Dirk Brandewie <dirk.brandewie@gmail.com>
Acked-by: Michal Marek <mmarek@suse.cz>
[grant.likely@secretlab.ca: cleaned up whitespace inconsistencies]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
initramfs: Fix build break on symbol-prefixed archs
initramfs: fix initramfs size calculation
initramfs: generalize initramfs_data.xxx.S variants
scripts/kallsyms: Enable error messages while hush up unnecessary warnings
scripts/setlocalversion: update comment
kbuild: Use a single clean rule for kernel and external modules
kbuild: Do not run make clean in $(srctree)
scripts/mod/modpost.c: fix commentary accordingly to last changes
kbuild: Really don't clean bounds.h and asm-offsets.h
The new init ramfs format (cpio based) requires an alignment of 4 (per the
documentation and per the source files themselves). As for compressed
sources, the decompressors can all deal with unaligned buffers.
The cpio source is also found in the __init sections of the kernel, so
once they are read and expanded into a tmpfs, the source is freed. That
means there is no need to force page alignment here either.
This has been used on Blackfin systems for many releases without issue.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the recent change "net: remove time limit in process_backlog()", the
softnet_data variable changed from "DEFINE_PER_CPU()" to
"DEFINE_PER_CPU_ALIGNED()" which moved it from the .data section to the
.data.shared_align section. I'm not saying this patch is wrong, just that
is what caused me to notice this larger problem. No one else in the
kernel is using this aligned macro variant, so I imagine that's why no one
has noticed yet.
Since .data..shared_align isn't declared in any vmlinux files that I can
see, the linker just places it last. This "just works" for most people,
but when building a ROM kernel on Blackfin systems, it causes section
overlap errors:
bfin-uclinux-ld.real:
section .init.data [00000000202e06b8 -> 00000000202e48b7] overlaps
section .data.shared_aligned [00000000202e06b8 -> 00000000202e0723]
I imagine other arches which support the ROM config option and thus do
funky placement would see similar issues ...
On x86, it is stuck in a dedicated section at the end:
[8] .data PROGBITS ffffffff810ec000 2ec0000303a8 00 WA 0 0 4096
[9] .data.shared_alig PROGBITS ffffffff8111c3c0 31c3c00000c8 00 WA 0 0 64
So make sure we include this section in the DATA_DATA macro so that it is
placed in the right location.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Cc: Greg Ungerer <gerg@snapgear.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32, percpu: Correct the ordering of the percpu readmostly section
x86, mm: Enable ARCH_DMA_ADDR_T_64BIT with X86_64 || HIGHMEM64G
x86: Spread tlb flush vector between nodes
percpu: Introduce a read-mostly percpu API
x86, mm: Fix incorrect data type in vmalloc_sync_all()
x86, mm: Hold mm->page_table_lock while doing vmalloc_sync
x86, mm: Fix bogus whitespace in sync_global_pgds()
x86-32: Fix sparse warning for the __PHYSICAL_MASK calculation
x86, mm: Add RESERVE_BRK_ARRAY() helper
mm, x86: Saving vmcore with non-lazy freeing of vmas
x86, kdump: Change copy_oldmem_page() to use cached addressing
x86, mm: fix uninitialized addr in kernel_physical_mapping_init()
x86, kmemcheck: Remove double test
x86, mm: Make spurious_fault check explicitly check the PRESENT bit
x86-64, mem: Update all PGDs for direct mapping and vmemmap mapping changes
x86, mm: Separate x86_64 vmalloc_sync_all() into separate functions
x86, mm: Avoid unnecessary TLB flush
Checkin c957ef2c59 had inconsistent
ordering of .data..percpu..page_aligned and .data..percpu..readmostly;
the still-broken version affected x86-32 at least.
The page aligned version really must be page aligned...
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <1287544022.4571.7.camel@sli10-conroe.sh.intel.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Add a new readmostly percpu section and API. This can be used to
avoid dirtying data lines which are generally not written to, which is
especially important for data which may be accessed by processors
other than the one for which the percpu area belongs to.
[ hpa: moved it *after* the page-aligned section, for obvious
reasons. ]
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
LKML-Reference: <1287544022.4571.7.camel@sli10-conroe.sh.intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The size of a built-in initramfs is calculated in init/initramfs.c by
"__initramfs_end - __initramfs_start". Those symbols are defined in the
linker script include/asm-generic/vmlinux.lds.h:
#define INIT_RAM_FS \
. = ALIGN(PAGE_SIZE); \
VMLINUX_SYMBOL(__initramfs_start) = .; \
*(.init.ramfs) \
VMLINUX_SYMBOL(__initramfs_end) = .;
If the initramfs file has an odd number of bytes, the "__initramfs_end"
symbol points to an odd address, for example, the symbols in the
System.map might look like:
0000000000572000 T __initramfs_start
00000000005bcd05 T __initramfs_end <-- odd address
At least on s390 this causes a problem:
Certain s390 instructions, especially instructions for loading addresses
(larl) or branch addresses must be on even addresses. The compiler loads
the symbol addresses with the "larl" instruction. This instruction sets
the last bit to 0 and, therefore, for odd size files, the calculated size
is one byte less than it should be:
0000000000540a9c <populate_rootfs>:
540a9c: eb cf f0 78 00 24 stmg %r12,%r15,120(%r15),
540aa2: c0 10 00 01 8a af larl %r1,572000 <__initramfs_start>
540aa8: c0 c0 00 03 e1 2e larl %r12,5bcd04 <initramfs_end>
(Instead of 5bcd05)
...
540abe: 1b c1 sr %r12,%r1
To fix the problem, this patch introduces the global variable
__initramfs_size, which is calculated in the "usr/initramfs_data.S" file.
The populate_rootfs() function can then use the start marker of the
.init.ramfs section and the value of __initramfs_size for loading the
initramfs. Because the start marker and size is sufficient, the
__initramfs_end symbol is no longer needed and is removed.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
assembly gcc mechanism, we can now branch to labels from an 'asm goto'
statment. This allows us to create a 'no-op' fastpath, which can subsequently
be patched with a jump to the slowpath code. This is useful for code which
might be rarely used, but which we'd like to be able to call, if needed.
Tracepoints are the current usecase that these are being implemented for.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jason Baron <jbaron@redhat.com>
LKML-Reference: <ee8b3595967989fdaf84e698dc7447d315ce972a.1284733808.git.jbaron@redhat.com>
[ cleaned up some formating ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
tracing/kprobes: unregister_trace_probe needs to be called under mutex
perf: expose event__process function
perf events: Fix mmap offset determination
perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
perf, powerpc: Convert the FSL driver to use local64_t
perf tools: Don't keep unreferenced maps when unmaps are detected
perf session: Invalidate last_match when removing threads from rb_tree
perf session: Free the ref_reloc_sym memory at the right place
x86,mmiotrace: Add support for tracing STOS instruction
perf, sched migration: Librarize task states and event headers helpers
perf, sched migration: Librarize the GUI class
perf, sched migration: Make the GUI class client agnostic
perf, sched migration: Make it vertically scrollable
perf, sched migration: Parameterize cpu height and spacing
perf, sched migration: Fix key bindings
perf, sched migration: Ignore unhandled task states
perf, sched migration: Handle ignored migrate out events
perf: New migration tool overview
tracing: Drop cpparg() macro
perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
...
Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
* upstream/pvhvm:
Introduce CONFIG_XEN_PVHVM compile option
blkfront: do not create a PV cdrom device if xen_hvm_guest
support multiple .discard.* sections to avoid section type conflicts
xen/pvhvm: fix build problem when !CONFIG_XEN
xenfs: enable for HVM domains too
x86: Call HVMOP_pagetable_dying on exit_mmap.
x86: Unplug emulated disks and nics.
x86: Use xen_vcpuop_clockevent, xen_clocksource and xen wallclock.
xen: Fix find_unbound_irq in presence of ioapic irqs.
xen: Add suspend/resume support for PV on HVM guests.
xen: Xen PCI platform device driver.
x86/xen: event channels delivery on HVM.
x86: early PV on HVM features initialization.
xen: Add support for HVM hypercalls.
Conflicts:
arch/x86/xen/enlighten.c
arch/x86/xen/time.c
gcc 4.4.4 will complain if you use a .discard section for both text and
data ("causes a section type conflict"). Add support for ".discard.*"
sections, and use .discard.text for a dummy function in the x86
RESERVE_BRK() macro.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
The .data..init_task output section was missing
a load offset causing a popwerpc target to fail to boot.
Sean MacLennan tracked it down to the definition of
INIT_TASK_DATA_SECTION().
There are only two users of INIT_TASK_DATA_SECTION()
in the kernel today: cris and popwerpc.
cris do not support relocatable kernels and is thus not
impacted by this change.
Fix INIT_TASK_DATA_SECTION() to specify load offset like
all other output sections.
Reported-by: Sean MacLennan <smaclennan@pikatech.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We define a number of symbols in the linker scipt like this:
__start_syscalls_metadata = .;
*(__syscalls_metadata)
But we do not know the alignment of "." when we assign
the __start_syscalls_metadata symbol.
gcc started to uses bigger alignment for structs (32 bytes),
so we saw situations where the linker due to alignment
constraints increased the value of "." after the symbol assignment.
This resulted in boot fails.
Fix this by forcing a 32 byte alignment of "." before the
assignment.
This patch introduces the forced alignment for
ftrace_events and syscalls_metadata.
It may be required in more places.
Reported-by: Zeev Tarantov <zeev.tarantov@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <20100710063459.GA14596@merkur.ravnborg.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Markers have been removed, but we forgot to remove their
section.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
* 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits)
kbuild: Revert part of e8d400a to resolve a conflict
kbuild: Fix checking of scm-identifier variable
gconfig: add support to show hidden options that have prompts
menuconfig: add support to show hidden options which have prompts
gconfig: remove show_debug option
gconfig: remove dbg_print_ptype() and dbg_print_stype()
kconfig: fix zconfdump()
kconfig: some small fixes
add random binaries to .gitignore
kbuild: Include gen_initramfs_list.sh and the file list in the .d file
kconfig: recalc symbol value before showing search results
.gitignore: ignore *.lzo files
headerdep: perlcritic warning
scripts/Makefile.lib: Align the output of LZO
kbuild: Generate modules.builtin in make modules_install
Revert "kbuild: specify absolute paths for cscope"
kbuild: Do not unnecessarily regenerate modules.builtin
headers_install: use local file handles
headers_check: fix perl warnings
export_report: fix perl warnings
...
Modify the way how RapidIO switch operations are declared. Multiple
assignments through the linker script replaced by single initialization
call.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add RapidIO Port-Write message handling in the context of Error
Management Extensions Specification Rev.1.3.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Tested-by: Thomas Moll <thomas.moll@sysgo.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
The next commit will require the use of MODULE_SYMBOL_PREFIX in
.tmp_exports-asm.S. Currently it is mixed in with C structure
definitions in "asm/module.h". Move the definition of this arch option
into Kconfig, so it can be easily accessed by any code.
This also lets modpost.c use the same definition. Previously modpost
relied on a hardcoded list of architectures in mk_elfconfig.c.
A build test for blackfin, one of the two MODULE_SYMBOL_PREFIX archs,
showed the generated code was unchanged. vmlinux was identical save
for build ids, and an apparently randomized suffix on a single "__key"
symbol in the kallsyms data).
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Acked-by: Mike Frysinger <vapier@gentoo.org> (blackfin)
CC: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The old RW_DATA_SECTION had INIT_TASK_DATA (which was
more-than-PAGE_SIZE-aligned), followed by a bunch of small alignment
stuff, followed by more PAGE_SIZE-aligned stuff, so you wasted memory
in the middle of .data re-aligning back up to PAGE_SIZE.
This patch sorts the sections by alignment requirements, which should
pack them essentially optimally.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__start_mcount_loc[] is unused after init, yet occupies RAM forever
as part of .rodata. 152kiB is typical on a 64-bit architecture. Instead,
__start_mcount_loc should be in the interval [__init_begin, __init_end)
so that the space is reclaimed after init.
__start_mcount_loc[] is generated during the load portion
of kernel build, and is used only by ftrace_init(). ftrace_init is declared
'__init' and is in .init.text, which is freed after init.
__start_mcount_loc is placed into .rodata by a call to MCOUNT_REC inside
the RO_DATA macro of include/asm-generic/vmlinux.lds.h. The array *is*
read-only, but more importantly it is not used after init. So the call to
MCOUNT_REC should be moved from RO_DATA to INIT_DATA.
This patch has been tested on x86_64 with CONFIG_DEBUG_PAGEALLOC=y
which verifies that the address range never is accessed after init.
Signed-off-by: John Reiser <jreiser@BitWagon.com>
LKML-Reference: <4A6DF0B6.7080402@bitwagon.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Conflicts:
arch/sparc/kernel/smp_64.c
arch/x86/kernel/cpu/perf_counter.c
arch/x86/kernel/setup_percpu.c
drivers/cpufreq/cpufreq_ondemand.c
mm/percpu.c
Conflicts in core and arch percpu codes are mostly from commit
ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all
the first chunk allocators into mm/percpu.c, the changes are moved
from arch code to mm/percpu.c.
Signed-off-by: Tejun Heo <tj@kernel.org>
The BSS section macros in vmlinux.lds.h currently place the .sbss
input section outside the bounds of [__bss_start, __bss_end]. On all
architectures except for microblaze that handle both .sbss and
__bss_start/__bss_end, this is wrong: the .sbss input section is
within the range [__bss_start, __bss_end]. Relatedly, the example
code at the top of the file actually has __bss_start/__bss_end defined
twice; I believe the right fix here is to define them in the
BSS_SECTION macro but not in the BSS macro.
Another problem with the current macros is that several
architectures have an ALIGN(4) or some other small number just before
__bss_stop in their linker scripts. The BSS_SECTION macro currently
hardcodes this to 4; while it should really be an argument. It also
ignores its sbss_align argument; fix that.
mn10300 is the only user at present of any of the macros touched by
this patch. It looks like mn10300 actually was incorrectly converted
to use the new BSS() macro (the alignment of 4 prior to conversion was
a __bss_stop alignment, but the argument to the BSS macro is a start
alignment). So fix this as well.
I'd like acks from Sam and David on this one. Also CCing Paul, since
he has a patch from me which will need to be updated to use
BSS_SECTION(0, PAGE_SIZE, 4) once this gets merged.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Discarded sections in different archs share some commonality but have
considerable differences. This led to linker script for each arch
implementing its own /DISCARD/ definition, which makes maintaining
tedious and adding new entries error-prone.
This patch makes all linker scripts to move discard definitions to the
end of the linker script and use the common DISCARDS macro. As ld
uses the first matching section definition, archs can include default
discarded sections by including them earlier in the linker script.
ia64 is notable because it first throws away some ia64 specific
subsections and then include the rest of the sections into the final
image, so those sections must be discarded before the inclusion.
defconfig compile tested for x86, x86-64, powerpc, powerpc64, ia64,
alpha, sparc, sparc64 and s390. Michal Simek tested microblaze.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Tested-by: Michal Simek <monstr@monstr.eu>
Cc: linux-arch@vger.kernel.org
Cc: Michal Simek <monstr@monstr.eu>
Cc: microblaze-uclinux@itee.uq.edu.au
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Tony Luck <tony.luck@intel.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
kbuild: finally remove the obsolete variable $TOPDIR
gitignore: ignore scripts/ihex2fw
Kbuild: Disable the -Wformat-security gcc flag
gitignore: ignore gcov output files
kbuild: deb-pkg ship changelog
Add new __init_task_data macro to be used in arch init_task.c files.
asm-generic/vmlinux.lds.h: shuffle INIT_TASK* macro names in vmlinux.lds.h
Add new macros for page-aligned data and bss sections.
asm-generic/vmlinux.lds.h: Fix up RW_DATA_SECTION definition.
Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix
changes. As alpha in percpu tree uses 'weak' attribute instead of
inline assembly, there's no need for __used attribute.
Conflicts:
arch/alpha/include/asm/percpu.h
arch/mn10300/kernel/vmlinux.lds.S
include/linux/percpu-defs.h
The ctors section for each object file is eight byte aligned (on 64 bit).
However the __ctors_start symbol starts at an arbitrary address dependent
on the size of the previous sections.
Therefore the linker may add some zeroes after __ctors_start to make sure
the ctors contents are properly aligned. However the extra zeroes at the
beginning aren't expected by the code. When walking the functions
pointers contained in there and extra zeroes are added this may result in
random jumps. So make sure that the __ctors_start symbol is always
aligned as well.
Fixes this crash on an allyesconfig on s390:
[ 0.582482] Kernel BUG at 0000000000000012 [verbose debug info unavailable]
[ 0.582489] illegal operation: 0001 [#1] SMP DEBUG_PAGEALLOC
[ 0.582496] Modules linked in:
[ 0.582501] CPU: 0 Tainted: G W 2.6.31-rc1-dirty #273
[ 0.582506] Process swapper (pid: 1, task: 000000003f218000, ksp: 000000003f2238e8)
[ 0.582510] Krnl PSW : 0704200180000000 0000000000000012 (0x12)
[ 0.582518] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
[ 0.582524] Krnl GPRS: 0000000000036727 0000000000000010 0000000000000001 0000000000000001
[ 0.582529] 00000000001dfefa 0000000000000000 0000000000000000 0000000000000040
[ 0.582534] 0000000001fff0f0 0000000001790628 0000000002296048 0000000002296048
[ 0.582540] 00000000020c438e 0000000001786000 0000000002014a66 000000003f223e60
[ 0.582553] Krnl Code:>0000000000000012: 0000 unknown
[ 0.582559] 0000000000000014: 0000 unknown
[ 0.582564] 0000000000000016: 0000 unknown
[ 0.582570] 0000000000000018: 0000 unknown
[ 0.582575] 000000000000001a: 0000 unknown
[ 0.582580] 000000000000001c: 0000 unknown
[ 0.582585] 000000000000001e: 0000 unknown
[ 0.582591] 0000000000000020: 0000 unknown
[ 0.582596] Call Trace:
[ 0.582599] ([<0000000002014a46>] kernel_init+0x622/0x7a0)
[ 0.582607] [<0000000000113e22>] kernel_thread_starter+0x6/0xc
[ 0.582615] [<0000000000113e1c>] kernel_thread_starter+0x0/0xc
[ 0.582621] INFO: lockdep is turned off.
[ 0.582624] Last Breaking-Event-Address:
[ 0.582627] [<0000000002014a64>] kernel_init+0x640/0x7a0
Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We recently added a INIT_TASK(align) in include/asm-generic/vmlinux.lds.h,
but there is already a macro INIT_TASK in include/linux/init_task.h, which
is quite confusing. We should switch the macro in the linker script to
INIT_TASK_DATA. (Sorry that I missed this in reviewing the patch). Since
the macros are new, there is only one user of the INIT_TASK in
vmlinux.lds.h, arch/mn10300/kernel/vmlinux.lds.S.
However, we are currently using INIT_TASK_DATA for laying down an entire
.data.init_task section. So rename that to INIT_TASK_DATA_SECTION.
I would be worried about changing the meaning of INIT_TASK_DATA, but the
old INIT_TASK_DATA implementation had no users, and in fact if anyone had
tried to use it, it would have failed to compile because it didn't pass
the alignment to the old INIT_TASK.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jesper Nilsson <Jesper.Nilsson@axis.com
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
RW_DATA_SECTION is defined to take 4 different alignment parameters,
while NOSAVE_DATA currently uses a fixed PAGE_SIZE alignment as noted
in the comments.
There are presently no in-tree users of this at present, and I just
stumbled across this while implementing the simplified script on a new
architecture port, which subsequently resulted in a syntax error.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
x86 throws away .discard section but no other archs do. Also,
.discard is not thrown away while linking modules. Make every arch
and module linking throw it away. This will be used to define dummy
variables for percpu declarations and definitions.
This patch is based on Ivan Kokshaysky's alpha percpu patch.
[ Impact: always throw away everything in .discard ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
In asm-generic/vmlinux.lds.h, name INIT_RAM_FS consistently, no matter the
setting of CONFIG_BLK_DEV_INITRD. This corrects:
commit ef53dae865
Author: Sam Ravnborg <sam@ravnborg.org>
Date: Sun Jun 7 20:46:37 2009 +0200
Subject: Improve vmlinux.lds.h support for arch specific linker scripts
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Call constructors (gcc-generated initcall-like functions) during kernel
start and module load. Constructors are e.g. used for gcov data
initialization.
Disable constructor support for usermode Linux to prevent conflicts with
host glibc.
Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Li Wei <W.Li@Sun.COM>
Cc: Michael Ellerman <michaele@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (53 commits)
.gitignore: ignore *.lzma files
kbuild: add generic --set-str option to scripts/config
kbuild: simplify argument loop in scripts/config
kbuild: handle non-existing options in scripts/config
kallsyms: generalize text region handling
kallsyms: support kernel symbols in Blackfin on-chip memory
documentation: make version fix
kbuild: fix a compile warning
gitignore: Add GNU GLOBAL files to top .gitignore
kbuild: fix delay in setlocalversion on readonly source
README: fix misleading pointer to the defconf directory
vmlinux.lds.h update
kernel-doc: cleanup perl script
Improve vmlinux.lds.h support for arch specific linker scripts
kbuild: fix headers_exports with boolean expression
kbuild/headers_check: refine extern check
kbuild: fix "Argument list too long" error for "make headers_check",
ignore *.patch files
Remove bashisms from scripts
menu: fix embedded menu presentation
...
Updated after review by Tim Abbott.
- Use HEAD_TEXT_SECTION
- Drop use of section-names.h and delete file
- Introduce EXIT_CALL
Deleting section-names.h required a few simple
updates of init.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Tim Abbott <tabbott@ksplice.com>
To support alingment of the individual architecture specific linker scripts
provide a set of general definitions in vmlinux.lds.h
With these definitions applied the diverse linekr scripts can be reduced
in line count and their readability are improved - IMO.
A sample linker script is included to give the preferred
order of the sections for the architectures that do not
have any special requirments.
These definitions are also a first step towards eventual
support for -ffunction-sections.
The definitions makes it much easier to do a global
renaming of section names - but the main purpose is
to clean up the linker scripts.
Tim Aboot has provided a lot of inputs to improve
the definitions - all faults are mine.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Tim Abbott <tabbott@mit.edu>
- add .init.rodata to INIT_DATA, and group all initconst flavors
together
- move strings generated from __setup_param() into .init.rodata
- add .*init.rodata to modpost's sets of init sections
- make modpost warn about references between meminit and cpuinit
as well as memexit and cpuexit sections (as CPU and memory
hotplug are independently selectable features)
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Merge reason: tracing/core was on a .30-rc1 base and was missing out on
on a handful of tracing fixes present in .30-rc5-almost.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The old refok sections
.text.init.refok
.data.init.refok
.exit.text.refok
have been deprecated since commit
312b1485fb. After the other patches in
this patch series nothing is put in these sections, so clean things up
by eliminating all the remaining references to them.
Signed-off-by: Tim Abbott <tabbott@mit.edu>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is preparation for replacing all uses of ".head.text" or
".text.head" in the kernel with macros, so that the section name can
later be changed without having to touch a lot of the kernel.
Since some linker scripts do more complex things than referencing
HEAD_TEXT, we add a HEAD_TEXT_SECTION macro that just contains the
actual name.
I've defined HEAD_TEXT_SECTION in a new header,
include/linux/section-names.h, so that this section name only needs to
appear in one place. I anticipate creating similar macro structures
for a number of other section names.
The long-term goal here is to be able to change the kernel's magic
section names to those that are compatible with -ffunction-sections
-fdata-sections. This requires renaming all magic sections with names
of the form ".text.foo".
Signed-off-by: Tim Abbott <tabbott@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a new config option, CONFIG_EVENT_TRACING that gets selected
when CONFIG_TRACING is selected and adds everything needed by the stuff
in trace_export - basically all the event tracing support needed by e.g.
bprint, minus the actual events, which are only included if
CONFIG_EVENT_TRACER is selected.
So CONFIG_EVENT_TRACER can be used to turn on or off the generated events
(what I think of as the 'event tracer'), while CONFIG_EVENT_TRACING turns
on or off the base event tracing support used by both the event tracer and
the other things such as bprint that can't be configured out.
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
LKML-Reference: <1239178441.10295.34.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch combines Greg Bank's dprintk() work with the existing dynamic
printk patchset, we are now calling it 'dynamic debug'.
The new feature of this patchset is a richer /debugfs control file interface,
(an example output from my system is at the bottom), which allows fined grained
control over the the debug output. The output can be controlled by function,
file, module, format string, and line number.
for example, enabled all debug messages in module 'nf_conntrack':
echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control
to disable them:
echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control
A further explanation can be found in the documentation patch.
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Impact: new feature
This adds the generic support for syscalls tracing. This is
currently exploited through a devoted tracer but other tracing
engines can use it. (They just have to play with
{start,stop}_ftrace_syscalls() and use the display callbacks
unless they want to override them.)
The syscalls prototypes definitions are abused here to steal
some metadata informations:
- syscall name, param types, param names, number of params
The syscall addr is not directly saved during this definition
because we don't know if its prototype is available in the
namespace. But we don't really need it. The arch has just to
build a function able to resolve the syscall number to its
metadata struct.
The current tracer prints the syscall names, parameters names
and values (and their types optionally). Currently the value is
a raw hex but higher level values diplaying is on my TODO list.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1236955332-10133-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix kernel crash when using trace_printk()
trace_printk_fmt section is defined into the readonly section.
But we do:
trace_printk_fmt = fmt;
to fill in that table of format strings - which is not read-only.
Under CONFIG_DEBUG_RODATA=y this crashes ...
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add a generic printk() for tracing, like trace_printk()
trace_bprintk() uses the infrastructure to record events on ring_buffer.
[ fweisbec@gmail.com: ported to latest -tip, made it work if
!CONFIG_MODULES, never free the format strings from modules
because we can't keep track of them and conditionnaly create
the ftrace format strings section (reported by Steven Rostedt) ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch creates the event tracing infrastructure of ftrace.
It will create the files:
/debug/tracing/available_events
/debug/tracing/set_event
The available_events will list the trace points that have been
registered with the event tracer.
set_events will allow the user to enable or disable an event hook.
example:
# echo sched_wakeup > /debug/tracing/set_event
Will enable the sched_wakeup event (if it is registered).
# echo "!sched_wakeup" >> /debug/tracing/set_event
Will disable the sched_wakeup event (and only that event).
# echo > /debug/tracing/set_event
Will disable all events (notice the '>')
# cat /debug/tracing/available_events > /debug/tracing/set_event
Will enable all registered event hooks.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: fix linker screwup on x86_32
Recent x86_64 zerobased patches introduced PERCPU_VADDR() to put
.data.percpu to a predefined address and re-defined PERCPU() in terms
of it. The new macro defined one extra symbol, __per_cpu_load, for
LMA of the section so that the init data could be accessed. This new
symbol introduced the following problems to x86_32.
1. If __per_cpu_load is defined outside of .data.percpu as an absolute
symbol, relocation generation for relocatable kernel fails due to
absolute relocation.
2. If __per_cpu_load is put inside .data.percpu with absolute address
assignment to work around #1, linker gets confused and under
certain configurations ends up relocating the symbol against
.data.percpu such that the load address gets added on top of
already set load address.
As x86_32 doesn't use predefined address for .data.percpu, there's no
need for it to care about the possibility of __per_cpu_load being
different from __per_cpu_start.
This patch defines PERCPU() separately so that __per_cpu_load is
defined inside .data.percpu so that everything is ordinary
linking-wise.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 5a611268b6.
It is causing occasional boot crashes, caused by certain
linker versions (GNU ld version 2.18.50.0.6-2 20080403) messing up:
82dcc000 D __per_cpu_load
c16e6000 A __per_cpu_load_abs
The __per_cpu_load value is out of whack. Hpa noticed the following
detail:
* (gdb) p/x -(0xc16e6000-0x82dcc000)
* $2 = 0xc16e6000
* I.e. one is the other << 1
The two symbols should be equal.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes this linker error:
WARNING: Absolute relocations present
Offset Info Type Sym.Value Sym.Name
c0a4e07d 00e78001 R_386_32 c0ab0000 __per_cpu_load
Now, __per_cpu_load is a section-relative symbol:
c0aa4000 D __per_cpu_load
c0aa4000 A __per_cpu_load_abs
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The newly added PERCPU_*() macros define and use __per_cpu_load but
VMLINUX_SYMBOL() was missing from usages causing build failures on
archs where linker visible symbol is different from C symbols
(e.g. blackfin).
Signed-off-by: Tejun Heo <tj@kernel.org>
[ Based on original patch from Christoph Lameter and Mike Travis. ]
Currently pdas and percpu areas are allocated separately. %gs points
to local pda and percpu area can be reached using pda->data_offset.
This patch folds pda into percpu area.
Due to strange gcc requirement, pda needs to be at the beginning of
the percpu area so that pda->stack_canary is at %gs:40. To achieve
this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is
added and used to reserve pda sized chunk at the start of the percpu
area.
After this change, for boot cpu, %gs first points to pda in the
data.init area and later during setup_per_cpu_areas() gets updated to
point to the actual pda. This means that setup_per_cpu_areas() need
to reload %gs for CPU0 while clearing pda area for other cpus as cpu0
already has modified it when control reaches setup_per_cpu_areas().
This patch also removes now unnecessary get_local_pda() and its call
sites.
A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[ Based on original patch from Christoph Lameter and Mike Travis. ]
This patch makes percpu symbols zerobased on x86_64 SMP by adding
PERCPU_VADDR() to vmlinux.lds.h which helps setting explicit vaddr on
the percpu output section and using it in vmlinux_64.lds.S. A new
PHDR is added as existing ones cannot contain sections near address
zero. PERCPU_VADDR() also adds a new symbol __per_cpu_load which
always points to the vaddr of the loaded percpu data.init region.
The following adjustments have been made to accomodate the address
change.
* code to locate percpu gdt_page in head_64.S is updated to add the
load address to the gdt_page offset.
* __per_cpu_load is used in places where access to the init data area
is necessary.
* pda->data_offset is initialized soon after C code is entered as zero
value doesn't work anymore.
This patch is mostly taken from Mike Travis' "x86_64: Base percpu
variables at zero" patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: let the function-graph-tracer be aware of the irq entrypoints
Add a new .irqentry.text section to store the irq entrypoints functions
inside the same section. This way, the tracer will be able to signal
an interrupts triggering on output by recognizing these entrypoints.
Also, make this section recordable for dynamic tracing.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: feature to profile if statements
This patch adds a branch profiler for all if () statements.
The results will be found in:
/debugfs/tracing/profile_branch
For example:
miss hit % Function File Line
------- --------- - -------- ---- ----
0 1 100 x86_64_start_reservations head64.c 127
0 1 100 copy_bootdata head64.c 69
1 0 0 x86_64_start_kernel head64.c 111
32 0 0 set_intr_gate desc.h 319
1 0 0 reserve_ebda_region head.c 51
1 0 0 reserve_ebda_region head.c 47
0 1 100 reserve_ebda_region head.c 42
0 0 X maxcpus main.c 165
Miss means the branch was not taken. Hit means the branch was taken.
The percent is the percentage the branch was taken.
This adds a significant amount of overhead and should only be used
by those analyzing their system.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up to make one profiler of like and unlikely tracer
The likely and unlikely profiler prints out the file and line numbers
of the annotated branches that it is profiling. It shows the number
of times it was correct or incorrect in its guess. Having two
different files or sections for that matter to tell us if it was a
likely or unlikely is pretty pointless. We really only care if
it was correct or not.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: API *CHANGE*. Must update all tracepoint users.
Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint
structure in a single spot for all the kernel. It helps reducing memory
consumption, especially when declaring a lot of tracepoints, e.g. for
kmalloc tracing.
*API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for
tracepoint declarations rather than DEFINE_TRACE(). This is the sane way
to do it. The name previously used was misleading.
Updates scheduler instrumentation to follow this API change.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: name change of unlikely tracer and profiler
Ingo Molnar suggested changing the config from UNLIKELY_PROFILE
to BRANCH_PROFILING. I never did like the "unlikely" name so I
went one step farther, and renamed all the unlikely configurations
to a "BRANCH" variant.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new unlikely/likely profiler
Andrew Morton recently suggested having an in-kernel way to profile
likely and unlikely macros. This patch achieves that goal.
When configured, every(*) likely and unlikely macro gets a counter attached
to it. When the condition is hit, the hit and misses of that condition
are recorded. These numbers can later be retrieved by:
/debugfs/tracing/profile_likely - All likely markers
/debugfs/tracing/profile_unlikely - All unlikely markers.
# cat /debug/tracing/profile_unlikely | head
correct incorrect % Function File Line
------- --------- - -------- ---- ----
2167 0 0 do_arch_prctl process_64.c 832
0 0 0 do_arch_prctl process_64.c 804
2670 0 0 IS_ERR err.h 34
71230 5693 7 __switch_to process_64.c 673
76919 0 0 __switch_to process_64.c 639
43184 33743 43 __switch_to process_64.c 624
12740 64181 83 __switch_to process_64.c 594
12740 64174 83 __switch_to process_64.c 590
# cat /debug/tracing/profile_unlikely | \
awk '{ if ($3 > 25) print $0; }' |head -20
44963 35259 43 __switch_to process_64.c 624
12762 67454 84 __switch_to process_64.c 594
12762 67447 84 __switch_to process_64.c 590
1478 595 28 syscall_get_error syscall.h 51
0 2821 100 syscall_trace_leave ptrace.c 1567
0 1 100 native_smp_prepare_cpus smpboot.c 1237
86338 265881 75 calc_delta_fair sched_fair.c 408
210410 108540 34 calc_delta_mine sched.c 1267
0 54550 100 sched_info_queued sched_stats.h 222
51899 66435 56 pick_next_task_fair sched_fair.c 1422
6 10 62 yield_task_fair sched_fair.c 982
7325 2692 26 rt_policy sched.c 144
0 1270 100 pre_schedule_rt sched_rt.c 1261
1268 48073 97 pick_next_task_rt sched_rt.c 884
0 45181 100 sched_info_dequeued sched_stats.h 177
0 15 100 sched_move_task sched.c 8700
0 15 100 sched_move_task sched.c 8690
53167 33217 38 schedule sched.c 4457
0 80208 100 sched_info_switch sched_stats.h 270
30585 49631 61 context_switch sched.c 2619
# cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }'
39900 36577 47 pick_next_task sched.c 4397
20824 15233 42 switch_mm mmu_context_64.h 18
0 7 100 __cancel_work_timer workqueue.c 560
617 66484 99 clocksource_adjust timekeeping.c 456
0 346340 100 audit_syscall_exit auditsc.c 1570
38 347350 99 audit_get_context auditsc.c 732
0 345244 100 audit_syscall_entry auditsc.c 1541
38 1017 96 audit_free auditsc.c 1446
0 1090 100 audit_alloc auditsc.c 862
2618 1090 29 audit_alloc auditsc.c 858
0 6 100 move_masked_irq migration.c 9
1 198 99 probe_sched_wakeup trace_sched_switch.c 58
2 2 50 probe_wakeup trace_sched_wakeup.c 227
0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144
4514 2090 31 __grab_cache_page filemap.c 2149
12882 228786 94 mapping_unevictable pagemap.h 50
4 11 73 __flush_cpu_slab slub.c 1466
627757 330451 34 slab_free slub.c 1731
2959 61245 95 dentry_lru_del_init dcache.c 153
946 1217 56 load_elf_binary binfmt_elf.c 904
102 82 44 disk_put_part genhd.h 206
1 1 50 dst_gc_task dst.c 82
0 19 100 tcp_mss_split_point tcp_output.c 1126
As you can see by the above, there's a bit of work to do in rethinking
the use of some unlikelys and likelys. Note: the unlikely case had 71 hits
that were more than 25%.
Note: After submitting my first version of this patch, Andrew Morton
showed me a version written by Daniel Walker, where I picked up
the following ideas from:
1) Using __builtin_constant_p to avoid profiling fixed values.
2) Using __FILE__ instead of instruction pointers.
3) Using the preprocessor to stop all profiling of likely
annotations from vsyscall_64.c.
Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar
for their feed back on this patch.
(*) Not ever unlikely is recorded, those that are used by vsyscalls
(a few of them) had to have profiling disabled.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Theodore Tso <tytso@mit.edu>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Base infrastructure to enable per-module debug messages.
I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
control of debugging statements on a per-module basis in one /proc file,
currently, <debugfs>/dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
is not set, debugging statements can still be enabled as before, often by
defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.
The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
can be dynamically enabled/disabled on a per-module basis.
Future plans include extending this functionality to subsystems, that define
their own debug levels and flags.
Usage:
Dynamic debugging is controlled by the debugfs file,
<debugfs>/dynamic_printk/modules. This file contains a list of the modules that
can be enabled. The format of the file is as follows:
<module_name> <enabled=0/1>
.
.
.
<module_name> : Name of the module in which the debug call resides
<enabled=0/1> : whether the messages are enabled or not
For example:
snd_hda_intel enabled=0
fixup enabled=1
driver enabled=0
Enable a module:
$echo "set enabled=1 <module_name>" > dynamic_printk/modules
Disable a module:
$echo "set enabled=0 <module_name>" > dynamic_printk/modules
Enable all modules:
$echo "set enabled=1 all" > dynamic_printk/modules
Disable all modules:
$echo "set enabled=0 all" > dynamic_printk/modules
Finally, passing "dynamic_printk" at the command line enables
debugging for all modules. This mode can be turned off via the above
disable command.
[gkh: minor cleanups and tweaks to make the build work quietly]
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch creates a section in the kernel called "__mcount_loc".
This will hold a list of pointers to the mcount relocation for
each call site of mcount.
For example:
objdump -dr init/main.o
[...]
Disassembly of section .text:
0000000000000000 <do_one_initcall>:
0: 55 push %rbp
[...]
000000000000017b <init_post>:
17b: 55 push %rbp
17c: 48 89 e5 mov %rsp,%rbp
17f: 53 push %rbx
180: 48 83 ec 08 sub $0x8,%rsp
184: e8 00 00 00 00 callq 189 <init_post+0xe>
185: R_X86_64_PC32 mcount+0xfffffffffffffffc
[...]
We will add a section to point to each function call.
.section __mcount_loc,"a",@progbits
[...]
.quad .text + 0x185
[...]
The offset to of the mcount call site in init_post is an offset from
the start of the section, and not the start of the function init_post.
The mcount relocation is at the call site 0x185 from the start of the
.text section.
.text + 0x185 == init_post + 0xa
We need a way to add this __mcount_loc section in a way that we do not
lose the relocations after final link. The .text section here will
be attached to all other .text sections after final link and the
offsets will be meaningless. We need to keep track of where these
.text sections are.
To do this, we use the start of the first function in the section.
do_one_initcall. We can make a tmp.s file with this function as a reference
to the start of the .text section.
.section __mcount_loc,"a",@progbits
[...]
.quad do_one_initcall + 0x185
[...]
Then we can compile the tmp.s into a tmp.o
gcc -c tmp.s -o tmp.o
And link it into back into main.o.
ld -r main.o tmp.o -o tmp_main.o
mv tmp_main.o main.o
But we have a problem. What happens if the first function in a section
is not exported, and is a static function. The linker will not let
the tmp.o use it. This case exists in main.o as well.
Disassembly of section .init.text:
0000000000000000 <set_reset_devices>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: e8 00 00 00 00 callq 9 <set_reset_devices+0x9>
5: R_X86_64_PC32 mcount+0xfffffffffffffffc
The first function in .init.text is a static function.
00000000000000a8 t __setup_set_reset_devices
000000000000105f t __setup_str_set_reset_devices
0000000000000000 t set_reset_devices
The lowercase 't' means that set_reset_devices is local and is not exported.
If we simply try to link the tmp.o with the set_reset_devices we end
up with two symbols: one local and one global.
.section __mcount_loc,"a",@progbits
.quad set_reset_devices + 0x10
00000000000000a8 t __setup_set_reset_devices
000000000000105f t __setup_str_set_reset_devices
0000000000000000 t set_reset_devices
U set_reset_devices
We still have an undefined reference to set_reset_devices, and if we try
to compile the kernel, we will end up with an undefined reference to
set_reset_devices, or even worst, it could be exported someplace else,
and then we will have a reference to the wrong location.
To handle this case, we make an intermediate step using objcopy.
We convert set_reset_devices into a global exported symbol before linking
it with tmp.o and set it back afterwards.
00000000000000a8 t __setup_set_reset_devices
000000000000105f t __setup_str_set_reset_devices
0000000000000000 T set_reset_devices
00000000000000a8 t __setup_set_reset_devices
000000000000105f t __setup_str_set_reset_devices
0000000000000000 T set_reset_devices
00000000000000a8 t __setup_set_reset_devices
000000000000105f t __setup_str_set_reset_devices
0000000000000000 t set_reset_devices
Now we have a section in main.o called __mcount_loc that we can place
somewhere in the kernel using vmlinux.ld.S and access it to convert
all these locations that call mcount into nops before starting SMP
and thus, eliminating the need to do this with kstop_machine.
Note, A well documented perl script (scripts/recordmcount.pl) is used
to do all this in one location.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Implementation of kernel tracepoints. Inspired from the Linux Kernel
Markers. Allows complete typing verification by declaring both tracing
statement inline functions and probe registration/unregistration static
inline functions within the same macro "DEFINE_TRACE". No format string
is required. See the tracepoint Documentation and Samples patches for
usage examples.
Taken from the documentation patch :
"A tracepoint placed in code provides a hook to call a function (probe)
that you can provide at runtime. A tracepoint can be "on" (a probe is
connected to it) or "off" (no probe is attached). When a tracepoint is
"off" it has no effect, except for adding a tiny time penalty (checking
a condition for a branch) and space penalty (adding a few bytes for the
function call at the end of the instrumented function and adds a data
structure in a separate section). When a tracepoint is "on", the
function you provide is called each time the tracepoint is executed, in
the execution context of the caller. When the function provided ends its
execution, it returns to the caller (continuing from the tracepoint
site).
You can put tracepoints at important locations in the code. They are
lightweight hooks that can pass an arbitrary number of parameters, which
prototypes are described in a tracepoint declaration placed in a header
file."
Addition and removal of tracepoints is synchronized by RCU using the
scheduler (and preempt_disable) as guarantees to find a quiescent state
(this is really RCU "classic"). The update side uses rcu_barrier_sched()
with call_rcu_sched() and the read/execute side uses
"preempt_disable()/preempt_enable()".
We make sure the previous array containing probes, which has been
scheduled for deletion by the rcu callback, is indeed freed before we
proceed to the next update. It therefore limits the rate of modification
of a single tracepoint to one update per RCU period. The objective here
is to permit fast batch add/removal of probes on _different_
tracepoints.
Changelog :
- Use #name ":" #proto as string to identify the tracepoint in the
tracepoint table. This will make sure not type mismatch happens due to
connexion of a probe with the wrong type to a tracepoint declared with
the same name in a different header.
- Add tracepoint_entry_free_old.
- Change __TO_TRACE to get rid of the 'i' iterator.
Masami Hiramatsu <mhiramat@redhat.com> :
Tested on x86-64.
Performance impact of a tracepoint : same as markers, except that it
adds about 70 bytes of instructions in an unlikely branch of each
instrumented function (the for loop, the stack setup and the function
call). It currently adds a memory read, a test and a conditional branch
at the instrumentation site (in the hot path). Immediate values will
eventually change this into a load immediate, test and branch, which
removes the memory read which will make the i-cache impact smaller
(changing the memory read for a load immediate removes 3-4 bytes per
site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it
also saves the d-cache hit).
About the performance impact of tracepoints (which is comparable to
markers), even without immediate values optimizations, tests done by
Hideo Aoki on ia64 show no regression. His test case was using hackbench
on a kernel where scheduler instrumentation (about 5 events in code
scheduler code) was added.
Quoting Hideo Aoki about Markers :
I evaluated overhead of kernel marker using linux-2.6-sched-fixes git
tree, which includes several markers for LTTng, using an ia64 server.
While the immediate trace mark feature isn't implemented on ia64, there
is no major performance regression. So, I think that we don't have any
issues to propose merging marker point patches into Linus's tree from
the viewpoint of performance impact.
I prepared two kernels to evaluate. The first one was compiled without
CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS.
I downloaded the original hackbench from the following URL:
http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c
I ran hackbench 5 times in each condition and calculated the average and
difference between the kernels.
The parameter of hackbench: every 50 from 50 to 800
The number of CPUs of the server: 2, 4, and 8
Below is the results. As you can see, major performance regression
wasn't found in any case. Even if number of processes increases,
differences between marker-enabled kernel and marker- disabled kernel
doesn't increase. Moreover, if number of CPUs increases, the differences
doesn't increase either.
Curiously, marker-enabled kernel is better than marker-disabled kernel
in more than half cases, although I guess it comes from the difference
of memory access pattern.
* 2 CPUs
Number of | without | with | diff | diff |
processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
--------------------------------------------------------------
50 | 4.811 | 4.872 | +0.061 | +1.27 |
100 | 9.854 | 10.309 | +0.454 | +4.61 |
150 | 15.602 | 15.040 | -0.562 | -3.6 |
200 | 20.489 | 20.380 | -0.109 | -0.53 |
250 | 25.798 | 25.652 | -0.146 | -0.56 |
300 | 31.260 | 30.797 | -0.463 | -1.48 |
350 | 36.121 | 35.770 | -0.351 | -0.97 |
400 | 42.288 | 42.102 | -0.186 | -0.44 |
450 | 47.778 | 47.253 | -0.526 | -1.1 |
500 | 51.953 | 52.278 | +0.325 | +0.63 |
550 | 58.401 | 57.700 | -0.701 | -1.2 |
600 | 63.334 | 63.222 | -0.112 | -0.18 |
650 | 68.816 | 68.511 | -0.306 | -0.44 |
700 | 74.667 | 74.088 | -0.579 | -0.78 |
750 | 78.612 | 79.582 | +0.970 | +1.23 |
800 | 85.431 | 85.263 | -0.168 | -0.2 |
--------------------------------------------------------------
* 4 CPUs
Number of | without | with | diff | diff |
processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
--------------------------------------------------------------
50 | 2.586 | 2.584 | -0.003 | -0.1 |
100 | 5.254 | 5.283 | +0.030 | +0.56 |
150 | 8.012 | 8.074 | +0.061 | +0.76 |
200 | 11.172 | 11.000 | -0.172 | -1.54 |
250 | 13.917 | 14.036 | +0.119 | +0.86 |
300 | 16.905 | 16.543 | -0.362 | -2.14 |
350 | 19.901 | 20.036 | +0.135 | +0.68 |
400 | 22.908 | 23.094 | +0.186 | +0.81 |
450 | 26.273 | 26.101 | -0.172 | -0.66 |
500 | 29.554 | 29.092 | -0.461 | -1.56 |
550 | 32.377 | 32.274 | -0.103 | -0.32 |
600 | 35.855 | 35.322 | -0.533 | -1.49 |
650 | 39.192 | 38.388 | -0.804 | -2.05 |
700 | 41.744 | 41.719 | -0.025 | -0.06 |
750 | 45.016 | 44.496 | -0.520 | -1.16 |
800 | 48.212 | 47.603 | -0.609 | -1.26 |
--------------------------------------------------------------
* 8 CPUs
Number of | without | with | diff | diff |
processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
--------------------------------------------------------------
50 | 2.094 | 2.072 | -0.022 | -1.07 |
100 | 4.162 | 4.273 | +0.111 | +2.66 |
150 | 6.485 | 6.540 | +0.055 | +0.84 |
200 | 8.556 | 8.478 | -0.078 | -0.91 |
250 | 10.458 | 10.258 | -0.200 | -1.91 |
300 | 12.425 | 12.750 | +0.325 | +2.62 |
350 | 14.807 | 14.839 | +0.032 | +0.22 |
400 | 16.801 | 16.959 | +0.158 | +0.94 |
450 | 19.478 | 19.009 | -0.470 | -2.41 |
500 | 21.296 | 21.504 | +0.208 | +0.98 |
550 | 23.842 | 23.979 | +0.137 | +0.57 |
600 | 26.309 | 26.111 | -0.198 | -0.75 |
650 | 28.705 | 28.446 | -0.259 | -0.9 |
700 | 31.233 | 31.394 | +0.161 | +0.52 |
750 | 34.064 | 33.720 | -0.344 | -1.01 |
800 | 36.320 | 36.114 | -0.206 | -0.57 |
--------------------------------------------------------------
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: 'Peter Zijlstra' <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
ARCH=h8300:
init/main.c:781: undefined reference to `___early_initcall_end'
Same problem have
__start___bug_table
__stop___bug_table
__tracedata_start
__tracedata_end
__per_cpu_start
__per_cpu_end
When defining a symbol in vmlinux.lds, use the VMLINUX_SYMBOL macro.
VMLINUX_SYMBOL adds a prefix charactor.
You can't just use straight symbol names in common header files as they
dont take into consideration weird arch-specific ABI conventions. in the
case of Blackfin/h8300, the ABI dictates that any C-visible symbols have
an underscore prefixed to them. Thus all symbols in vmlinux.lds.h need to
be wrapped in VMLINUX_SYMBOL() so that each arch can put hide this magic
in their own files.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: "Mike Frysinger" <vapier.adi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (25 commits)
setlocalversion: do not describe if there is nothing to describe
kconfig: fix typos: "Suport" -> "Support"
kconfig: make defconfig is no longer chatty
kconfig: make oldconfig is now less chatty
kconfig: speed up all*config + randconfig
kconfig: set all new symbols automatically
kconfig: add diffconfig utility
kbuild: remove Module.markers during mrproper
kbuild: sparse needs CF not CHECKFLAGS
kernel-doc: handle/strip __init
vmlinux.lds: move __attribute__((__cold__)) functions back into final .text section
init: fix URL of "The GNU Accounting Utilities"
kbuild: add arch/$ARCH/include to search path
kbuild: asm symlink support for arch/$ARCH/include
kbuild: support arch/$ARCH/include for tags, cscope
kbuild: prepare headers_* for arch/$ARCH/include
kbuild: install all headers when arch is changed
kbuild: make clean removes *.o.* as well
kbuild: optimize headers_* targets
kbuild: only one call for include/ in make headers_*
...