Impact: improve workqueue tracer output
Currently, /sys/kernel/debug/tracing/trace_stat/workqueues can display
wrong and strange thread names.
Why?
Currently, ftrace has tracing_record_cmdline()/trace_find_cmdline()
convenience function that implements a task->comm string cache.
This can avoid unnecessary memcpy overhead and the workqueue tracer
uses it.
However, in general, any trace statistics feature shouldn't use
tracing_record_cmdline() because trace statistics can display
very old process. Then comm cache can return wrong string because
recent process overrides the cache.
Fortunately, workqueue trace guarantees that displayed processes
are live. Thus we can search comm string from PID at display time.
<before>
% cat workqueues
# CPU INSERTED EXECUTED NAME
# | | | |
7 431913 431913 kondemand/7
7 0 0 tail
7 21 21 git
7 0 0 ls
7 9 9 cat
7 832632 832632 unix_chkpwd
7 236292 236292 ls
Note: tail, git, ls, cat unix_chkpwd are obiously not workqueue thread.
<after>
% cat workqueues
# CPU INSERTED EXECUTED NAME
# | | | |
7 510 510 kondemand/7
7 0 0 kmpathd/7
7 15 15 ata/7
7 0 0 aio/7
7 11 11 kblockd/7
7 1063 1063 work_on_cpu/7
7 167 167 events/7
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: micro-optimization
trace_printk() does this unconditionally:
trace_printk_fmt = fmt;
Where trace_printk_fmt is an entry into a global array. This is
very SMP-unfriendly.
So only write it once per bootup.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix kernel crash when using trace_printk()
trace_printk_fmt section is defined into the readonly section.
But we do:
trace_printk_fmt = fmt;
to fill in that table of format strings - which is not read-only.
Under CONFIG_DEBUG_RODATA=y this crashes ...
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs()
x86, bts: remove bad warning
x86: add Dell XPS710 reboot quirk
x86, math-emu: fix init_fpu for task != current
x86: EFI: Back efi_ioremap with init_memory_mapping instead of FIX_MAP
x86: fix DMI on EFI
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (28 commits)
Blackfin arch: SPI_MMC is now mainlined MMC_SPI
Blackfin arch: disable legacy /proc/scsi/ support by default
Blackfin arch: remove duplicated ANOMALY_05000448 ifdef check
Blackfin arch: add stubs for anomalies 447 and 448
Blackfin arch: cleanup bfin_sport.h header and export it to userspace
Blackfin arch: fix bug - gdb signull case make trunk kernel panic frequently
Blackfin arch: remove spurious dash when dcache is off
Blackfin arch: mark init_pda as __init as only __init funcs all it
Blackfin arch: fix bug - On bf548-ezkit, ethernet fails to work after wakeup from "mem"
Blackfin arch: Random read/write errors are a bad thing
Blackfin arch: update default kernel config, select KSZ8893M driver for BF518
Blackfin arch: Fix bug - KGDB single step into the middle of a 4 bytes instruction on bf561 after soft bp is hit
Blackfin arch: Fix bug - make ksz8893m driver available when bfin_mac is enabled
Blackfin arch: make sure people do not set the kernel load address too high
Blackfin arch: fix bug - The SPORT_HYS bit is not set for BF561 0.5
Blackfin arch: update anomaly sheets to match latest public info
Blackfin arch: Fix BUG - kernel fails to build in pm.c when allow wakeup fromi standby by GPIO
Blackfin arch: PM_BFIN_WAKE_GP: update help
Blackfin arch: fix bug - kgdb fails to continue after setting breakpoint on bf561-ezkit kernel with smp patch
Blackfin arch: Enable Write Back Cache on all Blackfin Boards
...
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmatest: fix use after free in dmatest_exit
ipu_idmac: fix spinlock type
iop-adma, mv_xor: fix mem leak on self-test setup failure
fsldma: fix off by one in dma_halt
I/OAT: fail self-test if callback test reaches timeout
I/OAT: update driver version and copyright dates
I/OAT: list usage cleanup
I/OAT: set tcp_dma_copybreak to 256k for I/OAT ver.3
I/OAT: cancel watchdog before dma remove
I/OAT: fail initialization on zero channels detection
I/OAT: do not set DCACTRL_CMPL_WRITE_ENABLE for I/OAT ver.3
I/OAT: add verification for proper APICID_TAG_MAP setting by BIOS
dmaengine: update kerneldoc
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
ata: add CFA specific identify data words
remove stale comment from <linux/hdreg.h>
AT91: initialize Compact Flash on AT91SAM9263 cpu
ide: add at91_ide driver
ide: allow to wrap interrupt handler
ide-iops: fix odd-length ATAPI PIO transfers
ide: NULL noise: drivers/ide/ide-*.c
ide: expiry() returns int, negative expiry() return values won't be noticed
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: Don't trust current capacity values in identify words 57-58
libata: make sure port is thawed when skipping resets
sata_nv: fix module parameter description
ahci: Add the Device IDs for MCP89 and remove IDs of MCP7B to/from ahci.c
libata: don't use on-stack sense buffer
libata: align ap->sector_buf
libata: fix dma_unmap_sg misuse
libata: change drive ready wait after hard reset to 5s
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
Squashfs: frag_size should be signed, as it can hold an error result
Squashfs: fix documentation typo, Cramfs filesystem limit is 256 MiB
Squashfs: Fix oops when reading fsfuzzer corrupted filesystems
* 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Fix headphone-detect regression with multiple HP jacks
ALSA: hda - Fix typos in slave controls in patch_sigmatel.c
This is a build fix required after "x86-64: seccomp: fix 32/64 syscall
hole" (commit 5b1017404a). MIPS doesn't
have the issue that was fixed for x86-64 by that patch.
This also doesn't solve the N32 issue which is that N32 seccomp processes
will be treated as non-compat processes thus only have access to N64
syscalls.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There isn't a trace_max_latency file, there is tracing_max_latency.
Fix it.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <20090307235409.5A87.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit 0d3e0460f3
"MMC: CSD and CID timeout values" inadvertently broke
the timeout for the MMC command SEND_EXT_CSD.
This patch puts it back again.
Depending on the characteristics of the controller,
this bug may prevent the use of MMC cards.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Protocol 0x37 has been reserved for iNexio devices and Sahara
was supposed to get 0x38.
Reported-by: Claudio Nieder <private@claudio.ch>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
We recently discovered a problem with passing of DMA attributes on SN
systems with the older PIC chips.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jeremy Higdon <jeremy@sgi.com>
Cc: <habeck@sgi.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Impact: cleanup
Remove a few leftovers and clean up the code a bit.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: faster and lighter tracing
Now that we have trace_bprintk() which is faster and consume lesser
memory than trace_printk() and has the same purpose, we can now drop
the old implementation in favour of the binary one from trace_bprintk(),
which means we move all the implementation of trace_bprintk() to
trace_printk(), so the Api doesn't change except that we must now use
trace_seq_bprintk() to print the TRACE_PRINT entries.
Some changes result of this:
- Previously, trace_bprintk depended of a single tracer and couldn't
work without. This tracer has been dropped and the whole implementation
of trace_printk() (like the module formats management) is now integrated
in the tracing core (comes with CONFIG_TRACING), though we keep the file
trace_printk (previously trace_bprintk.c) where we can find the module
management. Thus we don't overflow trace.c
- changes some parts to use trace_seq_bprintk() to print TRACE_PRINT entries.
- change a bit trace_printk/trace_vprintk macros to support non-builtin formats
constants, and fix 'const' qualifiers warnings. But this is all transparent for
developers.
- etc...
V2:
- Rebase against last changes
- Fix mispell on the changelog
V3:
- Rebase against last changes (moving trace_printk() to kernel.h)
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add a generic printk() for tracing, like trace_printk()
trace_bprintk() uses the infrastructure to record events on ring_buffer.
[ fweisbec@gmail.com: ported to latest -tip, made it work if
!CONFIG_MODULES, never free the format strings from modules
because we can't keep track of them and conditionnaly create
the ftrace format strings section (reported by Steven Rostedt) ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: save on memory for tracing
Current tracers are typically using a struct(like struct ftrace_entry,
struct ctx_switch_entry, struct special_entr etc...)to record a binary
event. These structs can only record a their own kind of events.
A new kind of tracer need a new struct and a lot of code too handle it.
So we need a generic binary record for events. This infrastructure
is for this purpose.
[fweisbec@gmail.com: rebase against latest -tip, make it safe while sched
tracing as reported by Steven Rostedt]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
An new optimization is making its way to ftrace. Its purpose is to
make trace_printk() consuming less memory and be faster.
Written by Lai Jiangshan, the approach is to delay the formatting
job from tracing time to output time.
Currently, a call to trace_printk() will format the whole string and
insert it into the ring buffer. Then you can read it on /debug/tracing/trace
file.
The new implementation stores the address of the format string and
the binary parameters into the ring buffer, making the packet more compact
and faster to insert.
Later, when the user exports the traces, the format string is retrieved
with the binary parameters and the formatting job is eventually done.
The new implementation rewrites a lot of format decoding bits from
vsnprintf() function, making now 3 differents functions to maintain
in their duplicated parts of printf format decoding bits.
Suggested by Ingo Molnar, this patch tries to factorize the most
possible common bits from these functions.
The real common part between them is the format decoding. Although
they do somewhat similar jobs, their way to export or import the parameters
is very different. Thus, only the decoding layer is extracted, unless you see
other parts that could be worth factorized.
Changes in V2:
- Address a suggestion from Linus to group the format_decode() parameters inside
a structure.
Changes in v3:
- Address other cleanups suggested by Ingo and Linus such as passing the
printf_spec struct to the format helpers: pointer()/number()/string()
Note that this struct is passed by copy and not by address. This is to
avoid side effects because these functions often change these values and the
changes shoudn't be persistant when a callee helper returns.
It would be too risky.
- Various cleanups (code alignement, switch/case instead of if/else fountains).
- Fix a bug that printed the first format specifier following a %p
Changes in v4:
- drop unapropriate const qualifier loss while casting fmt to a char *
(thanks to Vegard Nossum for having pointed this out).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236356510-8381-6-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new APIs for binary trace printk infrastructure
vbin_printf(): write args to binary buffer, string is copied
when "%s" is occurred.
bstr_printf(): read from binary buffer for args and format a string
[fweisbec@gmail.com: rebase]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1236356510-8381-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use fixmaps instead of vmap/vunmap in text_poke() for avoiding
page allocation and delayed unmapping.
At the result of above change, text_poke() becomes atomic and can be called
from stop_machine() etc.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
LKML-Reference: <49B14352.2040705@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use the mutual exclusion provided by the text edit lock in alternatives code.
Since alternative_smp_* will be called from module init code, etc,
we'd better protect it from other subsystems.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <49B14332.9030109@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use the mutual exclusion provided by the text edit lock in the kprobes code. It
allows coherent manipulation of the kernel code by other subsystems.
Changelog:
Move the kernel_text_lock/unlock out of the for loops.
Use text_mutex directly instead of a function.
Remove whitespace modifications.
(note : kprobes_mutex is always taken outside of text_mutex)
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <49B14306.2080202@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is an architecture independant synchronization around kernel text
modifications through use of a global mutex.
A mutex has been chosen so that kprobes, the main user of this, can sleep
during memory allocation between the memory read of the instructions it
must replace and the memory write of the breakpoint.
Other user of this interface: immediate values.
Paravirt and alternatives are always done when SMP is inactive, so there
is no need to use locks.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
LKML-Reference: <49B142D8.7020601@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
ds_write_config() can write the BTS as well as the PEBS part of
the DS config. ds_request_pebs() passes the wrong qualifier, which
results in the wrong configuration to be written.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305085721.A22550@sedona.ch.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In case a ptraced task is reaped (while the tracer is still attached),
ds_exit_thread() is called before ptrace_exit(). The latter will
release the bts_tracer and remove the thread's ds_ctx.
The former will WARN() if the context is not NULL.
Oleg Nesterov submitted patches that move ptrace_exit() before
exit_thread() and thus reverse the order of the above calls.
Remove the bad warning. I will add it again when Oleg's changes are in.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305084954.A22000@sedona.ch.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: build fix
The 'struct power_trace' definition is needed (for the event tracer) even if
the power-tracer plugin is turned off in the .config.
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <20090306104106.GF31042@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix deadlock while using set_ftrace_pid
Reproducer:
# cd /sys/kernel/debug/tracing
# echo $$ > set_ftrace_pid
then, console becomes hung.
Details:
when writing set_ftracepid, kernel callstack is following
ftrace_pid_write()
mutex_lock(&ftrace_lock);
ftrace_update_pid_func()
mutex_lock(&ftrace_lock);
mutex_unlock(&ftrace_lock);
mutex_unlock(&ftrace_lock);
then, system always deadlocks when ftrace_pid_write() is called.
In past days, ftrace_pid_write() used ftrace_start_lock, but
commit e6ea44e9b4 consolidated
ftrace_start_lock to ftrace_lock.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <20090306151155.0778.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The recent changes over the DAC detection mechanism in patch_sigmatel.c
breaks the HP detection on the machines with multiple HP jacks.
It's basically because of the workaround to support the multi-channel
output. Since the HP detection is more important feature, disable
the HP-swap workaroud temporarily.
Reference: Novell bnc#482052
https://bugzilla.novell.com/show_bug.cgi?id=482052
Signed-off-by: Takashi Iwai <tiwai@suse.de>
"Headphone Playback ..." appears twice in slave_vols[] and slave_sws[].
They should be "Headphone Playback2 ..."
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit 1e42807918 introduced a bug where we
don't get front/back segment sizes in the bio in blk_recount_segments().
Fix this by tracking the back bio as well as the front bio in
__blk_recalc_rq_segments(), this also cleans up the interface by getting
rid of the segment size pointer passing.
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Impact: allow user apps to read binary format of basic ftrace entries
Currently, only defined raw events export their formats so a binary
reader can parse them. There's no reason that the default ftrace entries
can't export their formats.
This patch adds a subsystem called "ftrace" in the events directory
that includes the ftrace entries for basic ftrace recorded items.
These only have three files in the events directory:
type : printf
available_types : printf
format : format for the event entry
For example:
# cat /debug/tracing/events/ftrace/wakeup/format
name: wakeup
ID: 3
format:
field:unsigned char type; offset:0; size:1;
field:unsigned char flags; offset:1; size:1;
field:unsigned char preempt_count; offset:2; size:1;
field:int pid; offset:4; size:4;
field:int tgid; offset:8; size:4;
field:unsigned int prev_pid; offset:12; size:4;
field:unsigned char prev_prio; offset:16; size:1;
field:unsigned char prev_state; offset:17; size:1;
field:unsigned int next_pid; offset:20; size:4;
field:unsigned char next_prio; offset:24; size:1;
field:unsigned char next_state; offset:25; size:1;
field:unsigned int next_cpu; offset:28; size:4;
print fmt: "%u:%u:%u ==+ %u:%u:%u [%03u]"
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Impact: clean up
Move the macro that creates the event format file to a separate header.
This will allow the default ftrace events to use this same macro
to create the formats to read those events.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>