Commit Graph

11 Commits

Author SHA1 Message Date
Emilio G. Cota
fe9959a275 qsp: QEMU's Synchronization Profiler
The goal of this module is to profile synchronization primitives (i.e.
mutexes, recursive mutexes and condition variables) so that scalability
issues can be quickly diagnosed.

Sync primitives are profiled by QSP based on the vaddr of the object accessed
as well as the call site (file:line_nr). That means the same object called
from two different call sites will be tracked in separate entries, which
might be reported together or separately (see subsequent commit on
call site coalescing).

Some perf numbers:

Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Command: taskset -c 0 tests/atomic_add-bench -d 5 -m

- Before: 54.80 Mops/s
- After:  54.75 Mops/s

That is, a negligible slowdown due to the now indirect call to
qemu_mutex_lock. Note that using a branch instead of an indirect
call introduces a more severe slowdown (53.65 Mops/s, i.e. 2% slowdown).

Enabling the profiler (with -p, added in this series) is more interesting:

- No profiling: 54.75 Mops/s
- W/ profiling: 12.53 Mops/s

That is, a 4.36X slowdown.

We can break down this slowdown by removing the get_clock calls or
the entry lookup:

- No profiling:     54.75 Mops/s
- W/o get_clock:    25.37 Mops/s
- W/o entry lookup: 19.30 Mops/s
- W/ profiling:     12.53 Mops/s

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23 18:46:25 +02:00
Emilio G. Cota
32359d529f qht: return existing entry when qht_insert fails
The meaning of "existing" is now changed to "matches in hash and
ht->cmp result". This is saner than just checking the pointer value.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by:  Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Emilio G. Cota
61b8cef1d4 qht: require a default comparison function
qht_lookup now uses the default cmp function. qht_lookup_custom is defined
to retain the old behaviour, that is a cmp function is explicitly provided.

qht_insert will gain use of the default cmp in the next patch.

Note that we move qht_lookup_custom's @func to be the last argument,
which makes the new qht_lookup as simple as possible.
Instead of this (i.e. keeping @func 2nd):
0000000000010750 <qht_lookup>:
   10750:       89 d1                   mov    %edx,%ecx
   10752:       48 89 f2                mov    %rsi,%rdx
   10755:       48 8b 77 08             mov    0x8(%rdi),%rsi
   10759:       e9 22 ff ff ff          jmpq   10680 <qht_lookup_custom>
   1075e:       66 90                   xchg   %ax,%ax

We get:
0000000000010740 <qht_lookup>:
   10740:       48 8b 4f 08             mov    0x8(%rdi),%rcx
   10744:       e9 37 ff ff ff          jmpq   10680 <qht_lookup_custom>
   10749:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-06-15 07:42:55 -10:00
Markus Armbruster
719a30776b Purge uses of banned g_assert_FOO()
We banned use of certain g_assert_FOO() functions outside tests, and
made checkpatch.pl flag them (commit 6e9389563e).  We neglected to
purge existing uses.  Do that now.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180608170231.27912-1-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: John Snow <jsnow@redhat.com>
2018-06-13 13:47:35 +02:00
Emilio G. Cota
76b553b308 qht: fix unlock-after-free segfault upon resizing
The old map's bucket locks are being unlocked *after*
that same old map has been passed to RCU for destruction.
This is a bug that can cause a segfault, since there's
no guarantee that the deletion will be deferred (e.g.
there may be no concurrent readers).

The segfault is easily triggered in RHEL6/CentOS6 with qht-test,
particularly on a single-core system or by pinning qht-test
to a single core.

Fix it by unlocking the map's bucket locks right after having
published the new map, and (crucially) before marking the map
for deletion via call_rcu().

While at it, expand qht_do_resize() to atomically do (1) a reset,
(2) a resize, or (3) a reset+resize. This simplifies the calling
code, since the new function (qht_do_resize_reset()) acquires
and releases the buckets' locks.

Note that no qht_do_reset inline is provided, since it would have
no users--qht_reset() already performs a reset without taking
ht->lock.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1475706880-10667-3-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-06 18:04:13 +02:00
Emilio G. Cota
f555a9d0b3 qht: simplify qht_reset_size
Sometimes gcc doesn't pick up the fact that 'new' is properly
set if 'resize == true', which may generate an unnecessary
build warning.

Fix it by removing 'resize' and directly checking that 'new'
is non-NULL.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1475706880-10667-2-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-06 18:04:13 +02:00
Alex Bennée
a890643958 util/qht: atomically set b->hashes
ThreadSanitizer detects a possible race between reading/writing the
hashes. The ordering semantics are already documented for QHT however
for true C11 compliance we should use relaxed atomic primitives for
accesses that are done across threads. On x86 this slightly changes to
the code to not do a load/compare in a single instruction leading to a
slight performance degradation.

Running 'taskset -c 0 tests/qht-bench -n 1 -d 10' (i.e. all lookups) 10
times, we get:

before the patch:
 $ ./mean.pl 34.04 34.24 34.38 34.25 34.18 34.51 34.46 34.44 34.29 34.08
 34.287 +- 0.160072900059109
after:
 $ ./mean.pl 33.94 34.00 33.52 33.46 33.55 33.71 34.27 34.06 34.28 34.58
 33.937 +- 0.374731014640279

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20160930213106.20186-10-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-04 10:00:26 +02:00
Emilio G. Cota
7266ae91a1 qht: do not segfault when gathering stats from an uninitialized qht
So far, QHT functions assume that the passed qht has previously been
initialized--otherwise they segfault.

This patch makes an exception for qht_statistics_init, with the goal
of simplifying calling code. For instance, qht_statistics_init is
called from the 'info jit' dump, and given that under KVM the TB qht
is never initialized, we get a segfault. Thus, instead of complicating
the 'info jit' code with additional checks, let's allow passing an
uninitialized qht to qht_statistics_init.

While at it, add a test for this to test-qht.

Before the patch (for $ qemu -enable-kvm [...]):
(qemu) info jit
[...]
direct jump count   0 (0%) (2 jumps=0 0%)
Program received signal SIGSEGV, Segmentation fault.

After the patch the "TB hash buckets", "TB hash occupancy"
and "TB hash avg chain" lines are omitted.
(qemu) info jit
[...]
direct jump count   0 (0%) (2 jumps=0 0%)
TB hash buckets     0/0 (-nan% head buckets used)
TB hash occupancy   nan% avg chain occ. Histogram: (null)
TB hash avg chain   nan buckets. Histogram: (null)
[...]

Reported by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1469205390-14369-1-git-send-email-cota@braap.org>
[Extract printing statistics to an entirely separate function. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-08-02 12:03:58 +02:00
Paolo Bonzini
34506b30e4 util/qht: Document memory ordering assumptions
It is naturally expected that some memory ordering should be provided
around qht_insert() and qht_lookup(). Document these assumptions in the
header file and put some comments in the source to denote how that
memory ordering requirements are fulfilled.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Sergey Fedorov: commit title and message provided;
comment on qht_remove() elided]
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-Id: <20160715175852.30749-2-sergey.fedorov@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-08-02 12:03:58 +02:00
Paolo Bonzini
e9abfcb57f clean-includes: run it once more
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:03 +02:00
Emilio G. Cota
2e11264aaf qht: QEMU's fast, resizable and scalable Hash Table
This is a fast, scalable chained hash table with optional auto-resizing, allowing
reads that are concurrent with reads, and reads/writes that are concurrent
with writes to separate buckets.

A hash table with these features will be necessary for the scalability
of the ongoing MTTCG work; before those changes arrive we can already
benefit from the single-threaded speedup that qht also provides.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-11-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:20 +00:00