Commit Graph

27652 Commits

Author SHA1 Message Date
Ian Rogers
94dbfd6781 perf parse-events: Architecture specific leader override
Currently topdown events must appear after a slots event:

  $ perf stat -e '{slots,topdown-fe-bound}' /bin/true

   Performance counter stats for '/bin/true':

         3,183,090      slots
           986,133      topdown-fe-bound

Reversing the events yields:

  $ perf stat -e '{topdown-fe-bound,slots}' /bin/true
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (topdown-fe-bound).

For metrics the order of events is determined by iterating over a
hashmap, and so slots isn't guaranteed to be first which can yield this
error.

Change the set_leader in parse-events, called when a group is closed, so
that rather than always making the first event the leader, if the slots
event exists then it is made the leader. It is then moved to the head of
the evlist otherwise it won't be opened in the correct order.

The result is:

  $ perf stat -e '{topdown-fe-bound,slots}' /bin/true

   Performance counter stats for '/bin/true':

         3,274,795      slots
         1,001,702      topdown-fe-bound

A problem with this approach is the slots event is identified by name,
names can be overwritten like 'cpu/slots,name=foo/' and this causes the
leader change to fail.

The change also modifies and fixes mixed groups like, with the change:

  $ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2

   Performance counter stats for 'system wide':

        5574985410      slots
         971981616      instructions
        1348461887      topdown-fe-bound

       2.001263120 seconds time elapsed

Without the change:

  $ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2

   Performance counter stats for 'system wide':

     <not counted>      instructions
     <not counted>      slots
   <not supported>      topdown-fe-bound

       2.006247990 seconds time elapsed

Something that may be undesirable here is that the events are reordered
in the output.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Link: http://lore.kernel.org/lkml/20211130174945.247604-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:24 -03:00
Ian Rogers
ecdcf630d7 perf evlist: Allow setting arbitrary leader
The leader of a group is the first, but allow it to be an arbitrary list
member so that for Intel topdown events slots may always be the group
leader.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Link: http://lore.kernel.org/lkml/20211130174945.247604-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:24 -03:00
Ian Rogers
6b6b16b3bb perf metric: Reduce multiplexing with duration_time
It is common to use the same counters with and without duration_time.
The ID sharing code treats duration_time as if it were a hardware event
placed in the same group. This causes unnecessary multiplexing such as
in the following example where l3_cache_access isn't shared:

  $ perf stat -M l3 -a sleep 1

   Performance counter stats for 'system wide':

         3,117,007      l3_cache_miss         #    199.5 MB/s  l3_rd_bw
                                              #     43.6 %  l3_hits
                                              #     56.4 %  l3_miss                 (50.00%)
         5,526,447      l3_cache_access                                             (50.00%)
         5,392,435      l3_cache_access       # 5389191.2 access/s  l3_access_rate  (50.00%)
     1,000,601,901 ns   duration_time

       1.000601901 seconds time elapsed

Fix this by placing duration_time in all groups unless metric
sharing has been disabled on the command line:

  $ perf stat -M l3 -a sleep 1

   Performance counter stats for 'system wide':

         3,597,972      l3_cache_miss         #    230.3 MB/s  l3_rd_bw
                                              #     48.0 %  l3_hits
                                              #     52.0 %  l3_miss
         6,914,459      l3_cache_access       # 6909935.9 access/s  l3_access_rate
     1,000,654,579 ns   duration_time

       1.000654579 seconds time elapsed

  $ perf stat --metric-no-merge -M l3 -a sleep 1

   Performance counter stats for 'system wide':

         3,501,834      l3_cache_miss         #     53.5 %  l3_miss                (24.99%)
         6,548,173      l3_cache_access                                            (24.99%)
         3,417,622      l3_cache_miss         #     45.7 %  l3_hits                (25.04%)
         6,294,062      l3_cache_access                                            (25.04%)
         5,923,238      l3_cache_access       # 5919688.1 access/s  l3_access_rate (24.99%)
     1,000,599,683 ns   duration_time
         3,607,486      l3_cache_miss         #    230.9 MB/s  l3_rd_bw            (49.97%)

       1.000599683 seconds time elapsed

v2. Doesn't count duration_time in the metric_list_cmp function that
    sorts larger metrics first. Without this a metric with duration_time
    and an event is sorted the same as a metric with two events,
    possibly not allowing the first metric to share with the second.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211124015226.3317994-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:24 -03:00
Gang Li
b4515ad6e1 perf trace: Enable ignore_missing_thread for trace
perf already support ignore_missing_thread for -u/-p, but not yet
applied to `perf trace`. This patch enables ignore_missing_thread
for `perf trace`.

Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1481538943-21874-6-git-send-email-jolsa@kernel.org
Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com
Link: http://lore.kernel.org/lkml/20211123074018.11406-1-ligang.bdlg@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:24 -03:00
Sandipan Das
7a2e14962c perf docs: Update link to AMD documentation
This updates the link to documentation on AMD processors.  The new link
points to a page where users can find the Processor Programming
Reference (PPR) documents for the family and model codes corresponding
to processors they are using.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Robert Richter <rrichter@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Link: https://lore.kernel.org/r/20211123084613.243792-2-sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:24 -03:00
Sandipan Das
4edb117e64 perf docs: Add info on AMD raw event encoding
AMD processors have events with event select codes and unit masks larger
than a byte. The core PMU, for example, uses 12-bit event select codes
split between bits 0-7 and 32-35 of the PERF_CTL MSRs as can be seen
from /sys/bus/event_sources/devices/cpu/format/*.

The Processor Programming Reference (PPR) lists the event codes as
unified 12-bit hexadecimal values instead and the split between the bits
is not apparent to someone who is not aware of the layout of the
PERF_CTL MSRs.

8-bit event select codes continue to work as the layout matches that of
the PERF_CTL MSRs i.e. bits 0-7 for event select and 8-15 for unit mask.

This adds more details in the perf man pages about using
/sys/bus/event_sources/devices/*/format/* for determining the correct
raw event encoding scheme.

E.g. the "op_cache_hit_miss.op_cache_hit" event with code 0x28f and
umask 0x03 can be programmed using its symbolic name as:

  $ sudo perf --debug perf-event-open stat -e op_cache_hit_miss.op_cache_hit sleep 1
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             128
    config                           0x20000038f
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  [...]

One might use a simple eventsel+umask combination based on what the
current man pages say and incorrectly program the event as:

  $ sudo perf --debug perf-event-open stat -e r0328f sleep 1
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             128
    config                           0x328f
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  [...]

When it should have been based on the format from sysfs:

  $ cat /sys/bus/event_source/devices/cpu/format/event
  config:0-7,32-35

  $ sudo perf --debug perf-event-open stat -e r20000038f sleep 1
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             128
    config                           0x20000038f
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  [...]

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Robert Richter <rrichter@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Link: https://lore.kernel.org/r/20211123084613.243792-1-sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:24 -03:00
Shunsuke Nakamura
a7f3713f6b libperf tests: Add test_stat_multiplexing test
Adds a test for a counter obtained using read() system call during
multiplexing.

  $ sudo make tests -C ./tools/lib/perf/ V=1
  make: Entering directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf'
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=libperf
  make -C /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/api/ O= libapi.a
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=tests
  make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./tests obj=tests
  running static:
  - running tests/test-cpumap.c...OK
  - running tests/test-threadmap.c...OK
  - running tests/test-evlist.c...
  Event  0 -- Raw count = 298049842, run = 270269503, enable = 456262127
           Scaled count = 503160191 (59.24%, 270269503/456262127)
  Event  1 -- Raw count = 299134173, run = 271075173, enable = 456257234
           Scaled count = 503484435 (59.41%, 271075173/456257234)
  Event  2 -- Raw count = 300461996, run = 272069283, enable = 456253417
           Scaled count = 503867290 (59.63%, 272069283/456253417)
  Event  3 -- Raw count = 301308704, run = 273063387, enable = 456249352
           Scaled count = 503443183 (59.85%, 273063387/456249352)
  Event  4 -- Raw count = 302531164, run = 274102932, enable = 456244712
           Scaled count = 503563543 (60.08%, 274102932/456244712)
  Event  5 -- Raw count = 303710254, run = 275406214, enable = 456228165
           Scaled count = 503115633 (60.37%, 275406214/456228165)
  Event  6 -- Raw count = 304531302, run = 276396076, enable = 456221130
           Scaled count = 502661313 (60.58%, 276396076/456221130)
  Event  7 -- Raw count = 304486460, run = 276601890, enable = 456213754
           Scaled count = 502205212 (60.63%, 276601890/456213754)
  Event  8 -- Raw count = 304116681, run = 276631326, enable = 456205562
           Scaled count = 501532936 (60.64%, 276631326/456205562)
  Event  9 -- Raw count = 303567766, run = 276188567, enable = 456196839
           Scaled count = 501420666 (60.54%, 276188567/456196839)
  Event 10 -- Raw count = 302238014, run = 275144001, enable = 456185300
           Scaled count = 501106833 (60.31%, 275144001/456185300)
  Event 11 -- Raw count = 300805716, run = 273824589, enable = 456175608
           Scaled count = 501124573 (60.03%, 273824589/456175608)
  Event 12 -- Raw count = 299959051, run = 272834556, enable = 456166593
           Scaled count = 501517477 (59.81%, 272834556/456166593)
  Event 13 -- Raw count = 299037090, run = 271820805, enable = 456157086
           Scaled count = 501830195 (59.59%, 271820805/456157086)
  Event 14 -- Raw count = 298327042, run = 270784311, enable = 456147546
           Scaled count = 502544433 (59.36%, 270784311/456147546)
     Expected: 501614268
     High: 503867290   Low:  298049842   Average:  502438527
     Average Error = 0.16%
  OK
  - running tests/test-evsel.c...
          loop = 65536, count = 328182
          loop = 131072, count = 660214
          loop = 262144, count = 1315534
          loop = 524288, count = 2635364
          loop = 1048576, count = 5271971
          loop = 65536, count = 491952
          loop = 131072, count = 850061
          loop = 262144, count = 1648608
          loop = 524288, count = 3162059
          loop = 1048576, count = 6353393
  OK
  running dynamic:
  - running tests/test-cpumap.c...OK
  - running tests/test-threadmap.c...OK
  - running tests/test-evlist.c...
  Event  0 -- Raw count = 300218292, run = 297528154, enable = 496789343
           Scaled count = 501281125 (59.89%, 297528154/496789343)
  Event  1 -- Raw count = 301438606, run = 298515328, enable = 496784768
           Scaled count = 501649643 (60.09%, 298515328/496784768)
  Event  2 -- Raw count = 302342618, run = 298798983, enable = 496782015
           Scaled count = 502673648 (60.15%, 298798983/496782015)
  Event  3 -- Raw count = 303132319, run = 299230407, enable = 496778508
           Scaled count = 503256412 (60.23%, 299230407/496778508)
  Event  4 -- Raw count = 302758195, run = 299218047, enable = 496774243
           Scaled count = 502651743 (60.23%, 299218047/496774243)
  Event  5 -- Raw count = 303158458, run = 299204274, enable = 496769146
           Scaled count = 503334281 (60.23%, 299204274/496769146)
  Event  6 -- Raw count = 303471397, run = 299197479, enable = 496763124
           Scaled count = 503859189 (60.23%, 299197479/496763124)
  Event  7 -- Raw count = 303583387, run = 299196861, enable = 496756458
           Scaled count = 504039405 (60.23%, 299196861/496756458)
  Event  8 -- Raw count = 303096897, run = 299186924, enable = 496748667
           Scaled count = 503240507 (60.23%, 299186924/496748667)
  Event  9 -- Raw count = 301424173, run = 297845086, enable = 496739994
           Scaled count = 502709122 (59.96%, 297845086/496739994)
  Event 10 -- Raw count = 300876415, run = 296851339, enable = 496729034
           Scaled count = 503464297 (59.76%, 296851339/496729034)
  Event 11 -- Raw count = 300239338, run = 296547963, enable = 496719538
           Scaled count = 502902612 (59.70%, 296547963/496719538)
  Event 12 -- Raw count = 299751948, run = 296547195, enable = 496710036
           Scaled count = 502077926 (59.70%, 296547195/496710036)
  Event 13 -- Raw count = 299341883, run = 296549981, enable = 496700423
           Scaled count = 501376663 (59.70%, 296549981/496700423)
  Event 14 -- Raw count = 299145476, run = 296561684, enable = 496690949
           Scaled count = 501018366 (59.71%, 296561684/496690949)
     Expected: 501669431
     High: 504039405   Low:  300218292   Average:  502635662
     Average Error = 0.19%
  OK
  - running tests/test-evsel.c...
          loop = 65536, count = 329275
          loop = 131072, count = 664638
          loop = 262144, count = 1315367
          loop = 524288, count = 2629617
          loop = 1048576, count = 5273657
          loop = 65536, count = 459641
          loop = 131072, count = 978402
          loop = 262144, count = 1581219
          loop = 524288, count = 3774908
          loop = 1048576, count = 7694417
  OK
  make: Leaving directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf'

Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211109085831.3770594-4-nakamura.shun@fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:23 -03:00
Shunsuke Nakamura
f2c4dcf191 libperf: Remove scaling process from perf_mmap__read_self()
Remove the scaling process from perf_mmap__read_self(), and unify the
counters that can be obtained from perf_evsel__read() to "no scaling".

Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211109085831.3770594-3-nakamura.shun@fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:23 -03:00
Shunsuke Nakamura
9a5b2d1afa libperf: Adopt perf_counts_values__scale() from tools/perf/util
Move perf_counts_values__scale() from tools/perf/util to tools/lib/perf
so that it can be used with libperf.

Committer notes:

As noted by Jiri, use __s8 instead of s8 on the exported function.

Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211109085831.3770594-2-nakamura.shun@fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:23 -03:00
John Garry
c77a78c291 tools build: Enable warnings through HOSTCFLAGS
The tools build system uses KBUILD_HOSTCFLAGS symbol for obvious purposes.

However this is not set for anything under tools/

As such, host tools apps built have no compiler warnings enabled.

Declare HOSTCFLAGS for perf tools build, and also use that symbol in
declaration of host_c_flags. HOSTCFLAGS comes from EXTRA_WARNINGS, which
is independent of target platform/arch warning flags.

Suggested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1635525041-151876-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:23 -03:00
Arnaldo Carvalho de Melo
e9c08f7229 perf test sigtrap: Print errno string when failing
Helps a bit the user figuring out why it is failing:

Before:

  $ perf test sigtrap
  73: Sigtrap                                                         : FAILED!
  $ perf test -v sigtrap
  73: Sigtrap                                                         :
  --- start ---
  test child forked, pid 3816772
  FAILED sys_perf_event_open()
  test child finished with -1
  ---- end ----
  Sigtrap: FAILED!
  $

After:

  $ perf test sigtrap
  73: Sigtrap                                                         : FAILED!
  $ perf test -v sigtrap
  73: Sigtrap                                                         :
  --- start ---
  test child forked, pid 3816772
  FAILED sys_perf_event_open(): Permission denied
  test child finished with -1
  ---- end ----
  Sigtrap: FAILED!
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kasan-dev@googlegroups.com
Link: http://lore.kernel.org/lkml/YZOpSVOCXe0zWeRs@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:23 -03:00
Marco Elver
5504f67944 perf test sigtrap: Add basic stress test for sigtrap handling
Add basic stress test for sigtrap handling as a perf tool built-in test.
This allows sanity checking the basic sigtrap functionality from within
the perf tool.

Committer notes:

Reported that !root was getting -EPERM, applied a fixup from Marco to
set .exclude_{hv,kernel} that made it work.

Signed-off-by: Marco Elver <elver@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kasan-dev@googlegroups.com
Link: http://lore.kernel.org/lkml/20211115112822.4077224-1-elver@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 22:18:23 -03:00
Song Liu
5a897531e0 perf bpf_skel: Do not use typedef to avoid error on old clang
When building bpf_skel with clang-10, typedef causes confusions like:

  libbpf: map 'prev_readings': unexpected def kind var.

Fix this by removing the typedef.

Fixes: 7fac83aaf2 ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/BEF5C312-4331-4A60-AEC0-AD7617CB2BC4@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Song Liu
f7c4e85bcc perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distros
Arnaldo reported that building all his containers with BUILD_BPF_SKEL=1
to then make this the default he found problems in some distros where
the system linux/bpf.h file was being used and lacked this:

   util/bpf_skel/bperf_leader.bpf.c:13:20: error: use of undeclared identifier 'BPF_F_PRESERVE_ELEMS'
           __uint(map_flags, BPF_F_PRESERVE_ELEMS);

So use instead the vmlinux.h file generated by bpftool from BTF info.

This fixed these as well, getting the build back working on debian:11,
debian:experimental and ubuntu:21.10:

  In file included from In file included from util/bpf_skel/bperf_leader.bpf.cutil/bpf_skel/bpf_prog_profiler.bpf.c::33:
  :
  In file included from In file included from /usr/include/linux/bpf.h/usr/include/linux/bpf.h::1111:
  :
  /usr/include/linux/types.h/usr/include/linux/types.h::55::1010:: In file included from  util/bpf_skel/bperf_follower.bpf.c:3fatal errorfatal error:
  : : In file included from /usr/include/linux/bpf.h:'asm/types.h' file not found11'asm/types.h' file not found:

  /usr/include/linux/types.h:5:10: fatal error: 'asm/types.h' file not found
  #include <asm/types.h>#include <asm/types.h>

           ^~~~~~~~~~~~~         ^~~~~~~~~~~~~

  #include <asm/types.h>
           ^~~~~~~~~~~~~
  1 error generated.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/CF175681-8101-43D1-ABDB-449E644BE986@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Ian Rogers
4747395082 perf header: Fix memory leaks when processing feature headers
These leaks were found with leak sanitizer running "perf pipe recording
and injection test".

In pipe mode feat_fd may hold onto an events struct that needs freeing.

When string features are processed they may overwrite an already created
string, so free this before the overwrite.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211118201730.2302927-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Ian Rogers
1aa79e5773 perf test: Reset shadow counts before loading
Otherwise load counting is an average. Without this change
duration_time in test_memory_bandwidth will alter its value if an
earlier test contains duration_time.

This patch fixes an issue that's introduced in the proposed patch:
https://lore.kernel.org/lkml/20211124015226.3317994-1-irogers@google.com/
in perf test "Parse and process metrics".

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211128085810.4027314-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Thomas Richter
6c481031c9 perf test: Fix 'Simple expression parser' test on arch without CPU die topology info
Some platforms do not have CPU die support, for example s390.

Commit
Cc: Ian Rogers <irogers@google.com>
Fixes: fdf1e29b61 ("perf expr: Add metric literals for topology.")
fails on s390:

  # perf test -Fv 7
    ...
  # FAILED tests/expr.c:173 #num_dies >= #num_packages
    ---- end ----
    Simple expression parser: FAILED!
  #

Investigating this issue leads to these functions:

 build_cpu_topology()
   +--> has_die_topology(void)
        {
           struct utsname uts;

           if (uname(&uts) < 0)
                  return false;
           if (strncmp(uts.machine, "x86_64", 6))
                  return false;
           ....
        }

which always returns false on s390. The caller build_cpu_topology()
checks has_die_topology() return value. On false the the struct
cpu_topology::die_cpu_list is not contructed and has zero entries. This
leads to the failing comparison: #num_dies >= #num_packages.  s390 of
course has a positive number of packages.

Fix this and check if the function build_cpu_topology() did build up
a die_cpus_list. The number of entries in this list should be larger
than 0. If the number of list element is zero, the die_cpus_list has
not been created and the check in function test__expr():

    TEST_ASSERT_VAL("#num_dies >= #num_packages", \
		    num_dies >= num_packages)

always fails.

Output after:

  # perf test -Fv 7
   7: Simple expression parser                                        :
   --- start ---
   division by zero
   syntax error
   ---- end ----
   Simple expression parser: Ok
  #

Fixes: fdf1e29b61 ("perf expr: Add metric literals for topology.")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20211129112339.3003036-1-tmricht@linux.ibm.com
[ Added comment in the added 'if (num_dies)' line about architectures not having die topology ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Arnaldo Carvalho de Melo
3d1d57debe tools build: Remove needless libpython-version feature check that breaks test-all fast path
Since 66dfdff03d ("perf tools: Add Python 3 support") we don't use
the tools/build/feature/test-libpython-version.c version in any Makefile
feature check:

  $ find tools/ -type f | xargs grep feature-libpython-version
  $

The only place where this was used was removed in 66dfdff03d:

  -        ifneq ($(feature-libpython-version), 1)
  -          $(warning Python 3 is not yet supported; please set)
  -          $(warning PYTHON and/or PYTHON_CONFIG appropriately.)
  -          $(warning If you also have Python 2 installed, then)
  -          $(warning try something like:)
  -          $(warning $(and ,))
  -          $(warning $(and ,)  make PYTHON=python2)
  -          $(warning $(and ,))
  -          $(warning Otherwise, disable Python support entirely:)
  -          $(warning $(and ,))
  -          $(warning $(and ,)  make NO_LIBPYTHON=1)
  -          $(warning $(and ,))
  -          $(error   $(and ,))
  -        else
  -          LDFLAGS += $(PYTHON_EMBED_LDFLAGS)
  -          EXTLIBS += $(PYTHON_EMBED_LIBADD)
  -          LANG_BINDINGS += $(obj-perf)python/perf.so
  -          $(call detected,CONFIG_LIBPYTHON)
  -        endif

And nowadays we either build with PYTHON=python3 or just install the
python3 devel packages and perf will build against it.

But the leftover feature-libpython-version check made the fast path
feature detection to break in all cases except when python2 devel files
were installed:

  $ rpm -qa | grep python.*devel
  python3-devel-3.9.7-1.fc34.x86_64
  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;
  $ make -C tools/perf O=/tmp/build/perf install-bin
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j32' parallel build
    HOSTCC  /tmp/build/perf/fixdep.o
  <SNIP>
  $ cat /tmp/build/perf/feature/test-all.make.output
  In file included from test-all.c:18:
  test-libpython-version.c:5:10: error: #error
      5 |         #error
        |          ^~~~~
  $ ldd ~/bin/perf | grep python
	libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007fda6dbcf000)
  $

As python3 is the norm these days, fix this by just removing the unused
feature-libpython-version feature check, making the test-all fast path
to work with the common case.

With this:

  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;
  $ make -C tools/perf O=/tmp/build/perf install-bin |& head
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j32' parallel build
    HOSTCC  /tmp/build/perf/fixdep.o
    HOSTLD  /tmp/build/perf/fixdep-in.o
    LINK    /tmp/build/perf/fixdep

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  $ ldd ~/bin/perf | grep python
	libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007f58800b0000)
  $ cat /tmp/build/perf/feature/test-all.make.output
  $

Reviewed-by: James Clark <james.clark@arm.com>
Fixes: 66dfdff03d ("perf tools: Add Python 3 support")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jaroslav Škarvada <jskarvad@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/YaYmeeC6CS2b8OSz@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Ian Rogers
4ffbe87e2d perf tools: Fix SMT detection fast read path
sysfs__read_int() returns 0 on success, and so the fast read path was
always failing.

Fixes: bb629484d9 ("perf tools: Simplify checking if SMT is active.")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211124001231.3277836-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Arnaldo Carvalho de Melo
cba43fcf7a tools headers UAPI: Sync powerpc syscall table file changed by new futex_waitv syscall
To pick the changes in this cset:

  a0eb2da92b ("futex: Wireup futex_waitv syscall")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible (adapted from the x86_64 test output):

  # perf trace -e futex_waitv
  ^C#
  # perf trace -v -e futex_waitv
  event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
  ^C#
  # perf trace -v -e futex* --max-events 10
  event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 221 || id == 449)
  mmap size 528384B
           ? (         ): Timer/219310  ... [continued]: futex())                                            = -1 ETIMEDOUT (Connection timed out)
       0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
       0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.088 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
       0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.088 ( 0.089 ms): Timer/219310  ... [continued]: futex())                                            = 0
       0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.181 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
  #

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep futex tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
  221	32	futex				sys_futex_time32
  221	64	futex				sys_futex
  221	spu	futex				sys_futex
  422	32	futex_time64			sys_futex			sys_futex
  449	common  futex_waitv                     sys_futex_waitv
  $

This addresses this perf build warnings:

  Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/YZ%2F1OU9mJuyS2HMa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Adrian Hunter
c29d979260 perf inject: Fix itrace space allowed for new attributes
The space allowed for new attributes can be too small if existing header
information is large. That can happen, for example, if there are very
many CPUs, due to having an event ID per CPU per event being stored in the
header information.

Fix by adding the existing header.data_offset. Also increase the extra
space allowed to 8KiB and align to a 4KiB boundary for neatness.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20211125071457.2066863-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:52 -03:00
Arnaldo Carvalho de Melo
71a16df164 tools headers UAPI: Sync s390 syscall table file changed by new futex_waitv syscall
To pick the changes in these csets:

  6c122360cf ("s390: wire up sys_futex_waitv system call")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible (adapted from the x86_64 test output):

  # perf trace -e futex_waitv
  ^C#
  # perf trace -v -e futex_waitv
  event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
  ^C#
  # perf trace -v -e futex* --max-events 10
  event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 238 || id == 449)
           ? (         ): Timer/219310  ... [continued]: futex())                                            = -1 ETIMEDOUT (Connection timed out)
       0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
       0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.088 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
       0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.088 ( 0.089 ms): Timer/219310  ... [continued]: futex())                                            = 0
       0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.181 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
  #

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep futex tools/perf/arch/s390/entry/syscalls/syscall.tbl
  238  common	futex			sys_futex			sys_futex_time32
  422	32	futex_time64		-				sys_futex
  449  common	futex_waitv		sys_futex_waitv			sys_futex_waitv
  $

This addresses this perf build warnings:

  Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/lkml/YZ%2F2qRW%2FTScYTP1U@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:52 -03:00
Jiri Olsa
3f8d657716 Revert "perf bench: Fix two memory leaks detected with ASan"
This: This reverts commit 92723ea0f1.

  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRRRRRRR Ok
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRR Ok
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRRRR Ok

yep, it seems the perf bench is broken so the counts won't correlated if
I revert this one:

  92723ea0f1 perf bench: Fix two memory leaks detected with ASan

it works for me again.. it seems to break -t option

   [root@dell-r440-01 perf]# ./perf bench sched messaging -g 1 -l 100 -t
   # Running 'sched/messaging' benchmark:
   RRRperf: CLIENT: ready write: Bad file descriptor
   Rperf: SENDER: write: Bad file descriptor

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/lkml/YZev7KClb%2Fud43Lc@krava/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:52 -03:00
Linus Torvalds
f5d54a42d3 - Fix a couple of SWAPGS fencing issues in the x86 entry code
- Use the proper operand types in __{get,put}_user() to prevent
 truncation in SEV-ES string io
 
 - Make sure the kernel mappings are present in trampoline_pgd in order
 to prevent any potential accesses to unmapped memory after switching to
 it
 
 - Fix a trivial list corruption in objtool's pv_ops validation
 
 - Disable the clocksource watchdog for TSC on platforms which claim
 that the TSC is constant, doesn't stop in sleep states, CPU has TSC
 adjust and the number of sockets of the platform are max 2, to prevent
 erroneous markings of the TSC as unstable.
 
 - Make sure TSC adjust is always checked not only when going idle
 
 - Prevent a stack leak by initializing struct _fpx_sw_bytes properly in
 the FPU code
 
 - Fix INTEL_FAM6_RAPTORLAKE define naming to adhere to the convention
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmGsnWcACgkQEsHwGGHe
 VUoR+g/9FcOP0/XLH+LKHumYc9JHXsp5BvYGihyypMFgU0fXQBORtGqdls8jZtiJ
 kEdbW6iL0MRlyN8aHJCqr7dJqs7KJlpWes6hky7BY+U+7uewtjL5y3eSyZnA34T3
 M/Raecx27Hh0L0kHQlHXTUN73v1cgDvq3dCXWsP7Jqgjf5cEmCcV/tPEateqhq/f
 8TkLVIm55rJlbJ0LBO/cT0V3Q8QH9JPKm7nviOZuKCh9gcttFEPaM9MkaJyKUhoy
 O13jlenDoVkVWRXIQec1EZp2pTLxVAm/3Y0plge1yEVsejzh07gsQnMpoNeF+yFC
 8mDgSv8ZAED/vbsnB+BcgoRVj6ajG0+ilpLzcfPwUquiqS9pZrBSTddlvYDPjRMC
 MEXO548xiYgxmipu3r62H89nqmLEYQPk914rJu6bDnDeJ1gaabh8RXbNtQcRqqj3
 RETgVOp78iWn+aT33RLLD1EyodZb2IkMy087a3+TZICIXG81aDj9VgHvrVRnWnfY
 yKuldyrEKzi60yMQkV6h1oc8KSWQhspUSLtOVS9zrulCinYphFOfYFrzFmcKUWIq
 GdVb9eaP2oNBGfPybXP+TBLGZ4Zv9iXZmaEUk7ZGCjgv3ZmGMWJ18Hs/ufs2cwWK
 RNNUo3sz/y3OsreHowkWIk1eSxI16MabB7G/PDMnBSHlioVT390=
 =d6nS
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v5.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Fix a couple of SWAPGS fencing issues in the x86 entry code

 - Use the proper operand types in __{get,put}_user() to prevent
   truncation in SEV-ES string io

 - Make sure the kernel mappings are present in trampoline_pgd in order
   to prevent any potential accesses to unmapped memory after switching
   to it

 - Fix a trivial list corruption in objtool's pv_ops validation

 - Disable the clocksource watchdog for TSC on platforms which claim
   that the TSC is constant, doesn't stop in sleep states, CPU has TSC
   adjust and the number of sockets of the platform are max 2, to
   prevent erroneous markings of the TSC as unstable.

 - Make sure TSC adjust is always checked not only when going idle

 - Prevent a stack leak by initializing struct _fpx_sw_bytes properly in
   the FPU code

 - Fix INTEL_FAM6_RAPTORLAKE define naming to adhere to the convention

* tag 'x86_urgent_for_v5.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/xen: Add xenpv_restore_regs_and_return_to_usermode()
  x86/entry: Use the correct fence macro after swapgs in kernel CR3
  x86/entry: Add a fence for kernel entry SWAPGS in paranoid_entry()
  x86/sev: Fix SEV-ES INS/OUTS instructions for word, dword, and qword
  x86/64/mm: Map all kernel memory into trampoline_pgd
  objtool: Fix pv_ops noinstr validation
  x86/tsc: Disable clocksource watchdog for TSC on qualified platorms
  x86/tsc: Add a timer to make sure TSC_adjust is always checked
  x86/fpu/signal: Initialize sw_bytes in save_xstate_epilog()
  x86/cpu: Drop spurious underscore from RAPTOR_LAKE #define
2021-12-05 08:43:35 -08:00
Peter Zijlstra
988f01683c objtool: Fix pv_ops noinstr validation
Boris reported that in one of his randconfig builds, objtool got
infinitely stuck. Turns out there's trivial list corruption in the
pv_ops tracking when a function is both in a static table and in a code
assignment.

Avoid re-adding function to the pv_ops[] lists when they're already on
it.

Fixes: db2b0c5d7b ("objtool: Support pv_opsindirect calls for noinstr")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20211202204534.GA16608@worktop.programming.kicks-ass.net
2021-12-03 09:11:42 +01:00
Linus Torvalds
a51e3ac43d Networking fixes for 5.16-rc4, including fixes from wireless,
and wireguard.
 
 Current release - regressions:
 
  - smc: keep smc_close_final()'s error code during active close
 
 Current release - new code bugs:
 
  - iwlwifi: various static checker fixes (int overflow, leaks, missing
    error codes)
 
  - rtw89: fix size of firmware header before transfer, avoid crash
 
  - mt76: fix timestamp check in tx_status; fix pktid leak;
 
  - mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()
 
 Previous releases - regressions:
 
  - smc: fix list corruption in smc_lgr_cleanup_early
 
  - ipv4: convert fib_num_tclassid_users to atomic_t
 
 Previous releases - always broken:
 
  - tls: fix authentication failure in CCM mode
 
  - vrf: reset IPCB/IP6CB when processing outbound pkts, prevent
    incorrect processing
 
  - dsa: mv88e6xxx: fixes for various device errata
 
  - rds: correct socket tunable error in rds_tcp_tune()
 
  - ipv6: fix memory leak in fib6_rule_suppress
 
  - wireguard: reset peer src endpoint when netns exits
 
  - wireguard: improve resilience to DoS around incoming handshakes
 
  - tcp: fix page frag corruption on page fault which involves TCP
 
  - mpls: fix missing attributes in delete notifications
 
  - mt7915: fix NULL pointer dereference with ad-hoc mode
 
 Misc:
 
  - rt2x00: be more lenient about EPROTO errors during start
 
  - mlx4_en: update reported link modes for 1/10G
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGo6KMACgkQMUZtbf5S
 Iruq+BAAhRMTcL+X4eRIL9lIEvWEKHMKLCA/pUaQWNlSxsbEeydJWRNSc37Cs3pv
 z0rYIEhfieOz8+QXS1Kq+yZwJVXjA8Jvgld2qw9V9Y5w+N15Mj8RUtG8NaUw+o4E
 U8PCAbaamnbzyPdlCYcVHschd8MD0BCXm5+jAGeIyCP+KQCnhEpFZv+bvHaWzQR8
 FZLYrhXTR9W0DFsrKG9+haqFwFBR3+VDqTGILhaHPE+r2o6wKQQ5yJMhd8fq0SaC
 nne8zDkGuFEeW3cxj0VbhdRMyrV97eMK+P4dZ2P0Z7xcrsed9/2XJkNQNJGtuRnj
 GGJV6utupJRAY+lnJNUkifqS4Wt7KirfZsSsyaKKa4plyoVgtGhiqEYFTQVLagC0
 CF4Qe+3qks6rESbRu6PEFN4oWSkMEhRzdcDpg7vBDURUKcrRs9fgtNUJUCi8nKFA
 A/F/K+7IHBoBZyQYZbYmnGdNsNauKbF3rUY3hwMGBfQZIr/wsql9+jhtLsmZX77m
 V/L7KzT2jhhNc5gDzuLps25K3P7snKuV19qQSsY2LeuGj1x3gmWZ+ibN6ynhB+Gt
 KBnfHDMTI/4aciZBIbwJmwfeRhCF8tOfw0WZdUP7FRIXukbfVuDBoznWLz4BKKgf
 GSYSTNDs/PHZQo5vCQ/onvTwUK5aN6zoPNy5ih7lp9YZBYtN2TI=
 =r0Jh
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from wireless, and wireguard.

  Mostly scattered driver changes this week, with one big clump in
  mv88e6xxx. Nothing of note, really.

  Current release - regressions:

   - smc: keep smc_close_final()'s error code during active close

  Current release - new code bugs:

   - iwlwifi: various static checker fixes (int overflow, leaks, missing
     error codes)

   - rtw89: fix size of firmware header before transfer, avoid crash

   - mt76: fix timestamp check in tx_status; fix pktid leak;

   - mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()

  Previous releases - regressions:

   - smc: fix list corruption in smc_lgr_cleanup_early

   - ipv4: convert fib_num_tclassid_users to atomic_t

  Previous releases - always broken:

   - tls: fix authentication failure in CCM mode

   - vrf: reset IPCB/IP6CB when processing outbound pkts, prevent
     incorrect processing

   - dsa: mv88e6xxx: fixes for various device errata

   - rds: correct socket tunable error in rds_tcp_tune()

   - ipv6: fix memory leak in fib6_rule_suppress

   - wireguard: reset peer src endpoint when netns exits

   - wireguard: improve resilience to DoS around incoming handshakes

   - tcp: fix page frag corruption on page fault which involves TCP

   - mpls: fix missing attributes in delete notifications

   - mt7915: fix NULL pointer dereference with ad-hoc mode

  Misc:

   - rt2x00: be more lenient about EPROTO errors during start

   - mlx4_en: update reported link modes for 1/10G"

* tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
  net: dsa: b53: Add SPI ID table
  gro: Fix inconsistent indenting
  selftests: net: Correct case name
  net/rds: correct socket tunable error in rds_tcp_tune()
  mctp: Don't let RTM_DELROUTE delete local routes
  net/smc: Keep smc_close_final rc during active close
  ibmvnic: drop bad optimization in reuse_tx_pools()
  ibmvnic: drop bad optimization in reuse_rx_pools()
  net/smc: fix wrong list_del in smc_lgr_cleanup_early
  Fix Comment of ETH_P_802_3_MIN
  ethernet: aquantia: Try MAC address from device tree
  ipv4: convert fib_num_tclassid_users to atomic_t
  net: avoid uninit-value from tcp_conn_request
  net: annotate data-races on txq->xmit_lock_owner
  octeontx2-af: Fix a memleak bug in rvu_mbox_init()
  net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
  vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
  net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
  net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed
  net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family
  ...
2021-12-02 11:22:06 -08:00
Li Zhijian
a05431b22b selftests: net: Correct case name
ipv6_addr_bind/ipv4_addr_bind are function names. Previously, bind test
would not be run by default due to the wrong case names

Fixes: 34d0302ab8 ("selftests: Add ipv6 address bind tests to fcnal-test")
Fixes: 75b2b2b3db ("selftests: Add ipv4 address bind tests to fcnal-test")
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 12:19:08 +00:00
Linus Torvalds
f080815fdb ARM64:
* Fix constant sign extension affecting TCR_EL2 and preventing
 running on ARMv8.7 models due to spurious bits being set
 
 * Fix use of helpers using PSTATE early on exit by always sampling
 it as soon as the exit takes place
 
 * Move pkvm's 32bit handling into a common helper
 
 RISC-V:
 
 * Fix incorrect KVM_MAX_VCPUS value
 
 * Unmap stage2 mapping when deleting/moving a memslot
 
 x86:
 
 * Fix and downgrade BUG_ON due to uninitialized cache
 
 * Many APICv and MOVE_ENC_CONTEXT_FROM fixes
 
 * Correctly emulate TLB flushes around nested vmentry/vmexit
 and when the nested hypervisor uses VPID
 
 * Prevent modifications to CPUID after the VM has run
 
 * Other smaller bugfixes
 
 Generic:
 
 * Memslot handling bugfixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGmHBEUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOkGgf/RBjt1d7H6Um7tD7oA5QiIHmNY4ko
 K/90OAa8h62rilxpqxkRgLNmphBc5AzcbufVXN4J1hVhw2M+u1ouDxKeHS1GEZTA
 /XdNb0dwK99TpOJkIcuV/NQVIZUxkM00VbIiCoLkX06VuIc1Gie1G4bqzLhWCP8Y
 ts9l/pkfafvfEmjmcjVd7gkDOnEPbT+JPDJcuo/RA7C7Z2L4+8DsFeyfWGqBP647
 J6omUUxD82QRm28OVOK4V7aNALWsAdlaqHrVFAPZywQl7QTWMO0UTcKTdCCB2B4Q
 QnHejFV6pFh55q3/fhe7epy9e2Sw+NOsmWKTEGPbU5nn94R8lyW1GV4ZUQ==
 =Nduu
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Fix constant sign extension affecting TCR_EL2 and preventing
     running on ARMv8.7 models due to spurious bits being set

   - Fix use of helpers using PSTATE early on exit by always sampling it
     as soon as the exit takes place

   - Move pkvm's 32bit handling into a common helper

  RISC-V:

   - Fix incorrect KVM_MAX_VCPUS value

   - Unmap stage2 mapping when deleting/moving a memslot

  x86:

   - Fix and downgrade BUG_ON due to uninitialized cache

   - Many APICv and MOVE_ENC_CONTEXT_FROM fixes

   - Correctly emulate TLB flushes around nested vmentry/vmexit and when
     the nested hypervisor uses VPID

   - Prevent modifications to CPUID after the VM has run

   - Other smaller bugfixes

  Generic:

   - Memslot handling bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (44 commits)
  KVM: fix avic_set_running for preemptable kernels
  KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled
  KVM: SEV: accept signals in sev_lock_two_vms
  KVM: SEV: do not take kvm->lock when destroying
  KVM: SEV: Prohibit migration of a VM that has mirrors
  KVM: SEV: Do COPY_ENC_CONTEXT_FROM with both VMs locked
  selftests: sev_migrate_tests: add tests for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
  KVM: SEV: move mirror status to destination of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
  KVM: SEV: initialize regions_list of a mirror VM
  KVM: SEV: cleanup locking for KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
  KVM: SEV: do not use list_replace_init on an empty list
  KVM: x86: Use a stable condition around all VT-d PI paths
  KVM: x86: check PIR even for vCPUs with disabled APICv
  KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
  KVM: selftests: page_table_test: fix calculation of guest_test_phys_mem
  KVM: x86/mmu: Handle "default" period when selectively waking kthread
  KVM: MMU: shadow nested paging does not have PKU
  KVM: x86/mmu: Remove spurious TLB flushes in TDP MMU zap collapsible path
  KVM: x86/mmu: Use yield-safe TDP MMU root iter in MMU notifier unmapping
  KVM: X86: Use vcpu->arch.walk_mmu for kvm_mmu_invlpg()
  ...
2021-11-30 09:22:15 -08:00
Matthew Wilcox (Oracle)
d6e6a27d96 tools: Fix math.h breakage
Commit 98e1385ef2 ("include/linux/radix-tree.h: replace kernel.h with
the necessary inclusions") broke the radix tree test suite in two
different ways; first by including math.h which didn't exist in the
tools directory, and second by removing an implicit include of
spinlock.h before lockdep.h.  Fix both issues.

Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-30 09:14:42 -08:00
Paolo Bonzini
17d44a96f0 KVM: SEV: Prohibit migration of a VM that has mirrors
VMs that mirror an encryption context rely on the owner to keep the
ASID allocated.  Performing a KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
would cause a dangling ASID:

1. copy context from A to B (gets ref to A)
2. move context from A to L (moves ASID from A to L)
3. close L (releases ASID from L, B still references it)

The right way to do the handoff instead is to create a fresh mirror VM
on the destination first:

1. copy context from A to B (gets ref to A)
[later] 2. close B (releases ref to A)
3. move context from A to L (moves ASID from A to L)
4. copy context from L to M

So, catch the situation by adding a count of how many VMs are
mirroring this one's encryption context.

Fixes: 0b020f5af0 ("KVM: SEV: Add support for SEV-ES intra host migration")
Message-Id: <20211123005036.2954379-11-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30 03:54:14 -05:00
Paolo Bonzini
dc79c9f4eb selftests: sev_migrate_tests: add tests for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
I am putting the tests in sev_migrate_tests because the failure conditions are
very similar and some of the setup code can be reused, too.

The tests cover both successful creation of a mirror VM, and error
conditions.

Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Message-Id: <20211123005036.2954379-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30 03:54:13 -05:00
Maciej S. Szmigiero
81835ee113 KVM: selftests: page_table_test: fix calculation of guest_test_phys_mem
A kvm_page_table_test run with its default settings fails on VMX due to
memory region add failure:
> ==== Test Assertion Failure ====
>  lib/kvm_util.c:952: ret == 0
>  pid=10538 tid=10538 errno=17 - File exists
>     1  0x00000000004057d1: vm_userspace_mem_region_add at kvm_util.c:947
>     2  0x0000000000401ee9: pre_init_before_test at kvm_page_table_test.c:302
>     3   (inlined by) run_test at kvm_page_table_test.c:374
>     4  0x0000000000409754: for_each_guest_mode at guest_modes.c:53
>     5  0x0000000000401860: main at kvm_page_table_test.c:500
>     6  0x00007f82ae2d8554: ?? ??:0
>     7  0x0000000000401894: _start at ??:?
>  KVM_SET_USER_MEMORY_REGION IOCTL failed,
>  rc: -1 errno: 17
>  slot: 1 flags: 0x0
>  guest_phys_addr: 0xc0000000 size: 0x40000000

This is because the memory range that this test is trying to add
(0x0c0000000 - 0x100000000) conflicts with LAPIC mapping at 0x0fee00000.

Looking at the code it seems that guest_test_*phys*_mem variable gets
mistakenly overwritten with guest_test_*virt*_mem while trying to adjust
the former for alignment.
With the correct variable adjusted this test runs successfully.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <52e487458c3172923549bbcf9dfccfbe6faea60b.1637940473.git.maciej.szmigiero@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30 03:12:13 -05:00
Jason A. Donenfeld
20ae1d6aa1 wireguard: device: reset peer src endpoint when netns exits
Each peer's endpoint contains a dst_cache entry that takes a reference
to another netdev. When the containing namespace exits, we take down the
socket and prevent future sockets from being created (by setting
creating_net to NULL), which removes that potential reference on the
netns. However, it doesn't release references to the netns that a netdev
cached in dst_cache might be taking, so the netns still might fail to
exit. Since the socket is gimped anyway, we can simply clear all the
dst_caches (by way of clearing the endpoint src), which will release all
references.

However, the current dst_cache_reset function only releases those
references lazily. But it turns out that all of our usages of
wg_socket_clear_peer_endpoint_src are called from contexts that are not
exactly high-speed or bottle-necked. For example, when there's
connection difficulty, or when userspace is reconfiguring the interface.
And in particular for this patch, when the netns is exiting. So for
those cases, it makes more sense to call dst_release immediately. For
that, we add a small helper function to dst_cache.

This patch also adds a test to netns.sh from Hangbin Liu to ensure this
doesn't regress.

Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reported-by: Xiumei Mu <xmu@redhat.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Fixes: 900575aa33 ("wireguard: device: avoid circular netns references")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:45 -08:00
Li Zhijian
7e938beb83 wireguard: selftests: rename DEBUG_PI_LIST to DEBUG_PLIST
DEBUG_PI_LIST was renamed to DEBUG_PLIST since 8e18faeac3 ("lib/plist:
rename DEBUG_PI_LIST to DEBUG_PLIST").

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Fixes: 8e18faeac3 ("lib/plist: rename DEBUG_PI_LIST to DEBUG_PLIST")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:40 -08:00
Jason A. Donenfeld
782c72af56 wireguard: selftests: actually test for routing loops
We previously removed the restriction on looping to self, and then added
a test to make sure the kernel didn't blow up during a routing loop. The
kernel didn't blow up, thankfully, but on certain architectures where
skb fragmentation is easier, such as ppc64, the skbs weren't actually
being discarded after a few rounds through. But the test wasn't catching
this. So actually test explicitly for massive increases in tx to see if
we have a routing loop. Note that the actual loop problem will need to
be addressed in a different commit.

Fixes: b673e24aad ("wireguard: socket: remove errant restriction on looping to self")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:29 -08:00
Jason A. Donenfeld
03ff1b1def wireguard: selftests: increase default dmesg log size
The selftests currently parse the kernel log at the end to track
potential memory leaks. With these tests now reading off the end of the
buffer, due to recent optimizations, some creation messages were lost,
making the tests think that there was a free without an alloc. Fix this
by increasing the kernel log size.

Fixes: 24b70eeeb4 ("wireguard: use synchronize_net rather than synchronize_rcu")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:29 -08:00
Linus Torvalds
c5c17547b7 Networking fixes for 5.16-rc3, including fixes from netfilter.
Current release - regressions:
 
  - r8169: fix incorrect mac address assignment
 
  - vlan: fix underflow for the real_dev refcnt when vlan creation fails
 
  - smc: avoid warning of possible recursive locking
 
 Current release - new code bugs:
 
  - vsock/virtio: suppress used length validation
 
  - neigh: fix crash in v6 module initialization error path
 
 Previous releases - regressions:
 
  - af_unix: fix change in behavior in read after shutdown
 
  - igb: fix netpoll exit with traffic, avoid warning
 
  - tls: fix splice_read() when starting mid-record
 
  - lan743x: fix deadlock in lan743x_phy_link_status_change()
 
  - marvell: prestera: fix bridge port operation
 
 Previous releases - always broken:
 
  - tcp_cubic: fix spurious Hystart ACK train detections for
    not-cwnd-limited flows
 
  - nexthop: fix refcount issues when replacing IPv6 groups
 
  - nexthop: fix null pointer dereference when IPv6 is not enabled
 
  - phylink: force link down and retrigger resolve on interface change
 
  - mptcp: fix delack timer length calculation and incorrect early
    clearing
 
  - ieee802154: handle iftypes as u32, prevent shift-out-of-bounds
 
  - nfc: virtual_ncidev: change default device permissions
 
  - netfilter: ctnetlink: fix error codes and flags used for kernel side
    filtering of dumps
 
  - netfilter: flowtable: fix IPv6 tunnel addr match
 
  - ncsi: align payload to 32-bit to fix dropped packets
 
  - iavf: fix deadlock and loss of config during VF interface reset
 
  - ice: avoid bpf_prog refcount underflow
 
  - ocelot: fix broken PTP over IP and PTP API violations
 
 Misc:
 
  - marvell: mvpp2: increase MTU limit when XDP enabled
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGhSCwACgkQMUZtbf5S
 IrvAdw//aR54mgc9rc0mkvS5sbDeKzDscmTzav5ANGjl+2ooKTOe8Qd07s59z6TJ
 H9IJlTu0Uc9Psbb2RvRo1T1HDohSpWy7SEN/Qlo6N+z1WzDHWbuXyC/KTQDM+8I1
 coMYBBTwBGkblBosuoMUi60GWLbBslLv9gR7HUZj7gbtxMfk36BrX5UYz1ONy+tx
 HiVshtOmzOgumBi+/j0tkI4lpI/ajf9eYaG6Vvd0A6F3idcbhWKNKfLPgw9qQF36
 sQrbz1SYwL5Ucgk47EG+Lpk7oSzbkdNoO6Ro9ncsebB8OMoLUhddclmG/fbgPG0o
 SWJ4kK3kmaRSTvSi6q4e5BM89oIhtFWhGRB6vURokrAQU1Ds+sq5F+8IwCaMqEYb
 GNyEZ8cdJhLc50RU+/Im3lN6IrRHvQiirE1BN+ZuCMjeSTrsqX18ZYMh1pSJhxkZ
 wRC03sSd2ZcaooFrSNJ5Scr3ndacrWNtVr78IQYCNrTjqJn1QUK7ZegTjP04FUfD
 JLB7+en8Hd6EKosJLKyoAPRwoFPZN6mDAPC6RfF45B3OoZAHbvXmJrOT6PatcqHe
 i0YwDkAJKPRijfcepN1IQYlY2Za5HwNWzCV6v0bf4tUCluDsSkczTKS02dZ1hegR
 oYW1Ra1BIyYK4cbG4H0lD7iBQGLGgwt38U1NlFpawbJa/fECUSs=
 =LtZ7
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from netfilter.

  Current release - regressions:

   - r8169: fix incorrect mac address assignment

   - vlan: fix underflow for the real_dev refcnt when vlan creation
     fails

   - smc: avoid warning of possible recursive locking

  Current release - new code bugs:

   - vsock/virtio: suppress used length validation

   - neigh: fix crash in v6 module initialization error path

  Previous releases - regressions:

   - af_unix: fix change in behavior in read after shutdown

   - igb: fix netpoll exit with traffic, avoid warning

   - tls: fix splice_read() when starting mid-record

   - lan743x: fix deadlock in lan743x_phy_link_status_change()

   - marvell: prestera: fix bridge port operation

  Previous releases - always broken:

   - tcp_cubic: fix spurious Hystart ACK train detections for
     not-cwnd-limited flows

   - nexthop: fix refcount issues when replacing IPv6 groups

   - nexthop: fix null pointer dereference when IPv6 is not enabled

   - phylink: force link down and retrigger resolve on interface change

   - mptcp: fix delack timer length calculation and incorrect early
     clearing

   - ieee802154: handle iftypes as u32, prevent shift-out-of-bounds

   - nfc: virtual_ncidev: change default device permissions

   - netfilter: ctnetlink: fix error codes and flags used for kernel
     side filtering of dumps

   - netfilter: flowtable: fix IPv6 tunnel addr match

   - ncsi: align payload to 32-bit to fix dropped packets

   - iavf: fix deadlock and loss of config during VF interface reset

   - ice: avoid bpf_prog refcount underflow

   - ocelot: fix broken PTP over IP and PTP API violations

  Misc:

   - marvell: mvpp2: increase MTU limit when XDP enabled"

* tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
  net: dsa: microchip: implement multi-bridge support
  net: mscc: ocelot: correctly report the timestamping RX filters in ethtool
  net: mscc: ocelot: set up traps for PTP packets
  net: ptp: add a definition for the UDP port for IEEE 1588 general messages
  net: mscc: ocelot: create a function that replaces an existing VCAP filter
  net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP
  net: hns3: fix incorrect components info of ethtool --reset command
  net: hns3: fix one incorrect value of page pool info when queried by debugfs
  net: hns3: add check NULL address for page pool
  net: hns3: fix VF RSS failed problem after PF enable multi-TCs
  net: qed: fix the array may be out of bound
  net/smc: Don't call clcsock shutdown twice when smc shutdown
  net: vlan: fix underflow for the real_dev refcnt
  ptp: fix filter names in the documentation
  ethtool: ioctl: fix potential NULL deref in ethtool_set_coalesce()
  nfc: virtual_ncidev: change default device permissions
  net/sched: sch_ets: don't peek at classes beyond 'nbands'
  net: stmmac: Disable Tx queues when reconfiguring the interface
  selftests: tls: test for correct proto_ops
  tls: fix replacing proto_ops
  ...
2021-11-26 12:58:53 -08:00
Vitaly Kuznetsov
908fa88e42 KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE
With the elevated 'KVM_CAP_MAX_VCPUS' value kvm_create_max_vcpus test
may hit RLIMIT_NOFILE limits:

 # ./kvm_create_max_vcpus
 KVM_CAP_MAX_VCPU_ID: 4096
 KVM_CAP_MAX_VCPUS: 1024
 Testing creating 1024 vCPUs, with IDs 0...1023.
 /dev/kvm not available (errno: 24), skipping test

Adjust RLIMIT_NOFILE limits to make sure KVM_CAP_MAX_VCPUS fds can be
opened. Note, raising hard limit ('rlim_max') requires CAP_SYS_RESOURCE
capability which is generally not needed to run kvm selftests (but without
raising the limit the test is doomed to fail anyway).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211123135953.667434-1-vkuznets@redhat.com>
[Skip the test if the hard limit can be raised. - Paolo]
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26 08:14:20 -05:00
Vitaly Kuznetsov
6c1186430a KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features test
hyperv_features's sole purpose is to test access to various Hyper-V MSRs
and hypercalls with different CPUID data. As KVM_SET_CPUID2 after KVM_RUN
is deprecated and soon-to-be forbidden, avoid it by re-creating test VM
for each sub-test.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211122175818.608220-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26 08:14:19 -05:00
Paolo Bonzini
826bff439f selftests: sev_migrate_tests: free all VMs
Ensure that the ASID are freed promptly, which becomes more important
when more tests are added to this file.

Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26 06:43:30 -05:00
Paolo Bonzini
4916ea8b06 selftests: fix check for circular KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM leaves the source VM in a dead state,
so migrating back to the original source VM fails the ioctl.  Adjust
the test.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26 06:43:29 -05:00
Jakub Kicinski
f884a34262 selftests: tls: test for correct proto_ops
Previous patch fixes overriding callbacks incorrectly. Triggering
the crash in sendpage_locked would be more spectacular but it's
hard to get to, so take the easier path of proving this is broken
and call getname. We're currently getting IPv4 socket info on an
IPv6 socket.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:28:17 -08:00
Jakub Kicinski
274af0f9e2 selftests: tls: test splicing decrypted records
Add tests for half-received and peeked records.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:28:16 -08:00
Jakub Kicinski
d87d67fd61 selftests: tls: test splicing cmsgs
Make sure we correctly reject splicing non-data records.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:28:16 -08:00
Jakub Kicinski
ef0fc0b3cc selftests: tls: add tests for handling of bad records
Test broken records.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:28:15 -08:00
Jakub Kicinski
31180adb0b selftests: tls: factor out cmsg send/receive
Add helpers for sending and receiving special record types.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:28:15 -08:00
Jakub Kicinski
a125f91fe7 selftests: tls: add helper for creating sock pairs
We have the same code 3 times, about to add a fourth copy.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:28:15 -08:00
James Prestwood
619ca0d010 selftests: add arp_ndisc_evict_nocarrier to Makefile
This was previously added in selftests but never added
to the Makefile

Signed-off-by: James Prestwood <prestwoj@gmail.com>
Link: https://lore.kernel.org/r/20211122171806.3529401-1-prestwoj@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-23 20:10:13 -08:00
Eric Dumazet
710d5835b7 tools: sync uapi/linux/if_link.h header
This file has not been updated for a while.

Sync it before BIG TCP patch series.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211122184810.769159-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-23 19:39:28 -08:00
Nikolay Aleksandrov
02ebe49ab0 selftests: net: fib_nexthops: add test for group refcount imbalance bug
The new selftest runs a sequence which causes circular refcount
dependency between deleted objects which cannot be released and results
in a netdevice refcount imbalance.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 15:44:49 +00:00